Data Migration - The Traditional Way

Modified on Fri, 13 Oct, 2023 at 4:38 AM

When discussing the "traditional way of specification in data migration," it typically refers to the detailed planning and requirements needed to execute a successful data migration project.

This is often combined with the typical waterfall approach to data migration which refers to a linear and sequential process where each phase must be completed before the next begins. It is a traditional method often used in software development and project management.

There is however a better way!

The mapping rules governing the data migration need to be specified and maintained, which can be done in various ways. It is common to use textual specifications such as Word documents or Excel worksheets which are then handed over to developers to turn these specifications into code that when executed will transform and validate the data as specified in the documents provided.

This approach makes sense as the users creating these specifications should have in-depth knowledge of the data and the business processes supported by that data. Many "business experts" prefer using familiar applications like Word or Excel to produce these specifications.

Based on these specifications, a set of executables must be developed and maintained. Typically, a team of developers is tasked with writing programs that execute the data migration based on the specifications. Historically, third-generation programming languages were used for this purpose, but nowadays, leveraging the capabilities of powerful Extract, Transform, Load (ETL) tools is becoming increasingly common.

However, the tools commonly used for specifications (Word, Excel etc) lack support for version control, cross-referencing or validating the consistency of the produced mappings. Inconsistencies can lead to communication challenges and wasted effort during implementation. In some cases, these inconsistencies may go unnoticed until late in the data migration project's lifecycle, making them difficult, expensive, and risky to correct.

Regardless of the chosen toolset, specifying and implementing the data migration specifications are critical components of any project. The success of the data migration depends on the quality of the specifications and their translation into executable programs. As the specifications often grow in size and complexity, maintaining their validity and coherence becomes challenging or outright unmanageable.

As the project evolves, specifications mature and changes are made to the target system, making it increasingly difficult to ensure complete consistency between the specifications and the executables. As deadlines approach, there is a risk of leaving the specifications behind while directly modifying the implementations. As implementation complexity increases, the team responsible for implementation may reject requirements or requests from the business experts due to earlier implementation choices based on outdated specifications.

Finally, the effort invested in the specifications becomes useless as they drift further away from the reality reflected in the executables. The audit trail is most often lost in the process making it difficult to trace issues found after the go-live back to the source.