Plotting a smooth path to data migration.
Because the system itself is seen as the investment, any data migration effort is often viewed as a necessary but unfortunate cost, leading to an over-simplified, under-funded approach. With an understanding of the hidden challenges, managing the migration as part of the investment is much more likely to deliver accurate data that supports the needs of the business and mitigates the risk of delays, budget over-runs and scope reductions that can arise.
Key drivers of data complexity
A combination of trends is accelerating the need to manage data migration activity more effectively as part of a corporate data quality strategy:
* Mergers, acquisitions and restructuring of disparate sytems
* Compliance--the need to validate data against regulations and standards such as Basel II and Sarbanes-Oxley (SOX)
* Data volume--escalating data is increasing the burden of managing data
* Data diversity--introduction of data in new formats e.g. RFID, SMS, email
* Data decay--data is volatile; customer data typically deteriorates at 10-25% a year
* Data denial--organizations are often unaware of their data quality issues and lack the expertise or a senior sponsor to champion decisive action
* Technical advances--proliferation of new data devices, platforms and operating systems
* Economic factors--with pressure on margins all corporate data must help organizations compete more effectively
Does data migration get the attention it deserves?
Data migration is usually part of a larger project deliverable and typically the majority of business attention is focused on the package selection and configuration rather than ensuring that the data that populates the new system is fit for purpose. There are some clear reasons
why data migration subprojects tend to be 'planned' so lightly. Choosing the new system is an exciting, strategic business activity that usually entails working with new technologies, new suppliers and new opportunities. In short, it is the sexy part of the project.
In contrast, data migration planning is seen as a simple matter of shifting data from one bucket to another via a process that is a necessary administrative burden and extra cost. As a result, planning is often left too late, and the required resources and difficulty of the migration are frequently underestimated. It is regarded as a mundane and thankless task, and in some instances people know they are migrating themselves out of a job.
Mapping a faster route to the unknown
In classic situations the main priority is the safe physical transfer of data from the source(s) to the target without disrupting the business. The focus has been on preparing detailed mapping specifications or rules for moving source data to the target, based on:
* Ad-hoc queries of the source system
* Analysis of a small sample set of source data
* Little knowledge or documentation of how the source systems work
* Little knowledge of how the target system works--these parameters may also be moving as modifications are made during implementation
* Insufficient access to the target system
* Much IT input and little business input
Consequently mapping specifications and code are often metadata driven and not content driven. Without adequately checking the actual content of the data, many assumptions are made, resulting in significant errors and a high rework rate.
samples and relying on metadata descriptions is a major risk and is likely to give rise to the following:
* Time and budget estimates will fall short of actual needs
* The target system will not perform effectively
* Workarounds will need to be implemented and resourced
* Remedial data cleansing work will need to be devised and resourced
* The costs of missing the deadline will include maintaining the team, continued running costs of legacy systems and downtime on the target application.
* The new system will be seen to be at fault, making it harder to gain user acceptance
* Management confidence will be questioned.
Discovering the missing links with profiling and auditing
The most effective way of delivering a data migration program is to fully understand the data sources before starting to specify migration code. This is best achieved with a complete profile and audit of all source data within scope at an early stage and can deliver tangible benefits:
* With complete visibility of all source data, the team can identify and address potential problems that might have remained hidden until a later stage
* The rules for planning, mapping, building and testing migration code should be based on a thorough analysis of all source data rather than a small sample set
* Decisions can then be made on the basis of proven facts rather than assumptions
* Early data validation can assist with the choice of the migration method
* Establishing an in-depth repository of knowledge about the sources to be migrated enables organizations to deliver more accurate specifications for transferring data faster
* Full data auditing can reduce the cost of code amends at test stage by up to 80%
By applying a top down, target driven rationale, scoping decisions can be prioritized in line with the value to the organization by criteria such as region, line of business and product type. Refining data outwards from the core saves time and effort.
The result of this greater understanding of the data is a refined scope, stipulating what data will be migrated and how it needs to be transformed and cleaned on route.
Whatever the reason for the data migration, the ultimate aim should be to improve corporate performance and deliver competitive advantage. In order to succeed, data migrations must be given the attention they deserve, rather than simply being considered part of a larger underlying project. Without this, and without proper planning, there is a high risk of going over budget, timescales slipping, or even the project failing completely.
Steve Tuck, CTO, Datanomic
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||DATABASE AND NETWORK INTELLIGENCE|
|Publication:||Database and Network Journal|
|Date:||Feb 1, 2008|
|Previous Article:||A multi layered approach to prevent data leakage.|
|Next Article:||CA delivers Recovery Management software.|