Search This Blog


9 October 2009

Notes on a data migration

When preparing to test a data migration we wanted to know how many records needed to be inspected. I googled for a calculator and this site came up.

Let me give a hypothetical scenario to illustrate how it can work for you.

I have a 1000 accounts of VIPs being migrated from a legacy system to a new online travel management tool.

We want to migrate their current details including name, company, address, contact details etc. We are also interested in migrating past travel details, but this will be deferred to a later release.

The VIPs are important customers to us and we have no tolerance for mistakes. By no tolerance I mean that if a handful of them has a problem, I can handle it, but not more because, as I said, they are all VIPs. So no tolerance means 1% room for error.

The developer work has been unit tested and peer reviewed and it looks like the data is in pretty good shape in the legacy system, so I only need a 90% confidence level on the test.

(This stands in contrast to our 85% unit test coverage, 0% automated UI tests, and 75% coverage of cases in UAT.)

We also do a pre-test check in the developer environment and inspect 20 records and find that there are three mistakes, so we add in a response bias of 15% (which is 3/20.)

The calculator tells me that out of 1000 cases I need to test 122 records. With an error tolerance of 5%, a confidence level of 90% and a bias of 15%.

That scenario includes the potential for up to 50 VIPs to encounter a problem sometime, based on the data we migrate. Maybe that’s one a week in the year, or maybe, because they are VIPs and travel a hell of a lot, it’s 50 in the first month. And we know that each problem costs us $300. The potential for errors at this threshold is $15,000, which is about equivalent to 3 weeks of testing time.

So we know that the potential cost here outweighs the cost of some additional tests. If we target an error tolerance of 1% we need to run 776 tests. Each test takes 10 minutes, so that’s about four weeks worth of testing/inspecting effort. That’s a bit too expensive, so we start exploring the numbers in between until we find the right balance between risk and cost.

Okay, I am no stats expert and would love some commentary on this scenario.