Most testers will have faced problems with test data. Recently I was facing problems again… The specific data that I needed was only limited available on the test environment. In some cases I would have chosen to skip some of the tests, but in this case it involved a high risk area. One way of getting more test data was just creating new data via the normal way that you would create for instance new customers. So that’s what I started to do, create the specific sets of customers that I needed for my test cases. It took a lot of time to create this, but since the high risk, I decided it was worth the time.
With a lot of nice data prepared, I could finally start executing the tests that were needed to provide insight in the new functionality. The first couple of test cases passed, but after that the failing wouldn’t stop. The first couple of cases were meant to work with newly created data and therefore the data was correct for the test cases. The more complex test cases demanded historical correct data, which is not easy to fake. At that point we made the decision to refresh the test environment with a newer set of transformed production data. This set would have more of the data that I needed, but it would also take out the complete test environment for at least one day. This needed aligning with other people that used the test environment, but in the end we agreed on a time frame.
Too bad… the refresh of the environment failed due to low disk space. So we lost a complete day of testing and we gained nothing. We actually lost another day to restore the test environment to a usable state. We are still not sure on how to test the parts that need historical data without completely faking it. This faking takes too much time and I still wonder about the validity of the test when all data is fake.
Test data is crucial for the testing, so please also test how to do backup, restore and refreshing of environments!