Designing A Data Warehouse Test Plan

Koos van Strien walks through some of the high-level concepts when automating data warehouse tests:

In my current project, I’ve got a database containing everything to perform these tests:

  • Tables with identical structure to the ones in the staging area (plus two columns “TestSuiteName” and “TestName”)
  • A table containing the mapping from test-input table to target database, schema and table
  • A stored procedure to purge the DWH (all layers) in the test environment
  • A stored procedure to insert the data for a specific testsuite / name

When preparing a specific test case (the “insert rows for test case” step from the diagram above), the rows needed for that case are copied into the DWH:

Testing warehouses is certainly not a trivial exercise but given how complex warehouse ETL tends to be, having good tests reduces the number of 3 AM pages.

Related Posts

The Difficulties Of Database Load Testing

Brent Ozar shares some of the trouble you might run into when database load testing: Managers think that to simulate more load, they can just take the production queries and replay them multiple times, simultaneously, from the replay tool. We’ve already talked about how you can’t reliably replay deletes, but even inserts and updates cause a […]

Read More

Simulating Load With Powershell

Grant Fritchey builds a quick script to simulate load using PowerShell: The idea is to be able to easily do one of several different things. By commenting out different sections of the code, I can change the general behavior. Most of the work is done in the  # Run forever section of the code.First, I’ll […]

Read More