Designing A Data Warehouse Test Plan

Koos van Strien walks through some of the high-level concepts when automating data warehouse tests:

In my current project, I’ve got a database containing everything to perform these tests:

  • Tables with identical structure to the ones in the staging area (plus two columns “TestSuiteName” and “TestName”)
  • A table containing the mapping from test-input table to target database, schema and table
  • A stored procedure to purge the DWH (all layers) in the test environment
  • A stored procedure to insert the data for a specific testsuite / name

When preparing a specific test case (the “insert rows for test case” step from the diagram above), the rows needed for that case are copied into the DWH:

Testing warehouses is certainly not a trivial exercise but given how complex warehouse ETL tends to be, having good tests reduces the number of 3 AM pages.

Related Posts

Data Warehouse Automation

Koos van Strien provides some thoughts on data warehouse automation tools: Currently, I think there are two main approaches to Data Warehouse Automation Data Warehouse Generation: You provide sources, mappings, datatype mappings etc.. The tool generates code (or artifacts). Data Warehouse Automation (DWA): The tool not only generates code / artifacts, but also manages the […]

Read More

Dimensional Modeling

Jen Underwood explains the basics of dimensional modeling: A dimensional model is also commonly called a star schema. It provides a way to improve report query performance without affecting data integrity. This type of model is popular in data warehousing because it can provide better query performance than transactional, normalized, OLTP data models. It also […]

Read More

Categories