Designing A Data Warehouse Test Plan

Koos van Strien walks through some of the high-level concepts when automating data warehouse tests:

In my current project, I’ve got a database containing everything to perform these tests:

  • Tables with identical structure to the ones in the staging area (plus two columns “TestSuiteName” and “TestName”)
  • A table containing the mapping from test-input table to target database, schema and table
  • A stored procedure to purge the DWH (all layers) in the test environment
  • A stored procedure to insert the data for a specific testsuite / name

When preparing a specific test case (the “insert rows for test case” step from the diagram above), the rows needed for that case are copied into the DWH:

Testing warehouses is certainly not a trivial exercise but given how complex warehouse ETL tends to be, having good tests reduces the number of 3 AM pages.

Related Posts

Excluding Checks With dbachecks

Garry Bargsley shows us how to set a config which lets us exclude particular checks when running dbachecks: While tweaking my Invoke-DbcCheck  the list of  -ExcludeCheck checks keeps growing and growing. 1 Invoke-DbcCheck -SqlInstance $Servers -ComputerName $Servers -Check $_ -ExcludeDatabase ReportServer, ReportServerTempDB -ExcludeCheck TestLastBackup, TestLastBackupVerifyOnly, LinkedServerConnection, SPN, MaintenanceSolution, SaRenamed, LastGoodCheckDb, LogShipping, InvalidDatabaseOwner -PassThru | Update-DbcPowerBiDataSource -Environment Production Sure […]

Read More

Temp Tables In Redshift

Derik Hammer has some notes on temporary tables in Amazon Redshift: One difference between regular tables and temporary tables is how they are typically used. Temporary tables are session scoped which means that adding them into a process or report will probably cause them to be created multiple times. Temporary tables might be very similar […]

Read More