Data Warehouse Automation

Koos van Strien provides some thoughts on data warehouse automation tools:

Currently, I think there are two main approaches to Data Warehouse Automation

  1. Data Warehouse Generation: You provide sources, mappings, datatype mappings etc.. The tool generates code (or artifacts).
  2. Data Warehouse Automation (DWA): The tool not only generates code / artifacts, but also manages the existing Data Warehouse, by offering continuous insight in data flows, actual lineage, row numbers, etc..

The difference might seem small, but IMHO is visible most clearly whenever changes occur in the Data Warehouse – the second class of tools can handle those changes (while preserving history). With the first class of tools provide you with the new structures, but you need to handle the preservation of history yourself (as you would’ve without DWA).

Read on for a contrast of these two approaches.

Related Posts

The Importance Of A Data Computing Layer For Reporting

Buxing Jiang argues that there are reporting scenarios in which building a data computing layer is critical: In previous articles, we mentioned that most reporting performance issues need to be addressed during the data preparation stage, but many scenarios can’t be handled within the data source. For example, parallel data retrieval should be performed outside of […]

Read More

The Premise Of Cloud Data Warehousing

Derik Hammer explains how cloud data warehouses differ from their on-prem cousins: Given the data processing needs of a data warehouse, they tend to be implemented on massively parallel processing (MPP) systems. The MPP architecture replies upon a shared nothing concept for distributing data across various slices. Compute nodes are layered on top of the […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31