Lambda Architecture

Koos van Strien looks at lambda architecture and asks if it works for data warehouses:

The Lambda Architecture is pretty well documented – online1 as well as in the book I just mentioned2. For a quick overview, Lambda Architecture is basically a system where the raw data is always stored, and never thrown away. All information that’s derived from this raw data is always recomputed – often stated as query = function(all data). This provides for a fool-proof architecture that’s rigorously simple (compared to classic RDBMS solutions), made up of three layers:

Admittedly, about half of this went over my head, but there are some good book and webpage recommendations to learn more about lambda architecture and Data Vault.

Related Posts

Temp Tables In Redshift

Derik Hammer has some notes on temporary tables in Amazon Redshift: One difference between regular tables and temporary tables is how they are typically used. Temporary tables are session scoped which means that adding them into a process or report will probably cause them to be created multiple times. Temporary tables might be very similar […]

Read More

Virtualize Data Or Move It?

James Serra contrasts data virtualization with traditional ETL moving data to a warehouse: Data virtualization integrates data from disparate sources, locations and formats, without replicating or moving the data, to create a single “virtual” data layer that delivers unified data services to support multiple applications and users. Data movement is the process of extracting data from source […]

Read More


May 2016
« Apr Jun »