Lambda Architecture

Koos van Strien looks at lambda architecture and asks if it works for data warehouses:

The Lambda Architecture is pretty well documented – online1 as well as in the book I just mentioned2. For a quick overview, Lambda Architecture is basically a system where the raw data is always stored, and never thrown away. All information that’s derived from this raw data is always recomputed – often stated as query = function(all data). This provides for a fool-proof architecture that’s rigorously simple (compared to classic RDBMS solutions), made up of three layers:

Admittedly, about half of this went over my head, but there are some good book and webpage recommendations to learn more about lambda architecture and Data Vault.

Related Posts

Modeling Semi-Additive Measures

Paul Poco shows a couple techniques for modeling semi-additive measures in Analysis Services and Power BI: As mentioned earlier, the most commonly encountered approach is Option 2, the snapshot fact table. The main drawback of this approach is that the fact table’s size will grow extremely fast. For example, if you want to calculate the headcount in a company with 10,000 employees on average, and you want 5 […]

Read More

Don’t Truncate Facts and Dimensions when Loading Data

Meagan Longoria explains why a truncate-and-reload strategy for data warehouses isn’t a good look: Every once in a while, I come across a data warehouse where the data load uses a full truncate and reload pattern to populate a fact or dimension. While it may not be the end of the world for a small […]

Read More

Categories

May 2016
MTWTFSS
« Apr Jun »
 1
2345678
9101112131415
16171819202122
23242526272829
3031