Transforming Data: ELT Or ETL?

Kevin Feasel

2018-04-09

ETL

Artyom Keydunov argues that Extract-Load-Transform is a better model than Extract-Transform-Load:

ETL arose to solve a problem of providing businesses with clean and ready-to-analyze data. We remove dirty and irrelevant data and transform, enrich, and reshape the rest. The example of this could be sessionization: the process of creating sessions out of raw pageviews and users’ events.

ETL is complicated, especially the transformation part. It requires at least several months for a small-sized (less than 500 employees) company to get up and running. Once you have the initial transform jobs implemented, never-ending changes and updates will begin because data always evolves with business.

The other problem of ETL is that during the transformation, we reshape data into some specific form. This form usually lacks some data’s resolution and does not include data that is useless for that time or for that particular task. Often, “useless” data becomes “useful.” For example, if business users request daily data instead of weekly, then you will have to fix your transformation process, reshape data, and reload it. That would take a few weeks more.

Read on for more, including his argument for why ELT is better.

Related Posts

Proving ETL Correctness

Ed Elliott shares a few techniques for testing ETL processes: Reconciliation is the process of going to your source system, getting a number and validating that number on the target. This ranges from being easy to impossible, so you need to decide what to reconcile on a case by case basis. In its simplest form, […]

Read More

SSIS Design Preferences

Meagan Longoria systematizes a set of preferences regarding Integration Services package and ETL process design: – Every table should have InsertDateTime and UpdateDateTime columns. The UpdateDateTime column should be populated with the same value as the InsertDateTime column upon creation of the row, rather than being left null.– Whatever you use to create tables, include […]

Read More

Categories

April 2018
MTWTFSS
« Mar May »
 1
2345678
9101112131415
16171819202122
23242526272829
30