Transforming Data: ELT Or ETL?

Artyom Keydunov argues that Extract-Load-Transform is a better model than Extract-Transform-Load:

ETL arose to solve a problem of providing businesses with clean and ready-to-analyze data. We remove dirty and irrelevant data and transform, enrich, and reshape the rest. The example of this could be sessionization: the process of creating sessions out of raw pageviews and users’ events.

ETL is complicated, especially the transformation part. It requires at least several months for a small-sized (less than 500 employees) company to get up and running. Once you have the initial transform jobs implemented, never-ending changes and updates will begin because data always evolves with business.

The other problem of ETL is that during the transformation, we reshape data into some specific form. This form usually lacks some data’s resolution and does not include data that is useless for that time or for that particular task. Often, “useless” data becomes “useful.” For example, if business users request daily data instead of weekly, then you will have to fix your transformation process, reshape data, and reload it. That would take a few weeks more.

Read on for more, including his argument for why ELT is better.

Related Posts

Loading CSVs Into Azure Using dbatools

Stuart Moore has a quick Powershell script which loads CSV data into Azure SQL Database using dbatools: To get some of this data usable for reporting we’re importing it into Azure SQL Database so people can start working their way through it, and we can fix up errors before we push it through into Azure Data Lake […]

Read More

Uploading Files To Azure Blob Storage With Data Factory V2

Kevin Feasel

2018-03-26

Cloud, ETL

Ben Jarvis shows how to use Azure Data Factory V2 to upload files from an on-prem server to Azure Blob Storage: In ADF V2 the integration runtime is responsible for providing the compute infrastructure that carries out data movement between data stores. A self-hosted integration runtime is an on-premise version of the integration runtime that […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

April 2018
MTWTFSS
« Mar  
 1
2345678
9101112131415
16171819202122
23242526272829
30