The Continued Importance Of ETL

Kevin Feasel

2017-05-08

ETL

Andy Leonard explains that good old ETL remains vital to an organization:

A Problem

As Jen points out earlier in her Analytics Market Commoditization and Consolidation post (you should read it all – it’s awesome – like all of Jen’s posts!) many analytics solution providers share the “Same look, same marketing story, same saves time and allows users [to] avoid evil IT.”

I can hear some of you thinking, “Are you telling us analytics doesn’t work, Andy?” Goodness no. I’m telling you hype and sales strategy work in the analytics market as well as anywhere. When asked why a solution may not perform to expectations, the #1 response is “your data is not clean.”

Data engineering (think ETL specifically designed for analytics and “big data”) is the backbone behind data science.  To Andy’s point, the data engineer’s job is to get clean, context-heavy data in front of a data scientist, the same way a “classical” Business Intelligence specialist works with analysts.

Related Posts

Streaming ETL Using CDC And Event Hub

Rolf Tesmer combines Change Data Capture and Event Hubs to build a streaming ETL solution: The solution picks up the SQL data changes from the CDC Change Tracking system tables, creates JSON messages from the change rows, and then posts the message to an Azure Event Hub.  Once landed in the Event Hub an Azure […]

Read More

Real-Time Streaming ETL With Kafka Streams

Yeva Byzek has a tutorial using Kafka and Kafka Streams to perform real-time ETL: Let’s consider an application that does some real-time stateful stream processing with the Kafka Streams API. We’ll run through a specific example of the end-to-end reference architecture and show you how to: Run a Kafka source connector to read data from […]

Read More

Categories