Streaming ETL Using CDC And Event Hub

Rolf Tesmer combines Change Data Capture and Event Hubs to build a streaming ETL solution:

The solution picks up the SQL data changes from the CDC Change Tracking system tables, creates JSON messages from the change rows, and then posts the message to an Azure Event Hub.  Once landed in the Event Hub an Azure Stream Analytics (ASA) Job distributes the changes into the multiple outputs.

What I found pretty cool was that I could transmit SQL delta changes from source to target in as little as 5 seconds end to end!

There are a bunch of steps, but the end result is worth it.

Related Posts

Last-Click Attribution With Databricks Delta

Caryl Yuhas and Denny Lee give us an example of building a last-click digital marketing attribution model with Databricks Delta: The first thing we will need to do is to establish the impression and conversion data streams.   The impression data stream provides us a real-time view of the attributes associated with those customers who were served the […]

Read More

Getting Started With Azure Databricks

David Peter Hansen has a quick walkthrough of Azure Databricks: RUN MACHINE LEARNING JOBS ON A SINGLE NODE A Databricks cluster has one driver node and one or more worker nodes. The Databricks runtime includes common used Python libraries, such as scikit-learn. However, they do not distribute their algorithms. Running a ML job only on the driver might not […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31