Yeva Byzek has a tutorial using Kafka and Kafka Streams to perform real-time ETL:
Let’s consider an application that does some real-time stateful stream processing with the Kafka Streams API. We’ll run through a specific example of the end-to-end reference architecture and show you how to:
-
Run a Kafka source connector to read data from another system (a SQLite3 database), then modify the data in-flight using Single Message Transforms (SMTs) before writing it to the Kafka cluster
-
Process and enrich the data from a Java application using the Kafka Streams API (e.g. count and sum)
-
Run a Kafka sink connector to write data from the Kafka cluster to another system (AWS S3)
Read the whole thing.