Kafka Plus Spark Streaming

Prasad Alle shows how to integrate Kafka with Spark Streaming on AWS:

Stream processing walkthrough

The entire pattern can be implemented in a few simple steps:

  1. Set up Kafka on AWS.

  2. Spin up an EMR 5.0 cluster with Hadoop, Hive, and Spark.

  3. Create a Kafka topic.

  4. Run the Spark Streaming app to process clickstream events.

  5. Use the Kafka producer app to publish clickstream events into Kafka topic.

  6. Explore clickstream events data with SparkSQL.

This is a pretty easy-to-follow walkthrough with some good tips at the end.

Related Posts

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library: Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and […]

Read More

Creating SQL Server Images In Azure Container Registry

Andrew Pruski shows us how to save Docker container images to the Azure Container Registry using Powershell: Awesome! Our custom image is in our ACR! But has it worked? Has it really? Oh ye of little faith… I guess the only way to find out is to run a container! So let’s run a Azure […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31