Stream-To-Stream Joins In Spark

Ayush Tiwari shows how to join a pair of streams in Apache Spark 2.3:

In Spark 2.3, it added support for stream-stream joins, i.e, we can join two streaming Datasets/DataFrames and in this blog we are going to see how beautifully spark now give support for joining the two streaming dataframes.

I this example, I am going to use

Apache Spark 2.3.0
Apache Kafka 0.11.0.1
Scala 2.11.8

Click through for the demo.

Related Posts

HDP 3.0 Released

Roni Fontaine and Saumitra Buragohain announce Hortonworks Data Platform version 3.0: Other additional capabilities include: Scalability and availability with NameNode federation, allowing customers to scale to thousands of nodes and a billion files. Higher availability with multiple name nodes and standby capabilities allow for the undisrupted, continuous cluster operations if a namenode goes down. Lower […]

Read More

Visualizing Data In Real Time With SQL Server And Dash

Tomaz Kastrun shows how to use Python Dash to visualize data living in SQL Server in real time: The need for visualizing the real-time data (or near-real time) has been and still is a very important daily driver for many businesses. Microsoft SQL Server has many capabilities to visualize streaming data and this time, I […]

Read More

Categories

May 2018
MTWTFSS
« Apr Jun »
 123456
78910111213
14151617181920
21222324252627
28293031