Joining Streams Of Data

Chuck Blake gives an example of joining two streams of data together in Wallaroo:

The joining event streams pattern takes multiple data pipelines and joins them to produce a new signal message that can be acted upon by a later process.

This pattern can is used in a variety of use cases. Here are a few examples:

  • Merging data for an individual across a variety of social media accounts.

  • Merging click data from a variety of devices (e.g. mobile and desktop) for an individual user.

  • Tracking locations of delivery vehicles and assets that need to be delivered.

  • Monitoring electronic trading activity for clients on a variety of trading venues.

Conceptually, it’s very similar to normal join operations, but there is a time element which complicates things.

Related Posts

Spark Streaming Using DStreams Or DataFrames?

Yaroslav Tkachenko contrasts the two methods for operating on data with Spark Streaming: Spark Streaming went alpha with Spark 0.7.0. It’s based on the idea of discretized streams or DStreams. Each DStream is represented as a sequence of RDDs, so it’s easy to use if you’re coming from low-level RDD-backed batch workloads. DStreams underwent a lot […]

Read More

Lessons Learned From A Kafka Streams Implementation

Rishi Dhanaraj provides us with some lessons learned from implementing Kafka Streams to read data from Cassandra and Mongo and write into Mongo: This Python script ran on a single machine, and is from the early days of the company. However, this script didn’t scale since it cannot run in a distributed manner. As a […]

Read More

Categories

August 2018
MTWTFSS
« Jul Sep »
 12345
6789101112
13141516171819
20212223242526
2728293031