Unit Testing Spark Streaming DStreams

Anuj Saxena gives an example of using StreamingSuiteBase to build unit tests for DStreams in Spark Streaming:

So what’s the problem? How to execute streaming logic in a test environment.

We can write Integration test cases and provide the actual environment in the integration test. But for unit testing, we need a testing environment which should not depend on any external application.

Click through for the example.

Related Posts

Working With The Databricks API Via Powershell

Gerhard Brueckl has a Powershell module for interacting with Databricks, either Azure or AWS: As most of our deployments use PowerShell I wrote some cmdlets to easily work with the Databricks API in my scripts. These included managing clusters (create, start, stop, …), deploying content/notebooks, adding secrets, executing jobs/notebooks, etc. After some time I ended […]

Read More

Tuning Apache Spark Applications

Vidisha Gupta has a few tips for tuning Apache Spark programs: Data Serialization – Serialization plays an important role in increasing the performance of any application. Spark provides two serialization libraries – Java Serialization: By default, spark uses Java’s ObjectOutputStream framework which can work with any class that implements java.io.serializable. This serialization is flexible but slow and […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031