Using Datadog To Monitor Spark Clusters On EMR

Priya Matpadi walks us through one way to monitor Spark clusters on Amazon ElasticMapReduce:

We recently implemented a Spark streaming application, which consumes data from from multiple Kafka topics. The data consumed from Kafka comprises different types of telemetry events generated by mobile devices. We decided to host the Spark cluster using the Amazon EMR service, which manages a fleet of EC2 instances to run our data-processing pipelines.

As part of preparing the cluster and application for deployment to production, we needed to implement monitoring so we could track the streaming application and the Spark infrastructure itself. At a high level, we wanted ensure that we could monitor the different components of the application, understand performance parameters, and get alerted when things go wrong.

In this post, we’ll walk through how we aggregated relevant metrics in Datadog from our Spark streaming application running on a YARN cluster in EMR.

Check it out.  If this is interesting, Priya’s blog has the full series.

Related Posts

Controlling Azure Services In R With AzureR

Hong Ooi announces a new set of packages called AzureR: As background, some of you may remember the AzureSMR package, which was written a few years back as an R interface to Azure. AzureSMR was very successful and gained a significant number of users, but it was never meant to be maintainable in the long term. As […]

Read More

Game Theory With Apache Spark

Konor Unyelioglu has a four-part series on solving game theoretical problems with Apache Spark.  Part one lays out the scenario: One application of game theory is finding optimal resource allocation. For example, as discussed in this article, resource management for heterogeneous wireless networks involves sharing network links, e.g. 3G, Wi-Fi, WiMAX, LTE, between mobile devices of […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

November 2018
MTWTFSS
« Oct  
 1234
567891011
12131415161718
19202122232425
2627282930