Spark Metrics

Swaroop Ramachandra looks at some key metrics for Spark administration:

Once you have identified and broken down the Spark and associated infrastructure and application components you want to monitor, you need to understand the metrics that you should really care about that affects the performance of your application as well as your infrastructure. Let’s dig deeper into some of the things you should care about monitoring.

  1. In Spark, it is well known that Memory related issues are typical if you haven’t paid attention to the memory usage when building your application. Make sure you track garbage collection and memory across the cluster on each component, specifically, the executors and the driver. Garbage collection stalls or abnormality in patterns can increase back pressure.

There are a few metrics of note here.  Check it out.

Related Posts

Running Hive LLAP As A YARN Service

Gour Saha, et al, demonstrate running Apache Hive LLAP as a YARN service: Making LLAP as a first-class YARN service also enables us to use some of the other powerful features in YARN that were added in Apache Hadoop 3.0 / 3.1, some of them are noted below. Advanced container placement scheduling such as affinity […]

Read More

Preventing Server Manager From Loading

Steve Stedman shows how to prevent the Server Manager app from loading whenever you RDP into a Windows Server machine: If you frequently connect to many different SQL Server as I do, you are probably used to the Server Manager loading slowly when you log in with Remote Desktop. The Server Manager has a bad […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930