The YARN Fair Scheduler

Kevin Feasel

2016-06-17

Hadoop

Justin Kestelyn discusses the Fair Scheduler in YARN:

Assume that we have a YARN cluster with total resources <memory: 800GB, vcores 200> with two queues: root.busy (weight=1.0) and root.sometimes_busy (weight 3.0).  There are generally four scenarios of interest:

 

  • Scenario A: The busy queue is full with applications, and sometimes_busy queue has a handful of running applications (say 10%, i.e. <memory: 80GB, vcores: 20>). Soon, a large number of applications are added to the sometimes_busy queue in a relatively short time window. All the new applications in sometimes_busy will be pending, and will become active as containers finish up in thebusy queue. If the tasks in the busy queue are fairly short-lived, then the applications in thesometimes_busy queue will not wait long to get containers assigned. However, if the tasks in the busyqueue take a long time to finish, the new applications in the sometimes_busy queue will stay pending for a long time. In either case, as the applications in the sometimes_busy queue become active, many of the running applications in the busy queue will take much longer to finish.

 

If you’re interested in a deeper dive into YARN, this is a good series to start with.

Related Posts

Running Hive LLAP As A YARN Service

Gour Saha, et al, demonstrate running Apache Hive LLAP as a YARN service: Making LLAP as a first-class YARN service also enables us to use some of the other powerful features in YARN that were added in Apache Hadoop 3.0 / 3.1, some of them are noted below. Advanced container placement scheduling such as affinity […]

Read More

Flattening JSON Data With Databricks

Ivan Vazharov gives us a Databricks notebook to parse and flatten JSON using PySpark: With Databricks you get: An easy way to infer the JSON schema and avoid creating it manually Subtle changes in the JSON schema won’t break things The ability to explode nested lists into rows in a very easy way (see the […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930