Yarn Service Framework Coming

Kevin Feasel

2018-02-01

Hadoop

Jian He, et al, announce the Yarn Service Framework:

Apache Hadoop YARN is well known as the general resource-management platform for big-data applications such as MapReduce, Hive / Tez and Spark. It abstracts the complicated cluster resource management and scheduling from higher level applications and enables them to focus solely on their own application specific logic.

In addition to big-data apps, another broad spectrum of workloads we see today are long running services such as HBase, Hive/LLAP and container (e.g. Docker) based services. In the past year, the YARN community has been working hard to build first-class support for long running services on YARN.

This is going to ship with Hadoop 3.1.

Related Posts

Tips For Using PolyBase With Cloudera QuickStart VM

I have a post on using Cloudera’s QuickStart VM with PolyBase: Here’s something which tripped me up a little bit while connecting to Cloudera using SQL Server. The data node name, instead of being quickstart.cloudera like the host name, is actually localhost. You can change this in /etc/cloudera-scm-agent/config.ini. Because PolyBase needs to have direct access to the data nodes, […]

Read More

Bayesian Modeling Of Hardware Failure Rates

Sean Owen shows how you can use Bayesian statistical approaches with Spark Streaming, using the example of hard drive failure rates: This data doesn’t arrive all at once, in reality. It arrives in a stream, and so it’s natural to run these kind of queries continuously. This is simple with Apache Spark’s Structured Streaming, and proceeds […]

Read More

Categories

February 2018
MTWTFSS
« Jan Mar »
 1234
567891011
12131415161718
19202122232425
262728