Yarn Service Framework Coming

Kevin Feasel



Jian He, et al, announce the Yarn Service Framework:

Apache Hadoop YARN is well known as the general resource-management platform for big-data applications such as MapReduce, Hive / Tez and Spark. It abstracts the complicated cluster resource management and scheduling from higher level applications and enables them to focus solely on their own application specific logic.

In addition to big-data apps, another broad spectrum of workloads we see today are long running services such as HBase, Hive/LLAP and container (e.g. Docker) based services. In the past year, the YARN community has been working hard to build first-class support for long running services on YARN.

This is going to ship with Hadoop 3.1.

Related Posts

Joining Multiple Types Of Data With KSQL

Robin Moffatt has an example where he enriches streaming CSV data with information stored in MySQL: This is a continuous query that executes in the background until explicitly terminated by the user. In effect, these are stream processing applications, and all we need to create them is SQL! Here all we’ve done is an enrichment (joining two […]

Read More

Kafka Partitioning Strategies

Amy Boyle shares some thoughts on Kafka partitioning strategy: If you have enough load that you need more than a single instance of your application, you need to partition your data. The producer clients decide which topic partition data ends up in, but it’s what the consumer applications will do with that data that drives […]

Read More


February 2018
« Jan Mar »