Hadoop 3.0 Ships

Alex Woodie reports that Hadoop 3.0 is officially out there, and looks at what’s forthcoming in 3.1 and 3.2:

As we told you about last week, Hadoop 3.0 brings two big new features that are compelling in their own right. That includes support for erasure coding, which should boost storage efficiency by 50% thanks to more efficient data replication; and YARN Federation, which should allow Hadoop clusters to scale up to 40,000 nodes.

The delivery of Hadoop 3.0 shows that open open source community is responding to demands of industry, said Doug Cutting, original co-creator of Apache Hadoop and the chief architect at Cloudera.

“It’s tremendous to see this significant progress, from the raw tool of eleven years ago, to the mature software in today’s release,” he said in a press release.  “With this milestone, Hadoop better meets the requirements of its growing role in enterprise data systems.

But some of the new features in Hadoop 3.0 weren’t designed to bring immediate rewards to users. Instead, they pave the way for the Apache Hadoop community to deliver more compelling features with versions 3.1 and versions 3.2, according to  Hortonworks director of engineering Vinod Kumar Vavilapalli, who’s also a committer on the Apache Hadoop project.

“Hadoop 3.0 is actually a building block, a foundation, for more exciting things to come in 3.1 and 3.2,” he said.

Click through to see some of those exciting things.

Related Posts

Leveraging Hive In Pyspark

Fisseha Berhane shows how to use Spark to connect Python to Hive: If we are using earlier Spark versions, we have to use HiveContext which is variant of Spark SQL that integrates with data stored in Hive. Even when we do not have an existing Hive deployment, we can still enable Hive support. In this […]

Read More

Stream Reactor Update

Andrew Stevenson announces Stream Reactor 1.0.0 for Kafka Connect 1.0: Stream Reactor is an Apache License, Version 2.0 open source collection of components built on top of Kafka and provides Kafka Connect compatible connectors to move data between Kafka and popular data stores. Stream Reactor provides source connectors to publish data into Kafka and sink connectorsto bring data from Kafka […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031