Alex Woodie reports that Hadoop 3.0 is officially out there, and looks at what’s forthcoming in 3.1 and 3.2:
As we told you about last week, Hadoop 3.0 brings two big new features that are compelling in their own right. That includes support for erasure coding, which should boost storage efficiency by 50% thanks to more efficient data replication; and YARN Federation, which should allow Hadoop clusters to scale up to 40,000 nodes.
The delivery of Hadoop 3.0 shows that open open source community is responding to demands of industry, said Doug Cutting, original co-creator of Apache Hadoop and the chief architect at Cloudera.
“It’s tremendous to see this significant progress, from the raw tool of eleven years ago, to the mature software in today’s release,” he said in a press release. “With this milestone, Hadoop better meets the requirements of its growing role in enterprise data systems.
But some of the new features in Hadoop 3.0 weren’t designed to bring immediate rewards to users. Instead, they pave the way for the Apache Hadoop community to deliver more compelling features with versions 3.1 and versions 3.2, according to Hortonworks director of engineering Vinod Kumar Vavilapalli, who’s also a committer on the Apache Hadoop project.
“Hadoop 3.0 is actually a building block, a foundation, for more exciting things to come in 3.1 and 3.2,” he said.
Click through to see some of those exciting things.