The Evolution Of Hadoop

Holden Ackerman has an interesting analysis of Qubole customers’ adoption of Hadoop 2:

In Qubole’s 2018 Data Activation Report, we did a deep-dive analysis of how companies are adopting and using different big data engines. As part of this research, we found some fascinating details about Hadoop that we will detail in the rest of this blog.

A common misconception in the market is that Hadoop is dying. However, when you hear people refer to this, they often mean “MapReduce” as a standalone resource manager and “HDFS” as being the primary storage component that is dying. Beyond this, Hadoop as a framework is a core base for the entire big data ecosystem (Apache Airflow, Apache Oozie, Apache Hbase, Apache Spark, Apache Storm, Apache Flink, Apache Pig, Apache Hive, Apache NiFi, Apache Kafka, Apache Sqoop…the list goes on).

I clipped this portion rather than the direct analysis because I think it’s an important point:  the Hadoop ecosystem is thriving as the matter of primary importance switches from what was important a decade ago (batch processing of large amounts of data on servers with direct attached storage) to what is important today (a combination of batch and streaming processing of large amounts of data on virtualized and often cloud-based servers with network-attached flash storage).

Related Posts

Locking Azure Resources

Stuart Moore shows us how we can lock Azure resources to prevent accidental mistakes: A resource deletion may not sound like too much of a big thing if you’re deploying Infrastructure as code, hey we’ll just terraform apply again and it’ll pop backup. In theory that’s a great idea, just with one big problem. The new resource […]

Read More

Automate VM Shutdown

Meagan Longoria has a script to shut off an Azure VM when a SQL Agent job finishes: The runbook sets the Azure context to the appropriate subscription (especially important when you are a guest user in someone else’s tenant). Then it checks if the VM is started. If it is, it goes into a do-while […]

Read More

Categories

April 2018
MTWTFSS
« Mar May »
 1
2345678
9101112131415
16171819202122
23242526272829
30