The Evolution Of Hadoop

Holden Ackerman has an interesting analysis of Qubole customers’ adoption of Hadoop 2:

In Qubole’s 2018 Data Activation Report, we did a deep-dive analysis of how companies are adopting and using different big data engines. As part of this research, we found some fascinating details about Hadoop that we will detail in the rest of this blog.

A common misconception in the market is that Hadoop is dying. However, when you hear people refer to this, they often mean “MapReduce” as a standalone resource manager and “HDFS” as being the primary storage component that is dying. Beyond this, Hadoop as a framework is a core base for the entire big data ecosystem (Apache Airflow, Apache Oozie, Apache Hbase, Apache Spark, Apache Storm, Apache Flink, Apache Pig, Apache Hive, Apache NiFi, Apache Kafka, Apache Sqoop…the list goes on).

I clipped this portion rather than the direct analysis because I think it’s an important point:  the Hadoop ecosystem is thriving as the matter of primary importance switches from what was important a decade ago (batch processing of large amounts of data on servers with direct attached storage) to what is important today (a combination of batch and streaming processing of large amounts of data on virtualized and often cloud-based servers with network-attached flash storage).

Related Posts

Saving To Excel From Azure Data Studio

Bob Pusateri shows us how you can export to Excel from Azure Data Studio: In SQL Server Management Studio, there’s no single-step way to save a result set to Excel. Most commonly I will just copy/paste a result set into a spreadsheet, but depending on the size of the result set and the types of […]

Read More

Things To Know About Databricks UAP

Kara Annanie has five things you should know about the Databricks Unified Analytics Platform: 4.     A Spark Dataframe is not the same as a Pandas/R DataframeSpark Dataframes are specifically designed to use distributed memory to perform operations across a cluster whereas Pandas/R Dataframes can only run on one computer. This means that you need to […]

Read More

Categories

April 2018
MTWTFSS
« Mar May »
 1
2345678
9101112131415
16171819202122
23242526272829
30