Amit Kulkarni shows how to install Azure Data Lake Store support on your “older” Hadoop clusters:

How old is really old?

The Azure Data Lake Store binaries have been broadly certified for Hadoop distributions after 3.0 and above. We are really in uncharted territory for lower versions. So the farther away you go from 3.0 the higher the likelihood of them not working. My personal recommendation is to go no lower than 2.6. After that your mileage may really vary.

This is a good article, and do check it out.  A very small mini-rant follows:  Hadoop version 2.6 is not old.  Nor is 2.7.  2.7 is the most recent production-worthy branch and 3.0 isn’t expected to go GA until August.

Related Posts

Comparing Performance: HBase1 vs HBase2

Surbhi Kochhar takes us through performance improvements between HBase version 1 and HBase version 2: We are loading the YCSB dataset with 1000,000,000 records with each record 1KB in size, creating total 1TB of data. After loading, we wait for all compaction operations to finish before starting workload test. Each workload tested was run 3 […]

Read More

The Transaction Log in Delta Tables

Burak Yavuz, et al, explain how the transaction log works with Delta Tables in Apache Spark: When a user creates a Delta Lake table, that table’s transaction log is automatically created in the _delta_log subdirectory. As he or she makes changes to that table, those changes are recorded as ordered, atomic commits in the transaction log. Each commit […]

Read More

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728