Data Lake 3.0

Vinod Kumar Vavilapalli describes the modern data lake:

During the past few years though, end-to-end business use-cases have evolved to another level.

  • The end-to-end business problems are now mostly solved by multiple applications working together.
  • As the platform matured, users have increasingly started wanting to solely focus on the business application layers, and getting impatient to get on with developing their main business-logic.
  • However, YARN, and for that matter any other related platform, hasn’t catered to this evolving need, leaving the users to unwillingly get involved in the painstaking details of wiring applications together, keeping them up, manually scaling them as need arises etc.

Manual plumbing of all these different colored services in tiresome! Further, there is a clear need for seamless aggregate deployment, lifecycle management and application wireup. This is the gap that needs to be bridged between what these end-to-end business use-cases need from the platform and what the platform offers today. If these features are provided, then the business use cases authors can singularly focus on the business logic.

This is a higher-level “where are we at?” kind of post which could be helpful if you’re new to the data lake concept.

Related Posts

Generating Load For Kafka With JMeter

Anup Shirolkar shows us a way to use JMeter to generate load for Apache Kafka clusters: The Anomalia Machina is going to require (at least!) one more thing as stated in the intro, loading with lots of data! Kafka is a log aggregation system and operates on a publish-subscribe mechanism. The Kafka cluster in Anomalia Machina […]

Read More

Data Science And Data Engineering In HDP 3.0

Saumitra Buragohain, et al, show off some of the things added to the Hortonworks Data Platform for data scientists and data engineers: We leverage the power of HDP 3.0 from efficient storage (erasure coding), GPU pooling to containerized TensorFlow and Zeppelin to enable this use case. We will the save the details for a different […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031