The Need For Multiple Warehouse Architectures

James Serra argues in favor of a data lake approach and a traditional data warehouse:

I think the ultimate question is: Can all the benefits of a traditional relational data warehouse be implemented inside of a Hadoop data lake with interactive querying via Hive LLAP or Spark SQL, or should I use both a data lake and a relational data warehouse in my big data solution?  The short answer is you should use both.  The rest of this post will dig into the reasons why.

I touched on this ultimate question in a blog that is now over a few years old at Hadoop and Data Warehouses so this is a good time to provide an update.  I also touched on this topic in my blogs Use cases of various products for a big data cloud solutionData lake detailsWhy use a data lake?and What is a data lake? and my presentation Big data architectures and the data lake.  

Read on for James’s argument, which is good.  My argument is summed up as follows:  the purpose of a data warehouse is to solve known business problems—that is, to help build reports that people on the business side need based on established requirements.  The purpose of a data lake is to hold all kinds of data and curate it for when people come looking for something they didn’t know they needed.

Related Posts

Mounting HDFS As A Local Filesystem

Guy Shilo looks at two techniques for mounting HDFS as a local filesystem: NFS Gateway is a HDFS component that enables the use to expose HDFS through NFS3 interface so that Linux machines can mount it and access it just as a local filesystem. The manual installation is quite cumbersome and is covered here. Cloudera manager […]

Read More

How Humio Uses Kafka

Kresten Krab describes ways that Humio uses Apache Kafka for their product: Humio is a log analytics system built to run both on-prem and as a hosted offering. It is designed for “on-prem first” because, in many logging use cases, you need the privacy and security of managing your own logging solution. And because volume […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031