Data Lake Planning

Melissa Coates discusses some of the planning involved with creating a data lake:

Does a Data Lake Replace a Data Warehouse?

I’m biased here, and a firm believer that modern data warehousing is still very important. Therefore, I believe that a data lake, in an of itself, doesn’t entirely replace the need for a data warehouse (or data marts) which contain cleansed data in a user-friendly format. The data warehouse doesn’t absolutely have to be in a relational database anymore, but it does need a semantic layer which is easy to work with that most business users can access for the most common reporting needs.

On this question, my answer is “Absolutely not.”  Data warehouses are designed to answer specific, known business questions.  They’re great for regulatory reporting, quarterly reports to shareholders, weekly reports to management, etc.  Data lakes are designed for ad hoc analysis of information.  Read the whole thing.

Related Posts

Data Lake Archive Tier

Ust Oldfeld looks at an important part of a data lake: The Archive access tier in blob storage was made generally available today (13th December 2017) and with it comes the final piece in the puzzle to archiving data from the data lake. Where Hot and Cool access tiers can be applied at a storage account level, […]

Read More

Fetching U-SQL Job Input And Output Paths

Matthew Hicks shows how to retrieve information on U-SQL input and output paths using Powershell: Each time you submit a U-SQL job, a job folder is created in your Azure Data Lake Store account. This folder contains useful debugging information about the job, including a file called the U-SQL algebra file. This is an XML […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31