Ashish Kumar has started a series on data lake essentials:
Data Lake architecture is all about storing large amounts of data which can be structured, semi-structured or unstructured, e.g. web server logs, RDBMS data, NoSql data, social media, sensors, IoT data and third-party data. A data lake can store the data in the same format as its source systems or transform it before storing.
The main purpose of a data lake is to make organizational data from different sources, accessible to a variety of end users like business analysts, data engineers, data scientists, product managers, executives, etc, in order to enable these personas to leverage insights in a cost-effective manner, for improved business performance. Today, many forms of advanced analytics are only possible on data lakes.
Click through for more information on what a data lake should provide—whether that be in-house or a cloud provider.