In my previous blog I showed how you can stream Twitter data to an Event Hub and stream the data to a Power BI live dashboard. In this post, I am going to show you how to store this data for long term storage. An Event Hub stores your events temporarily. That means it does not store them for later analysis. Say you want to analyze whether negative or positive tweets have an impact on your sales, you would need to store tweets for a historical view.
The question is where to store this data: directly to the datawarehouse, or store it to a data lake? This really depends on the architecture that you want to have. A data lake is often used to store the raw data historically. Is is especially interesting because it allows to store any kind of data, structured or unstructured and it is quite cheap compared to Azure SQL database or Azure SQL datawarehouse. So for that reason, we are going to store it in a data lake.
Jesse walks us through data lake creation and data migration from Event Hubs into a Data Lake Storage container.