Press "Enter" to skip to content

Concatenating Many Files in Azure Blob Storage

Drew Furgiuele concatenates a lot of files:

Lately, I’ve found myself with a few requests from friends and users that have a particular problem: they’ve got themselves a data lake in Azure, and they can read and write files just fine to it. The problem, though, is that sometimes they need to take a series of files and mash them all together, or as the cool kids call it: concatenate them. And when it comes to third party tools and methods that can do the trick, you’re spoiled for choice: Azure Data Factory, Spark via Databricks, or even PowerShell.

Case in point: I was working with someone who had tens of thousands of CSV files that they needed to merge together into one big file, but they were already out in their Azure storage account. That doesn’t sound so bad, does it? 

Drew explains why it is, but also why it isn’t. So click through and check that out.