A common requirement that I see from customers who are processing text files in data lakes with Azure Data Factory, is to read and process files where there are variable numbers of lines that precede both the data headers and the data that needs to be processed. ADF already has facilities that handle the ability to switch headers off or on as well as the ability to specify parameterized skip line counts. However, in many cases, files that are received for processing have variable numbers of superfluous lines that need to be skipped.
In ADF, between pipeline activities and data flows, there are a number of ways to handle this scenario. In this post, I am going to demonstrate one such technique.
Read on to see which technique Mark chose and how to get it working.