Combining Files In C#

Chris Koester shows how to combine a set of CSVs without duplicating their header rows:

The timings in this post came from combining 8 csv files with 13 columns and a combined total of 9.2 million rows.

I first tried combining the files with the PowerShell technique described here. It was painfully slow and took an hour and a half! This is likely because it is deserializing and then serializing every bit of data in the files, which adds a lot of unnecessary overhead.

Next I tried the C# script below using LINQPad. When reading from and writing to a network share, it took 3 minutes and 56 seconds. Much better! Next I tried it on a local SSD drive and it took just 44 seconds.

Read on for the script itself.  The ReadAllLines method works fine as long as the file isn’t larger than your working memory.

Related Posts

Voice Control For Shiny Apps

Over at Jumping Rivers, an example of using a Javascript library to control a page using voice commands: I have found that performance across all devices and browsers is definitely not equal. By far the best browser I have found for viewing the apps is Google Chrome. I have also tended to find that my […]

Read More

Reading Excel Files In An Office-less World

Bill Fellows shows us how to read from an Excel file on a machine without Microsoft Office installed: A common problem working with Excel data is Excel itself. Working with it programatically requires an installation of Office, and the resulting license cost, and once everything is set, you’re still working with COM objects which present […]

Read More

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031