Poor Man’s More

Chris Koester has a quick-and-easy file reader in a few lines of C#:

This post describes one way that you can read the top N rows from large text files with C#. This is very useful when working with giant files that are too big to open, but you need to view a portion of them to determine the schema, data types, etc.

I’ve used PowerShell many times to do this with large csv files, but in this example we’re going to use C# and look at the Wikipedia XML dump of pages and articles. The 3017-03-01 dump is very large and comes in at 59.5 GB.

I’ve had to write something similar before on Windows machines where I didn’t have access to more/less.  It’s really helpful for perusing the first few lines of gigantic log files.

Related Posts

Azure Functions Basics

Vincent-Philippe Lauzon explains the basics of Azure Functions: In general, serverless refers to an economical model where we pay for compute resources used as opposed to “servers”. Wait…  isn’t that what the Cloud is about? Well, yes, on a macro-scale it is, but serverless brings it to a micro-scale. In the cloud we can provision […]

Read More

Building Dynamic Row Headers With ML Services

Dave Mason tries to get around his RESULT SETS limitation when using SQL Server Machine Learning Services: The columns in the data frame clearly have names, but SQL Server isn’t using them. The data frame columns have types in R too (more on this in a moment). Now that makes me wonder about the data […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031