Polybase With Compression

I have a post looking at Polybase support for different compression formats:

This is a very interesting set of results.  First, 7Zip archived files do not work with the default encoding.  I’m not particularly surprised by this result, as 7Zip support is relatively scarce across the board and it’s a niche file format (though a very efficient format).

The next failure case is tar.  Tar is a weird case because it missed the first row in the file but was able to collect the remaining 776 records.  Same goes for .tar.gz.  I unpackaged the .tar file and the constituent SecondBasemen.csv file did in fact have all 777 records, so it’s something weird about the codec.

Stick to BZip2 and GZip if you’re using flat files.

Related Posts

ggplot2 Scales And Coordinates

I continue my series on ggplot2: The other thing I want to cover today is coordinate systems.  The ggplot2 documentation shows seven coordinate functions.  There are good reasons to use each, but I’m only going to demonstrate one.  By default, we use the Cartesian coordinate system and ggplot2 sets the viewing space.  This viewing space covers the […]

Read More

Troubleshooting Polybase Installation Errors

John Paul Cook has an article looking at what to do when you have Oracle JRE 9 and want to use Polybase: This article shows you how to troubleshoot a failed installation of SQL Server and how to implement a workaround to allow SQL Server 2017’s PolyBase feature to be installed when version 9 of […]

Read More


November 2016
« Oct Dec »