I have a post on using Polybase with Azure SQL Data Warehouse:
That’s a header row, and I’m okay with it not making its way in. As a quick aside, I should note that I picked tailnum as my distribution key. The airplane’s tail number is unique to that craft, so there absolutely will be more than 60 distinct values, and as I recall, this data set didn’t have too many NULL values. After loading the 2008 data, I loaded all years’ data the same way, except selecting from dbo.Flights instead of Flights2008.
Click through for more details, including the CETAS statement, which I’d love to see in on-prem SQL Server.
Comments closed