PolyBase — SQL to SQL

I have a post covering PolyBase from SQL Server to SQL Server:

Historically, PolyBase has three separate external entities: external data sources, external file formats, and external tables. External data sources tell SQL Server where the remote data is stored. External file formats tell SQL Server what the shape of that data looks like—in other words, CSV, tab-separated, Parquet, ORC, etc. External tables tell SQL Server the structure of some data of a particular external file format at a particular external data source.

With PolyBase V2—connectivity with SQL Server, Cosmos DB, Oracle, Spark, Hive, and a boatload of other external data sources—we no longer need external file formats because we ingest structured data. Therefore, we only need an external data source and an external table. You will need SQL Server 2019 to play along and I’d recommend keeping up on CTPs—PolyBase is under active development so being a CTP behind may mean hitting bugs which have subsequently been fixed.

I want this to get even better, to the point where external tables are a no-brainer over linked servers in terms of performance.

Related Posts

PolyBase and External Column Names

I have another post looking at external columns on PolyBase V2 data sources: I’m going to use external two tables in this experiment. In the left corner, we have some ORC files stored in Azure Blob Storage which we’ll represent as FireIncidents2017. In the right corner, we have data stored in a remote SQL Server […]

Read More

PolyBase and Azul Zulu OpenJDK

I have a post looking at one of the more interesting changes in SQL Server 2019 CTP 3.2: One of the more interesting parts of SQL Server 2019 CTP 3.2’s release notes is the relationship between Microsoft and Azul Systems. Travis Wright covers it in some detail, as well as what it means for customers. Prior […]

Read More

Categories

May 2019
MTWTFSS
« Apr Jun »
 12345
6789101112
13141516171819
20212223242526
2728293031