Upcoming Polybase Enhancements

James Serra knows how to get my interest:

Polybase was first made available in Analytics Platform System in March 2013, and then in SQL Server 2016.  The announcement at the PASS Summit was that by preview early next year, in addition to Hadoop and Azure blob storage, PolyBase will support Teradata, Oracle, SQL Server, and MongoDB in SQL Server 2016.  And the Azure Data Lake Store will be supported in Azure SQL Data Warehouse PolyBase.

With SQL Server 2016, you can create a cluster of SQL Server instances to process large data sets from external data sources in a scale-out fashion for better query performance (see PolyBase scale-out groups):

I’m excited for the future of Polybase and looking forward to vNext and vNext + 1 (for the stuff which they can’t possibly get done in time for vNext).

Related Posts

MRAppMaster Errors Running MapReduce Jobs

I have a post looking at potential causes when PolyBase MapReduce jobs are unable to find the MRAppMaster class: Let me tell you about one of my least favorite things I like to see in PolyBase: Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster This error is not limited to PolyBase but is instead […]

Read More

Database-First or Kafka-First for Event Streaming

Gwen Shapiro takes us through a scenario where database-first writes for event streaming makes the most sense: Note that the DB does quite a lot for you: it enforces serializability, locks, your logical constraints, etc. If the DB is distributed (Vitesse, Cockroach, Spanner, Yugabyte), it does even more. If you were to go Kafka-first… well, […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930