Forcing Polybase External Pushdown

I have a post showing how to control predicate pushdown in Polybase:

As a reminder, in order to allow predicate pushdown to occur, we need to hit a Hadoop cluster; we can’t use predicate pushdown on other systems like Azure Blob Storage.  Second, we need to have a resource manager link set up in our external data source.  Third, we need to make sure that everything is configured correctly on the Polybase side.  But once you have those items in place, it’s possible to use the FORCE EXTERNALPUSHDOWN command like so:

There’s also discussion of preventing MapReduce job creation as well as a pushdown-related error I had received in the past.

Related Posts

Testing Kafka Streams Applications

Yeva Byzek continues her series on testing Kafka-based streaming applications: When you create a stream processing application with Kafka’s Streams API, you create a Topologyeither using the StreamsBuilder DSL or the low-level Processor API. Normally, the topology runs with the KafkaStreams class, which connects to a Kafka cluster and begins processing when you call start(). For testing though, connecting to a running […]

Read More

Auto ML With SQL Server 2019 Big Data Clusters

Marco Inchiosa has a model scenario for using Big Data Clusters to scale out a machine learning problem: H2O provides popular open source software for data science and machine learning on big data, including Apache SparkTM integration. It provides two open source python AutoML classes: h2o.automl.H2OAutoML and pysparkling.ml.H2OAutoML. Both APIs use the same underlying algorithm implementations, […]

Read More

Categories

December 2016
MTWTFSS
« Nov Jan »
 1234
567891011
12131415161718
19202122232425
262728293031