Configuring Polybase

I have a blog post up on configuring Polybase:

Microsoft’s next recommendation is to make sure that predicate pushdown is enabled.  To do that, we’re going to go back to the Hadoop VM and grab our yarn.application.classpath from there.  To do that, cd to /etc/hadoop/conf/ and vi yarn-site.xml (or use whatever other text reader you want).  Copy the value for yarn.application.classpath, which should be a pretty long string.  Mine looks like:


Now that you have a copy of that value, go to your SQL Server installation directory (by default, C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf) and open up yarn-site.xml.  Paste the value into the corresponding yarn.application.classpath setting and you’re good to go.

This is part one of a series on using Polybase.

Related Posts

Flint: Time Series With Spark

Li Jin and Kevin Rasmussen cover the concepts of Flint, a time-series library built on Apache Spark: Time series analysis has two components: time series manipulation and time series modeling. Time series manipulation is the process of manipulating and transforming data into features for training a model. Time series manipulation is used for tasks like data […]

Read More

ElasticMapReduce And RStudio

Tanzir Musabbir demonstrates how to set up Amazon ElasticMapReduce to include an RStudio edge node: RStudio Server provides a browser-based interface for R and a popular tool among data scientists. Data scientist use Apache Spark cluster running on  Amazon EMR to perform distributed training. In a previous blog post, the author showed how you can install RStudio Server on Amazon […]

Read More


June 2016
« May Jul »