Configuring Polybase

I have a blog post up on configuring Polybase:

Microsoft’s next recommendation is to make sure that predicate pushdown is enabled.  To do that, we’re going to go back to the Hadoop VM and grab our yarn.application.classpath from there.  To do that, cd to /etc/hadoop/conf/ and vi yarn-site.xml (or use whatever other text reader you want).  Copy the value for yarn.application.classpath, which should be a pretty long string.  Mine looks like:

1
<value>$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>

Now that you have a copy of that value, go to your SQL Server installation directory (by default, C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf) and open up yarn-site.xml.  Paste the value into the corresponding yarn.application.classpath setting and you’re good to go.

This is part one of a series on using Polybase.

Related Posts

Running Hive LLAP As A YARN Service

Gour Saha, et al, demonstrate running Apache Hive LLAP as a YARN service: Making LLAP as a first-class YARN service also enables us to use some of the other powerful features in YARN that were added in Apache Hadoop 3.0 / 3.1, some of them are noted below. Advanced container placement scheduling such as affinity […]

Read More

Flattening JSON Data With Databricks

Ivan Vazharov gives us a Databricks notebook to parse and flatten JSON using PySpark: With Databricks you get: An easy way to infer the JSON schema and avoid creating it manually Subtle changes in the JSON schema won’t break things The ability to explode nested lists into rows in a very easy way (see the […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930