Polybase With HDP 2.5

I run into some issues with Polybase and Hortonworks Data Platform 2.5:

First, it’s interesting to note that the Polybase engine uses “pdw_user” as its user account.  That’s not a blocker here because I have an open door policy on my Hadoop cluster:  no security lockdown because it’s a sandbox with no important information.  Second, my IP address on the main machine is and the name node for my Hadoop sandbox is at  These logs show that my main machine runs a getfileinfo command against /tmp/ootp/secondbasemen.csv.  Then, the Polybase engine asks permission to open /tmp/ootp/secondbasemen.csv and is granted permission.  Then…nothing.  It waits for 20-30 seconds and tries again.  After four failures, it gives up.  This is why it’s taking about 90 seconds to return an error message:  it tries four times.

Aside from this audit log, there was nothing interesting on the Hadoop side.  The YARN logs had nothing in them, indicating that whatever request happened never made it that far.

Here’s hoping there’s a solution in the future.

Related Posts

Kafka Partitioning Strategies

Amy Boyle shares some thoughts on Kafka partitioning strategy: If you have enough load that you need more than a single instance of your application, you need to partition your data. The producer clients decide which topic partition data ends up in, but it’s what the consumer applications will do with that data that drives […]

Read More

Single-Node Hadoop 3 Installation

Mark Litwintschik has a fairly simple guide for installing Hadoop 3 on a single node for testing: This post is meant to help people explore Hadoop 3 without feeling the need they should be using 50+ machines to do so. I’ll be using a fresh installation of Ubuntu 16.04.2 LTS on a single computer. The […]

Read More


October 2016
« Sep Nov »