Polybase Setup Errors

Murshed Zaman on the Azure CAT team covers a number of Polybase configuration errors:

SSMS Error:

Any Select query fails with the following error.
Msg 106000, Level 16, State 1, Line 1
Java heap space

Possible Reason:

Illegal input may cause the java out of memory error.  In this particular case the file was not in UTF8 format. DMS tries to read the whole file as one row since it cannot decode the row delimiter and runs into Java heap space error.

Possible Solution:

Convert the file to UTF8 format since PolyBase currently requires UTF8 format for text delimited files.

I imagine that this page will get quite a few hits over the years, as there currently exists limited information on how to solve these issues if you run into them, and some of the error messages (especially the one quoted above) have nothing to do with root causes.

Related Posts

Databricks Runtime 5.2 Released

Nakul Jamadagni announces Databricks Runtime 5.2: Delta Time TravelTime Travel, released as an Experimental feature, adds the ability to query a snapshot of a table using a timestamp string or a version, using SQL syntax as well as DataFrameReader options for timestamp expressions.Sample codeSELECT count() FROM events TIMESTAMP AS OF timestamp_expressionSELECT count() FROM events VERSION AS OF version Time travel looks a bit like temporal tables in SQL Server.

Read More

Kafka And The Differing Aims Of Data Professionals

Kai Waehner argues that there is an impedence mismatch between data engineers, data scientists, and ML production engineers: Data scientists love Python, period. Therefore, the majority of machine learning/deep learning frameworks focus on Python APIs. Both the stablest and most cutting edge APIs, as well as the majority of examples and tutorials use Python APIs. […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930