Press "Enter" to skip to content

Category: Polybase

Issue with PolyBase and Cosmos DB

I found an issue with connecting to Cosmos DB from PolyBase after installing SQL Server 2019 CU2:

After upgrading to SQL Server 2019 CU2, I noticed some issues when trying to connect to a Cosmos DB collection via PolyBase. Specifically, I started getting the following error message:

Msg 105082, Level 16, State 1, Line 35
105082;Generic ODBC error: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at <<my Cosmos account name>>.documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15) Additional error <2>: ErrorMsg: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at <<my Cosmos account name>> .documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15), SqlState: HY000, NativeError: 110 .

Read on for a couple attempts at a solution and some more detail.

Comments closed

PolyBase Bug Around Windows Authentication

I have a post documenting a bug in SQL Server 2019:

Here’s the short version of the bug. If you are connected using a Windows authenticated account and attempt to perform a PolyBase-related action, such as creating an external data source or querying from an external table, you receive the following error:

Msg 46721, Level 20, State 1, Line 5
Login failed. The login is from an untrusted domain and cannot be used with Integrated authentication.

Because this is an error with a severity level of 20, it kills your session.

Click through for the workaround. I had hoped that this would have been fixed with CU3, but it’s still in there.

Comments closed

PolyBase and Excel

I have a post on setting up PolyBase to work with Microsoft Excel:

If you tried to use Microsoft’s Excel driver prior to 2019 CU2, you’d get the following error:

Msg 105082, Level 16, State 1, Line LineNumber
105082;Generic ODBC error: [Microsoft][ODBC Excel Driver]Optional feature not implemented

To this point, I recommended in PolyBase Revealed that you use a different driver, like CData’s, which did work. CData’s driver still works (I assume…PolyBase ODBC support is a fluid situation, it seems), but now I can officially say that PolyBase supports the Microsoft Access Database Engine Redistributable driver for Microsoft Excel. Let’s go to the tape.

Click through for the instructions.

Comments closed

Investigating the Big Data Cluster Data Pool

Mohammad Darab takes us through Big Data Cluster data pools:

Data pools enable the creation of scale-out data marts. Whether your data is being ingested from Spark jobs or SQL, it is stored into the data pool. Data is distributed across one, or two, SQL Server instances running queries against it is more efficient.

Whether the data is being ingested from IoT device, Kafka, another relational data source (like Oracle or Teradata), it all is stored into the data pool instances and are available as “data marts” for the consumer to work with. There is no need to go back out to the original data source each time you want to query the data. It is all available inside the data pool instances.

This lets you cache data brought in via PolyBase and spread it across a number of instances. That’s pretty powerful.

Comments closed

Diagnosing PolyBase Errors

Niels Berglund takes us through an odd “incorrect syntax” error with PolyBase:

What we see in Figure 1incorrect syntax exception, is strange, as I have executed the same code in a SQL Server 2019 Big Data Cluster, (BDC), without any issues, and the forum poster executed the same in SQL Server 2019 Enterprise Edition also without any issues.

Ok, but what about creating an external table against a relational data source – where we do not need to define an external file format?

There is a straightforward answer as to why the specific error message pops up, but I agree with Niels that it’d be nice to have “here’s the problem and here’s the solution” types of error messages. The deeper you get into the product—especially the older Hadoop external data source—the worse the error messages get.

Comments closed

When PolyBase Startup Fails in SQL Server 2019

Niels Berglund hits an annoying error after installing PolyBase on SQL Server 2019:

At MS Ignite in Orlando November 4 – 8, 2019, Microsoft announced the general availability of SQL Server 2019. At the same time, the SQL Server 2019 Developers Edition appeared as an MSDN download, and of course, I downloaded it and installed it on my dev box.

After the installation, I noticed that PolyBase did not start up correctly, and I saw dump files all over the place. After some investigation, I figured out what the issue was, and this blog post describes the fix.

This only affects Developer and Express editions, not Standard or Enterprise.

Comments closed

PolyBase and Dockerized Hadoop

I have a solution to a problem which vexed me for quite some time:

Quite some time ago, I posted about PolyBase and the Hortonworks Data Platform 2.5 (and later) sandbox.

The summary of the problem is that data nodes in HDP 2.5 and later are on a Docker private network. For most cases, this works fine, but PolyBase expects publicly accessible data nodes by default—one of its performance enhancements with Hadoop was to have PolyBase scale-out group members interact directly with the Hadoop data nodes rather than having everything go through the NameNode and PolyBase control node.

Click through for the solution.

Comments closed

PolyBase and External Column Names

I have another post looking at external columns on PolyBase V2 data sources:

I’m going to use external two tables in this experiment. In the left corner, we have some ORC files stored in Azure Blob Storage which we’ll represent as FireIncidents2017. In the right corner, we have data stored in a remote SQL Server instance which we’ll call LineItem. The data doesn’t really matter that much, but to give you an idea of where we’re going, I’ll show each table. 

There’s quite a bit you can do here.

Comments closed

PolyBase and Azul Zulu OpenJDK

I have a post looking at one of the more interesting changes in SQL Server 2019 CTP 3.2:

One of the more interesting parts of SQL Server 2019 CTP 3.2’s release notes is the relationship between Microsoft and Azul Systems. Travis Wright covers it in some detail, as well as what it means for customers.

Prior to SQL Server 2019 CTP 3.2, installing PolyBase required an installation of Oracle’s Java Runtime Environment 7 Update 51 or higher, either directly from Oracle or through OpenJDK.

Java is still required if you want to read from or write to Hadoop or Azure Blob Storage. Oracle’s flavor of Java is no longer required, however.

Comments closed