An Intro to SQL Server Big Data Clusters

Mohammad Darab has a series on Big Data Clusters. Part zero explains what they are:

The *absolute* unique feature that BDCs offer that *no* other company or product offers is: Data Virtualization

The new and enhanced SQL Server 2019 Polybase feature comes with connectors to many different data sources: Oracle, Teradata, Apache Spark, MongoDB, Azure Cosmos DB, and even ODBC connectivity to IBM’s DB2, SAP HANA and excel (see image below)

Part one shows how to set one up:

So far, Microsoft does not have a simple way to create a Big Data Cluster. It’s a bit cumbersome of a process and the learning curve is a bit steep. However, Microsoft is currently working on making it easier to deploy a Big Data Cluster via Notebook in Azure Data Studio and eventually some type of “deployment wizard.” But for now, the only option is to do it the long way.

The series will continue, but check out the setup work.

Related Posts

Creating Big Data Clusters with Azure Data Studio

Niels Berglund takes us through the creation of a Big Data Cluster by using Azure Data Studio to generate a notebook: I wrote a blog post back in November 2018, about how to install and deploy SQL Server 2019 Big Data Cluster on Azure Kubernetes Service. Back then SQL Server 2019 Big Data Cluster was in private preview, (CTP 2.1 I […]

Read More

Develop BDC PySpark Jobs in Visual Studio Code

Jenny Jiang announces a new capability in Visual Studio Code: With the Visual Studio Code extension, you can enjoy native Python programming experiences such as linting, debugging support, language service, and so on. You can run current line, run selected lines of code, or run all for your PY file. You can import and export a .ipynb notebook and perform […]

Read More

Categories

June 2019
MTWTFSS
« May Jul »
 12
3456789
10111213141516
17181920212223
24252627282930