Press "Enter" to skip to content

Category: Synapse Analytics

External Tables and the Serverless SQL Pool

Ryan Adams continues a series on querying the serverless SQL pool in Azure Synapse Analytics:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the second method which uses an external table to query a path within the lake.

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  Here you will use the Synapse Serverless Pool to query the data in your ADLS account.   

Read on for a demonstration.

Comments closed

Reading the Data Lake with the Serverless Pool via OPENROWSET

Ryan Adams begins a series on reading data from the data lake:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the first method which uses OPENROWSET to query a path within the lake. 

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  The article will focus on how you can use the Synapse Serverless Pool to query the data in your ADLS account.   

Click through for a primer on the topic, as well as a demo video.

Comments closed

CETAS in the Serverless Pool and Blob Storage Variants

Dennes Torres gives us a warning:

While making some CETAS tests, I discovered an interesting new behaviour. The following error message was displayed and it was very strange:

Msg 16539, Level 16, State 1, Line 1
Operation failed since the external data source ‘https://euwe01devqigsa01.blob.core.windows.net/dennes/filescsv/’ has underlying storage account that is not in the list of Outbound Firewall Rules on the server. Please add this storage account to the list of Outbound Firewall Rules on your server and retry the operation.

Read on for the cause of this error message.

Comments closed

Spark Structured Streaming with Synapse

Ryan Adams builds a demo:

In this post we are going to look at an example of streaming IoT temperature data in Synapse Spark.  I have an IoT device that will stream temperature data from two sensors to IoT hub. We’ll use Synapse Spark to process the data, and finally write the output to persistent storage. Here is what our architecture will look like: 

Click through for the architectural diagram and step-by-step on how to put the demo together.

Comments closed

Creating a DNS Alias for a Dedicated SQL Pool

Resham Popli creates an alias:

This blog will walk through the details on how to enable custom DNS (Domain Name System) entries on an Azure Synapse dedicated pool inside an Azure Synapse workspace in case of disaster recovery.

The DNS alias provides a translation layer that can redirect your client programs to different servers. This layer spares you the difficulties of having to find and edit all the clients and their connection strings (in disaster recovery implementation). This is not supported out-of-box, so we need to take extra steps to enable this feature. There are some limitations, so please read these steps carefully.

Read on to learn how and what those limitations are.

Comments closed

Executing Multiple Notebooks in one Spark Pool with Genie

Shalu Ganotra Chadra, et al, explain what Synapse Genie is:

The Genie framework is a metadata driven utility written in Python. It is implemented using threading (ThreadPoolExecutor module) and directed acyclic graph (Networkx library). It consists of a wrapper notebook, that reads metadata of notebooks and executes them within a single Spark session. Each notebook is invoked on a thread with MSSparkutils.run() command based on the available resources in the Spark pool. The dependencies between notebooks are understood and tracked through a directed acyclic graph.

Read on for more information about how you can use it and what the setup process looks like.

Comments closed

Cosmos DB to Data Explorer Synapse Link

Vicent-Philippe Lauzon makes an announcement:

We recently made our new Kusto data connection available in public preview:  Cosmos DB to Azure Data Explorer Synapse Link.

This does look like a marketing-heavy announcement but the short version is that you can ingest data from Cosmos DB into Data Explorer pools via Synapse Link rather than creating your own ETL process. The previous Cosmos DB connector for Synapse Link tied to a dedicated SQL pool.

Comments closed

Finding Data Factory Objects in Synapse Studio

Kevin Chant pulls out the magnifying glass and compass:

In this post I want to cover where you can find Azure Data Factory objects in Synapse Studio. I want to do this post for a couple of reasons.

First reason is that at the start of the year I published a post on how to automate a Data Factory pipeline migration to an Azure Synapse Analytics workspace using Azure DevOps.

Even though I showed one way that you can automate the migrations of a Data Factory to a Synapse workspace I did not show where you can view them in Synapse Studio.

Second reason is because I keep telling everybody they can use the same pipelines in Azure Synapse Analytics but the objects can be found in different places.

Read on for the two places to find Data Factory objects.

Comments closed

Sharing Results between Notebooks with MSSparkUtils

Liliam Leme provides an answer to a common Synapse Spark pool question:

I’ve been reviewing customer questions centered around “Have I tried using MSSparkUtils to solve the problem?”

One of the questions asked was how to share results between notebooks. Every time you hit “run” in a notebook, it starts a new Spark cluster which means that each notebook would be using different sessions. Making it impossible to share results between executions of notebooks. MSSparkUtils offers a solution to handle this exact scenario. 

Read on to see what MSSparkUtils is and how it helps in this case.

Comments closed