Press "Enter" to skip to content

Category: Synapse Analytics

Disabling Public Network Access in Synapse

Ryan Adams builds a private endpoint:

If you disable public access to your Azure Synapse Workspace you will get the following error message when attempting to open Synapse Studio. 

“Failed to load one or more resources due to forbidden issue, error code 403.” 

Click through for more information about routing for Synapse resources and what you’d need to do in order to disable public network access entirely.

Comments closed

Power BI Incremental Refresh with Non-Standard Dates in Parquet Files

Shabnam Watson hits on a specific but interesting use case:

The most common scenario for setting up the out of the box incremental refresh in Power BI is to base it off of a datetime column; however, there are cases when you may want to set up incremental refresh based off of a column with a data type other than datetime. Examples are when you are working with a smart date ID (01012023 for Jan 1,2023) column or when you are working with a source system that has partitioned data using a column such as Year that has a numeric data type.

A use case for the latter scenario is when you are working with Parquet/Delta files via Azure Synapse Analytics Serverless SQL Pool. When working with larger datasets, it is typical to see the Parquet/Delta files partitioned by date ranges. Depending on how much data there is, the partitioning may be at the Year level instead of Day.

With that scenario in mind, read on to learn how you can minimize your Power BI processing time and costs when doing incremental refresh.

Comments closed

Azure Synapse Analytics January 2023 Updates

Ryan Majidimehr shares an overview of what the Synapse team has been working on:

ADX contains native support for detecting anomalies over multiple time series by using the function series_decompose_anomalies(). This function can analyze thousands of time series in seconds, enabling near real time monitoring solutions and workflows based on ADX. Univariate analysis is simpler, faster, and easily scalable and is applicable to most real-life scenarios. However, there are some cases where it might miss anomalies that can only be detected by analyzing multiple metrics at the same time.

For some scenarios, there might be a need for a true multivariate model, jointly analyzing multiple metrics. This can be achieved now in ADX using the new Python-based multivariate anomaly detection functions.

The themes for this month are Spark, Data Exploration (via Kusto), and data integration.

Comments closed

External Tables and the Serverless SQL Pool

Ryan Adams continues a series on querying the serverless SQL pool in Azure Synapse Analytics:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the second method which uses an external table to query a path within the lake.

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  Here you will use the Synapse Serverless Pool to query the data in your ADLS account.   

Read on for a demonstration.

Comments closed

CI/CD for the Synapse Serverless SQL Pool

Kevin Chant has some links for us:

Last week the January 2023 video of the Azure Synapse Analytics and Microsoft MVP series was released. Where I covered how to do CI/CD for dedicated SQL Pools in Azure Synapse Analytics.

Since the release of that video the most popular question raised has been how to perform CI/CD for serverless SQL Pools within Azure Synapse Analytics?

Read on for links galore.

Comments closed

Reading the Data Lake with the Serverless Pool via OPENROWSET

Ryan Adams begins a series on reading data from the data lake:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the first method which uses OPENROWSET to query a path within the lake. 

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  The article will focus on how you can use the Synapse Serverless Pool to query the data in your ADLS account.   

Click through for a primer on the topic, as well as a demo video.

Comments closed

CETAS in the Serverless Pool and Blob Storage Variants

Dennes Torres gives us a warning:

While making some CETAS tests, I discovered an interesting new behaviour. The following error message was displayed and it was very strange:

Msg 16539, Level 16, State 1, Line 1
Operation failed since the external data source ‘https://euwe01devqigsa01.blob.core.windows.net/dennes/filescsv/’ has underlying storage account that is not in the list of Outbound Firewall Rules on the server. Please add this storage account to the list of Outbound Firewall Rules on your server and retry the operation.

Read on for the cause of this error message.

Comments closed

Spark Structured Streaming with Synapse

Ryan Adams builds a demo:

In this post we are going to look at an example of streaming IoT temperature data in Synapse Spark.  I have an IoT device that will stream temperature data from two sensors to IoT hub. We’ll use Synapse Spark to process the data, and finally write the output to persistent storage. Here is what our architecture will look like: 

Click through for the architectural diagram and step-by-step on how to put the demo together.

Comments closed

Creating a DNS Alias for a Dedicated SQL Pool

Resham Popli creates an alias:

This blog will walk through the details on how to enable custom DNS (Domain Name System) entries on an Azure Synapse dedicated pool inside an Azure Synapse workspace in case of disaster recovery.

The DNS alias provides a translation layer that can redirect your client programs to different servers. This layer spares you the difficulties of having to find and edit all the clients and their connection strings (in disaster recovery implementation). This is not supported out-of-box, so we need to take extra steps to enable this feature. There are some limitations, so please read these steps carefully.

Read on to learn how and what those limitations are.

Comments closed