Press "Enter" to skip to content

Category: Synapse Analytics

Configuring Classifiers for Workload Management

Reiss McSporran continues a series on workload management in Azure Synapse Analytics:

So in part 1 we found out what Workload Management was and why we wanted to use it. (If you haven’t read that part, please click here and read that post before you carry on, as we’re jumping in at the deep end!) In this part we’ll be learning how to configure it.

Continuing from the end of the previous post, you have two main options to configure this. First is to assign a user or role to one of the existing system defined workload groups, second is to create a custom workload group and assign a user or role to this instead.

Read on to learn how to create classifiers, what the rules look like for them, and their importance.

Comments closed

Querying Private Blob Storage Containers with Azure Synapse Analytics

Dennes Torres looks at some private information:

The queries from the previous article were made against the public container in the blob storage. However, if the container is private, you will need to authenticate with the container. In this article, you’ll learn how to query private blob storage with SQL.

NOTE: Be sure that the Azure Synapse Workspace and the storage account with the sample files are set up before following along with this article. You will also need to replace your storage account URL each time that a storage account URL is used in the article.

There are three possible authentication methods, and these methods may have some variation according to the type of storage account and the access configuration. I will not dig into details about storage here and leave that for a future article.

Read on for the three authorization methods and a lot of detail on using SAS tokens (the preferred method) to access this data.

Comments closed

A Review of Serverless SQL Pools

Teo Lachev lays out a review of serverless SQL pools in Azure Synapse Analytics:

Being able to query files using SQL is great. Some of you might remember the U-SQL language that was introduced a few years ago alongside Azure Data Lake Storage (ADLS) Gen 1 which is now deprecated. It never caught up because it looked like SQL but it wasn’t (it was actually closer to C#). Now we’re talking about real SQL. To query files! This opens the possibility to implement a logical warehouse (the emphasis is on logical as everyone to my knowledge who tried to replace a data warehouse with a data lake has failed). Or, you can connect Power BI to the serverless endpoint and start querying all these files in DirectQuery mode. So, this enables real-time BI on top of file extracts.

Click through for the pros and cons of using serverless SQL pools today.

Comments closed

Moving Artifacts between Folders in Synapse Studio

Wolfgang Strasser looks at a recent update:

Another small but very powerful usability extension in Azure Synapse Studio was added at the beginning of June: Move artifacts across folders in Synapse Studio (without extra clicks but with drag&drop)

Once again, the release notes list contained the short sentence that made me curious… hmm… that sound nice… In one of my previous post, I described the “old” way of moving artifacts around in Synapse Studio.

Click through for a demonstration.

Comments closed

Workload Management in Azure Synapse Analytics

Reiss McSporran starts a new series on workload management in Azure Synapse Analytics:

So you’ve started using Synapse SQL Pools, you’ve scaled up to improve performance, but your queries aren’t going as fast as you’d hoped. You take a look at resource utilisation and you see that as you scale up, your total resource utilisation per query goes down. But didn’t we scale up to allow our query to use more resource and run quicker? What’s going on here? A little thing called Workload Management.

What is Workload Management? At a very high level, Workload Management allows you to control how your workload uses system resources. The main problem here is defaults!

Click through for the full story. Or at least the full part 1 of the even fuller story.

Comments closed

Creating a dacpac for a Dedicated SQL Pool

Kevin Chant shows how to use Azure DevOps to create a dacpac for an Azure Synapse Analytics dedicated SQL pool:

By the end of this post, you will know how to create a dacpac for a dedicated SQL Pool within Azure Pipelines for your CI/CD deployments. Plus, how you can synchronize a database project created in Azure Data Studio with a Git repository in Azure DevOps.

In a previous post I covered how you can create a dacpac for an Azure Synapse Analytics dedicated SQL Pool using Azure Data Studio. In that post I stated that you could create a dacpac for the database project using Azure DevOps.

With this in mind, I will use the same database project that I created in that post.

Click through for the process.

Comments closed

Improving Parquet External Table Performance in Dedicated SQL Pools

Jovan Popovic shows us two ways of accessing data in Parquet files in Azure Synapse Analytics dedicated SQL pools:

Azure Synapse Analytics enables you to read Parquet files stored in the Azure Data Lake storage using the T-SQL language and high-performance Parquet readers. The key characteristic of these high-performance Parquet readers is that they are using the native (C++) code for reading Parquet files, unlike the existing Polybase Parquet reader technology that uses the Java code. These native readers are introduced in the serverless SQL pools in Azure Synapse Analytics workspaces.

In many experiments, this native technology that is used in the serverless SQL pools demonstrated better performance compared to the existing Polybase external table in the dedicated SQL pools.

This native technology for reading Parquet files is now also available in the dedicated SQL pools. In the dedicated Pools in Azure Synapse Analytics, you can create external tables that use native code to read Parquet files and improve performance of your queries that access external Parquet files.

Click through for the process, as well as what kind of performance differences you can see. Some of the queries ended up being worse for native tables versus PolyBase tables, but the majority were a good bit better.

Comments closed

Accessing ADLS Gen2 Data in Serverless SQL Pools with SAS Tokens

Neel Ball shows how you can use various techniques, including SAS tokens, to access data stored in Azure Data Lake Storage Gen2 from Azure Synapse Analytics serverless SQL pools:

You have a data lake that contains employee and social feed data. You have data residing in an employee folder that is used by HR team members and twitter for live social feeds that is usually used by marketing folks. If you use SAS token or RBAC, you cannot control to the folder level.

How do you allow users to perform data exploration using synapse serverless with fine grain control on underlying storage.

Read on for one solution.

Comments closed

Creating a dacpac for a Dedicated SQL Pool

Kevin Chant shows us how to build out a dacpac file for an Azure Synapse Analytics dedicated SQL Pool:

In reality, you can create a dacpac for a database that’s inside an Azure Synapse Analytics dedicated SQL Pool using a lot of the methods that you use to create them for SQL Server databases.

Azure Data Studio can be an appealing alternative SQL Server Data Tools (SSDT) for tasks like this. Due to various reasons. For instance, it’s a multi-platform solution that is easy to install.

With this in mind, I decided in this post to cover how to create a dacpac for an Azure Synapse Analytics dedicated SQL Pool using Azure Data Studio.

Click through to see how.

Comments closed