Press "Enter" to skip to content

Curated SQL Posts

Untrusted Shared Access Signature Certificates in SQL Server

William Assaf diagnoses an issue:

If you’ve tried doing Backup to URL with SQL Server using a Shared Access Signature (SAS) certificate and received this error:

Error: 18204, Severity: 16, State: 1.BackupDiskFile::CreateMedia: Backup device ‘https://account.blob.core.windows.net/container/folder/msdb_log_202005170801.trn’ failed to create. Operating system error 50(The request is not supported.).Cannot open backup device ‘https://account.blob.core.windows.net/container/folder/msdb_log_202005170801.trn’. Operating system error 50(The request is not supported.). [SQLSTATE 42000] (Error 3201) BACKUP LOG is terminating abnormally. [SQLSTATE 42000] (Error 3013). NOTE: The step was retried the requested number of times (1) without succeeding. The step failed. 

You may have received the same error I encountered.

This error popped up only after startup of SQL Server. To resolve the problem, we’d recreate the SAS key, using the same cert in the same script, and the backups would start working again. This affected all types of SQL database backups.

William did some troubleshooting and explains the core problem. It’s a weird one.

Comments closed

Updated SQL Server Diagnostic Queries

Glenn Berry has an updated set of DMV queries for us:

These are my SQL Server Diagnostic Information Queries for June 2020, aka my DMV Diagnostic Queries. They allow you to get a very comprehensive view of the configuration and performance of your SQL Server instance in a short amount of time. There are separate versions of these T-SQL queries for SQL Server 2005 through SQL Server 2019. I also have separate versions for SQL Managed Instance and Azure SQL Database. My diagnostic queries have been used by many people around the world since 2009. I make regular improvements to these queries each month.

This is one of my favorite methods for learning about a new SQL Server instance.

Comments closed

The Rollup and Cube Operators

Greg Dodd digs into the ROLLUP and CUBE operators in a two-parter. First, ROLLUP:

As you can see, we now have these null’s popping up, but with totals. Row 5 for example, tells us that in 2017 there were 1,427,461 people living in Hawaii. Row 11 tells us that there are 2,438,188 people living in Rhode Island and Hawaii in 2017. Row 22 tells us that there were 2,429,070 people living in Rhode Island and Hawaii in 2018, and finally row 23 tells us that in total there have been 4,867,268 people in 2017 and 2018. This last row is a bit useless for this data as the overlap of those people would be huge, but for something like sales data, this number could be useful.

Next, CUBE:

For those with a keen eye you’ll see that I’ve started at row 28 in that screenshot. When we run the GROUP BY without ROLLUP or CUBE we get just 16 rows. With ROLLUP that grows to 23, but with CUBE it explodes out to 57. Why?

I’ve used ROLLUP several times with proper hierarchical data (e.g., product category, product sub-category, product) and it does an excellent job of summarizing that sort of data. CUBE has always returned too many rows for my liking. But the operator I go to most frequently is GROUPING SETS, as then I get to control the levels.

Comments closed

Fun with Randomness

Holger von Jouanne-Diedrich takes us through some random thoughts:

Many a student thinks that the first pic was created by some underlying pattern (because of its points clumping together in some areas while leaving others empty) and that the second one is “more” random. The truth is that technically both are not random (but only pseudo-random) but the first resembles “true” randomness more closely while the second is a low-discrepancy sequence.

While coming to the point of pseudo-randomness in a moment “true” randomness may appear to have a tendency to occur in clusters or clumps (technically called Poisson clumping). This is the effect seen (or shall I say heard) in the iPod shuffling function. Apple changed it to a more regular behaviour (in the spirit of the second picture)… which was then perceived to be more random (as with my students)!

Read the whole thing.

Comments closed

Dynamic Partition Pruning in Apache Spark 3.0

Anjali Sharma walks us through a nice improvement in Spark SQL coming with Apache Spark 3.0:

Partition pruning in Spark is a performance optimization that limits the number of files and partitions that Spark reads when querying. After partitioning the data, queries that match certain partition filter criteria improve performance by allowing Spark to only read a subset of the directories and files. When partition filters are present, the catalyst optimizer pushes down the partition filters. The scan reads only the directories that match the partition filters, thus reducing disk I/O.

However, in reality data engineers don’t just execute a single query, or single filter in their queries, and the common case is that they actually have dimensional tables, small tables that they need to join with a larger fact table. So in this case, we can no longer apply static partition pruning because the filter is on one side of the join, and the table that is more appealing and more attractive to prune is on the other side of the join. So, we have a problem now.

And that’s where dynamic partition pruning comes into play.

Comments closed

Migrating from Azure SQL DB to an Azure SQL Managed Instance

Eitan Blumin walks us through the process of moving from Azure SQL Database to an Azure SQL Managed Instance:

What we cannot do:

– You cannot backup and restore from Azure SQL DB to a Managed Instance.
– You cannot use Azure SQL DB as a source in Azure Data Migration Service (DMS).
– You cannot use Azure SQL DB as a source in Data Migration Assistant (DMA).
– You cannot use Azure SQL DB as a source in DB Mirroring, AlwaysOn, Transaction Log Shipping, or Replication.
– Implementing a custom T-SQL, SSIS, or ADF (Azure Data Factory) solution will be too complicated, and will require an unacceptable development overhead.

That last one is a bit iffy, though Eitan’s two solutions are going to be easier than a custom solution.

Comments closed

Power BI Report Page Access Control

Gilbert Quevauvilliers walks us through access control on Power BI pages:

With the update to the latest version in Power BI Desktop there is now the capability to control which users will see which pages in a report.

This is achieved using a combination of Row Level Security (RLS) and conditional formatting. This makes it a secure way of controlling access for specific users. I will demonstrate how this works below.

This is a clever solution to the problem of access control.

Comments closed

Pulling GROUP BY Above a Join

Paul White has fun with a SQL Server optimization:

When the optimizer moves a Group By above a Join it has to preserve the semantics. The new sequence of operations must be guaranteed to return the same results as the original in all possible circumstances.

One cannot just pick up a Group By and arbitrarily move it around the query tree without risking incorrect results.

Read on for the demo, including a couple ingenious tricks the optimizer has up its sleeve.

Comments closed

Optimizing Power Query Merges

Chris Webb wants to make your joins in Power Query faster:

The first question I decided to investigate was this:

Does the number of columns in a table affect the performance of a merge?

First of all, I created two identical queries called First and Second that connected to the CSV file, used the first row from the file as the headers, and set the data types to all seven columns to Whole Number.

Click through for the answer to this question. Chris promises a series out of this and I would expect there to be enough content for that.

Comments closed

The Power BI Pro’s Guide to Azure Synapse Analytics

Brett Powell gives a quick overview of a new whitepaper:

The Power BI Professional’s Guide to Azure Synapse Analytics, a white paper I wrote describing the Synapse Analytics platform and its benefits and use cases for Power BI professionals, was published a couple weeks ago. This post discusses a few themes from this paper and also shares a couple notes that weren’t included.

There are some interesting notes in here, so check those out and also get ahold of the whitepaper to understand how Power BI relates to the artist formerly known as Azure SQL Data Warehouse.

Comments closed