Press "Enter" to skip to content

Curated SQL Posts

The Value of Mirroring in Microsoft Fabric

Nikola Ilic talks mirroring:

First things first. Before I show you how to leverage this feature in Microsoft Fabric, let’s first explain the feature itself.

But, before we explain the feature itself, we need to go one step back and examine the key logic behind the Microsoft Fabric workloads, so that you understand the full context of the Mirroring importance.

Take that context and then you get an idea of how mirroring becomes so important for the Microsoft Fabric experience.

Comments closed

Dealing with Parameter Sniffing using Multiple Execution Plans

Andy Brownsword deals with statistical skew in the data:

Dynamic SQL has many uses and one of these can help us fix Parameter Sniffing issues. Here we’ll look at how it can be used to generate multiple execution plans for the same query.

Parameter sniffing is a common issue. Even for simple queries we can run into suboptimal plans being produced. There are multiple ways we can use Dynamic SQL to solve this challenge. Here we’ll demonstrate one technique: Comment Injection.

My one note about a good post (other than, you should read it) is that parameter sniffing is not itself a bad thing. 95%+ of the time, it’s a great thing. It’s that last 5% or so that give it a bad name.

Comments closed

Comparing pgvector and Postgres ARRAY

Ernst-Georg Schmid makes a comp based on a mass spectrometry database:

As said in the introduction, mass spectrometry is one, if not the tool to identify unknown compounds, to quantify known compounds, and to determine the structure of molecules. But it is a lot of work, and you need reference spectra to compare against.

So, there are curated databases of validated spectra available, like MassBank JapanMassBank Europe and the NIST mass spectral libraries. Laboratories might also want to store their own libraries for future use.

However, such databases often come in their own formats and with their own retrieval software. If you need to efficiently connect spectra to other data, e.g. chemical structures or genomic data, this calls for central management and a common API.

Read on to see the comparison of the pgvector extension versus built-in functionality with ARRAY.

Comments closed

Workspace Folders in Microsoft Fabric

Koen Verbeeck double-checks the calendar:

That’s right, this is not an April Fool’s Joke! The most anticipated feature of Microsoft Fabric has arrived! I’m not talking about decent CI/CD support, or OneSecurity. Nope, this is all about the ability to create folders in your workspaces! Very important, since Fabric is a centralized SaaS data platform that allows you to create a gazillion different objects, but until now you had now way of actually organizing them.

To give you an idea about how many objects, this is what the filter currently shows (and some items are missing, like Eventhouse):

This is big. Even on a small proof of concept that I worked on, the lack of folders was annoying. On a full project, the pain becomes worse. Granted, it’s in public preview, so it might not be available to everybody right off the bat, but it’s certainly a step in the direction of usefulness.

Comments closed

Maintaining Dynamic IP Rules for Azure Network Security Groups

Daniel Hutmacher shares a couple scripts:

Recently, my home ISP has started changing my public IP address. This causes me some headache because I have a couple of Azure Network Security Group rules (think of them as firewall rules) that specifically allow my home IP access to all of my Azure resources. When my home IP changes, those rules have to be updated accordingly.

So I made a PowerShell-based solution to automatically maintain them.

Read on for the process.

Comments closed

Finding Duplicate Statistics in SQL Server

Jose Manuel Jurado Diaz searches for clones:

Some time ago, we encountered a support case where a customer experienced significant delays in updating auto-created and user-created statistics. I would like to share the insights gained from this experience, including the underlying causes of the issue and the potential solutions we identified and implemented to address the problem effectively.

Read on for a demo to set up the scenario and the cause of the problem, as well as how to fix it.

Comments closed

Quantile Normalization in R

Steven Sanderson has achieved normality:

Before we dive into the code, let’s understand the concept behind quantile normalization. At its core, quantile normalization aims to equalize the distributions of multiple datasets by aligning their quantiles. This ensures that each dataset has the same distribution of values, making meaningful comparisons possible.

This is a bit different from normalizing individual data points in one dataset, as you can see in the post.

Comments closed

Power BI Paginated Reports over Excel and Web Services

Chris Webb shares an announcement:

By far the most exciting announcement for me this week was the new release of Power BI Report Builder that has Power Query built in, allowing you to connect to far more data sources in paginated reports than you ever could before. There’s a very detailed blog post and video showing you how this new functionality works here:

Given that this seems to be the spiritual replacement for SSRS, it’s good to see them still working on it.

Comments closed

SSMS 20 Minus Azure Data Studio

Erik Darling makes an observation:

I was quite publicly mystified by the coupling of Azure Data Studio into the SQL Server Management Studio installer. They’re different tools for different people.

This isn’t a tirade against Azure Data Studio, nor is it a victory lap since apparently the feedback item I posted was part of the decision to remove it.

This is purely informational, since the announcement that it’s not in there anymore is nearly as quiet as the announcement that it was being included back in SQL Server Management Studio 18.7, and only slightly louder than the availability of a command line switch to skip installing it.

I say this as someone whose primary SQL tool is Azure Data Studio: good. There’s a lot of functionality overlap between the two but it’s better making both of them optional.

Also, I’m subscribed to that same RSS feed Erik mentioned and fully agree with Erik’s assessment of Erin. Even if she is a Michigan fan.

Comments closed