Curated SQL – Page 257 – A Fine Slice Of SQL Server

Finding Row Counts in SQL Server

Published 2024-04-03 by Kevin Feasel

Today, I was working with SQL Server to get row counts from several tables so I thought I’d be smart and work with some functions in SQL Server to make it smarter / easier.

Now, if I am truly only getting “straight” row counts from these tables, I would be able to create a query like the below that would provide the answers with no problem:

Read on for the normal approach, as well as a more complicated approach made necessary due to some business logic requirements.

Comments closed

Date Calculation Bug in Power Query ODBC Code

Published 2024-04-03 by Kevin Feasel

Meagan Longoria files a report:

I was working on an imported Power BI semantic model, adding some fiscal year calculations to my date table. The date table was sourced from a view in Databricks Unity Catalog. I didn’t have access to add more fields to the view, so I was adding the fields in Power Query first, with plans to request they be added to the view in the future. I got some unexpected results, which turned into a bug being logged for the ODBC code for Power Query.

If you are only analyzing data in the last 20 years, you won’t see this bug. But if you are doing long-term analysis including years before 2000, you might just run into it.

Read on to see the bug, how you can replicate it, and three workarounds you can use to avoid it.

Comments closed

The Value of Mirroring in Microsoft Fabric

Published 2024-04-03 by Kevin Feasel

Nikola Ilic talks mirroring:

First things first. Before I show you how to leverage this feature in Microsoft Fabric, let’s first explain the feature itself.

But, before we explain the feature itself, we need to go one step back and examine the key logic behind the Microsoft Fabric workloads, so that you understand the full context of the Mirroring importance.

Take that context and then you get an idea of how mirroring becomes so important for the Microsoft Fabric experience.

Comments closed

Dealing with Parameter Sniffing using Multiple Execution Plans

Published 2024-04-03 by Kevin Feasel

Andy Brownsword deals with statistical skew in the data:

Dynamic SQL has many uses and one of these can help us fix Parameter Sniffing issues. Here we’ll look at how it can be used to generate multiple execution plans for the same query.

Parameter sniffing is a common issue. Even for simple queries we can run into suboptimal plans being produced. There are multiple ways we can use Dynamic SQL to solve this challenge. Here we’ll demonstrate one technique: Comment Injection.

My one note about a good post (other than, you should read it) is that parameter sniffing is not itself a bad thing. 95%+ of the time, it’s a great thing. It’s that last 5% or so that give it a bad name.

Comments closed

New Updates to the Big Book of R

Published 2024-04-02 by Kevin Feasel

Oscar Baruffa has been busy:

I’m very happy to announce the addition of 6 new books to the Big Book of R collection, which now stands at about 420 books in total!

Thanks to Isabella Velásquez, Emil Hvitfeldt and Metehan GÜNGÖR for their submissions :).

Read on for a link to the updates, as well as to the Big Book of R itself. H/T R-Bloggers.

Comments closed

Comparing pgvector and Postgres ARRAY

Published 2024-04-02 by Kevin Feasel

Ernst-Georg Schmid makes a comp based on a mass spectrometry database:

As said in the introduction, mass spectrometry is one, if not the tool to identify unknown compounds, to quantify known compounds, and to determine the structure of molecules. But it is a lot of work, and you need reference spectra to compare against.

So, there are curated databases of validated spectra available, like MassBank Japan, MassBank Europe and the NIST mass spectral libraries. Laboratories might also want to store their own libraries for future use.

However, such databases often come in their own formats and with their own retrieval software. If you need to efficiently connect spectra to other data, e.g. chemical structures or genomic data, this calls for central management and a common API.

Read on to see the comparison of the pgvector extension versus built-in functionality with ARRAY.

Comments closed

Workspace Folders in Microsoft Fabric

Published 2024-04-02 by Kevin Feasel

Koen Verbeeck double-checks the calendar:

That’s right, this is not an April Fool’s Joke! The most anticipated feature of Microsoft Fabric has arrived! I’m not talking about decent CI/CD support, or OneSecurity. Nope, this is all about the ability to create folders in your workspaces! Very important, since Fabric is a centralized SaaS data platform that allows you to create a gazillion different objects, but until now you had now way of actually organizing them.

To give you an idea about how many objects, this is what the filter currently shows (and some items are missing, like Eventhouse):

This is big. Even on a small proof of concept that I worked on, the lack of folders was annoying. On a full project, the pain becomes worse. Granted, it’s in public preview, so it might not be available to everybody right off the bat, but it’s certainly a step in the direction of usefulness.

Comments closed

Maintaining Dynamic IP Rules for Azure Network Security Groups

Published 2024-04-02 by Kevin Feasel

Daniel Hutmacher shares a couple scripts:

Recently, my home ISP has started changing my public IP address. This causes me some headache because I have a couple of Azure Network Security Group rules (think of them as firewall rules) that specifically allow my home IP access to all of my Azure resources. When my home IP changes, those rules have to be updated accordingly.

So I made a PowerShell-based solution to automatically maintain them.

Read on for the process.

Comments closed

Finding Duplicate Statistics in SQL Server

Published 2024-04-02 by Kevin Feasel

Jose Manuel Jurado Diaz searches for clones:

Some time ago, we encountered a support case where a customer experienced significant delays in updating auto-created and user-created statistics. I would like to share the insights gained from this experience, including the underlying causes of the issue and the potential solutions we identified and implemented to address the problem effectively.

Read on for a demo to set up the scenario and the cause of the problem, as well as how to fix it.

Comments closed

Quantile Normalization in R

Published 2024-04-01 by Kevin Feasel

Steven Sanderson has achieved normality:

Before we dive into the code, let’s understand the concept behind quantile normalization. At its core, quantile normalization aims to equalize the distributions of multiple datasets by aligning their quantiles. This ensures that each dataset has the same distribution of values, making meaningful comparisons possible.

This is a bit different from normalizing individual data points in one dataset, as you can see in the post.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Curated SQL Posts