Press "Enter" to skip to content

Curated SQL Posts

Microsoft Fabric and Tabular Editor

Johnny Winter is excited:

Why the excitement on my part? Well to take advantage of all the great features in Tabular Editor, you really need to be able to connect and write via XMLA, be that for doing CI/CD pipelines or by making edits directly on the dataset.

What great new features does Tabular Editor unlock that you can’t just do in the online Power BI modelling experience in Fabric… tons!

Read on to see how Tabular Editor plays with Microsoft Fabric.

Comments closed

Script Comparison with kdiff3

Steve Jones is speaking my language:

I had a customer recently ask if SQL Compare could show them the differences in two scripts they’ve written. They weren’t using version control (tsk, tsk, shame), but saw SQL Compare and the “Scripts folder” option. This isn’t used for random scripts, but I do have a better solution: KDiff3.

KDiff is an old project that is used to analyze multiple files and merge the differences. There is an archived SoundForge location, but the more modern version is here. That’s the current code location, and you can see the readme for details. To get started, download and install it.

I remember (cue “Pepperidge Farms Remembers” meme) back when kdiff3 was only available in KDE. That’s when I first learned of it, and ever since there was a Windows port, I’ve been a dedicated user. Yes, it’s an old tool, but it works really well.

Comments closed

SSMS Tips and Tricks

Vlad Drumea shares a few tips around SQL Server Management Studio:

In this post I cover my favorite SSMS tips and tricks that I’ve picked up along the years, and on which I rely on a daily basis in my workflow.

If you’re interested in my SQL Server Management Studio configuration recommendations – check this post out. This also contains how to have two rows of Query Editor tabs, so I’m not covering that here again.

You can find the latest version of SSMS on the official download page.

This is a great list of items and if you’re a daily driver of SSMS, you’ll want to check it out.

Comments closed

Reviewing an Execution Plan for a Window Function

Andy Brownsword categorizes the components:

Using last week’s sample data we can run the query below to demonstrate operators typically used for a window function:

The result of this query is a set of data with a running total of the Sales Value within each Financial Quarter.

We’ll follow the data through some of the operators in this execution plan to understand their part in the function. As with regular execution plans we’ll be working from right to left.

Read on for the key operators.

Comments closed

Power Regression in R

Steven Sanderson’s power level is over 9000:

In the realm of statistics, power regression stands out as a versatile tool for exploring the relationship between two variables, where one variable is the power of the other. This type of regression is particularly useful when there’s an inherent nonlinear relationship between the variables, often characterized by an exponential or inverse relationship.

Read on to learn more about the definition of power regression and how to perform it in R using a technique called “swole linear regression.” Or at least that’s what I think the technique should be called. Which is probably why I’m not in charge of naming things.

Comments closed

Plotting The Effects of Noise on R^2

Tomaz Kastrun messes with R^2:

So, an R-squared of 0.59 might show how well the data fit to the model (hence goodness of fit) and also explains about 59% of the variation in our dependent variable.

Given this logic, we prefer our regression models to have a high R-squared. Yes? Right! And by useless test, with adding random noise to a function, what happens next?

I like Tomaz’s scenario here and think he does a good job demonstrating the outcome. I do, however, struggle with the characterization of “making R^2 useless.” When the error term approaches an enormous value relative to the regressable components, that R^2 is telling you that something else is dominating the relationship between the independent variables and dependent variable. And this is correct: that error term does dominate. I suppose the problem here is philosophical: we call it an error term but what it signifies is “information we don’t understand about the relationship between these variables.” Yes, in this toy example, it was randomly-generated noise. But in a real dataset, it’s not random; it’s inexplicable, at least given the information you know at that time and the mechanisms you use to analyze the relationship.

Comments closed

Failure Writing Backups to Azure Blob Storage Due to Limits Reached

David Fowler hits a wall:

Picture this, you’re happily backing up your database to a Azure blob storage until suddenly it starts mysteriously failing with the error…

Write to backup block blob device https://****** failed. Device has reached its limit of allowed blocks.

What’s going on, nothing’s changed?!

Read on to learn the cause of this issue as well as three ways to fix it.

Comments closed

Indexing for Substring Searches

Daniel Hutmacher prepares the bloom filter:

A question from a client got me thinking. Relational databases (at least the ones I know and love) can’t really index for queries that use LIKE queries for a substring of a column value. If you want to search for strings beginning with a given string, a regular rowstore index will have you covered. If you’re looking for entire words or sentences, a full text index might be a good call. But because of the very way indexes work, you’ll never get great performance searching for just arbitrary parts of a string.

So today I’ll put on my lab coat and do a little rocket surgery, just to prove to the world that it can be done.

The suffix tree approach was an interesting one. I’ve also seen people attack this problem using bloom filters (as I alluded to in the link text) and n-grams. A commenter does note n-grams (specifically, tri-grams) as a viable solution as well.

Comments closed

Exploring a Dataset for Microsoft Fabric Suitability

Eugene Meidinger continues a series on learning Microsoft Fabric:

This is week 1 where I try to take Magic the Gathering draft data to learn Microsoft Fabric. Check out week 0 for some reasoning why.

So, before I do anything else, I want to get a sense of the data I’m looking at to see if it’s suitable for this project. I download the data, and because it’s gzipped, I use 7-zip to open it up on windows 10, or Windows explorer on Windows 11. In either case, the first thing I notice is the huge size disparity. When compressed, it is a quarter of a gigabyte. Uncompressed, it’s about 10 GB. This tells us something.

Read on to learn more about the dataset and how Eugene tackled some of the exploratory data analysis.

I also agree completely with Eugene’s point about serendipity. Keeping your metaphorical eyes open will increase the likelihood that you’ll just happen upon something that can help you later, or something that serves a need you didn’t know you had. I used to wander around the library back in my university days because I didn’t know what I didn’t know about topics (that is, the “unknown unknown” quadrant), so I’d just pick up some books that caught my eye. Not all of them are hits, though enough were to make the strategy worthwhile.

Comments closed

When an Update Doesn’t Update

Aaron Bertrand offers some troubleshooting advice:

Tell me if you’ve heard this one before:

I changed data from my application, but when I checked the database, I couldn’t see the change!

I’ve seen this. Loads. It can be quite perplexing for folks because they expect to see an error message if the insert, update, or delete failed. I put this post together to provide some things you can investigate if this happens to you – you are sure that you updated the data, but when you check using SQL Server Management Studio (SSMS), your change isn’t there. For the remainder of the post, I’m going to use the word “update” to mean any change to the data, even MERGE {shudder}.

Read on for three major classes of reason. One bonus reason: you left the transaction open. Most application frameworks will close transactions after a statement, but if you’re hand-writing transaction logic in your app, forgetting a COMMIT can happen.

Comments closed