Press "Enter" to skip to content

Curated SQL Posts

Sparse Columns and Space Utilization

Steve Jones gins up a demo:

I saw this as a question submitted at SQL Server Central, and wasn’t sure it was correct, but when I checked, I was surprised. If you choose to designate columns as sparse, but you have a lot of data, you can use more space.

This post looks at how things are stored and the impact if much of your data isn’t null.

I consider sparse columns a relic of the mid-aughts era, when storage was a lot more expensive and compression was an Enterprise Edition-only feature. Given that you can use page compression in any edition of SQL Server nowadays, I don’t think there’s a viable reason ever to have a sparse column.

Also, definitely check out the comments, where Jeff Moden has a great one.

Leave a Comment

K-Means Clustering in SQL Server

Sebastiao Pereira implements k-means clustering in T-SQL:

K-means clustering is an unsupervised machine learning algorithm used to group data into k distinct clusters based on their similarity, allowing for customer segmentation, anomaly detection, trend analysis, etc. The most common machine learning tutorials focus on Python or R. Normally, data is stored in SQL Server, and it is necessary to move data out of the database to apply clustering algorithms and then, if necessary, to update the original data with the cluster numbers. Is it possible to do it directly in SQL Server?

Given the work you have to do to implement this, I can’t imagine that it would be particularly fast. But it is neat to see that it’s possible.

Leave a Comment

Collations and citext in PostgreSQL

Umut Tekin picks up on a thread:

Recently, I read Laurenz Albe’s blog about case insensitive string search. He recommended case insensitive collations saying, “it won’t be any more expensive (and maybe more intuitive) than the current techniques of using upper() or the citext extension are today”. In other words, a custom collation can be used instead of citext to achieve case insensivity with a lower performance penalty. Today we will be testing it.

Click through for the results of those tests.

Leave a Comment

Comparing Write-Back Options for Power BI

Jon Vöge compares two options:

We’ve previously on this blog covered Power Apps write-back for Power BI/Fabric comprehensively, and in the past months we’ve taken a stab at the Fabric Native solution: Translytical Task Flows.

However, when comparing the different options, which solution actually comes out on top?

Read on as Jon contrasts the two options and explains when you might want to use each.

Leave a Comment

Modifying Power BI Page Visibility and Active Status via Semantic Link Labs

Meagan Longoria hides (or shows) a page:

Setting page visibility and the active page are often overlooked last steps when publishing a Power BI report. It’s easy to forget the active page since it’s just set to whatever page was open when you last saved the report. But we don’t have to settle for manually checking these things before we deploy to a new workspace (e.g., from dev to prod). If our report is in PBIR format, we can run Fabric notebooks to do this for us.

Click through for a notebook and an explanation.

Leave a Comment

SQL Server Auditing Bug Allows Data Exfiltration without Logging

Andreas Wolter describes a bug in SQL Server’s auditing capabilities:

Last week, I was contacted by an IT Leader from Saudi-Arabia who previously found several CVE’s in Oracle and Microsoft SQL Server. He wanted my opinion on a newly discovered security issue in SQL Server Auditing.

Interestingly, his findings directly overlap with a topic I wrote about just last month: Using Data Classification to Audit Data Access.

Emad Al-Mousa identified two vulnerabilities in the SENSITIVE_BATCH_COMPLETED Audit Action Group. Microsoft Security Response Center (MSRC) acknowledged the issue but classified it as low priority – meaning it may not be addressed until a major release, if at all.

Read on to see what the issue is and how you can trigger it today. Andreas also includes a workaround that will work in the meantime.

Leave a Comment

Automating SQL Server Deployments via dbatools

David Seis digs into scripted SQL Server installation:

In this and the next two blog posts I will be bringing diverse dbatools commands into scripts that can handle a complete deployment, do a checkup of major health and configuration metrics, and do a true up of a pre-existing instance. This post will cover the complete deployment, which if you have been reading the audit series will be much more than just the SQL install of last post. This time we are aiming for the whole thing. Install, update, configure host, configure SQL, Deploy maintenance. Everything  I can think of!!

Clicking next-next-next one or two times for SQL Server installation is fine—it gives you an idea of what capabilities are available and what you need to know about. By the time you’ve installed SQL Server 5-10 times, you should familiarize yourself with the configuration files (especially because they get auto-generated for you after you use the GUI—SQL Server itself uses these to install!), and should be looking for ways to automate this process and avoid misclicks or wasting time that you could otherwise be using by reading Curated SQL.

Leave a Comment

No More Default Semantic Models in Microsoft Fabric

Nicky van Vroenhoven has good news for us:

Another quick post, because today is an important day for everyone working with Fabric and Power BI!

Last month, Microsoft announced they are Sunsetting Default Semantic Models: Yaay! 
Today marks that day: No more automatic child semantic models!

The idea of having a default semantic model seemed like a good one, but the problem was that too many environments had very specific needs that a default semantic model couldn’t anticipate or address. As a result, these tended to confuse end users more than save them time.

Leave a Comment