Press "Enter" to skip to content

Curated SQL Posts

Local Vector Search in SQL Server 2025

Andy Yun gives vector search a try:

With the announcement of SQL Server 2025 Public Preview, hopefully you are interested in test driving Vector Search.

Microsoft has already posted demo code, but it’s only for OpenAI on Azure. But many of us are wondering about running things locally. So I thought I’d share a step-by-step of getting Ollama setup and running locally on my laptop. End-to-end, these instructions should take less than 30 minutes to complete.

Andy’s process involves downloading and running an embedding model to generate the vectors, creating an external model pointing to Ollama, and using it to generate embeddings.

Comments closed

JSON Indexes in SQL Server 2025 CTP 2.0

Daniel Hutmacher gives it a try:

Starting today, the public preview of SQL Server 2025 is available to download!

One really interesting new feature that got my attention was the addition of JSON indexes. I’m a big fan of everything that makes working with JSON easier, since JSON blobs are so much easier to work with than table variables when you’re moving data from point A to point B. This is especially true when you’re working with complex, relational data.

Daniel lays out some of the limitations of JSON index creation and also some of the performance gains you might see from it. This will be most helpful in data engineering scenarios, shredding JSON from various services, but the normalization purist in me says that if you’re shredding JSON enough to need indexes, it’s probably time to normalize that data.

1 Comment

Running SQL Server 2022 on Ubuntu 24.04

Laerte Junior gives it a go:

Microsoft does not yet support this edition of Ubuntu, but there are some workarounds to make it work. This should not be used for production usage and this blog is for educational/testing purposes only.

For my installation, I am using an AWS EC2 Ubuntu 24.04 with 2 GB of Ram. 2GB of RAM is the minimum required. This guide is targeted towards people who have installed SQL Server on previous versions of Ubuntu.

Laerte got it to work, but honestly, I’d rather wait for official support, especially if you’re stuck installing older versions of security-related packages (libldap vs the libldap2 that exists on Ubuntu 24.04).

Comments closed

The Unreliability of Microsoft Fabric

Brent Ozar points out some major issues:

The link https://aka.ms/fabricsupport takes you to a localized status page that almost always shows all green checkmarks – even when the service is on fire. During last month’s 12+hour overnight outage, people were screaming on Reddit overnight that things were down, but the status dashboard was showing all green. When Microsoft employees woke up, they asked if people were still having problems – and then eventually got around to updating the status page to reflect the outage when it was clear that things were really borked.

Redditors have resorted to relying on reporting Fabric outages to Statusgator, who then tracks the time gap between a burst of user outage reports, to the time Microsoft actually updates their status page – and it ain’t pretty:

Click through for Brent’s take and an embarrassingly bad post-mortem. Given that Microsoft Fabric is a software-as-a-service product, there’s an inherent level of trust necessary in using it: you’re relying upon the platform team to ensure things are running smoothly and that you get what you’re paying for. Incidents like this erode that trust. Outages themselves are bad but they do happen. The real problem is in not embracing the outage: be clear with customers on current status and cause, and ensure people can easily see the history of events.

Comments closed

Auditing SQL Server Login Options

Chad Callihan audits logins:

Do you know who is logging into your SQL Server?

I was once asked about the need to track SQL Server logins. Many servers were already tracking failed logins. Where the issue came up in this case was tracking successful logins to determine login usage. Let’s take a quick look at how we can track both failed and successful logins.

Security-oriented me always wants both failed and successful logins, as you want to know if the person who failed to log in eight times did in fact successfully log in on the ninth attempt.

Comments closed

New Capabilities in SQL Server 2025 CTP 2.0

Randolph West lays out some favorite features:

Three years ago, when the first public preview of SQL Server 2022 (CTP 2.0) was announced, I was a few months in at the SQL Docs team, and had very little to do with that release.

Three years later, the team is slightly larger (we’re called Data Docs now), and I was much more involved with helping scores of people merge the content for SQL Server 2025 (CTP 2.0).

Click through for Randolph’s favorite features for administrators and for developers that are available right now in the community technical preview.

Comments closed

The Spurious Correlations R Package

Mauricio Vargas S. shows correlation:

spuriouscorrelations package started as a fun project for one of my tutorials.

Here is a case of an interesting correlation: the number of people who drowned by falling into a pool and the number of films Nicholas Cage appeared in.

Click through for examples and how to use the package. If you’re interested in more of these, Tyler Vigen’s website has plenty, and he even wrote a book. H/T R-Bloggers.

Comments closed

Backfilling Data in TimescaleDB

Semab Tariq takes us through a problem:

Backfilling data into a TimescaleDB hypertable in production can be very tricky, especially when automated processes like compression policies are involved. From past experience, we have seen that if backfill operations aren’t handled properly, they can interfere with these automated tasks, sometimes causing them to stop working altogether. 

This blog covers a safer and more reliable approach to backfilling hypertables, along with best practices to prevent disruptions to compression and other background processes.

Read on for several tips. Backfills can be challenging in any database, but time-scale databases like TimescaleDB introduce their own unique issues.

Comments closed

Methods to Expand a Power BI Matrix Visual

Chris Webb runs some performance tests:

If you have a Power BI report with a matrix visual on it it’s quite likely that you’ll want all the levels in the matrix to be fully expanded by default. But did you know that the way you expand all the levels could have performance implications, especially if you’re using DirectQuery mode? Here’s an example.

Click through to see what options are available to you and their performance implications.

Comments closed