Press "Enter" to skip to content

Category: Generative AI

Thoughts on AI-Driven Database Development in 2026

Brent Ozar shares some thoughts:

In the PollGab question queue for Office Hours, MyRobotOverlordAsks asked a question that merited a full blog post answer:

My company announced during some AI training that within the next 12 months we won’t be writing any of our own code. Instead, we’ll be babysitting agents. What’s your opinion on this from a DB dev / DBA POV? MSSQL Dev tends to lag, so I’d personally be surprised.

If this sounds completely alien to you, check out this blog post by developer Armin Ronacher. In it, he discusses how 2025 was the year when he reluctantly shifted his development process to the point where now he spends most of his time doing exactly what MyRobotOverlordAsks’ company is proposing: rather than writing the code directly, he now asks AI tools to build and debug things for him, and he spends his time tweaking what they produce. (Update 2025/01/07: for another example, check out Eugene Meidinger’s post on his uses of AI.)

Brent is generally bullish on the idea. I agree that a lot of companies will move in this direction, but am not at all bullish that it’ll work well. I think this is mostly the latest iteration of Stack Overflow-driven development, except with less copy and paste of bad code and more generation of bad code.

If you want the really spicy version of this take, you’ll have to talk to me in person.

Leave a Comment

SQL Server 2025 and Vector Data

Tomaz Kastrun continues a series on SQL Server 2025 with several posts on vector data. First up is the new vector data type:

The vector data type is designed to store vector data optimized for operations such as similarity search and machine learning applications. Vectors are stored in an optimized binary format but are exposed as JSON arrays for convenience.

Implicit and explicit conversion from and to the vector type can be done using varcharnvarchar, and json types.

Second is information on vector functions:

Yesterday we looked into Vector data type and how to create table, insert the vector and read it. With SQL Server 2025, vector data type comes equipped also with couple of functions:

And third is how to generate embeddings and store the results in SQL Server:

AI_GENERATE_EMBEDDINGS is a built-in function that creates embeddings (vector arrays) using a pre-created AI model definition stored in the database.

Before running, we need to register the model; creating the master key, database scope credentials and Creating external model.

Comments closed

REST API Invocation in SQL Server 2025

Tomaz Kastrun continues an advent of SQL Server 2025. First up is external REST API endpoint execution:

This new functionality, you can call to the system stored procedure sp_invoke_external_rest_endpoint, and call / get:

– Call REST/GraphQL endpoints from other Azure services
– Have data processed via an Azure Function
– Update a Power BI dashboard
– Call an on-premises REST endpoint
– Talk to Azure OpenAI services

Then, Tomaz uses this to call a language model:

After short introduction into the  sp_invoke_external_rest_endpoint we will look into creating a REST endpoint for using LLM.

Comments closed

Gresham’s Law and AI-Generated Texts

John Mount describes a problem:

I would like to write a bit about text. That is: technical writing, legal briefs, or even an opinion piece such as this note. Such writings make up much of our society and form a “marketplace of ideas.”

Texts are now very cheap to produce using large language models (LLMs). Some simulated texts remain correct and useful, and some contain numerous subtle flaws and fabrications. In my opinion it remains expensive to reliably determine which text is which type, as LLMs are not as good at detection as fabrication.

Read on for some of the challenges that have come with the proliferation of language models and text auto-generation. John mentions scientific conferences being overwhelmed with AI-generated abstracts, peer reviews, and the like. In the technical world, we’re also seeing an inundation of AI-generated abstracts. For example, we’ve developed a few key tells for submissions to speak at our user group and will automatically reject abstracts that hit those tells. I’m sure there’s a false positive rate there, but that kind of protection mechanism is important to avoid no-shows from artificially generated abstracts.

Comments closed

Scaling On-Prem Vector Search with Ollama and Nginx

Anthony Nocentino solves a problem:

When you call out to an external embedding service from T-SQL via REST over HTTPS, you’re limited by the throughput of that backend. If you’re running a single Ollama instance, you’ll quickly hit a ceiling on how fast you can generate embeddings, especially for large datasets. I recently attended an event and discussed this topic. My first attempt at generating embeddings was for a three-million-row table. I had access to some world-class hardware to generate the embeddings. When I arrived at the lab and initiated the embedding generation process for this dataset, I quickly realized it would take approximately 9 days to complete. Upon closer examination, I found that I was not utilizing the GPUs to their full potential; in fact, I was only using about 15% of one GPU’s capacity. So I started to cook up this concept in my head, and here we are, load balancing embedding generation across multiple instances of ollama to more fully utilize the resources.

Click through for the solution.

Comments closed

Natural Language Querying in SQL Server

Hadi Fadlallah shells out to an API:

Data is usually the most important asset in organizations, but only SQL developers can frequently access that data. Technical teams often write queries for non-technical users. This restricts agility, slows decision-making, and creates a bottleneck in data accessibility. One possible remedy is natural language processing (NLP), which enables users to ask questions in simple English and receive answers without knowing any code. Still, the majority of NLP-to-SQL solutions are cloud-based, which raises issues with cost and privacy.

This particular solution has nothing to do with the embedding features in SQL Server 2025. Instead, it essentially shells out to an Ollama API and runs the resulting SQL query. It’s reasonably neat but I’d have so many qualms putting anything like this into production.

Comments closed

T-SQL Tuesday 189 Round-Up

Taiob Ali summarizes this month’s T-SQL Tuesday:

I would like to thank all the participants of T-SQL Tuesday #189. If I missed your post, it was not intentional. Please let me know, and I will add it to this list.

I am proud of this community and feel lucky to be a small part of it. I admire everyone who joined the blog party and shared their thoughts on how AI is changing our careers, as well as your thoughts on AI tools.

Click through to see the responses.

Comments closed

Building a Vector Data Demo Database for SQL Server 2025

Andy Yun has a new demo database:

Today, I have the honor and pleasure of debuting a new presentation for MSSQLTips: A Practical Introduction to Vector Search in SQL Server 2025 (you can watch the recording here too). To accompany that new presentation, I opted to create a new demo database instead of retrofitting one of my existing demo databases. And I’m sharing it with you so you don’t have to go through the headache of taking an existing database and creating vector embeddings.

Click through for Andy’s demo database, which is approximately 16 GB in size, so not a tiny one.

Comments closed

Copilots, MCP Servers, and Connection Strings

Chad Baldwin shares a warning:

Well, a few days ago, I ran into the result of one of those awkward pieces when combining the MSSQL extension for VS Code, MSSQL MCP Server and Copilot.

The short of it is…I asked Copilot to change the connection used by the MSSQL extension to use a particular database. I later asked Copilot to describe a table in the database (which uses the MSSQL MCP server), only for it to claim the table didn’t exist. I realized right away it was due to competing connections between the MSSQL extension and the MSSQL MCP Server configuration. It was also at that moment where I realized this situation could potentially be SO MUCH worse than simply not finding a table…

So let’s set up a worst case scenario and see what happens.

This is basically the equivalent of “Wait, that SSMS window was production? Uh-oh.” Not that this has ever happened to me, of course. Or any of you. Nope.

Comments closed

EchoLeak: Zero-Click Copilot Vulnerability

Alex Woodie reports on a vulnerability:

The Microsoft Copilot vulnerability, dubbed EchoLeak, was listed as CVE-2025-32711 in the NIST’s National Vulnerability Database, which gave the flaw a severity score of 9.3. According to Aim Labs, which discovered EchoLeak and shared its research with the world last week, the “zero-click” flaw could “allow attackers to automatically exfiltrate sensitive and proprietary information from M365 Copilot context, without the user’s awareness, or relying on any specific victim behavior.” Microsoft patched the flaw the following day.

The blog post linked above is pretty interesting. Microsoft has patched the vulnerability, so this particular attack vector shouldn’t be an issue. But it will certainly open up the doors for more fun ways of exploiting generative AI-based services.

Comments closed