Press "Enter" to skip to content

Category: Generative AI

Finding a Burrito in Ireland

Andrew Pruski has my attention and my interest:

A while back I posted about a couple of side projects that I’ve been working on when I get the chance. One of those was the Burrito Bot…a bot to make burrito recommendations in Ireland 🙂

Over the last 18 months or so I’ve reworked this project to utilise the new vector search functionality in SQL Server 2025…so now it looks like this: –

Andrew also owns the burrito-bot.com domain, as he showed it off live during his presentation at Data Saturdays Chicago. Unfortunately, it seems the service is not up at the moment, so your best bet might be to take the code from the Burrito Bot GitHub repo and build your own.

Leave a Comment

The Yin and Yang of Language Models

Erik Darling details a series of problems and somewhat-solutions:

This post is inspired by my BFFL Joe Sack’s wonderful post: Keep Humans in the Circle.

It’s an attempt to detail my progression using LLMs, up through how I’m using them today to build my free SQL Server Monitoring and Query Plan analysis tools.

While these days I do find LLMs (specifically Claude Code) to be wonderful enablers for my ideas, they still require quite a bit of guidance and QA, and they’re quite capable of (and sometimes seemingly eager to) wreck your day.

I did enjoy reading about Erik’s journey, so I figure I’ll share something of my own.

Jonathan Stewart (linking SQL Bites because I’m actively trying to shame him into creating a first video) has gone head-first into Claude Code and has dragged me along kicking and screaming. A lot of what Erik mentions resonates well with me, but there’s something that Jonathan developed that has helped: the Hater persona as an MCP server.

The Hater persona’s job is to critique whatever solution the language model comes up with. Find all of the nitpicks, point out the major gaps in implementation, come up with scenarios in which this just won’t work, that kind of thing. I’ve been Jonathan’s Hater-as-a-Service for years, so naturally, he named the MCP server Kevin. Artificial Kevin can iteratively to come up with the biggest problems, and feed that information back into the main model to fix it up. After several rounds of this, I’ve found that there aren’t nearly as many rough edges as you might find at the start.

Even so, I still stand by the assertion that language models are akin to drunken interns, and the extent to which you trust the output of a language model is on you. But in fairness, hiring the average dev from Fiverr gives you the same experience but a few orders of magnitude slower.

Leave a Comment

Managing Non-Deterministic Behavior in Language Models

Alexander Arvidsson sets expectations:

You’ve written a prompt. It works beautifully. You ship it to production.

Three days later, someone reports wildly different answers to identical questions. You run the exact same input and get a different result than yesterday. Your test suite passes locally, fails in CI, passes again on re-run.

Welcome back to non-determinism in Large Language Models.

Click through for some practical tips on how you can reduce non-deterministic behavior, as well as the trade-offs of doing so.

Leave a Comment

Where the Buck Stops

Louis Davidson talks slop:

I loathe the phrase AI Slop. I have said it before, I don’t like the phrase because it is generally attributed to some content that a person has posted. I blame the poster, not the generator. We all use AI these days, just like they used tractors to farm, computers to do accounting work, and CGI to produce movies. These are all tools.

But when I sign my name to something, it is really and truly mine. In this blog, I will discuss this and more. So as the title says, don’t blame AI, Google, a person’s teachers in grade school, nope. Blame the person who said, “This is good enough to put out in my name”, or in other words, the person in the byline. For this post and video, that is Louis Davidson.

I understand where Louis is going with this and it’s fair. When you publish something, the person ultimately responsible looks suspiciously like the picture on your driver’s license. But I think it can serve as a useful descriptive term for a category of garbage output without removing agency from the perpetrator.

Comments closed

Using Database Properties to Assist Generative AI Solutions

Brent Ozar makes use of extended properties:

You can add database instructions as extended properties at the database or object level, and when Copilot works with those objects, it’ll read your instructions and use them to shape its advice.

For example, you can add a database-level property called a “constitution” with your company’s coding standards, like this:

Andy Brownsword has another example:

The new Database Instructions are text stored against database objects to add more context about the object and how it should be used. A simple example:

We find a Sales table with a Price column. Is that the price for a single unit or the line total? Does that include or exclude VAT? What about discounts?

This is where context is king, and Database Instructions allow us to annotate these details and remove the ambiguity.

Database properties are a criminally underused part of SQL Server—in part because there wasn’t great tooling around how to display or work with these properties—and if this forces people to be a bit thoughtful in design and after-the-fact documentation on database objects, so much the better.

Comments closed

Thoughts on AI-Driven Database Development in 2026

Brent Ozar shares some thoughts:

In the PollGab question queue for Office Hours, MyRobotOverlordAsks asked a question that merited a full blog post answer:

My company announced during some AI training that within the next 12 months we won’t be writing any of our own code. Instead, we’ll be babysitting agents. What’s your opinion on this from a DB dev / DBA POV? MSSQL Dev tends to lag, so I’d personally be surprised.

If this sounds completely alien to you, check out this blog post by developer Armin Ronacher. In it, he discusses how 2025 was the year when he reluctantly shifted his development process to the point where now he spends most of his time doing exactly what MyRobotOverlordAsks’ company is proposing: rather than writing the code directly, he now asks AI tools to build and debug things for him, and he spends his time tweaking what they produce. (Update 2025/01/07: for another example, check out Eugene Meidinger’s post on his uses of AI.)

Brent is generally bullish on the idea. I agree that a lot of companies will move in this direction, but am not at all bullish that it’ll work well. I think this is mostly the latest iteration of Stack Overflow-driven development, except with less copy and paste of bad code and more generation of bad code.

If you want the really spicy version of this take, you’ll have to talk to me in person.

Comments closed

SQL Server 2025 and Vector Data

Tomaz Kastrun continues a series on SQL Server 2025 with several posts on vector data. First up is the new vector data type:

The vector data type is designed to store vector data optimized for operations such as similarity search and machine learning applications. Vectors are stored in an optimized binary format but are exposed as JSON arrays for convenience.

Implicit and explicit conversion from and to the vector type can be done using varcharnvarchar, and json types.

Second is information on vector functions:

Yesterday we looked into Vector data type and how to create table, insert the vector and read it. With SQL Server 2025, vector data type comes equipped also with couple of functions:

And third is how to generate embeddings and store the results in SQL Server:

AI_GENERATE_EMBEDDINGS is a built-in function that creates embeddings (vector arrays) using a pre-created AI model definition stored in the database.

Before running, we need to register the model; creating the master key, database scope credentials and Creating external model.

Comments closed

REST API Invocation in SQL Server 2025

Tomaz Kastrun continues an advent of SQL Server 2025. First up is external REST API endpoint execution:

This new functionality, you can call to the system stored procedure sp_invoke_external_rest_endpoint, and call / get:

– Call REST/GraphQL endpoints from other Azure services
– Have data processed via an Azure Function
– Update a Power BI dashboard
– Call an on-premises REST endpoint
– Talk to Azure OpenAI services

Then, Tomaz uses this to call a language model:

After short introduction into the  sp_invoke_external_rest_endpoint we will look into creating a REST endpoint for using LLM.

Comments closed

Gresham’s Law and AI-Generated Texts

John Mount describes a problem:

I would like to write a bit about text. That is: technical writing, legal briefs, or even an opinion piece such as this note. Such writings make up much of our society and form a “marketplace of ideas.”

Texts are now very cheap to produce using large language models (LLMs). Some simulated texts remain correct and useful, and some contain numerous subtle flaws and fabrications. In my opinion it remains expensive to reliably determine which text is which type, as LLMs are not as good at detection as fabrication.

Read on for some of the challenges that have come with the proliferation of language models and text auto-generation. John mentions scientific conferences being overwhelmed with AI-generated abstracts, peer reviews, and the like. In the technical world, we’re also seeing an inundation of AI-generated abstracts. For example, we’ve developed a few key tells for submissions to speak at our user group and will automatically reject abstracts that hit those tells. I’m sure there’s a false positive rate there, but that kind of protection mechanism is important to avoid no-shows from artificially generated abstracts.

Comments closed

Scaling On-Prem Vector Search with Ollama and Nginx

Anthony Nocentino solves a problem:

When you call out to an external embedding service from T-SQL via REST over HTTPS, you’re limited by the throughput of that backend. If you’re running a single Ollama instance, you’ll quickly hit a ceiling on how fast you can generate embeddings, especially for large datasets. I recently attended an event and discussed this topic. My first attempt at generating embeddings was for a three-million-row table. I had access to some world-class hardware to generate the embeddings. When I arrived at the lab and initiated the embedding generation process for this dataset, I quickly realized it would take approximately 9 days to complete. Upon closer examination, I found that I was not utilizing the GPUs to their full potential; in fact, I was only using about 15% of one GPU’s capacity. So I started to cook up this concept in my head, and here we are, load balancing embedding generation across multiple instances of ollama to more fully utilize the resources.

Click through for the solution.

Comments closed