Press "Enter" to skip to content

Category: Generative AI

Major Announcements from Microsoft Build 2026

James Serra puts together a list:

Once again there were a number of Microsoft Build announcements related to data and AI, and some were very impressive. Below are my favorites. I am prioritizing the data announcements first, because that is where my brain naturally goes (and because AI without good data is just a very confident intern with access to a keyboard).

The biggest announcements across Microsoft Fabric and databases can be found in the Microsoft Build 2026: Building agentic apps with Microsoft Fabric and Microsoft Databases blog post.

This looks like another year of Fabric + AI as the (almost) entire data platform focus at Build.

Leave a Comment

Noise in CRAN Package Additions

Joseph Rickert shows a consequence of lowering the bar for application development:

If you are reading this post on R-bloggers, you will probably know that I have been publishing my selection of the “Top 40” new R packages on CRAN for quite some time. I did this first as part of my work at Revolution Analytics, then on R Views for RStudio and Posit, and now here on R Works. It used to take about a day’s worth of pleasurable work spread out over a month to select forty interesting packages. For a hundred or so packages, I could look at all of the package webpages, download and play with a small number of them. Now, the “Top 40” has become a real hamster-on-the-wheel project. The following plot shows my count of the number of new packages to make it to CRAN since I began publishing on R Works.

Click through to see what Joseph has laid out. The part that surprises me is, historically, CRAN was pretty difficult to get a package into and you typically needed to jump through a certain number of quality gates. I suppose that has to have changed given what Joseph notes around the lack of documentation in many of these new packages.. But it could be that my understanding of it was wrong H/T R-Bloggers.

Leave a Comment

Vibe Coding and Maintenance

Buck Woody has an essay:

Artificial Intelligence constructs, from Large Language Models answering questions to Agentic AI that runs various workflows are fantastic, amazing, helpful tools in getting a job done. They aren’t quite completely automating entire tasks (The best ones as of this writing are correctly implementing around one out of three tasks accurately: https://llm-stats.com/benchmarks/apex-agents) but they are still a very helpful tool. “Vibe Coding” which means explaining to a model that can write code (or a Codex) what you want the code to do, trying it out, then correcting it until it does the thing, is prevalent everywhere now. And it’s easy to do.

But the code a Codex creates meets a single need: to ship.

This matches pretty well with what I’ve seen. You can definitely build something, which may be good enough for single-person use. But maintenance is a separate story altogether and raises the old adage that you can only maintain code less sophisticated than your knowledge level. Between that and cognitive overload, you can easily end up with a code base that you can’t understand.

Leave a Comment

Next Token Selection in Language Models

Ivan Palomares Carrascosa explains how three knobs shape the outputs of a language model:

In this article, you will learn how logits, temperature, and top-p sampling work together to control next-token prediction in large language models.

Topics we will cover include:

  • What logits are and how they are produced by a transformer’s final linear layer.
  • How temperature and top-p (nucleus sampling) shape the probability distribution used for token selection.
  • How these three components fit into a sequential pipeline that governs LLM output generation.

Click through for that explanation.

Comments closed

Generating Sample Data in Fabric Dataflows

Chris Webb builds some data:

Back in December the FabricAI.Prompt() M function was released in Fabric Dataflows Gen2. Most of the people writing about it at that time, as in this great post by my colleague Sandeep Pawar, focused on calling this function for each row in a table – something that the UI in the editor makes easy. However the FabricAI.Prompt() function itself is a lot more flexible. You can use it to summarise whole tables of data as I showed here; you can also use it to generate sample data. This is similar to what I blogged about here where I got Copilot to generate M code that returned sample data but using FabricAI.Prompt() is maybe a bit simpler.

Click through to see how.

Comments closed

Finding a Burrito in Ireland

Andrew Pruski has my attention and my interest:

A while back I posted about a couple of side projects that I’ve been working on when I get the chance. One of those was the Burrito Bot…a bot to make burrito recommendations in Ireland 🙂

Over the last 18 months or so I’ve reworked this project to utilise the new vector search functionality in SQL Server 2025…so now it looks like this: –

Andrew also owns the burrito-bot.com domain, as he showed it off live during his presentation at Data Saturdays Chicago. Unfortunately, it seems the service is not up at the moment, so your best bet might be to take the code from the Burrito Bot GitHub repo and build your own.

Comments closed

The Yin and Yang of Language Models

Erik Darling details a series of problems and somewhat-solutions:

This post is inspired by my BFFL Joe Sack’s wonderful post: Keep Humans in the Circle.

It’s an attempt to detail my progression using LLMs, up through how I’m using them today to build my free SQL Server Monitoring and Query Plan analysis tools.

While these days I do find LLMs (specifically Claude Code) to be wonderful enablers for my ideas, they still require quite a bit of guidance and QA, and they’re quite capable of (and sometimes seemingly eager to) wreck your day.

I did enjoy reading about Erik’s journey, so I figure I’ll share something of my own.

Jonathan Stewart (linking SQL Bites because I’m actively trying to shame him into creating a first video) has gone head-first into Claude Code and has dragged me along kicking and screaming. A lot of what Erik mentions resonates well with me, but there’s something that Jonathan developed that has helped: the Hater persona as an MCP server.

The Hater persona’s job is to critique whatever solution the language model comes up with. Find all of the nitpicks, point out the major gaps in implementation, come up with scenarios in which this just won’t work, that kind of thing. I’ve been Jonathan’s Hater-as-a-Service for years, so naturally, he named the MCP server Kevin. Artificial Kevin can iteratively to come up with the biggest problems, and feed that information back into the main model to fix it up. After several rounds of this, I’ve found that there aren’t nearly as many rough edges as you might find at the start.

Even so, I still stand by the assertion that language models are akin to drunken interns, and the extent to which you trust the output of a language model is on you. But in fairness, hiring the average dev from Fiverr gives you the same experience but a few orders of magnitude slower.

Comments closed

Managing Non-Deterministic Behavior in Language Models

Alexander Arvidsson sets expectations:

You’ve written a prompt. It works beautifully. You ship it to production.

Three days later, someone reports wildly different answers to identical questions. You run the exact same input and get a different result than yesterday. Your test suite passes locally, fails in CI, passes again on re-run.

Welcome back to non-determinism in Large Language Models.

Click through for some practical tips on how you can reduce non-deterministic behavior, as well as the trade-offs of doing so.

Comments closed

Where the Buck Stops

Louis Davidson talks slop:

I loathe the phrase AI Slop. I have said it before, I don’t like the phrase because it is generally attributed to some content that a person has posted. I blame the poster, not the generator. We all use AI these days, just like they used tractors to farm, computers to do accounting work, and CGI to produce movies. These are all tools.

But when I sign my name to something, it is really and truly mine. In this blog, I will discuss this and more. So as the title says, don’t blame AI, Google, a person’s teachers in grade school, nope. Blame the person who said, “This is good enough to put out in my name”, or in other words, the person in the byline. For this post and video, that is Louis Davidson.

I understand where Louis is going with this and it’s fair. When you publish something, the person ultimately responsible looks suspiciously like the picture on your driver’s license. But I think it can serve as a useful descriptive term for a category of garbage output without removing agency from the perpetrator.

Comments closed

Using Database Properties to Assist Generative AI Solutions

Brent Ozar makes use of extended properties:

You can add database instructions as extended properties at the database or object level, and when Copilot works with those objects, it’ll read your instructions and use them to shape its advice.

For example, you can add a database-level property called a “constitution” with your company’s coding standards, like this:

Andy Brownsword has another example:

The new Database Instructions are text stored against database objects to add more context about the object and how it should be used. A simple example:

We find a Sales table with a Price column. Is that the price for a single unit or the line total? Does that include or exclude VAT? What about discounts?

This is where context is king, and Database Instructions allow us to annotate these details and remove the ambiguity.

Database properties are a criminally underused part of SQL Server—in part because there wasn’t great tooling around how to display or work with these properties—and if this forces people to be a bit thoughtful in design and after-the-fact documentation on database objects, so much the better.

Comments closed