Press "Enter" to skip to content

Category: Generative AI

Scaling On-Prem Vector Search with Ollama and Nginx

Anthony Nocentino solves a problem:

When you call out to an external embedding service from T-SQL via REST over HTTPS, you’re limited by the throughput of that backend. If you’re running a single Ollama instance, you’ll quickly hit a ceiling on how fast you can generate embeddings, especially for large datasets. I recently attended an event and discussed this topic. My first attempt at generating embeddings was for a three-million-row table. I had access to some world-class hardware to generate the embeddings. When I arrived at the lab and initiated the embedding generation process for this dataset, I quickly realized it would take approximately 9 days to complete. Upon closer examination, I found that I was not utilizing the GPUs to their full potential; in fact, I was only using about 15% of one GPU’s capacity. So I started to cook up this concept in my head, and here we are, load balancing embedding generation across multiple instances of ollama to more fully utilize the resources.

Click through for the solution.

Leave a Comment

Natural Language Querying in SQL Server

Hadi Fadlallah shells out to an API:

Data is usually the most important asset in organizations, but only SQL developers can frequently access that data. Technical teams often write queries for non-technical users. This restricts agility, slows decision-making, and creates a bottleneck in data accessibility. One possible remedy is natural language processing (NLP), which enables users to ask questions in simple English and receive answers without knowing any code. Still, the majority of NLP-to-SQL solutions are cloud-based, which raises issues with cost and privacy.

This particular solution has nothing to do with the embedding features in SQL Server 2025. Instead, it essentially shells out to an Ollama API and runs the resulting SQL query. It’s reasonably neat but I’d have so many qualms putting anything like this into production.

Leave a Comment

T-SQL Tuesday 189 Round-Up

Taiob Ali summarizes this month’s T-SQL Tuesday:

I would like to thank all the participants of T-SQL Tuesday #189. If I missed your post, it was not intentional. Please let me know, and I will add it to this list.

I am proud of this community and feel lucky to be a small part of it. I admire everyone who joined the blog party and shared their thoughts on how AI is changing our careers, as well as your thoughts on AI tools.

Click through to see the responses.

Comments closed

Building a Vector Data Demo Database for SQL Server 2025

Andy Yun has a new demo database:

Today, I have the honor and pleasure of debuting a new presentation for MSSQLTips: A Practical Introduction to Vector Search in SQL Server 2025 (you can watch the recording here too). To accompany that new presentation, I opted to create a new demo database instead of retrofitting one of my existing demo databases. And I’m sharing it with you so you don’t have to go through the headache of taking an existing database and creating vector embeddings.

Click through for Andy’s demo database, which is approximately 16 GB in size, so not a tiny one.

Comments closed

Copilots, MCP Servers, and Connection Strings

Chad Baldwin shares a warning:

Well, a few days ago, I ran into the result of one of those awkward pieces when combining the MSSQL extension for VS Code, MSSQL MCP Server and Copilot.

The short of it is…I asked Copilot to change the connection used by the MSSQL extension to use a particular database. I later asked Copilot to describe a table in the database (which uses the MSSQL MCP server), only for it to claim the table didn’t exist. I realized right away it was due to competing connections between the MSSQL extension and the MSSQL MCP Server configuration. It was also at that moment where I realized this situation could potentially be SO MUCH worse than simply not finding a table…

So let’s set up a worst case scenario and see what happens.

This is basically the equivalent of “Wait, that SSMS window was production? Uh-oh.” Not that this has ever happened to me, of course. Or any of you. Nope.

Comments closed

EchoLeak: Zero-Click Copilot Vulnerability

Alex Woodie reports on a vulnerability:

The Microsoft Copilot vulnerability, dubbed EchoLeak, was listed as CVE-2025-32711 in the NIST’s National Vulnerability Database, which gave the flaw a severity score of 9.3. According to Aim Labs, which discovered EchoLeak and shared its research with the world last week, the “zero-click” flaw could “allow attackers to automatically exfiltrate sensitive and proprietary information from M365 Copilot context, without the user’s awareness, or relying on any specific victim behavior.” Microsoft patched the flaw the following day.

The blog post linked above is pretty interesting. Microsoft has patched the vulnerability, so this particular attack vector shouldn’t be an issue. But it will certainly open up the doors for more fun ways of exploiting generative AI-based services.

Comments closed

Trying out Microsoft Fabric Data Agents

Wolfgang Strasser gives a generative AI solution built into Microsoft Fabric a try:

Today, I wanted to give the new Fabric Data Agents a try. According to the documentation, a Fabric Data Agent is defined as follows:

Data agent in Microsoft Fabric is a new Microsoft Fabric feature that allows you to build your own conversational Q&A systems using generative AI. A Fabric data agent makes data insights more accessible and actionable for everyone in your organization. With a Fabric data agent, your team can have conversations, with plain English-language questions, about the data that your organization stored in Fabric OneLake and then receive relevant answers. This way, even people without technical expertise in AI or a deep understanding of the data structure can receive precise and context-rich answers.

Let’s give it a try and build our first Data Agent.

Click through for the pre-requisites, the setup process, and how everything looked for Wolfgang.

Comments closed

Local Vector Search in SQL Server 2025

Andy Yun gives vector search a try:

With the announcement of SQL Server 2025 Public Preview, hopefully you are interested in test driving Vector Search.

Microsoft has already posted demo code, but it’s only for OpenAI on Azure. But many of us are wondering about running things locally. So I thought I’d share a step-by-step of getting Ollama setup and running locally on my laptop. End-to-end, these instructions should take less than 30 minutes to complete.

Andy’s process involves downloading and running an embedding model to generate the vectors, creating an external model pointing to Ollama, and using it to generate embeddings.

Comments closed

Building a Multi-Agent Orchestrator with Flink and Kafka

Sean Falconer builds an orchestration engine:

Just as some problems are too big for one person to solve, some tasks are too complex for a single artificial intelligence (AI) agent to handle. Instead, the best approach is to decompose problems into smaller, specialized units so that multiple agents can work together as a team.

This is the foundation of a multi-agent system—networks of agents, each with a specific role, collaborating to solve larger problems.

Read on for the overview. There’s also a code repository and a free e-book on the topic.

Comments closed

Model Documentation via Fabric Data Agent

Chris Webb gets some answers:

AI is meant to help us automate boring tasks, and what could be more boring than creating documentation for your Power BI semantic models? It’s such a tedious task that most people don’t bother; there’s also an ecosystem of third party tools that do this job for you, and you can also build your own solution for this using DAX DMVs or the new-ish INFO functions (see here for a good example). That got me wondering: can you use Fabric Data Agents to generate documentation for you? And what’s more, why even generate documentation when you can just ask a Data Agent the questions that you’d need to generate documentation to answer?

For a simple scenario, Chris was able to get pretty solid results. As complexity grows, your mileage may vary.

Comments closed