Press "Enter" to skip to content

Category: Vectors

Generating Local Text Embeddings in SQL Server 2025

Greg Low continues a series on local text embeddings:

In the first article of this series, I explained how to install and configure Ollama to host text embeddings models locally. I also demonstrated how to install Caddy as a proxy to allow SQL Server to use Ollama via https-based calls. In this article, I’ll show you how to make use of this at the SQL Server end.

Greg mentions a few embedding models, but the one I’m pushing right now is Microsoft Research’s Harrier OSS v1 model, specifically the 600m parameter version. It does extremely well in the MTEB leaderboards (the 27b variant is at the top of the board as of June 2026) and has a permissive license. It generates vectors in 1024 dimensions, so the embeddings are a bit chunky, taking up 4kb apiece. But the results are really good.

Leave a Comment

Vector Search with Oracle against Iceberg Tables

Brendan Tierney performs a search:

In my previous blog posts I’ve explored how to use Iceberg Tables and how to integrate these in with your Data Lake. Additionally, I showed how to setup your Oracle Data Lake (Database) to access the data stored in Iceberg Tables stored in OCI Object Storage. To access this Iceberg Table data from the Oracle Database we created an External Table. This allows us to query the Iceberg Table data as if it was internal to the database. With all new releases there is continuous improvement in the features and to make them easier to use. One such new feature (as of 23.26.1) is the ability to read vector data types from an External Table. This new feature is called or referred to as ‘Vectors on Ice’.

With Oracle Database External Tables now supporting vector embedding stored in Iceberg Tables, means you can generate vector embeddings with your preferred embedding model (external to Oracle using your faviourite tool/library), store them in Iceberg Tables in cloud object storage (OCI Object Storage, AWS S3, etc.), and run semantic search from Oracle AI Database, accessing vector data stored within the database and externally with the minimum of data movement and with similar SQL queries.

Click through to see how.

Leave a Comment

Hosting an Local Embedding Service for SQL Server 2025

Greg Low has a demo:

Instead of working directly with text, images, or other rich content, an embedding represents that content as a set of numbers that capture semantic relationships learned by a model. This lets systems work with meaning in a mathematical way rather than relying on concepts like text matching.

Embeddings are the output of trained text-based AI models. One possibly surprising concept is that they’re used for similarity, as opposed to facts – and for relative closeness instead of exact matches.

SQL Server is not designed to host or execute those models. This isn’t a limitation; it is a design choice. While it would be possible to run code within SQL Server to generate embeddings, it just wouldn’t be a good idea.

Note that SQL Server can already run machine learning models directly, and use the PREDICT statement to make predictions. We don’t want to be doing that with the language models we need for embeddings, though.

There are also some really good local embedding models, and between that and the vector similarity functionality in SQL Server, you can perform semantic search without needing to reach out to an online service.

Leave a Comment

Vector Chunking and SQL Server 2025

Greg Low breaks down a document:

If you’ve started to work with vector databases and looked at using text embeddings for AI search, you might have come across the term chunking and wondered what it relates to. In this article, I’ll explain the concept in general – and then show how it works in SQL Server 2025.

Read on for that explanation. Greg also includes a quick example of how this looks in SQL Server 2025 when passing text data through an embedding model.

Comments closed

Visualizing High-Dimensional Vectors

Andrew Pruski takes a look:

Following on from my previous post on building The Burrito Bot, I want to delve into visualisation of vector embeddings that were generated from the restaurant data pulled from Google Maps.

Those embeddings had 1536 dimensions, each dimension corresponding to an axis within a high dimensional space, with embeddings that have similar meanings grouped together in that high dimensional space.

1536 dimensions…is a lot of dimensions! And for me, a hard concept to get my head around. It all just feels so abstract (to me anyway), I want to see what they actually look like!

Click through for a link to a website that helps with that visualization. It ultimately performs principal component analysis (PCA) to get 1536 (or however many) dimensions down to 3 principal components. It’s not perfect, but it does give us the ability to reason over the data.

Comments closed

The State of Vector Indexes in SQL Server 2025

Rebecca Lewis separates marketing hype from reality:

Microsoft’s entire marketing pitch for SQL Server 2025 is ‘the AI-ready database.’ It went GA on November 18, 2025. We are now four months in. Here is what is actually GA, what is still behind a preview flag, and what that means if you are evaluating this for production.

Read on for a list, as well as a summary of Erik Darling’s great work on the topic.

My take on this is that vector indexes are where columnstore indexes were in SQL Server 2012: a neat idea, but not ready for prime time. It took until 2016 before columnstore indexes were actually worthwhile (primarily, the introduction of clustered columnstore indexes and ability to rebuild indexes), so we’ll see if it takes as long for vector indexes to get all of the necessary functionality.

1 Comment

Performing Log Shipping between SQL Server Versions

Greg Low answers a question:

One of the discussion lists that I participate in, had a brief discussion the other day about whether or not it’s possible to perform log shipping between differernt versions of SQL Server.

Specifically, can you do log shipping between SQL Server 2017 and SQL Server 2025?

Because the question was not in the header, it does not violate Betteridge’s Law of Headlines. Well done, Greg.

Comments closed

Vectors and Columnstore Indexes

Niko Neugebauer continues a series on columnstore indexes:

In this post we are going to test one of the more promising technologies in SQL Server-based offerings – Vector data types and its relationship with the Columnstore Indexes. The tests I am running right now are executed against SQL Server 2025 RTM, the latest and greatest SQL Server version available to customers. Given that some parts of the SQL Server 2025 were delivered as a Preview Features, the current situation might change in the future for SQL Server 2025 (at least, Half-precision float support should evolve into the fully supported feature, in my opinion). At very least, I do expect reasonably fast evolution of the space on Azure SQL Database & Azure SQL Managed Instance.

This seems like more pain than joy, which is the unfortunate reality of v1 features in SQL Server anymore.

Comments closed