Search – Curated SQL

Vector Search from Scratch

Published 2025-06-11 by Kevin Feasel

In this article, I’ll walk you through every step from generating vector representations to searching using cosine similarity, and we’ll even visualize what’s happening behind the scenes. By the end, you’ll not only understand how vector search works but also have a working implementation you can build on. So, let’s get started.

It’s kind of funny how simple this is, but it is. A lot of the complexity is around data quality operations, as well as optimizing the search process.

Semantic Search in PostgreSQL

Published 2025-03-31 by Kevin Feasel

Hans-Jürgen Schönig performs a search:

PostgreSQL offers advanced capabilities through extensions like pgvector, which enable semantic search at a level and quality never achieved before. Unlike traditional text search, which mostly relies on trivial string comparison, semantic search in PostgreSQL goes beyond keywords by understanding context and meaning, enhancing relevance.

The quick idea here is that converting words (or parts of words) into vectors can maintain most of the semantic meaning behind the words. Then, when we perform certain types of vector comparisons, we can take advantage of this semantic meaning and find results whose language is different from our query but the concept is a match for what we want. Click through for the full article.

Comments closed

Knowledge Management via Azure OpenAI

Published 2023-12-01 by Kevin Feasel

Paul Hernandez builds a system:

In this post, I would like to show you how I implemented a simple use case to exemplify how you can query your data by implementing a chat application using Azure Open AI. Of course, we cannot only answer questions, LLMs are also capable of summarizing texts, or extracting entities. I decided to call it “Knowledge Management Assistant”, since I would like to use the application to assist me with some tedious tasks, which consumes some of my limited time.

Click through for the process. I would have recommended checking the box for vector search, though I imagine that would have blown past the limitations of the Basic tier of Azure AI Search (nee Azure Cognitive Search).

Comments closed

Combining Cosmos DB and Azure Search

Published 2023-08-10 by Kevin Feasel

Hasan Savran does some looking:

In my previous post, I discussed the process of establishing a Free-text search for Azure Cosmos DB. Towards the end, I demonstrated how to carry out a free-text search using the Azure Portal. Now, I will guide you on how to perform this search using code. To perform this search by code, I created a basic console application and added Azure.Search.Documents and Microsoft.Azure.Cosmos.

Click through for that demonstration.

Comments closed

Full-Text Search in Cosmos DB via Cognitive Services

Published 2023-07-28 by Kevin Feasel

Hasan Savran performs a search:

Incorporating Full-Text Search functionality into your application can enable users to locate what they are searching for effortlessly. Searching for specific words or phrases within a database has always been a difficulty, particularly for relational databases. Throughout my career, I’ve had countless discussions/arguments with DBAs about the importance of implementing full-text search in a relational database. We are in totally different times, now users want to search by voice, image, or video.

Full-Text Search functionality is not part of Azure Cosmos DB’s Database Engine. Firstly, we must establish the Azure Cognitive Search service and link the data from Azure Cosmos DB to the Search Service. The process of setting up Azure Cognitive Search is relatively straightforward. Like other Azure services, you will need to answer similar types of questions beforehand. (Subscription, Resource Group, a name for the service, region, and tier)

By the way, Azure Cognitive Search is very similar to Elasticsearch, for those of you familiar with that technology.

Comments closed

A Summary of Full-Text Search in SQL Server

Published 2023-05-10 by Kevin Feasel

Paul Hernandez gives us a primer on full-text search in SQL Server:

Sometimes you want to perform a search using one or more keywords over one or multiple character columns in a table. Clustered, nonclustered or column stored indexes (organized in a B-Tree structure) will help you with such a task. You can of course use the LIKE operator and do wildcard text searches, but this is still inefficient. Full-text search in SQL Server and Azure SQL lets you perform full-text queries against character based-data in your tables.

Read on to learn more about the topic. I’ve used full-text search with some success once, and my failed attempts count (in that, I tried to use FTS but it wasn’t a good use case and it didn’t work) is a little bit higher. The biggest thing I found was that it struggled with very large numbers of rows–I had tried examples with 50-100 million rows and the index never finished building.

Comments closed

Creating an Elasticsearch Pipeline

Published 2023-03-16 by Kevin Feasel

The Big Data in Real World team builds a pipeline:

A pipeline is a definition of a series of processors that are to be executed in the same order as they are declared.

Think of a processor as a series of instructions that will be executed.

In this post we are going to create a pipeline to add a field named doc_timestamp to all the documents that are added to the index.

Click through for the process. In Elasticsearch, ingest pipelines aren’t for moving data but rather for performing some common operations or tasks prior to indexing the data.

Comments closed

Role-Based Access Controls in Amazon OpenSearch

Published 2023-03-16 by Kevin Feasel

Scott Chang and Muthu Pitchaimani show how to assign rights in Amazon OpenSearch to IAM groups:

Amazon OpenSearch Service is a managed service that makes it simple to secure, deploy, and operate OpenSearch clusters at scale in the AWS Cloud. AWS IAM Identity Center (successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. To build a strong least-privilege security posture, customers also wanted fine-grained access control to manage dashboard permission by user role. In this post, we demonstrate a step-by-step procedure to implement IAM Identity Center to OpenSearch Service via native SAML integration, and configure role-based access control in OpenSearch Dashboards by using group attributes in IAM Identity Center. You can follow the steps in this post to achieve both authentication and authorization for OpenSearch Service based on the groups configured in IAM Identity Center.

Click through for the process.

Comments closed

Understanding Azure Cognitive Search Costs

Published 2023-03-14 by Kevin Feasel

Matt Eland doesn’t want to break the bank:

Let’s continue my recent trend in exploring pricing tips for the various parts of AI and Machine Learning on Azure with a dive into Azure Cognitive Search.

Sometimes confused with the AI offerings of Azure Cognitive Services, the entirely different Azure Cognitive Search is a rich service that allows you to index a variety of files and documents, extract meaning from those documents, and provide rich search results to users.

In this article we’ll explore the pricing structure of Azure Cognitive Search and highlight some things you should be aware of as you plan and develop your Cognitive Search resources.

Read the whole thing if you’re thinking of using Azure Cognitive Search. It’s a good service and I think the pricing model is fairly straightforward, though there are always nuances to these things.

Comments closed

Semantic Search in Azure Cognitive Search

Published 2021-03-03 by Kevin Feasel

Rangan Majumder, et al, have an article on how semantic search works in Azure Cognitive Search:

As part of our AI at Scale effort, we lean heavily on recent developments in large Transformer-based language models to improve the relevance quality of Microsoft Bing. These improvements allow a search engine to go beyond keyword matching to searching using the semantic meaning behind words and content. We call this transformational ability semantic search—a major showcase of what AI at Scale can deliver for customers.
Semantic search has significantly advanced the quality of Bing search results, and it has been a companywide effort: top applied scientists and engineers from Bing leverage the latest technology from Microsoft Research and Microsoft Azure. Maximizing the power of AI at Scale requires a lot of sophistication. One needs to pretrain large Transformer-based models, perform multi-task fine-tuning across various tasks, and distill big models to a servable form with very minimal loss of quality. We recognize that it takes a large group of specialized talent to integrate and deploy AI at Scale products for customers, and many companies can’t afford these types of teams. To empower every person and every organization on the planet, we need to significantly lower the bar for everyone to use AI at Scale technology.

Click through to learn more about the technology.

Comments closed

Category: Search