Regular Expressions In Lucene

Kendra Little looks at Azure Search searches:

I wanted to be able to find all architect jobs using something like ‘%rchit%’ as well, because there’s not a lot of great ways to do this in SQL Server.

In SQL Server, you can use a traditional B-Tree index to seek, but only based on the letters at the beginning of a character column.  If I want to know every business title that contains ‘%rchit%’, I’m going to have to scan an entire index.

SQL Server fulltext indexes don’t solve the double-wildcard problem, either. Fulltext indexes support word prefix searches– so a fulltext index would be great at finding all job titles that contain a word that starts with ‘Arch%’.

Sometimes that’s enough. But a lot of times, you do need to find a substring anywhere in a word. And sometimes you do want to offload that from your database.

This is the kind of problem Lucene (and its follow-up implementations, like Elasticsearch) was designed to solve.  Read on for more details as Kendra solves the problem in Azure Search.

Related Posts

Saving An ADF Pipeline As A Template

Rayis Imayev shares with us how you can save an Azure Data Factory pipeline as a template: Azure Data Factory (ADF) provides you with a framework for creating data transformation solutions in the Microsoft cloud environment. Recently introduced Template Gallery for ADF pipelines can speed up this development process and provide you with helpful information to create additional activity […]

Read More

Improving The Azure Automated AG Experience

Allan Hirt would like to see a few improvements to the experience when creating availability groups on Azure VMs: What does this mean? To have a supported WSFC-based configuration (doesn’t matter what you are running on it – could be something non-SQL Server), you need to pass validation. xFailOverCluster does not allow this to be run. You […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31