Fuzzy Searches In SQL Server

Kevin Feasel

2017-02-24

Data

Phil Factor wants fuzzy searches done inside the relational database:

Many times in the past, I’ve had arguments with members of the development teams who, when we are discussing fuzzy searches, draw themselves up to their full height, look dignified, and say that a relational database is no place to be doing fuzzy searches or spell-checking. It should, they say, be done within the application layer. This is nonsense, and we can prove it with a stopwatch.

We are dealing with data. Relational databases do this well, but it just has to be done right. This implies searching on well-indexed fields such as the primary key, and not being ashamed of having quite large working tables. It means dealing with the majority of cases as rapidly as possible. It implies learning from failures to find a match. It means, most of all, a re-think from a procedural strategy.

This is a very interesting article, as Phil’s tend to be.  I enjoy these types of solutions where it requires almost an inversion of mindset:  instead of writing code which understands the data you intended, writing simpler code which looks at intention-laden data.

Related Posts

Kaggle-Maintained Data

Noah Daniels announces Maintained by Kaggle data sets: The “Maintained by Kaggle” badge means that Kaggle is now and will continue to actively maintain that dataset. This includes regular updates to descriptions and metadata, quicker response rates in discussion, and accurate current data from the source. Our goal is to create seamless workflows that allow […]

Read More

t-closeness And Data Anonymity

John Cook shares some thoughts about k-anonymity and t-closeness: The idea of k-anonymity is that every database record appears at least k times. If you have a lot of records and few fields, your value of k could be high. But as you get more fields, it becomes more likely that a combination of fields is unique. If k = 1, then k-anonymity offers […]

Read More

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728