Metaphones In SQL

Phil Factor builds a function to generate metaphones in SQL Server:

Metaphone algorithms are designed to produce an approximate phonetic representation, in ASCII, of regular “dictionary” words and names in English and some Latin-based languages. It is intended for indexing words by their English pronunciation. It is one of the more popular of the phonetic algorithms and was published by Lawrence Philips in 1990. A Metaphone is up to ten characters in length.

It is used for fuzzy searches for records where each string to be searched has an index with a Metaphone key. You search for all records with the same or similar metaphone key and then refine the search by some ranking algorithm such as Damerau–Levenshtein distance. Metaphone searches are particularly popular with ‘ancestor’ sites that search on surnames where spellings vary considerably for the same surname. The current version, Metaphone 3, is actively maintained by Lawrence Philips, developed to account for all spelling variations commonly found in English words, first and last names found in the United States and Europe, and non-English words whose native pronunciations are familiar to English-speakers. The source of Metaphone 3 is proprietary, and Lawrence charges a fee to supply the source.

Read on for the script.

Related Posts

Pitfalls with Window Functions

Itzik Ben-Gan takes us through two issues you might run into when using window functions: There are two common pitfalls involving window functions, both of which are the result of counterintuitive implicit defaults that are imposed by the SQL standard. One pitfall has to do with calculations of running totals where you get a window […]

Read More

Spaces in CHAR Columns

John McCormack wants to store a single space in a CHAR(1) column: I was asked by a colleague why his where clause wasn’t being selective when filtering on a space value. The column was a char(1) data type. To understand the curious case of the space in char(1), we need to understand how the char […]

Read More

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728