Implementing SoundEx

Kevin Feasel

2016-06-15

Search

Dror Helper shows how to implement SoundEx in C#:

It’s fairly easy to follow the steps of the algorithm (as defined by Wikipedia):

  1. Retain the first letter of the name and drop all other occurrences of a, e, I, o, u, y, h, w.

  2. Replace consonants with digits as follows (after the first letter):

    • b, f, p, v → 1

    • c, g, j, k, q, s, x, z → 2

    • d, t → 3

    • l → 4

    • m, n → 5

    • r → 6

  3. If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by ‘h’ or ‘w’ are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.

  4. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers.

SQL Server also supports SOUNDEX as a built-in function.

Related Posts

Improving Solr Performance

Kevin Feasel

2017-06-19

Search

Michael Sun has some tips to improve performance of Solr operations, focusing on memory tuning but including a few other tips as well: For time series applications, it’s very common to have queries in the following pattern q=*:*&fq=[NOW-3DAYS TO NOW] However, this is not a good practice from memory perspective. Under the hood, Solr converts […]

Read More

Embedded Solr With Scala

Anurag Srivastava shows how to use Embedded Solr using an example written in Scala: Embedded Solr has the same interface as Solr without requiring an HTTP connection. When we “embed” Solr into a Java an application, it provides the exact same API that you would use if you were connecting to a remote Solr instance. […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930