Press "Enter" to skip to content

The DIFFERENCE() and SOUNDEX() Functions

Hadi Fadlallah looks at two methods of string distance:

Soundex is a phonetic algorithm developed by Robert C. Russell and Margaret King Odell in the early 1900s. This algorithm is used to index names as they are pronounced in English. The main goal of such an algorithm is to encode homophones to the same representation to be matched even if there are some slight spelling differences. As an example, consider the names “Smith” and “Smyth”, or “Mohamad” and “Mouhammad”. Soundex mainly encodes consonants and only encodes a vowel if it is the first letter of the name.

Being one of the most popular phonetic algorithms, Soundex was implemented in multiple database engines such as OracleSQL ServerMySQLSQLite, and PostgreSQL.

These two methods are not perfect and they do really limit you to one word (or small word grouping), but they are useful.