Making Text Search Faster

Rob Farley looks at binary collations and specifically hash functions on aggregation:

If there’s an index on the column you’re grouping, then changing the collation is going to hurt a bit. Grouping could take advantage of a Stream Aggregate under our indexed collation, but changing the column is like throwing it away the index order (ORDER BY doesn’t get handled well by changing the collation) means a Hash is required. But comparing two query plans that both use Hash Match (Aggregate), one on a case-insensitive collation and one on a binary collation, then I found the latter was slightly faster. Not as drastic a change as searching, but still 10-30% better. One would run in about 12 seconds, and one in about 10.

Be sure to check out his comments for more details.

Anglicize Values

Dave Mattingly shows an easy way to anglicize values:

If your customer’s name is “José” but you search for “Jose”, you won’t (by default) find him.

Here’s a simple way to take care of that in your SQL database, without changing the data that you have.

If a particularly system only needs to support one language (e.g., English), this can be helpful, at least until somebody throws in Chinese or Hebrew characters.  That said, supporting Unicode is the best move when available.

Categories

July 2017
MTWTFSS
« Jun  
 12
3456789
10111213141516
17181920212223
24252627282930
31