Making Text Search Faster

Rob Farley looks at binary collations and specifically hash functions on aggregation:

If there’s an index on the column you’re grouping, then changing the collation is going to hurt a bit. Grouping could take advantage of a Stream Aggregate under our indexed collation, but changing the column is like throwing it away the index order (ORDER BY doesn’t get handled well by changing the collation) means a Hash is required. But comparing two query plans that both use Hash Match (Aggregate), one on a case-insensitive collation and one on a binary collation, then I found the latter was slightly faster. Not as drastic a change as searching, but still 10-30% better. One would run in about 12 seconds, and one in about 10.

Be sure to check out his comments for more details.

Anglicize Values

Dave Mattingly shows an easy way to anglicize values:

If your customer’s name is “José” but you search for “Jose”, you won’t (by default) find him.

Here’s a simple way to take care of that in your SQL database, without changing the data that you have.

If a particularly system only needs to support one language (e.g., English), this can be helpful, at least until somebody throws in Chinese or Hebrew characters.  That said, supporting Unicode is the best move when available.

Categories

May 2017
MTWTFSS
« Apr  
1234567
891011121314
15161718192021
22232425262728
293031