Making Text Search Faster

Rob Farley looks at binary collations and specifically hash functions on aggregation:

If there’s an index on the column you’re grouping, then changing the collation is going to hurt a bit. Grouping could take advantage of a Stream Aggregate under our indexed collation, but changing the column is like throwing it away the index order (ORDER BY doesn’t get handled well by changing the collation) means a Hash is required. But comparing two query plans that both use Hash Match (Aggregate), one on a case-insensitive collation and one on a binary collation, then I found the latter was slightly faster. Not as drastic a change as searching, but still 10-30% better. One would run in about 12 seconds, and one in about 10.

Be sure to check out his comments for more details.

Related Posts

Meidinger’s Law

Eugene Meidinger shares his thoughts on the future: Since we are prognosticating, I want to take a guess at one of the constraints limiting the future.  I present you with Meidinger’s law: An industry’s growth is constrained by how much your junior dev can learn in two years. Let me explain. On my team, one […]

Read More

Using Date Types In Warehouses

Koen Verbeeck argues that date keys in warehouses should be actual date types: The worst are by far the string representation, as there is no actual check on the contents. It can literally contain everything. And is ’01/02/2018′ the first of February 2018 (like any sane person would read, because days come before months), or […]

Read More


March 2016
« Feb Apr »