Press "Enter" to skip to content

Memory-Optimized Hash Indexes And Bucket Counts

Jack Li relates a story about how it’s important to think through bucket count for memory-optimized hash indexes:

I was working with a customer to troubleshoot memory optimized table issues.  In this scenario, our customer uses a memory optimized table variable.  He put 1 million rows of data into the table variable and then process it.  Based on what he said, I tried to come up with a repro to see if I can duplicate the problem.   While troubleshooting that issue, I ran into another issue where I can’t even insert 1 million row into a memory optimized table variable.

My rule of thumb is 30-50% more buckets than expected rows.  That way I can handle some level of unexpected growth while keeping the chance of a hash collision and really slow linked list scan as low as possible.  The official guidance says performance should be “acceptable” with up to 5x rows per buckets, but my experience has been that “acceptable” is a generous term at that point.