Cardinality Estimation And String Splits

Dan Holmes points out a quirk of estimated row counts with CLR-based functions:

That is an enormous amount of data.  What if you needed to sort that?  What if you joined this to another table or view and a spool was required.  What it it was a hash join and a memory grant was required?  The demand that this seemingly innocuous statement placed on your server could be overwhelming.

The memory grant could create system variability that is very difficult to find.  There is a thread on MSDN that I started which exposes what prompted this post.  (The plan that was causing much of the problem is at this link.)

It’s important to keep in mind the good enough “big round figures” that SQL Server uses for row estimation when stats are unavailable (e.g., linked server to Hive or a CLR function like in the post).  These estimates aren’t always correct, and there are edge cases like the one in the post in which the estimates will be radically wrong and begin to affect your server.

Related Posts

Hiding Work: The Nested Loop Operator

Erik Darling explains that the nested loop operator is like a duck: there’s more going on beneath the surface than it lets on: I’m going to talk about my favorite example, because it can cause a lot of confusion, and can hide a lot of the work it’s doing behind what appears to be a […]

Read More

Stored Procedure IF Branching and Performance

Erik Darling explains that the IF block in a stored procedure won’t help you with performance: Making plan choices with IF branches like this plain doesn’t work.The optimizer compiles a plan for both branches based on the initial compile value.What you end up with is a stored proc that doesn’t do what it’s supposed to […]

Read More

Categories

February 2016
MTWTFSS
« Jan Mar »
1234567
891011121314
15161718192021
22232425262728
29