APPROX_COUNT_DISTINCT

Kevin Feasel

2018-10-15

T-SQL

Niko Neugebauer is happy with a new function in SQL Server 2019:

A rather interesting result takes place if we scale our database to 100GB TPCH and run the very same queries – the total elapsed time jumps to 50% difference (from 30%), the CPU execution time difference is kept at 50%, but the memory grant gives the biggest difference ever – those 24.476 MB are still intact for the APPROX_DISTINCT_COUNT, while the COUNT(DISTINCT) asks for just a bit over 11GB ! Besides going through a completely different gateway on the bigger machines, running COUNT(DISTINCT) will bring your system to a full stop way before the same will take place with the APPROX_DISTINCT_COUNT.
Regarding the precision – in my tests I did not see the difference going over 1%.

Test before using this function, but if you don’t the correct number and can make do with “close enough,” this can save a boatload of memory on larger tables.

Related Posts

Faster Scalar Functions In SQL Server 2019

Brent Ozar looks at improvements the SQL Server team has made to scalar functions in 2019: My database has to be in 2019 compat mode to enable Froid, the function-inlining magic. Run the same query again, and the metrics are wildly different: Runtime: 4 seconds CPU time: 4 seconds Logical reads: 3,247,991 (which still sounds bad, […]

Read More

Uncovering Complexity In SQL Objects

Michael J. Swart helps us uncover hidden complexity in database objects: The other day, Erin Stellato asked a question on twitter about the value of nested SPs. Here’s how I weighed in: I’m not a fan of nested anything. Too much hidden complexity. Code reusability leads to queries that are jack of all trades, master […]

Read More

Categories

October 2018
MTWTFSS
« Sep Nov »
1234567
891011121314
15161718192021
22232425262728
293031