Press "Enter" to skip to content

APPROX_COUNT_DISTINCT

Niko Neugebauer is happy with a new function in SQL Server 2019:

A rather interesting result takes place if we scale our database to 100GB TPCH and run the very same queries – the total elapsed time jumps to 50% difference (from 30%), the CPU execution time difference is kept at 50%, but the memory grant gives the biggest difference ever – those 24.476 MB are still intact for the APPROX_DISTINCT_COUNT, while the COUNT(DISTINCT) asks for just a bit over 11GB ! Besides going through a completely different gateway on the bigger machines, running COUNT(DISTINCT) will bring your system to a full stop way before the same will take place with the APPROX_DISTINCT_COUNT.
Regarding the precision – in my tests I did not see the difference going over 1%.

Test before using this function, but if you don’t the correct number and can make do with “close enough,” this can save a boatload of memory on larger tables.