Paul White takes us through several performance improvements around aggregate pushdown:
SQL Server 2016 introduced serial batch mode processing and aggregate pushdown. When pushdown is successful, aggregation is performed within the Columnstore Scan operator itself, possibly operating directly on compressed data, and taking advantage of SIMD CPU instructions.
The performance improvements possible with aggregate pushdown can be very substantial. The documentation lists some of the conditions required to achieve pushdown, but there are cases where the lack of ‘locally aggregated rows’ cannot be fully explained from those details alone.
This article covers additional factors that affect aggregate pushdown for
GROUP BY
queries only. Scalar aggregate pushdown (aggregation without aGROUP BY
clause), filter pushdown, and expression pushdown may be covered in a future post.
Read the whole thing.