266ms was the partitioned table under SQL Server 2016 (compatibility level 120) while 353ms of the total elapsed time was obtained on SQL Server 2014! This represents a solid 25% improvement
All execution plans will have the same iterators, but will differ on the overall estimated cost (the non-partitioned queries will be way lower than the partitioned ones), as well as the distribution of the estimated costs within the execution plan, but as for the rest – it will be quite similar, like the one shown on the image below:
These improvements were swamped by the aggregate predicate pushdown improvements in 2016, at least in Niko’s example, but I’ll take a free 25%-33% performance improvement.