Robins Tharakan notes an upcoming performance boost:
The hidden cost of knowing too much. That’s one way to describe what happens when your data is skewed, Postgres statistics targets are set high, and the planner tries to estimate a join.
For over 20 years, Postgres used a simple O(N^2) loop to compare (equi-join) Most Common Values (MCVs) during join estimation. It worked fine when statistics targets are small (
default_statistics_targetdefaults to 100). But in the modern era – we often see Postgres best-practices recommend cranking that up. Customers are known to be using higher values (1000 and sometimes even higher) to handle complex data distributions + throw a 10 JOIN query to the mix – and this “dumb loop” can easily become a silent performance killer during planning.That changes in Postgres 19.
Read on for an example of the problem and what is coming out to mitigate issues that currently exist.