Henrietta Dombrovskaya digs into the write-ahead log:
First, when the users fixed one of the primary suspects jobs, the situation with WAL growth didn’t change. Second, the rate of the growth couldn’t be explained by these suboptimal jobs: the data volumes they were removing and reloading were still magnitudes smaller than the WAL size we were dealing with. Finally, I decided to do what I should have done from the start – to take a look at what exactly was in these super-fast growing WALs.
Read on to learn what Henrietta found. Also check out the comments for some additional context.