I wonder:
-If ‘max server memory’ wasn’t being overridden by a more reasonable target (because max server memory is 120 GB on a 32GB VM), would the behavior still be the same before reaching target? I bet it would be.
-Is this behavior specific to not having reached target? Or when reaching target would backup buffers be allocated, potentially triggering a shrink of the bpool, then returned to the OS afterward requiring the bpool to grow again?
-What other allocations are returned directly to the OS rather than given back to the memory manager to add to free memory? I bet CLR does this, too. Really large query plans? Nah, I bet they go back into the memory manager’s free memory.
-Does this make a big deal? I bet it could. Especially if a system is prone to develop persistent foreign memory among its NUMA nodes. In most cases, it probably wouldn’t matter.
Good questions, to which I have zero answers.