I was a bit perplexed by this issue, since the system was certainly no slouch — plenty of cores, 3TB of memory, and XtremIO storage. And none of these FlushCache messages were ever paired with the 15 second I/O warnings in the error log. Still, if you stack a bunch of high-transaction databases on there, checkpoint processing can get pretty sluggish. Not so much because of the direct I/O, but more reconciliation that has to be done with a massive number of dirty pages (not just from committed transactions) scattered across such a large amount of memory, and potentially waiting on the lazywriter (since there is only one for the whole instance).
Read on for several links and the results of Aaron’s testing.