Press "Enter" to skip to content

The OOM Killer Cometh

Venu Cherukupalli shows how to keep Linux’s Out of Memory Killer from taking down SQL Server:

When an index rebuild was kicked off on a large table (around 25GB), the reindex operation terminated, and the availability group had failed over to the other replica.

Upon further investigation, we discovered that the SQL Server process terminated at the time reindex operation was run and this resulted in the failover.

To determine the reason for the unexpected shutdown, we reviewed the Linux System Logs (/var/log/messages on RHEL) & pacemaker logs. From the pacemaker logs and system logs, we saw entries indicating that oom-killer was invoked, and as a result SQL Server process was terminated.

Read on for the two solutions.  I was hoping for a solution that involved making the SQL Server executable immune from oom-killer’s wily ways, but not so much in this post.