2021-04-20 16:02:17

OOMKiller and httpd

How to set up httpd to survive when OOMKiller kills one of its children.

In Copr, we have had a leaking process in our frontend. It is one route, which was leaking few megabytes. The route has a separate child process in httpd, so only one process has been leaking. We still did not identify the culprit, and in the meantime we had to fight with OOMKiller.

Few megabytes here and there and the process was too big. And we run out of memory. OOMKiller came and killed the process (as it was the biggest one). Usually, you will not care. Httpd is killing its children periodically, and when one is killed, the master process starts new child immediately. But…

Default OOMPolicy is stop. And here I will quote from man systemd.service:

OOMPolicy= 
Configure the Out-Of-Memory (OOM) killer policy. On Linux, when memory
becomes scarce the kernel might decide to kill a running process in order to free up 
memory and reduce memory pressure. This setting takes one of continue, stop or
kill. If set to continue and a process of the service is killed by the kernel's
OOM killer this is logged but the service continues running. If set to stop the
event is logged but the service is terminated cleanly by the service manager.
If set to kill and one of the service's processes is killed by the OOM killer
the kernel is instructed to kill all remaining processes of the service, too.
Defaults to the setting DefaultOOMPolicy= in systemd-system.conf(5) is set to,
except for services where Delegate= is turned on, where it defaults to continue.

Stop means that the OOMKiller will kill the process, it master process and all siblings. Effectively systemctl stop httpd. :((

Do you want to try it? Grab mod_oom from Copr project.

Put in httpd.conf:

 
  LoadModule oom_module modules/mod_oom.so 
   
  SetHandler oom 
  

And visit http://localhost/oom. It will eat all your memory, and you will see what happens. :) Your httpd service will be stopped.

How to improve this?

You can put: OOMPolicy=continue in httpd.service. In fact, Joe Orton did it in Fedora as default. More information in BZ 1947475.

Next time you will have OOM in your child proces, you will likely not even notice. Unless you carefully check the log.

Big thank goes to Joe Orton and Pavel Raiskup for the investigation.


Posted by Miroslav Suchý | Permanent link
Comments
comments powered by Disqus