Python close_fds issue

So I spent the better part of my evening trying to track down a bug, which turns out to be a “feature” of python.

I had just installed the GitPlugin for trac but I started to experience problems. When browsing the source inside trac it was taking over 30seconds to load the page and sometimes it would fail completely. A lot of searching didn’t help much, so I attempted to debug the problem myself. The first thing I noticed was Apache was taking 100% of the processor for a good 30seconds. I attached strace to it and saw something like this:

[pid 22682] close(43029)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43030)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43031)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43032)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43033)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43034)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43035)                = -1 EBADF (Bad file descriptor)
[pid 22682] close(43036)                = -1 EBADF (Bad file descriptor)

This obviously didn’t look good! After some tinkering I found the problem went away when I ran trac standalone, instead of using mod_python or fcgi. This turned out to be a bit of a red herring because I spent my time trying to figure out what was different between a standalone executable and one being run inside Apache.

After playing around with environment variables, I gave up and attempted to printf debug the trac git plugin. I found that the actual call to git was taking on the order of seconds, whereas calling it myself from the command took milliseconds. The line of code (in PyGIT.py) looked a bit like this:

p = Popen(self.__build_git_cmd(git_cmd, *cmd_args), stdin=None, 
        stdout=PIPE, stderr=PIPE, close_fds=True)

Now, when I removed the close_fds argument the problems went away! After some more digging I found this bug report which describes the behaviour of close_fds. Python will spin in a tight loop calling close for all possible valid fd number. WTF! You can see the python code here:

        def _close_fds(self, but):
            os.closerange(3, but)
            os.closerange(but + 1, MAXFD)

So the simple fix to this was to remove the close_fds, so that Python doesn’t stupidly spin calling close(). I suspect the reason I only noticed this when running inside Apache, is because Apache must have a larger MAXFD. Hopefully in the future Python will change this behaviour and find a more sensible way to close all file descriptors, especially when I read this bug report which advises changing close_fds default to true.

Be Sociable, Share!

1 Comment

Trackbacks

  1. Jenkins and django: "Cannot allocate memory" with popen from django-jenkins run | BlogoSfera

Leave Your Comment

Your email will not be published or shared. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>