Skip to content. | Skip to navigation

Personal tools
You are here: Home Programming Python Subprocess zombies

Subprocess zombies

Cleaning up your errant commandlines.

Symptoms

I was using the subprocess module to call some external commandline programs from a webservice. The intended effect was that the results from the commandline (e.g. fasta -O testseq.fasta refset.fasta ...) would be caught and printed up on the webpage.

Problems ensue: first the results take too long to generate and the page times out. But even after speeding the query (and increasing the timeout period for the page), I'm still getting circumstances where the request goes comatose, even for trivial requests.

Diagnosis

It's all to do with subprocess. Developers would be reasonable in assuming that if they delete or dispose of the Popen object generated by subprocess, the underlying call / process should be disposed of. But it isn't necessarily. In fact, there is a bit of opaque code in subprocess that seems to retain processes that aren't complete when their owning Popen is disposed of. A cleanup process is called the next time a Popen is created. The net effect is that a lot of running (or terminated but not disposed of) processes can accumulate.

Solution.

There's no good answer, but you may want to do this to your Popens:

proc = subprocess.Popen (...)
[... use it ...]
if (proc.poll() is None):
   os.kill (proc.pid, signal.SIGTERM)

Further complications

subprocess can trip you up other ways. For example, a call that runs fine on one computer or with one set of parameters may apparently hang on a different computer or with different parameters. This occurs at the wait stage:

proc  = subprocess.Popen (... stdout=PIPE, stderr=PIPE)
result = proc.wait() # here

The problem is to do with IO buffers. (See here, here and here.)  The system I/O fills up stdout and stderr with the output from the program and waits for it to be consumed (i.e. read) before it goes any further. And so the process hangs.

There are a number of solutions for this, none elegant. You can continually read the buffers, emptying them so more can be put in. You can create open file handles and pass these to stdout and stderr, for the output to be directed there. You can slip a redirect into the commandline:

myprog --arg1 --arg2 > redirect.txt

although that will only take care of stdout. Finally, setting both of the buffers to None (which is the default) may work as well.

Document Actions
Visitors
Locations of visitors to this page
Ads
 
Sections