'Have subprocess.Popen only wait on its child process to return, but not any grandchildren

I have a python script that does this:

p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=False) 
theStdin=request.input.encode('utf-8')
(outputhere,errorshere) = p.communicate(input=theStdin)

It works as expected, it waits for the subprocess to finish via p.communicate(). However within the pythonscript.py I want to "fire and forget" a "grandchild" process. I'm currently doing this by overwriting the join function:

class EverLastingProcess(Process):
    def join(self, *args, **kwargs):
        pass # Overwrites join so that it doesn't block. Otherwise parent waits.
    def __del__(self):
        pass

And starting it like this:

p = EverLastingProcess(target=nameOfMyFunction, args=(arg1, etc,), daemon=False)
p.start()

This also works fine I just run pythonscript.py in a bash terminal or bash script. Control and a response returns while the child process started by EverLastingProcess keeps going. However, when I run pythonscript.py with Popen running the process as shown above, it looks from timings that the Popen is waiting on the grandchild to finish. How can I make it so that the Popen only waits on the child process, and not any grandchild processes?



Solution 1:[1]

The solution above (using the join method with the shell=True addition) stopped working when we upgraded our Python recently.

There are many references on the internet about the pieces and parts of this, but it took me some doing to come up with a useful solution to the entire problem.

The following solution has been tested in Python 3.9.5 and 3.9.7.

Problem Synopsis

The names of the scripts match those in the code example below.

A top-level program (grandparent.py):

  • Uses subprocess.run or subprocess.Popen to call a program (parent.py)
  • Checks return value from parent.py for sanity.
  • Collects stdout and stderr from the main process 'parent.py'.
  • Does not want to wait around for the grandchild to complete.

The called program (parent.py)

  • Might do some stuff first.
  • Spawns a very long process (the grandchild - "longProcess" in the code below).
  • Might do a little more work.
  • Returns its results and exits while the grandchild (longProcess) continues doing what it does.

Solution Synopsis

The important part isn't so much what happens with subprocess. Instead, the method for creating the grandchild/longProcess is the critical part. It is necessary to ensure that the grandchild is truly emancipated from parent.py.

  • Subprocess only needs to be used in a way that captures output.
  • The longProcess (grandchild) needs the following to happen:
    • It should be started using multiprocessing.
    • It needs multiprocessing's 'daemon' set to False.
    • It should also be invoked using the double-fork procedure.
    • In the double-fork, extra work needs to be done to ensure that the process is truly separate from parent.py. Specifically:
      • Move the execution away from the environment of parent.py.
      • Use file handling to ensure that the grandchild no longer uses the file handles (stdin, stdout, stderr) inherited from parent.py.

Example Code

grandparent.py - calls parent.py using subprocess.run()

#!/usr/bin/env python3
import subprocess 
p = subprocess.run(["/usr/bin/python3", "/path/to/parent.py"], capture_output=True) 

## Comment the following if you don't need reassurance

print("The return code is:  " + str(p.returncode))
print("The standard out is: ")
print(p.stdout)
print("The standard error is: ")
print(p.stderr)

parent.py - starts the longProcess/grandchild and exits, leaving the grandchild running. After 10 seconds, the grandchild will write timing info to /tmp/timelog.

!/usr/bin/env python3

import time
def longProcess() :
    time.sleep(10)
    fo = open("/tmp/timelog", "w")
    fo.write("I slept!  The time now is: " + time.asctime(time.localtime()) + "\n")
    fo.close()


import os,sys
def spawnDaemon(func):
    # do the UNIX double-fork magic, see Stevens' "Advanced
    # Programming in the UNIX Environment" for details (ISBN 0201563177)
    try:
        pid = os.fork()
        if pid > 0: # parent process
            return
    except OSError as e:
        print("fork #1 failed. See next. " )
        print(e)
        sys.exit(1)

    # Decouple from the parent environment.
    os.chdir("/")
    os.setsid()
    os.umask(0)

    # do second fork
    try:
        pid = os.fork()
        if pid > 0:
            # exit from second parent
            sys.exit(0)
    except OSError as  e:
        print("fork #2 failed. See next. " )
        print(e)
        print(1)

    # Redirect standard file descriptors.
    # Here, they are reassigned to /dev/null, but they could go elsewhere.
    sys.stdout.flush()
    sys.stderr.flush()
    si = open('/dev/null', 'r')
    so = open('/dev/null', 'a+')
    se = open('/dev/null', 'a+')
    os.dup2(si.fileno(), sys.stdin.fileno())
    os.dup2(so.fileno(), sys.stdout.fileno())
    os.dup2(se.fileno(), sys.stderr.fileno())

    # Run your daemon
    func()

    # Ensure that the daemon exits when complete
    os._exit(os.EX_OK)


import multiprocessing
daemonicGrandchild=multiprocessing.Process(target=spawnDaemon, args=(longProcess,))
daemonicGrandchild.daemon=False
daemonicGrandchild.start()
print("have started the daemon")  # This will get captured as stdout by grandparent.py

References

The code above was mainly inspired by the following two resources.

  1. This reference is succinct about the use of the double-fork but does not include the file handling we need in this situation.
  2. This reference contains the needed file handling, but does many other things that we do not need.

Solution 2:[2]

Edit: the below stopped working after a Python upgrade, see the accepted answer from Lachele.

Working answer from a colleague, change to shell=True like this:

p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=True)

I've tested and the grandchild subprocesses stay alive after the child processes returns without waiting for them to finish.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 BrokenBenchmark
Solution 2