'Why use devnull with subprocess.check_output in python2?

I have inherited a complicated (to me) toolbox from a PhD student before me and it uses python2 instead of 3 since it's older. The main file is in python2 and uses lots of modules and user-made scripts within it. A key one involves a python2 script running a C++ program which then returns a long string of data for the python2 script to interpret and pass back to the main script. This was working fine until I had to move from the computing cluster this code was written on to the newer cluster, going from torque to slurm.

In addition I did upgrade part of the toolbox, specifically the programs called by the C++ program - but the program calls these and runs them correctly and literally just returns numbers calculated through them back to the python script so I do not believe that upgrade is the problem.

In the original code python2 runs the C++ code like so:

def main(template_file, pars, vars):
    """Run requested command and return list of result values."""

    global arguments, timeout, timelimit, formula_eval, data_fields_code

    with open(os.devnull) as devnull, timelimit(timeout):
        output = subprocess.check_output(
            arguments + ([template_file] if template_file else []),
            stdin=devnull,
            stderr=subprocess.STDOUT
        )
    
    # group 0 is the full number match
    # make sure it stays that way when changing number_pattern!
    all_numbers = [float(x[0]) for x in re.findall(number_pattern, output)]

arguments is python2 bytecode, it contains the command to run the C++ as well as a point in some space (like (x, y, z) all as decimals). As you can see the template file is not required, its just an alternate way of inputting data to the system, on the C++ end it looks like so;

int       yt_in;  // - Type
double    Z7_in;  // - Z7
double    mH_in;  // - mH
double   mHc_in;  // - mHc
double    mA_in;  // - mA
double   cba_in;  // - cos(b-a)
double    tb_in;  // - tan(b)

if     ( argc == 2 )   // - filename as input
{
    ifstream file( argv[1]);
    //  file.open( fname.c_str(), ios::in);
    file >> yt_in;
    file >> Z7_in;
    file >> mH_in;
    file >> mHc_in;
    file >> mA_in;
    file >> cba_in;
    file >> tb_in;
    file.close();
}

else if ( argc == 8 )   // - parameters as input
{
     yt_in      = (int)   atoi(argv[1]);
     Z7_in      = (double)atof(argv[2]);
     mH_in      = (double)atof(argv[3]);
     mHc_in     = (double)atof(argv[4]);
     mA_in      = (double)atof(argv[5]);
     cba_in     = (double)atof(argv[6]);
     tb_in      = (double)atof(argv[7]);
}

So that if a template is given the first option is used, if not then the bytecode already has the variables in it to directly (the main python2 file generates a point by default so it is easily passed along, I am not really sure why there are two different ways of doing this, I have only ever used it where points are generated in the main file. I would also like to mention that there is not really any documentation for this for me to refer to).

Below is the original method of output for the C++ file;

std::cout

// -- Input
<< Z7_in  << " "                    // 1
<< mH_in  << " "                   // 2

....

<< k_huu  << " "                    // 75
<< k_hdd  <<                        // 76

    std::endl;

    printf("\n\n");
    printf("Finished passing variables over!");
    printf("\n\n");

    return 0;

When I try using the original method for calling the C++ code the program just gets stuck and loops on the steps beforehand (it is designed to loop, selecting a point and processing it then selecting another etc, but it does not run the C++ code on the points).

I tried altering the code so that instead of streaming the data from the c++ code it would write it to a file named with a randomised string. Then just pass this string to cout for the python2 script to receive and use to open the file and read in the data. The file gets created correctly by the C++ initially, I can find and open it myself and see that it is working. The file name seems to be passed back to the python correctly also as I can get it to print out the string and the type of the output, which is string as expected.

But then instead of carrying on with the python2 script as expected it starts running the c++ again on the same point instead of going back to the main python script and generating a new one. It does this four times for some reason, then gives up. My changed python2 code is below;

def main(template_file, pars, vars):
    """Run requested command and return list of result values."""
    print '### --- Inside SimpleProcessor.main --- '
    print '                                        '
    global arguments, timeout, timelimit, formula_eval, data_fields_code

    print 'arguments', arguments
    print 'formula_eval', formula_eval
    print'data_fields_code', data_fields_code
    print 'template_file', template_file

    cwd = os.getcwd()
    print "current working dir is: " + str(cwd)
    output = subprocess.check_output(
            arguments + [template_file] if template_file else [])

    print output
    print "                      "
    print "The above is output from check_output"
    print "                      "
    print( "result type is: ", type(output))

    print "Current dir is: " + str(os.getcwd())
    path_to_file = str(output)
    print "the output file is: " + str(path_to_file)

    if os.path.isfile(path_to_file):
        print str(path_to_file) + " exists"
    else:
        print str(path_to_file) + " could not be located"

Sometimes it thinks the file exists, sometimes it doesn't while doing this weird looping. The file is present the whole time though, and it's actually using the same random string each time so it's always looking for the same file. I don't know how that's happening since the C++ generates the random string. I can only think that there's something wrong with the subprocess.check_output where it's taking in the same information multiple times. But the python doesn't even reach the same point before restarting the loop each time! Sometimes it gets as far as indicating it can/t find the datafile, sometimes it only gets as far as print output. I can't switch to python3 (unfortunately) as the main python script is python2 and as a PhD student I don't have the time needed to upgrade that.

Can anyone help me?

----------Edit----------------- It turned out that a single line in a python file higher up in the chain had been deleted somehow (I ssh into the cluster and it drops every five minutes recently so I suspect something got written over accidentally). I don't fully understand how that line led to these weird affects, but thanks everyone for trying to help and also thank goodness for github backups!!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source