'PBSPro qsub Output File Name for Job Array

It is desirable to have PBSPro std output files that are easily understood while making use of job arrays. I have not found a way.

Here are a set of jobs for discussion -

Job id            Name             User              Time Use S Queue 
----------------  ---------------- ----------------  -------- - -----  
651902.srvname    pl_0000          xxxxxxxxx         00:00:00 R large  
651903[].srvname  dp_0000-0001     xxxxxxxxx                0 H large     
651904.srvname    bp_0100          xxxxxxxxx                0 H large  
651905[].srvname  dp_0000-bpx6     xxxxxxxxx                0 H large

Running qsub without changing the output name results in files for each of the subjobs for job arrays. For example, the following files are produced for job 651905[]:

651905[1].srvname.OU  
651905[2].srvname.OU  
...  
651905[x].srvname.OU

Using qsub -o [JOBNAME] which is known when launching, results in a single file for the job array id so that the std output is only available for one of the subjobs.

The desired output file set is:

dp_0000-bpx6[1].OU     
dp_0000-bpx6[2].OU
...
dp_0000-bpx6[x].OU

How can this be accomplished? In other words, how can the output file name be set to something more understandable while preserving the array index?

A secondary question is how can I include the sequence number along with the job name? Something like -

dp_0000-bpx6.651905[1].OU     
dp_0000-bpx6.651905[2].OU
...
dp_0000-bpx6.651905[x].OU  
pbs


Solution 1:[1]

This won't get you fully there, but close.

qsub -J "0-512:512" -N pl_0000 -o pl_0000.^array_index^ -- /usr/bin/echo "HI"

Produces this output. You can see each array index with their own output.

-rw------- 1 pbsdev pbsdev     3 Apr 29 21:00 pl_0000.0
-rw------- 1 pbsdev pbsdev     3 Apr 29 21:00 pl_0000.512
-rw------- 1 pbsdev pbsdev     0 Apr 29 21:00 pl_0000.e1441.0
-rw------- 1 pbsdev pbsdev     0 Apr 29 21:00 pl_0000.e1441.512

This one will take the jobid and put it in the output filename along with the index.

jobid=$(qsub -h -J "0-512:512" -N pl_0000 -- /usr/bin/echo "HI") && qalter -o pl_0000.${jobid}.^array_index^.OU ${jobid} && qrls ${jobid}; echo ${jobid}

1446[].pdw-s1

Produces this output. You can see each jobid and array index with their own output.

drwxrwxr-x 2 root   pbsdev 69632 Apr 29 21:05 .
drwxrwxr-x 6 root   pbsdev 20480 Apr 29 20:49 ..
-rw------- 1 pbsdev pbsdev     3 Apr 29 21:05 pl_0000.1446[].pdw-s1.0.OU
-rw------- 1 pbsdev pbsdev     3 Apr 29 21:05 pl_0000.1446[].pdw-s1.512.OU
-rw------- 1 pbsdev pbsdev     0 Apr 29 21:05 pl_0000.e1446.0
-rw------- 1 pbsdev pbsdev     0 Apr 29 21:05 pl_0000.e1446.512


JOBID     v            v Array Index
pl_0000.1446[].pdw-s1.512.OU

You can add a -e and change the error output in the qalter. Note, the -h is needed to hold the job, so then qalter has time to modify it, then qrls releases it.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 hunterofcow