'Slurm - How to create a job array with jobs that are not in the same folder

I have a folder structure which is like this:

/home/01/01/script.R
/home/01/02/script.R
/home/01/03/script.R
/home/02/01/script.R
/home/02/02/script.R
/home/02/03/script.R
/home/03/01/script.R
/home/03/02/script.R
/home/03/03/script.R

I want to send all of these scripts jointly to the Slurm as one job array. However, I am running into problems because they are not in the same folder. What I currently know how to do is how to send these scripts to Slurm as three separate job arrays - one of which is at /home/01, second one at /home/02 and the third one at /home/03. I was wondering if there was an easy way to send all nine jobs together as a part of the array, WITHOUT putting them all in a same folder - the folder structure needs to strictly stay as is here.

This is the script that I am currently using, which doesn't work:

#!/bin/bash
# submit_array.sh

#SBATCH --job-name=array_test
#SBATCH [email protected]
#SBATCH --mail-type=end
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --mem=50                      
#SBATCH --time=0-00:01:00               
#SBATCH --qos=standard

declare -a combinations
index=0
for dataset in `seq -w 01 03`
do
    for chain in `seq -w 01 03`
    do
        combinations[$index]="$dataset $chain"
        index=$((index + 1))

    done
done

parameters=(${combinations[${SLURM_ARRAY_TASK_ID}]})

dataset=${parameters[0]}
chain=${parameters[1]}

module add R

cd /home/$dataset/$chain
R CMD BATCH script.R

Any help would be appreciated, thanks!



Solution 1:[1]

One method is to use combine the folder combinations as separate IDs in the sbatch array whose associated ${SLURM_ARRAY_TASK_ID} can be parsed through substring parameter expansion in the shell script as follows:

sbatch -a 101,102,103,201,202,203,301,302,303 ./submit_array.sh

where the contents of submit_array.sh are:

#!/bin/bash
# submit_array.sh

#SBATCH --job-name=array_test
#SBATCH [email protected]
#SBATCH --mail-type=end
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --mem=50                      
#SBATCH --time=0-00:01:00               
#SBATCH --qos=standard

# job arrays do not usually support 4 digits,
# so we append "0" for the dataset variable
dataset="0${SLURM_ARRAY_TASK_ID::1}"
# then chain uses the last two digits
chain=${SLURM_ARRAY_TASK_ID:1:3}

module add R

cd /home/${dataset}/${chain}
R CMD BATCH script.R

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pcamach2