'If else script in bash using grep and awk

I am trying to make a script to check if the value snp (column $4 of the test file) is present in another file (map file). If so, print the value snp and the value distance taken from the map file (distance is the column $4 of the map file). If the snp value from the test file is not present in the map file, print the snp value but put a 0 (zero) in the second column as distance value.

The script is:

for chr in {1..22}; 
do
for snp in awk '{print $4}' test$chr.bim
i=$(grep $snp map$chr.txt | wc -l | awk '{print $1}')
if [[ $i == "0" ]]
then 
echo "$snp 0" >> position.$chr
else
distance=$(grep $snp map$chr.txt | awk '{print $4}')
echo "$snp $distance" >> position.$chr
fi
done
done

my map file is made like this:

Chromosome  Position(bp)    Rate(cM/Mb) Map(cM)
chr22   16051347    8.096992    0.000000
chr22   16052618    8.131520    0.010291
chr22   16053624    8.131967    0.018472

and so on..

my test file is made like this:

22  16051347    0   16051347    C   A
22  16052618    0   16052618    G   T
22  17306184    0   17306184    T   G

and so on..

I'm getting the following syntax errors:

position.sh: line 6: syntax error near unexpected token `i=$(grep $snp map$chr.txt | wc -l | awk '{print $1}')'
position.sh: line 6: `i=$(grep $snp map$chr.txt | wc -l | awk '{print $1}')' 

Any tip?



Solution 1:[1]

The attempt to use awk as the argument to for is basically a syntax error, and you have a number of syntax problems and inefficiencies here.

Try this:

for chr in {1..22}; do
    awk '{print $4}' "test$chr.bim" |
    while IFS="" read -r snp; do
        if ! grep -q "$snp" "map$chr.txt"; then
            echo "$snp 0"
        else
            awk -v snp="$snp" '
                $0 ~ snp { print snp, $4 }' "map$chr.txt"
        fi  >> "position.$chr"
    done
done

The entire thing could probably be further refactored to a single Awk script.

for chr in {1..22}; do
    awk 'NR == FNR { ++a[$4]; next }
      $2 in a { print a[$2], $4; ++found[$2] }
      END { for(k in a) if (!found[k]) print a[k], 0 }' \
         "test$chr.bim"  "map$chr.txt" >> "position.$chr"
done

The correct for syntax for what I'm guessing you wanted would look like

for snp in $(awk '{print $4}' "test$chr.bim"); do

but this has other problems; see don't read lines with for

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1