'Iterate over a csv and change the values of a column that meets a condition
I have to use bash to iterate over a CSV file and replace the values of a column that meets a condition. Finally, the results have to be stored in an output file.
I have written this code, which reads the file and stores the content in an array. On iterating over the file, if the value at column 13 is equal to "NULL" then the value of this record has to be replaced by "0". Once the file is reviewed the output with the replaced values is stored at file_b.
#!/bin/bash
file="./2022_Accidentalidad.csv"
while IFS=; read -ra array
do
if [[ ${array[13]} == "NULL" ]]; then
echo "${array[13]}" | sed -n 's/NULL/0/g'
fi
done < $file > file_b.csv
The problem is that file_b is empty. Nothing is written there. How could I do this?
I cannot use AWK, and have to use or a FOR or a WHILE command to iterate over the file.
Sample input:
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;NULL;NULL;443359,226;4472082,272;N;NULL
Expected output
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL
Thanks a lot in advance.
Solution 1:[1]
You don't need sed
. Just replace $array[13]
with 0
. Then print the entire array with ;
separators between the fields.
( # in a subshell
IFS=';' # set IFS, that affects `read` and `"${array[*]}"`
while read -ra array
do
if [[ ${array[13]} == "NULL" ]]; then
array[13]=0
fi
echo "${array[*]}"
done
) < $file > file_b.csv
echo
uses the first character of $IFS
as the output field separator.
Solution 2:[2]
When awk
is also possible:
awk 'BEGIN{FS=OFS=";"} NR==2 && $14=="NULL"{$14=0} {print}' "$file" > file_b.csv
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
Solution 3:[3]
One idea using a regex and the BASH_REMATCH
array:
regex='(([^;]*;){13})(NULL)(;.*)'
while read -r line
do
[[ "${line}" =~ $regex ]] &&
line="${BASH_REMATCH[1]}0${BASH_REMATCH[4]}"
# uncomment following line to display contents of BASH_REMATCH[] array
# declare -p BASH_REMATCH
echo "${line}"
done < file.csv > file_b.csv
This generates:
$ cat file_b.csv
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | glenn jackman |
Solution 2 | Cyrus |
Solution 3 | markp-fuso |