'How to replace "00" in data with "N/A" skipping first row in sed?
I'm working with GWAS data, My data looks like this:
IID,kgp11004425,rs11274005,kgp183005,rs746410036,kgp7979600
1,00,AG,GT,AK,00
32,AG,GG,AA,00,AT
300,TT,AA,00,AG,AA
400,GG,AG,00,GT,GG
Desired Output:
IID,kgp11004425,rs11274005,kgp183005,rs746410036,kgp7979600
1,N/A,AG,GT,AK,N/A
32,AG,GG,AA,N/A,AT
98,TT,AA,N/A,AG,AA
3,GG,AG,N/A,GT,GG
Here I'm trying to replace "00" with "N/A", but since I have 00 in the first_row/header_row and First column i.e IId, it's replacing here with N/A like kgp11N/A4425, rs11274N/A5,kgp183N/A5.... and Id column values with 300, 400, 500 as 3N/A, 4N/A, 5N/A. The bash command I used:
sed 's~00~N/A~g' allSNIPsFinaldata.csv
Can anyone please help "how not to include/Skip the first row or header row and first column and apply this effect. please help
Solution 1:[1]
You may specify an address to select the line(s) to apply the command to. Thus you might choose to exclude the first line like this:
sed '1!s~00~N/A~g' allSNIPsFinaldata.csv
As a sidenote I'd like to note that your example isn't actually CSV despite the file name; your header is comma-delimited but the rest of the file is using spaces.
Solution 2:[2]
With 2 capture groups you can use this sed
:
sed -E 's~(^|[[:blank:]])00([[:blank:]]|$)~\1N/A\2~g' file
IID, kgp11004425, rs11274005, kgp183005, rs746410036, kgp7979600
1 N/A AG GT AK N/A
32 AG GG AA N/A AT
98 TT AA N/A AG AA
3 GG AG N/A GT GG
Details:
(^|[[:blank:]])
: Match start or a whitespace in capture group #100
: Match00
([[:blank:]]|$)
: Match end or a whitespace in capture group #2\1N/A\2
: Replacement to put back value of capture group #1 followed byN/A
followed by value of capture group #2
Solution 3:[3]
Using sed
$ sed 's|\<00\>|N/A|g' input_file
IID, kgp11004425, rs11274005, kgp183005, rs746410036, kgp7979600
1 N/A AG GT AK N/A
32 AG GG AA N/A AT
98 TT AA N/A AG AA
3 GG AG N/A GT GG
Solution 4:[4]
You might also skip the first row starting from the second one:
sed '2,$s~00~N/A~g' allSNIPsFinaldata.csv
If you don't want partial word matches, you can implement word boundaries around the 00
in different ways.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Klaus Klein |
Solution 2 | anubhava |
Solution 3 | HatLess |
Solution 4 |