'sed multiline delete with pattern
I want to delete all multiline occurences of a pattern like
{START-TAG
foo bar
ID: 111
foo bar
END-TAG}
{START-TAG
foo bar
ID: 222
foo bar
END-TAG}
{START-TAG
foo bar
ID: 333
foo bar
END-TAG}
I want to delete all portions between START-TAG and END-TAG that contain specific IDs.
So to delete ID: 222 only this would remain:
{START-TAG
foo bar 2
ID: 111
foo bar 3
END-TAG}
{START-TAG
foo bar 2
ID: 333
foo bar 3
END-TAG}
I have a blacklist of IDs that should be removed.
I assume a quite simple multiline sed regex script would do it. Can anyone help?
It is very similar to Question: sed multiline replace but not the same.
Solution 1:[1]
You can use the following:
sed '/{START-TAG/{:a;N;/END-TAG}/!ba};/ID: 222/d' data.txt
Breakdown:
/{START-TAG/ { # Match '{START-TAG'
:a # Create label a
N # Read next line into pattern space
/END-TAG}/! # If not matching 'END-TAG}'...
ba # Then goto a
} # End /{START-TAG/ block
/ID: 222/d # If pattern space matched 'ID: 222' then delete it.
Solution 2:[2]
Don't use sed
for anything that involves multiple lines, just use awk
for a robust, portable solution. Given the sample input from the question you referenced, if the blocks are always separated by blank lines:
$ awk -v RS= -v ORS='\n\n' '!/ID: 222/' file
{START-TAG
foo bar
ID: 111
foo bar
END-TAG}
{START-TAG
foo bar
ID: 333
foo bar
END-TAG}
Otherwise:
$ awk '/{START-TAG/{f=1} f{rec=rec $0 ORS} /END-TAG}/{if (rec !~ /ID: 222/) print rec; rec=f=""}' file
{START-TAG
foo bar
ID: 111
foo bar
END-TAG}
{START-TAG
foo bar
ID: 333
foo bar
END-TAG}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | vinzee |