'sed multiline delete with pattern

I want to delete all multiline occurences of a pattern like

  {START-TAG
  foo bar
  ID: 111
  foo bar
  END-TAG}

  {START-TAG
  foo bar
  ID: 222
  foo bar
  END-TAG}

  {START-TAG
  foo bar
  ID: 333
  foo bar
  END-TAG}

I want to delete all portions between START-TAG and END-TAG that contain specific IDs.

So to delete ID: 222 only this would remain:

  {START-TAG
  foo bar 2
  ID: 111
  foo bar 3
  END-TAG}


  {START-TAG
  foo bar 2
  ID: 333
  foo bar 3
  END-TAG}

I have a blacklist of IDs that should be removed.

I assume a quite simple multiline sed regex script would do it. Can anyone help?

It is very similar to Question: sed multiline replace but not the same.



Solution 1:[1]

You can use the following:

sed '/{START-TAG/{:a;N;/END-TAG}/!ba};/ID: 222/d' data.txt

Breakdown:

/{START-TAG/ { # Match '{START-TAG'
:a             # Create label a
N              # Read next line into pattern space
/END-TAG}/!    # If not matching 'END-TAG}'...
           ba  # Then goto a
}              # End /{START-TAG/ block
/ID: 222/d     # If pattern space matched 'ID: 222' then delete it. 

Solution 2:[2]

Don't use sed for anything that involves multiple lines, just use awk for a robust, portable solution. Given the sample input from the question you referenced, if the blocks are always separated by blank lines:

$ awk -v RS= -v ORS='\n\n' '!/ID: 222/' file
  {START-TAG
  foo bar
  ID: 111
  foo bar
  END-TAG}

  {START-TAG
  foo bar
  ID: 333
  foo bar
  END-TAG}

Otherwise:

$ awk '/{START-TAG/{f=1} f{rec=rec $0 ORS} /END-TAG}/{if (rec !~ /ID: 222/) print rec; rec=f=""}' file
  {START-TAG
  foo bar
  ID: 111
  foo bar
  END-TAG}

  {START-TAG
  foo bar
  ID: 333
  foo bar
  END-TAG}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 vinzee