'Tool sed-like to insert a HTML snippet code (a div block) after a <div id="jsn-page">...</div>
I am looking for a way to insert a block <div id="jsn-content-bottom">...code...</div>
after a block <div id="jsn-body">...</div>
.
I want to use a shell script because I need to apply this insertion to multiple HTML files into recursive directories.
In a first attempt, I tried to use sed
. But the issue is that I don't know how to find the right closing tag </div>
corresponding to the open tag <div id="jsn-body">
. Indeed, there are multiple others <div>
tag inside the <div id="jsn-body">
block and I need to find this closing tag (maybe its line number is enough) because I want to insert the block <div id="jsn-content-bottom">...code...</div>
just after this closing tag.
Anyone could see how to find easily the line of this closing tag ( when I say that, I guess to use sed
in my shell script but I am opened to others tools or Linux command that would make easier this processing of HTML files).
Just a last thing, I would like that inserted block to be stored in a file and handle this file for my insertion (with cat
or similar commands).
Update
For the moment, solution suggested by ctac_
is almost working. You can test the HTML source on index.html.txt, with the code snippet to insert insert.txt and the command line suggested, i.e :
awk '
NR==FNR{b=b$0RS;next}
/<div id="jsn-body">/{a=1;s[d]++}
a && /<div/{s[d]++}
a && /<\/div/{s[d]--}
a && s[d]==1{a=0;print $0RS b;next}1' insert.txt index.html.txt > outfile.html.txt
Unfortunately, when I "grep 'jsn-content-bottom" on the output of above awk command ( i.e by remove redirection "> outfile.html.txt
" ), no pattern match appears is displayed.
I don't know where the error could come from.
You can test the solution given by ctac_
on the following files :
and with the awk
command above.
Solution 1:[1]
You can try this awk
awk '
NR==FNR{b=b$0RS;next}
/<div id="jsn-page">/{a=1;d++}
a && /<div/{d++}
a && /<\/div/{d--}
a && d==1{a=0;print $0RS b;next}1' insert.txt infile.html >outfile.html
insert.txt contain the block 'jsn-content-bottom">...code...'.
first read this file and keep this content in b.
After read infile.html and find the start of block jsn-page.
a is a flag to tell we are in the block.
each time 'div' is seen d is incremented (start of block).
each time '<\div' is seen d is decremented (end of block).
When d return to 1, it's the end of block jsn-page.
a=0 to tell we are out of block.
so print the current line and b (the content of the insert file)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |