'Find and replace with PowerShell based on the next line
I'm trying to find and replace via PowerShell based on what's on the next line of the line I want to replace. For example, the following text file:
blahblah flimflam zimzam
If the line after blahblah
is flimflam
, replace blahblah
with new stuff
Here's the code I have so far:
$reader = New-Object System.IO.StreamReader($myFile.FullName);
$FileContents=$reader.ReadToEnd()
$reader.Close()
if(the line after "blahblah" == "flimflam") #pseudo code
{
$FileContents=$FileContents.Replace("blahblah","new stuff")
}
If the next line is anything other than flimflam
, do nothing.
One idea I had was to replace "blahblah n` flimflam" with "new stuff", but I can't get it to work. I think I might be onto something with including the new line character though.
Solution 1:[1]
While your use of
System.IO.StreamReader
works, it's generally easier to useGet-Content -Raw
to read a file into memory in full, as a single, multi-line string.If performance is a concern, you can still use .NET types directly, in which case
[System.IO.File]::ReadAllText($myFile.FullName)
is a much simpler alternative - although if there's any performance gain to be had overGet-Content -Raw
at all, it is probably insignificant.To specify the input file's encoding explicitly, use
Get-Encoding -Encoding <encoding>
/[System.IO.File]::ReadAllText($myFile.FullName, <encoding>)
,
The
[string]
type's.Replace()
method is limited to literal string replacement, so advanced matching such as limiting matches to a full line is not an option.Use PowerShell's regex-based
-replace
operator instead.To prevent confusion with PowerShell's string expansion (string interpolation) in double-quoted (
"..."
) strings, it's generally preferable to use-replace
with single-quoted ('...'
) strings, which PowerShell treats as literals, so you can focus on regex constructs in the string.
PS> $FileContents -replace '(?m)^blahblah(?=\r?\nflimflam$)', 'new stuff'
new stuff
flimflam
zimzam
(?m)
uses inline optionm
(multi-line) to make anchors^
/$
match the start / end of each line (instead of the string as a whole).(?=...)
is a look-ahead assertion that matches without including the matching part in the overall match, so that it doesn't get replaced.\r?\n
is a platform-agnostic way to match a newline sequence / character: CRLF (\r\n
) on Windows, LF-only (\n
) on Unix-like platforms.
Solution 2:[2]
With a RegEx with a positive lookahead you can replace the previous line without even knowing the content of that line:
(Get-Content .\SO_53398250.txt -raw) -replace "(?sm)^[^`r`n]+(?=`r?`nflimflam)","new stuff"|
Set-Content .\SO_53398250_2.txt
See the RegEx explained on regex101.com (with different escaping `n => \n)
Solution 3:[3]
This questions asks for a reusable cmdlet that supports streaming as much as possible...
Replace-String
Function Replace-String {
[CmdletBinding()][OutputType([String[]])]Param (
[String]$Match, [String]$Replacement, [Int]$Offset = 0,
[Parameter(ValueFromPipeLine = $True)][String[]]$InputObject
)
Begin {
$Count = 0
$Buffer = New-Object String[] ([Math]::Abs($Offset))
}
Process {
$InputObject | ForEach-Object {
If ($Offset -gt 0) {
If ($Buffer[$Count % $Offset] -Match $Match) {$Replacement} Else {$_}
} ElseIf ($Offset -lt 0) {
If ($Count -ge -$Offset) {If ($_ -Match $Match) {$Replacement} Else {$Buffer[$Count % $Offset]}}
} Else {
If ($_ -Match $Match) {$Replacement} Else {$_}
}
If ($Offset) {$Buffer[$Count++ % $Offset] = $_}
}
}
End {
For ($i = 0; $i -gt $Offset; $i--) {$Buffer[$Count++ % $Offset]}
}
}
Syntax
$InputObject | Replace-String [-Match] <String to find>
[-Replacement] <Replacement string to use>
[-Offset] <Offset relative to the matched string>
Examples:
Replace the found string:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 0
One
Two
X
Four
Five
Replace the string prior the found string:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X -1
One
X
Three
Four
Five
Replace the second string prior the found string:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X -2
X
Two
Three
Four
Five
Replace the string after the found string:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 1
One
Two
Three
X
Five
Replace the second string after the found string:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 2
One
Two
Three
Four
X
Replace the strings prior the (two) strings that contains a T
:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String T X -1
X
X
Three
Four
Five
Replace the strings after the (two) strings that contains a T
:
'One', 'Two', 'Three', 'Four', 'Five' | Replace-String T X 1
One
Two
X
X
Five
Specific to the question:
'blahblah', 'flimflam','zimzam' | Replace-String 'flimflam' 'new stuff' -1
new stuff
flimflam
zimzam
Parameters
-InputObject <String[]>
(From pipeline)
A stream of strings to match and replace
-Match <String>
The string to match in the stream.
Note that the -Match
operator is used for this parameter which means that it support regular expressions. If the whole string needs to match, use start line and end line anchors, e.g.: -Match '^Three$'
.
-Replacement <String>
The string to use for replacing the target(s).
-Offset <Int> = 0
The offset to string relative to the matched string(s) to replace. The default is 0
, meaning: replace the matched string(s).
Background
A little background on the programming in this cmdlet:
- The Input Processing Methods (
Begin {...}
,Process {...}
,End {...}
are used to pass the strings as fast as possible through the cmdlet and release them for the next cmdlet in the pipeline. This cmdlet is designed for the middle of a pipeline (e.g.Get-Content $myFile | Replace-String A B 1 | ...
). To leverage from the pipeline:- Avoid brackets (like:
($List) | Replace-String A B
- Avoid assigning the output (like:
$Array = ... | Replace-String A B
- Avoid parameters that read in the whole content (as
)Get-Content -Raw
- Avoid brackets (like:
- If both cases (replacing behind -using a negative offset- or ahead -using a positive offset-), a buffer is required of the size of the offset (
$Buffer = New-Object String[] ([Math]::Abs($Offset))
) - To speed up the process, the script cycles through the buffer (
$Buffer[$Count % $Offset]
) rather than shifting the contained items If ($Count -ge -$Offset) {...
will hold the first number of input strings (equal to the offset) as it can only later been determined whether an input string needs to be replaced or not- In the end (
End {...
), if the$Offset
is negative, the buffer (that contains the rest of the input strings) is being released. In other words, a negative offset (e.g.-offset -$n
) will buffer$n
strings and cause the output to run$n
strings behind the input stream
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 |