'R: Match an odd number of repetitions

I would like to match a string like \code, but not when the backslash is escaped.

I think that one way of doing this could be matching an odd number of backslashes. Then for example, assuming \code is an expression to be replaced by 1234:

\code would be 1234, but \\code, should stay as is, \\\code would be \\1234, etc.

In R, given the strings:

message(o <- "\\\\\\code")
# \\\code
message(e <- "\\\\code")
# \\code

A partially working attempt in R is:

message(gsub("((?:\\\\{2})?)\\\\code", "\\11234", o, perl=TRUE))
# \\1234
message(gsub("((?:\\\\{2})*)\\\\code", "\\11234", e, perl=TRUE))
# \1234

The regex matches both the odd and even case. To make it work, I should find a way to match the double backslashes, "\", more greedily (always when they are present) so that the second backslash doesn't.

Of course, if there is a better strategy to match a "\sequence" (when not escaped) that would be equally fine.



Solution 1:[1]

You may use

rx <- "(?<!\\\\)(?:\\\\{2})*\\K\\\\code"

Replace with 1234. See the regex demo.

Details

  • (?<!\\) - fail if there is a \ immediately to the left of the current location
  • (?:\\{2})* - match and consume 0 or more occurrences of double backslash
  • \K - match reset operator that discards all text matched so far
  • \\code - \code text.

See an R demo online:

rx <- "(?<!\\\\)(?:\\\\{2})*\\K\\\\code"
message(gsub(rx, "1234", "\\\\\\code", perl=TRUE)) # \\\code
# => \\1234
message(gsub(rx, "1234", "\\\\code", perl=TRUE))   # \\code
# => \\code
message(gsub(rx, "1234", "\\code", perl=TRUE)) # \code
# => 1234

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1