'Problem with quantifiers and look-behind

### Ruby 1.8.7 ###

require 'rubygems'
require 'oniguruma' # for look-behind

Oniguruma::ORegexp.new('h(?=\w*)')
# => /h(?=\w*)/

Oniguruma::ORegexp.new('(?<=\w*)o')
# => ArgumentError: Oniguruma Error: invalid pattern in look-behind

Oniguruma::ORegexp.new('(?<=\w)o')
# => /(?<=\w)o/


### Ruby 1.9.2 rc-2 ###

"hello".match(/h(?=\w*)/)
# => #<MatchData "h">

"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/

"hello".match(/(?<=\w)o/)
# => #<MatchData "o"> 

I can't using quantifiers with look-behind?



Solution 1:[1]

The issue is that Ruby doesn't support variable-length lookbehinds. Quantifiers aren't out per se, but they can't cause the length of the lookbehind to be nondeterministic.

Perl has the same restriction, as does just about every major language featuring regexes.

Try using the straightforward match (\w*)\W*?o instead of the lookbehind.

Solution 2:[2]

I was banging my head against the same problem, and Borealid's answer helped explain the issue well.

However, that got me thinking. Maybe the quantifier does not need to be inside the lookbehind, but can be applied on the lookbehind itself?

"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/

"hello".match(/(?<=\w)*o/)
# => #<MatchData "o">

So now we have a variable number of constant-length lookbehinds. Seems to bypass the issue for me. :)

Solution 3:[3]

For those who found this thread in 2022 with Ruby version >= 2.0, use \K:

$ ruby -e 'p ARGV.first.match(/h\K\w*/)' hello
#<MatchData "ello">

Quoted from https://ruby-doc.org/core-3.1.2/doc/regexp_rdoc.html

\K - Uses an positive lookbehind of the content preceding \K in the regexp.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Borealid
Solution 2 lime
Solution 3 Weihang Jian