'Safe regex patterns from ReDos attack

I've recently faced with some redos attack issues.

Explain in simple steps:

Regex denial of services: it means the attacker can put some malicious/crafted inputs to bring your server down by making it impossible to stop to finding the correct pattern, so it takes your whole CPU, and finally causing internal server error.

e.g. if you have a pattern like ^((ab)*)+$, and the attacker put a malicious input like abababababababababababababababababa it will cause a catastrophic error: enter image description here enter image description here

The issue comes to the surface when using Nested quantifiers, quantifying over a sub-expression which can, itself, match in many ways in the same position. as you can see in the below picture(https://jex.im/regulex):

enter image description here

There are many solutions for many pattern matching problems, e.g if you want a pattern for URL there are tons of answers(also in the StackOverflow), which are good answers but almost vulnerable to this kind of attack.

I've found some useful tools like safe-regex: https://www.npmjs.com/package/safe-regex , which works good but have false-positives and false negatives. As you already know, Safe Regex Patterns from Redos Attack are hard to find.

Need

I'm asking, is there any list of safe regex patterns out there to use for common uses like passwords, URLs,etc.?

Useful resource

Useful for only js platform, https://github.com/validatorjs/validator.js

Update

I've struggled with this issue and found there are some libraries like re2, and validator.js, which are good tools, and found out that java solves this problem from v9 and erlang too, but in javascript regex engine still has the problem in chrome, but in firefox, it will throw an error which is good to handle in try cache, and finally, I've put my tries to make a list for this purpose at this Github link:

https://github.com/phoenixdevio/safe-regex-patterns

still couldn't found a good solution. although I know there may be a solution using the atomic group. it will be great if anyone could help with this to make the list more and better.



Solution 1:[1]

This doesn't directy answer your question, but the simplest way (in my experience) to avoid this type of attack is to use a regex library based off of https://github.com/google/re2, as it will not be vulnerable to ReDOS attacks. For node the reference library is https://github.com/uhop/node-re2/

RE2 consciously avoids any regular expression features that require worst-case exponential time to evaluate. These features are essentially those that describe a Context-Free Language (CFL) rather than a Regular Expression, and are extensions to the traditional regular expression language because some people don't know when enough is enough.

The most noteworthy missing features are backreferences and lookahead assertions. If your application uses these features, you should continue to use RegExp. But since these features are fundamentally vulnerable to ReDoS, you should strongly consider replacing them.

If your regex isn't compatible with it, then it is almosy certainly vulnerable to ReDOS, and (as a bonus) many that are typically vulnerable to ReDOS (like your example) are still valid, but not vulnerable in re2.

Solution 2:[2]

You can avoid ReDOS by using atomic groups and possessive quantifiers.

While these features are not supported natively in JS (there is a proposal pending), you can emulate them by using the /(?=(...))\1/ pattern around the bit that would otherwise backtrack. That pattern means that whatever \1 matches will be set in stone for this parse, since the JS RegExp engines won't backtrack into look ahead assertions (/(?=...)/), per spec.

Adding visual noise to an already complex RegExp may not be everyone's cup of tea though, and in that case, you may want to use a library like compose-regrexp that provides an atomic() helper that can be composed with other RegExp-building functions:

import {atomic, sequence} from 'compose-regexp'

const ReDOS = /^((ab)*)+$/
const fixed = sequence(/^/, atomic(/((ab)*)+/), /$/)

You can see it in action here.

Here's the console log if you don't want to leave SO:

match success:  true
REDOS, matching: ababababababababababababababab: 0.6982421875 ms
match success:  true
fixed, matching: ababababababababababababababab: 0.18505859375 ms
match success:  false
REDOS, matching: abababababababababababababababa: 1.5869140625 ms
match success:  false
fixed, matching: abababababababababababababababa: 0.241943359375 ms
match success:  false
REDOS, matching: ababababababababababababababababa: 2.820068359375 ms
match success:  false
fixed, matching: ababababababababababababababababa: 0.14306640625 ms
match success:  false
REDOS, matching: abababababababababababababababababa: 5.0400390625 ms
match success:  false
fixed, matching: abababababababababababababababababa: 0.135986328125 ms
match success:  false
REDOS, matching: ababababababababababababababababababa: 9.785888671875 ms
match success:  false
fixed, matching: ababababababababababababababababababa: 0.131103515625 ms

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dave
Solution 2