'Regex expression to capture only numeric fields and strip $ and comma, no match if there are any alphanumeric

I'm trying to write a regex that will strip out $ and , from a value and not match at all if there are any other non-numerics.

$100 -> 100
$12,203.00 -> 12203.00
12JAN2022 -> no match

I have gotten sort of close with this:

^(?:[$,]*)(([0-9.]{1,3})(?:[,.]?))+(?:[$,]*)$

However this doesn't properly capture the numeric value with $1 as the repeating digits are captured as like subgroup captures as you can see here https://regex101.com/r/4bOJtB/1



Solution 1:[1]

You can use a named capturing group to capture all parts of the number and then concatenate them. Although, it is more straight-forward to replace all chars you do not need as a post-processing step.

Here is an example code:

var pattern = @"^\$*(?:(?<v>\d{1,3})(?:,(?<v>\d{3}))*|(?<v>\d+))(?<v>\.\d+)?$";
var tests = new[] {"$100", "$12,203.00", "12JAN2022"};
foreach (var test in tests) {
    var result = string.Concat(Regex.Match(test, pattern)?
            .Groups["v"].Captures.Cast<Capture>().Select(x => x.Value));
    Console.WriteLine("{0} -> {1}", test, result.Length > 0 ? result : "No match");
}

See the C# demo. Output:

$100 -> 100
$12,203.00 -> 12203.00
12JAN2022 -> No match

The regex is

^\$*(?:(?<v>\d{1,3})(?:,(?<v>\d{3}))*|(?<v>\d+))(?<v>\.\d+)?$

See the regex demo. Details:

  • ^ - start of string
  • \$* - zero or more dollar symbols
  • (?:(?<v>\d{1,3})(?:,(?<v>\d{3}))*|(?<v>\d+)) - either one to three digits (captured into Group "v") and then zero or more occurrences of a comma and then three digits (captured into Group "v"), or one or more digits (captured into Group "v")
  • (?<v>\.\d+)? - an optional occurrence of . and one or more digits (all captured into Group "v")
  • $ - end of string.

Solution 2:[2]

I don't know how to achieve this in single regexp, but personal opinion here I find dividing the problem into smaller steps a good idea - it's easier to implement and maintain/understand in the future without sacrificing time to understand the magic.

  1. replace all $ and , to empty string [\$\,] => ``

  2. match only digits and periods as a capture group (of course you may need to align this with your requirements on allowed period locations etc.) ^((\d{1,3}\.?)+)$

Hope this helps!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew
Solution 2