'Limit integer and decimal parts length in flex

Is there a way I can extract the number of digits before and after the '.' in a float in Flex? I want to limit the number of digits in the integer part to 4, and of the decimal part to 8. So that the total of the characters including '.' will be up to 13.

this is what I did:


I get errors only if the float is more than 13 in length. But when it has more than 4 characters in integer part or more than 8 characters in decimal part, I don't get any error.

Thank you for your help.



Solution 1:[1]

In (f)lex, macros ({...}) are just macros; they're replaced with their definition (normally surrounded by parentheses to avoid the usual problem with macro expansion). So you can't use {IntPart} and {DecPart} to perform actions on subsequences in the {float} pattern. Either {float} (that is, the macro expansion of that macro) matches, or one of the other two patterns match.

That's going to have confusing results because your {IntPart} pattern does not match what you want it to. You want it to match either a 0 or an integer which doesn't start with 0. That would be [1-9][0-9]*|0. The pattern, ([1-9][0-9])*|0 matches 0 or integers of even length (and then, only if the digits in odd positions are not 0). Other integers will be matched by the {DecPart} pattern, which is also active (because all rules are always active, unless you use scanner states).

Since some integers match one of those patterns and other integers match the other one, it's quite possible that the wrong length test will be applied. The integer 12345, for example, will match the {DecPart} pattern and will be compared with the length 8, so it won't trigger an error message. So the first thing you should do is to try to fix your patterns so that they actually match what you want them to, remembering that you need to match the entire token.

You could, of course, send off partial tokens, thereby complicating the grammar somewhat. One way to do that would be to use something like this:

0|[1-9][0-9]*      { if (yyleng <= 4) return INT_PART;
                     fprintf(stderr, "%s: Integer part too long\n", yytext);
                     return BAD_TOKEN;
                   }
"."[0-9]+          { if (yyleng <= 9) return DEC_PART;
                     fprintf(stderr, "%s: Decimal part too long\n", yytext);
                     return BAD_TOKEN;
                   }

but then your parser will have to stick the two things together. And the parser doesn't actually know whether there was whitespace between the two parts, so that's going to need some more work. [Note 1]

Personally, I'd just do the match and later on check to make sure that neither part is too long. Or convert the whole thing to a floating point number and compare it with 10000, which is undoubtedly the simplest option.


Notes:

  1. You could use a lexer state --a "start condition"-- to only allow the decimal part rule immediately after an integer part has been recognised. And you could even condition that all on the existence of the .. So technically, you could do what you want. But it's a lot of work for little purpose, and the consequence will be code which is much harder to maintain.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1