I have a grammar that looked like this:
@skip { space }
@skip{}{
Value {
Number Unit?
}
}
which cause lezer-generator to produce the error: Inconsistent skip sets after Number
What I want is to accept something like “12m” or “12” but not “12 m”. The System Guide page’s example use an ending token, but I want to avoid that if possible.
The problem is the ?
after Unit, which means that it is not clear what to skip after reducing a number. There’s not currently a solution to this—contextual skipping only works for rules where there’s a clear end position (such as strings). One hack would be to have a custom tokenizer that distinguishes Number
and NumberBeforeUnit
, and use a numberWithUnit { NumberBeforeUnit Unit }
rule in a @skip {}
block, and match Number | numberWithUnit
in Value
(moving Value
out of the @skip
block).
2 Likes
That would work, thanks! In my case, the different would be NumberBeforeUnit
doesn’t have any white space after it, while Number
should. Creating an ExternalTokenizer for that should be straightforward, but I wonder if there is a syntax within lezer grammar to do that? Something similar to regex lookahead basically.
Looked like the official css parser also use an External Tokenizer for this, so I assume that is the best way to go at the moment. Thanks again!
A solution that worked for my language was to force a space/eof token after the ?
:
@top Program { Value+ }
@skip { space }
@skip{}{
Value {
Number Unit? (space | eof)
}
}
@tokens {
Number { @digit }
Unit {"u"|"n"|"i"|"t"}
space { @whitespace+ }
eof { @eof }
}
YMMV of course.