How to deal with Inconsistent skip sets error?

I have a grammar that looked like this:

@skip { space }
@skip{}{
	Value {
		Number Unit?
	}
}

which cause lezer-generator to produce the error: Inconsistent skip sets after Number

What I want is to accept something like “12m” or “12” but not “12     m”. The System Guide page’s example use an ending token, but I want to avoid that if possible.

The problem is the ? after Unit, which means that it is not clear what to skip after reducing a number. There’s not currently a solution to this—contextual skipping only works for rules where there’s a clear end position (such as strings). One hack would be to have a custom tokenizer that distinguishes Number and NumberBeforeUnit, and use a numberWithUnit { NumberBeforeUnit Unit } rule in a @skip {} block, and match Number | numberWithUnit in Value (moving Value out of the @skip block).

2 Likes

That would work, thanks! In my case, the different would be NumberBeforeUnit doesn’t have any white space after it, while Number should. Creating an ExternalTokenizer for that should be straightforward, but I wonder if there is a syntax within lezer grammar to do that? Something similar to regex lookahead basically.

Looked like the official css parser also use an External Tokenizer for this, so I assume that is the best way to go at the moment. Thanks again!

A solution that worked for my language was to force a space/eof token after the ?:

@top Program { Value+ }

@skip { space }
@skip{}{
	Value {
		Number Unit? (space | eof)
	} 
}

@tokens {
  Number { @digit }
  Unit {"u"|"n"|"i"|"t"}
  space { @whitespace+ }
  eof { @eof }
}

YMMV of course.