Issue with overlapping tokens

I am having trouble figuring out how to deal with overlapping tokens. Here is an example of the behavior I want:

# 1. Two numbers
100 99 ==> Program(Number, Number)

# 2. Two numbers, one negative
100 -99 ==> Program(Number, Number)

# 3. Verb usage of `-`, with space
100- 99 ==> Program(Number, Verb, Number)

# 4. Verb usage of `-`, without space
100-99 ==> Program(Number, Verb, Number)

And here is my grammar, simplified for this example:

@top Program { Number space Number | Number space? Verb space? Number }
@tokens {
  @precedence { Number, Verb }
  Verb { "-" }
  Number { "-"? @digit+ }
  space { " "+ }
}

The fourth test case is failing.

Without the precedence, the error is: Overlapping tokens Number and Verb used in same context (example: "-" vs "-0").

How can I get it to use the surrounding context to help? If - directly follows a digit (no space between them), then it has to be part of a Verb and not a Number. Otherwise, it’s part of a Number.

If your language is whitespace-sensitive, you either have to explicitly include the space in the grammar (instead of using @skip), or you have to use external tokenizers that look at surrounding text and make decisions based on that. Lezer’s built-in tokens are strictly regular, so they don’t support lookahead/lookbehind.

Ok, thanks. I don’t think I was using @skip in my example, unless there is some kind of default skip.

The external tokenizer worked will for this too. Thanks.