Examples of using an External Tokenizer

I’m a very new user (to CodeMirror 6), so forgive my ignorance. Tho I do have experience with ANTLR and Flex/Bison/Yacc.

Are there plenty of examples of creating and hooking up an external tokenizer somewhere? So far, what I found are lone examples that aren’t super helpful due to lack of explaination.

More specifically, I want to get “LABEL” tokens of the form ^[A-Za-z][^\n()\[\]]+:which can contain whitespace and must start on a newline but not consume the newline character before it (if there even is one) and handle the case it’s on the first line. And “POWER” token of the form-?[1-9][0-9]?where ‘-’ is otherwise a skip token. Then the rest of my tokens can just use my current *.grammar file, with priority lower ofc.

Most of the parsers in the Lezer Github org use an external tokenizer of some kind. The indentation tokenizer for Python also needs to ensure it is directly after a newline, and may be a useful example.

I ended up searching GitHub for examples and there were plenty.

I got it working. Turned out to be simple.