ExternalTokenizer is not work as expected

I’d like to write a JavaScript based template language which embeds JavaScript code inside {{ }}.

The syntax looks like

aa{{ a }}aa

which should output a syntax tree


The template grammar in mix-parsing documentation is not used because it doesn’t perform well with {{ "}}" }} and some other cases. So I decided to write it based on official lezer-javascript package

Here are my changes to lezer-javascript

  1. Added some lines to grammar file
@top JSTemplate { (Text | CustomInterpolation)* }

CustomInterpolation { "{{" expression? "}}" }

@external tokens tokenizeText from "./tokens" { Text }
  1. Implement the external tokenizer named tokenizeText which provides the Text token (To better explain the issue, I simplify the logic which only process character a as Text token)
export const tokenizeText = new ExternalTokenizer((input, stack) => {
  if (String.fromCharCode(input.next) === 'a') {

After doing the above work, I build and use the parser with the following code

const templateParser = parser.configure({
  top: 'JSTemplate'

templateParser.parse(`aa{{ a }}aa`)

The following syntax tree was generated


The expected syntax tree should be


It seems tokenizeText is not called at the beginning of the input, and no Text token is generated there.

Is there something I’m doing wrong cause the strange behavior?

Did you import the Text term?

import { Text } from './yourgrammar.terms'

Your grammar works for me here https://lezer-playground.vercel.app/

Thanks for reply. Text is already imported in my code.

I made a repro link here: GitHub - pure-bot/javascript: A JavaScript lezer grammar

After clone the repo, install dependencies and build the parser with npm run build, then run the test.mjs file with node test.mjs command

You can see the following output: