Although the original idea of using an external tokenizer is probably still the right approach for most people, I found that my language is simple enough I can include a minimal tokenizer in the outer language by expanding on the example in SrivishnuR’s question. It won’t fully support something like JavaScript (backticks don’t work right, for example), but I hope it will help others who are learning to use Lezer.
@top TemplateString { (Expression | StringContent)* }
@local tokens {
ExpressionStart[closedBy=ExpressionEnd]{ "${" }
@else StringContent
}
@precedence {
Expression
StringContent
}
@skip {} {
Expression {
ExpressionStart Interpolation ExpressionEnd
}
Interpolation {
(InnerString | ScriptText)*
}
}
@local tokens {
// Get strings, but don't let them be terminated by escaped quotes
InnerString { '"' (("\\" ![}] ) | ( !["] ))* '"' | "'" (("\\" ![}] ) | ( !['] ))* "'" }
ExpressionEnd[openedBy=ExpressionStart] { "}" }
@else ScriptText
}