parseMixed with outer StreamLanguage parser

EricMCornelius · October 24, 2023, 4:40pm

Hi there,

Struggling a little bit to figure out what a sensible approach would be here.

I’ve got a YAML document with embedded javascript snippets in the string blocks.

Using the legacy StreamLanguage parser works fine on its own for YAML support.

However, I was hoping to use the parseMixed functionality as per the Mixed Language example, but it seems like there’s no means to nest a lezer parser inside a non-lezer outer language?

Is there any way to still use overlays to get full language support for the subsets of the document which are javascript? Just looking for where to actually get started, as I’m not conversant enough in the project yet to understand how languages are assigned to specific document subsets.

marijn · October 25, 2023, 6:49am

No—stream modes only emit tokens, no bigger syntax tree nodes, so the mixed parser doesn’t have anything that would tell it the extent of the JavaScript snippet.

EricMCornelius · October 25, 2023, 2:52pm

Hi Marijn,

I’ve already got the ability to extract the code block string literal locations from a separate parsing pass that is executed in a change observer. Which means I can use the js language parser.parse call to get back trees for the appropriate subset ranges of the document.

I’m stuck at that point though, not clear to me how one might overlay or merge those trees when the outer language is a StreamLanguage or whether that’s entirely impossible.

The overlay logic seems to already take priority over the outer language in that manner, but perhaps there’s no way to specify that behavior as an overlay range vs a direct node replacement?

Was hoping there might be a simple solution to the conceptual idea of intercepting all language logic in the manner of a node overlay for a document range instead.

marijn · October 25, 2023, 3:07pm

On solution would be to use a custom top-level parser that divides the document into YAML and JavaScript parts, and then parseMixed to apply the stream parser to the YAML and the JavaScript parser to the JS parts.