Nested Autocomplete for Custom Languages

AashJ · January 25, 2023, 11:34pm

When going through the examples, i noticed that I can configure a basic mixed HTML parser with autocomplete via the following:

import { parser as htmlParser } from "@lezer/html";
import { parseMixed } from "@lezer/common";
import {
  foldInside,
  foldNodeProp,
  LanguageSupport,
  LRLanguage,
} from "@codemirror/language";
import { javascript, javascriptLanguage } from "@codemirror/lang-javascript";

export const mixedHTMLParser = htmlParser.configure({
  wrap: parseMixed((node) =>
    node.name == "ScriptText" ? { parser: javascriptLanguage.parser } : null
  ),
  props: [foldNodeProp.add({ Element: foldInside })],
});

const mixedHTMLLanguage = LRLanguage.define({ parser: mixedHTMLParser });

export const mixedHTML = () => {
  return new LanguageSupport(mixedHTMLLanguage, [javascript().support]);
};

When using this plugin, i get all the language support from javascript inside of script tags!

However, when implementing my own mixed language with javascript language support:

import { parser as templateParser } from "lib/parser";
import { parseMixed } from "@lezer/common";
import {
  foldInside,
  foldNodeProp,
  LanguageSupport,
  LRLanguage,
} from "@codemirror/language";
import { javascript, javascriptLanguage } from "@codemirror/lang-javascript";

export const templateStringParser = templateParser.configure({
  wrap: parseMixed((node) => {
    return node.name == "JavaScriptText"
      ? { parser: javascriptLanguage.parser }
      : null;
  }),
  props: [
    foldNodeProp.add({
      Expression: foldInside,
    }),
  ],
});

export const templateStringLanguage = LRLanguage.define({
  parser: templateStringParser,
});

export const templateString = () => {
  return new LanguageSupport(templateStringLanguage, [
    javascript({ jsx: false }).support,
  ]);
};

I get syntax highlighting for javascript but no autocomplete. Any idea what could be going wrong here? I could post the grammar for the templateParser if needed.

cody · January 26, 2023, 1:41am

Also running into this

marijn · January 26, 2023, 7:10am

I don’t have your grammar, but when I change that code to use the @lezer/html parser and target the UnquotedAttributeValue nodes instead, JavaScript completion works inside of those.

marijn · January 26, 2023, 7:12am

Also running into this highly specific issue 2 hours after the post was made? That’s quite a coincidence. Or, in case you’re a coworker of the OP ‘bumping’ this or a spam account trying to build up post count, please don’t — it’s annoying.

AashJ · January 26, 2023, 6:30pm

Thanks for the prompt response Marjin. Would the grammar/tokenizer have any impact on how support extensions for nested parsers are run? The HTML example works well for me, so wondering what the difference could be.

Here’s my grammar:

@top TemplateString { Expression+ }

Expression {
    OpenExpression JavaScriptText CloseExpression
}

OpenExpression {
    "{{"
}

JavaScriptText { javascriptText* }

@external tokens scriptTokens from "./tokens.js" {
    javascriptText
    CloseExpression
}

with tokens.js being

import { ExternalTokenizer } from "@lezer/lr";
import { CloseExpression, javascriptText } from "./parser.terms";

const closeTemplate = 125;

function expressionTokenizer() {
  return new ExternalTokenizer((input) => {
    let i = 0;
    let state = 0;
    let javascriptTextLength = 0;
    while (true) {
      if (input.next < 0) {
        if (i) input.acceptToken(javascriptText);
        break;
      }
      // first close template
      if (state == 0 && input.next == closeTemplate) {
        state++;
      }
      // second close template
      else if (state == 1 && input.next == closeTemplate) {
        // if we have javascriptTextLength then accept that token
        if (javascriptTextLength) {
          input.acceptToken(javascriptText, -javascriptTextLength);
        } else {
          input.acceptToken(CloseExpression, 1);
        }
        break;
      } else {
        // reset
        javascriptTextLength++;
        state = 0;
      }
      input.advance();
    }
  });
}

export const scriptTokens = expressionTokenizer();

With the following string: {{asdf}} being parsed here are the nodes:
Node TemplateString from 0 to 8
Node Expression from 0 to 8
Node OpenExpression from 0 to 2
Node JavaScriptText from 2 to 6
Node CloseExpression from 6 to 8

which seems like I should be able to just run the javascript parser on JavaScriptText and get the appropriate support extensions

cody · January 26, 2023, 7:13pm

My apologies, I just wanted to follow along with the thread, I should have just bookmarked it or watched it. Won’t happen again!

AashJ · January 27, 2023, 12:08am

Ah actually got this to work by removing the ifNotIn block from the JavaScript autocomplete.

Thanks for the help again! Really appreciate it and the work on this library

AashJ · January 27, 2023, 7:40pm

Although interestingly, this suggests that my mixed language parsing isn’t exactly correct as i shouldn’t need to modify the javascript language extensions…

AashJ · January 28, 2023, 12:26am

Ok final update for anybody else who’s following along – turns out my grammar and the JavaScript grammar had a node collision!

Important to note that you should prefix all grammars if you plan on mixing them and don’t want any node collisions

marijn · January 28, 2023, 9:47am

I see. This patch should prevent that in the future. Definitely don’t prefix your node names, that’d get very ugly.

yswang0927 · August 2, 2023, 7:22am

Hi AashJ,

Now I also need to parse like text {{ var a = Date.now() }} text content, the whole content is plain text, but content between {{ and }} parsed as javascript.

Can you help me? give me your whole parse code?