parseMixed with lang-html AttributeValue

im trying to implement a lit-html style language parser with html as the outer and javascript as the inner parser with parseMixed. i’ve got the parseMixed working with something like:

wrap: parseMixed(node => {
  if(node.name === ‘Text’) {
     return { parser: javascript.parser}
  }
  return null;
})

but when i add in node.name === ‘AttributeValue’ also to parse a document like <p title=“${someJsExpression}”> where the attribute can be quoted or unquoted, the contents of the ‘AttributeValue’ node include the double quotes so it’s not being recognized as a JS expression.

if I use UnquotedAttributeValue and don’t quote the values in my doc to be parsed that works fine. But lit-html allows JS expressions in both quoted and unquoted attributes.

i took a look at the HTML Lezer grammar and it seems like my parsing isn’t working because AttributeValue is defined as always including the quotes and so what the JS parser would see is “${some js expression}” and it won’t register in the JS parser as JS because JS doesn’t have standalone quoted entities.

  1. Is there a way to setup parseMixed to parse the content between the quotes of the AttributeValue with a nested lang when quotes are present? Afaict, there’s no node type/name in the parser for that?

  2. What would the impact be if the Lezer html grammar were changed such that double quotes were turned into a node type like Is (=) and AttributeValue was always only ever the contents inside the quotes?

You can use the overlay property to specify the range of the node that should be parsed with the inner parser.

1 Like

awesome, that’s working great!

what is the recommended way to tell if an AttributeValue is just ””. Should I calculate node.to - node.from > 2 before returning the nested parser, or is there a better way?

Checking for a length greater than 2 seems reasonable.