Add interpolation to JSON

Hi, I am extremely new to Lezer and language parsing in general, so excuse me if this question was asked before, but my limited knowledge makes it hard to find similar problems to mine So please point me to a thread that answered this question before if it exists.

I want to add interpolation to JSON. That means I want to add to the current grammar a term to match sequences of the kind ${something}.

Now since JSON is a fairly simple language, I managed to do it for the case:

{
  "key": ${my-variable}
}

By adding:

value { True | False | Null | Number | String | Object | Array | Interpolation }

InterpolationStart[closedBy=InterpolationEnd] { "${" }
InterpolationEnd[openedBy=InterpolationStart] { "}" }
Interpolation { InterpolationStart char* InterpolationEnd }

The problem comes when trying to do it inside of strings. Since the current term for string is defined as: string { '"' char* '"' }

I tried doing the following:

String { string | interpolationstring }

interpolationstring { '"' Interpolation '"' }
string { '"' char* '"' }

But as you might have guessed I received the following error: Overlapping tokens string and Interpolation used in same context (example: "\"${}\"")

The error makes sense and I understand why I get it, but I don’t know how to solve it.

The final thing I want to match is sth. like "${env1} some text ${env2}" but differentiating between “some string” and “${env}” would be good a good start.

I think this shows the way towards the solution. Make the string rule use multiple tokens (one for the opening quote, followed by any number of string content or interpolation start tokens, closed by another quote), and allow both content tokens and interpolations inside of it.

Thanks for your help. I think I am very close now. I added the following based on the JavaScript template string:

@local tokens {
  stringEnd { '"' }
  stringEnv[@name='Env'] { "${" char* "}" }
  @else stringContent
}

@skip {} {
  String { '"' (stringContent | stringEnv)* stringEnd }
}

This makes it correctly match envs in strings, but it cannot differentiate content between two envs:

What am I missing? Here is the full grammar:

@top JsonText { value }

value { True | False | Null | Number | String | Object | Array | Env }

@local tokens {
  stringEnd { '"' }
  stringEnv[@name='Env'] { "${" char* "}" }
  @else stringContent
}

@skip {} {
  String { '"' (stringContent | stringEnv)* stringEnd }
}

Object { "{" list<Property>? "}" }
Array  { "[" list<value>? "]" }

Property { PropertyName ":" value }
PropertyName { string }


@tokens {
  True  { "true" }
  False { "false" }
  Null  { "null" }

  Number { '-'? int frac? exp?  }
  int  { '0' | $[1-9] std.digit* }
  frac { '.' std.digit+ }
  exp  { $[eE] $[+\-]? std.digit+ }

  string { '"' char* '"' }
  char { $[\u{20}\u{21}\u{23}-\u{5b}\u{5d}-\u{10ffff}] | "\\" esc }
  esc  { $["\\\/bfnrt] | "u" hex hex hex hex }
  hex  { $[0-9a-fA-F] }

  InterpolationStart[closedBy=InterpolationEnd] { "${" }
  InterpolationEnd[openedBy=InterpolationStart] { "}" }
  Env { InterpolationStart char* InterpolationEnd }

  whitespace { $[ \n\r\t] }

  "{" "}" "[" "]"
}

@skip { whitespace }
list<item> { item ("," item)* }

@external propSource jsonHighlighting from "./highlight"

@detectDelim

char* will greedily match as much as possible. I would expect you’ll want to put the ${ token in the @local tokens block, and keep the interpolation rule as a whole outside of it, and not make it a single token, so you can do normal tokenizing of its content.

Thanks. I got closer yet again:

image

As you can see now the closing bracket is not matched correctly anymore and I can’t seem to fix it. Do you have an idea where this could come from? here is the current grammar:

@top JsonText { value }

value { True | False | Null | Number | String | Object | Array | Env }

@local tokens {
  InterpolationStart[closedBy=InterpolationEnd] { "${" }
  stringEnd { '"' }
  @else stringContent
}

InterpolationEnd[openedBy=InterpolationStart] { "}" }
StringEnv[@name=Env] { InterpolationStart interpolationContent* InterpolationEnd }

@skip {} {
  String { '"' (stringContent | StringEnv)* stringEnd }
}

Object { "{" list<Property>? "}" }
Array  { "[" list<value>? "]" }

Property { PropertyName ":" value }
PropertyName { string }


@tokens {
  True  { "true" }
  False { "false" }
  Null  { "null" }

  Number { '-'? int frac? exp?  }
  int  { '0' | $[1-9] std.digit* }
  frac { '.' std.digit+ }
  exp  { $[eE] $[+\-]? std.digit+ }

  string { '"' char* '"' }
  char { $[\u{20}\u{21}\u{23}-\u{5b}\u{5d}-\u{10ffff}] | "\\" esc }
  esc  { $["\\\/bfnrt] | "u" hex hex hex hex }
  hex  { $[0-9a-fA-F] }
  interpolationContent { @asciiLetter | $[_$\u{a1}-\u{10ffff}] | @digit }

  Env { "${" char* "}" }

  whitespace { $[ \n\r\t] }

  "{" "}" "[" "]"
}

@skip { whitespace }
list<item> { item ("," item)* }

@external propSource jsonHighlighting from "./highlight"

@detectDelim

Do you mean matched as in matchBrackets or as in highlighted? The former should work with the openedBy/closedBy attributes. The highlighting should just be a matter of assigning the correct highlighting tags (which you don’t show in your post).

this is my highlighting:

colors are:
not highlighted: grey
string: green
interpolation: orange

const jsonHighlighting = styleTags({
  String: tags.string,
  Number: tags.number,
  "True False": tags.bool,
  PropertyName: tags.propertyName,
  Null: tags.null,
  ",": tags.separator,
  "[ ]": tags.squareBracket,
  "{ }": tags.brace,
  Env: tags.bool,
  InterpolationStart: tags.bool,
  InterpolationEnd: tags.bool,
});

Which worked before, but not anymore. I found another problem with the above solution:

image

An interpolation inside an interpolation gets matched as a string, rather than just interpolationContent.

I think the main problem is that it already matches ${ as an interpolation:

image

Note the orange, even though there is no closing brace.

It does not do that for interpolations outside of string:

image

So I did get it to work reasonably well:

As you can see there is still a bug where it does not match the closing brace with the opening brace of the interpolation. To still make it work I had to change the regular JSON braces to a different token, so that I can highlight all “}” the same way I highlight the interpolation content. TBH this is really confusing to me as I basically ended up copying the JavaScript template literals.

I will leave the final grammar here as a reference anyway as some might benefit from it:

@top JsonText { value }

value { True | False | Null | Number | Object | Array | TemplateString | Env }

Object { openBrace list<Property>? closeBrace }
Array  { "[" list<value>? "]" }

Property { PropertyName ":" value }
PropertyName { PropertyTemplateString }

@skip {} {
  TemplateString {
    templateStart (templateEscape | templateContent | templateExpr)* templateEnd
  }
  PropertyTemplateString {
    templateStart (templateEscape | templateContent | templateExpr)* templateEnd
  }
}

templateExpr[@name=Interpolation] { InterpolationStart interpolationContent* InterpolationEnd }

templateStart { '"' }

InterpolationEnd[openedBy=InterpolationStart] { "}" }

@local tokens {
  InterpolationStart[closedBy=InterpolationEnd] { "${" }
  templateEnd { '"' }
  templateEscape[@name=Escape] { "\\" esc }
  @else templateContent
}

EnvEnd[openedBy=EnvStart] { "}" }
EnvStart[closedBy=EnvEnd] { "${" }
Env { EnvStart interpolationContent* EnvEnd }

@tokens {
  True  { "true" }
  False { "false" }
  Null  { "null" }

  Number { '-'? int frac? exp?  }
  int  { '0' | $[1-9] @digit* }
  frac { '.' @digit+ }
  exp  { $[eE] $[+\-]? @digit+ }

  string { '"' char* '"' }
  char { $[\u{20}\u{21}\u{23}-\u{5b}\u{5d}-\u{10ffff}] | "\\" esc }
  esc  { $["\\\/bfnrt] | "u" hex hex hex hex }
  hex  { $[0-9a-fA-F] }
  interpolationContent { $[\u{21}-\u{7a}] }

  whitespace { $[ \n\r\t] }

  openBrace { "{" }
  closeBrace { "}" }

  "{" "}" "[" "]"
}

@skip { whitespace }
list<item> { item ("," item)* }

@external propSource jsonHighlighting from "./highlight"

@detectDelim

And here is the highlighting I used:

const jsonHighlighting = styleTags({
  String: tags.string,
  Number: tags.number,
  "True False": tags.bool,
  PropertyName: tags.propertyName,
  Null: tags.null,
  TemplateString: tags.string,
  PropertyTemplateString: tags.propertyName,
  ",": tags.separator,
  "[ ]": tags.squareBracket,
  StringEnv: tags.bool,
  Env: tags.moduleKeyword,
  Interpolation: tags.moduleKeyword,
  "InterpolationStart InterpolationEnd  EnvStart EnvEnd": tags.moduleKeyword,
  // NOTE: This is a hack to get the highlighting to work
  "}": tags.moduleKeyword,
  Escape: tags.number,
});

I am sure there is an easy way to fix this bug, but I cannot find it and this is a state that I can work with.

NOTE: Also interpolation inside an interpolation is not supported using this grammar.

The brackets not lighting up from one side is because Lezer turns the } into InterpolationEnd { "}" }, and it will look at the inner }, not the InterpolationEnd to see what should be highlighted.

You can fix this by removing } from the tokens list:

  "{" "}" "[" "]"

into

  "{" "[" "]"