Exactly. I think the ambiguity markers may cause the possible paths to be too deep and the recovery function can’t get to the best routes.
Sure, this is the best i could come up with:
@top ROOT { MainBlock }
MainBlock { SecondLayerBlock “\n” | If }
If { IfKW identifier “\n” MainBlock EndIfKW}
SecondLayerBlock { ( Declaration | ~a “\n”)* ~a }
Declaration { String “:” identifier “,”}
String { ‘"’ char* ‘"’ }
@skip { limited_whitespace }
@tokens{
IfKW {’#if’}
EndIfKW {’#endIf’}
identifier { $[a-zA-Z_] $[a-zA-Z0-9_]*}
limited_whitespace {$[ \r\t] }
char { $[\u{20}\u{21}\u{23}-\u{5b}\u{5d}-\u{10ffff}] }
@precedence { identifier, limited_whitespace }
@precedence { char, limited_whitespace }
}
With the following input:
Correct:
#if condition
“name”: value,
#endIf
Incorrect:
#if condition
"name: value,
#endIf
If i reduce the SecondLayerBlock so that it does not have ambiguities then lezer can recover the best possible tree in the incorrect example. Unfortunately, I have to somehow deal with empty lines in the SecondLayerBlock, but can’t declare it in a separate skip block, because the original language has recursive calls from the SecondLayerBlock to the MainBlock and i have to explicitly look for Newline characters in the MainBlock.
If i increase the recoverDist to 10, then lezer can recover the best tree from the incorrect example even with the ambiguous grammar.