I’m trying to make a human relative date time parser. Running into the following issue in this heavily simplified example:
@top DateTimeExpression {
Date (space ~s ('at' space ~s)? Time)?
| Time
}
Date { FullYear ~n (space ~s Month (space ~s DayOfMonth)?)? }
Time { Hours ~n (space? AmPm)? }
FullYear { Digit ~n Digit ~n Digit ~n Digit ~n }
DayOfMonth { '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' | '13' | '14' | '15' | '16' | '17' | '18' | '19' | '20' | '21' | '22' | '23' | '24' | '25' | '26' | '27' | '28' | '29' | '30' | '31' ~n }
Month { 'january' | 'february' | 'march' | 'april' | 'may' | 'june' | 'july' | 'august' | 'september' | 'october' | 'november' | 'december' }
Digit { '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ~n }
Hours { ('0' | '1')? ~n Digit | '2' ~n ('0' | '1' | '3' | '4') ~n }
AmPm { ('a' 'm'? | 'p' 'm'?) }
@tokens {
space { @whitespace+ }
}
shift/reduce conflict between
Date -> FullYear space Month · space DayOfMonth
and
Date -> FullYear space Month
With input:
FullYear space Month · space …
Shared origin: @top -> · Date
The space after the FullYear
or Month
could by followed by either a DayOfMonth
or 'at'
…I thought annotating the spaces with ~s
would tell the parser to try both branches in GLR, but seemingly not.
Btw, is (Foo ~marker)?
equivalent to Foo? ~marker
or not?
Notes: I’ve avoided tokenizing numbers so that I have some hope of interpreting 32/08/14
as yy/MM/dd
and 14/08/32
as dd/MM/yy
, though I guess I could just leave that for code that runs after the parser.
Also I’ve had to rely on significant whitespace for many of my rules.
Starting to wonder if I should just hand write a parser with infinite lookahead for this use case, since the expressions are never going to be very long… but Lezer is awesome! I’m hoping I can use it!