Length of the expression?

Hello, we’re writing a postgreSQL esque grammar for Lezer. We noticed that the grammar doesn’t compile (the compilation gets stuck) if the expression is rather lengthy.

So for example, if we have an expression like below,

Create whitespace Account (whitespace)? Identifier Admin "-" Name (whitespace)? "=" (whitespace)? Identifier whitespace Admin Dash Password (whitespace)? "=" (whitespace)? "'" Identifier "'" whitespace Email (whitespace)? "=" (whitespace)? Identifier whitespace Edition (whitespace)? "=" (whitespace)? (Standard |Enterprise |Business "-" Critical) (First "-" Name (whitespace)?"="(whitespace)? Identifier)?(Last "-" Name (whitespace)?"="(whitespace)? Identifier)? (Must "-" Change "-" Password(whitespace)?"="(whitespace)? (True|False) )? ";"

it will get stuck, but if we replace the later half of the expression, with a substitute keyword, that has the body of the rest of the original expression,

AccountOptional { (First "-" Name (whitespace)? "=" (whitespace)? Identifier)? (Last "-" Name (whitespace)? "=" (whitespace)? Identifier)? (Must "-" Change "-" Password(whitespace)? "=" (whitespace)? (True|False))? }

and then the previous expression becomes:

Create whitespace Account (whitespace)? Identifier Admin "-" Name (whitespace)? "=" (whitespace)? Identifier whitespace Admin Dash Password (whitespace)? "=" (whitespace)? "'" Identifier "'" whitespace Email (whitespace)? "=" (whitespace)? Identifier whitespace Edition (whitespace)? "=" (whitespace)? (Standard|Enterprise|Business "-" Edition) AccountOptional ";"

it does compile after this substitution. So, could it be the length of the expression that’s causing this? Is it not advisable to have lengthy expressions? (Provided that they are not reused a lot)

The ? operator is resolved by expanding to one instance of the rule with and one without the expression. For example a? b c? expands to b | a b | b c | a b c So if you have a lot of them (or nested | operators) in a single rule, you will get an exponential amount of rules, which I guess is what’s causing the slowness in this case.

2 Likes

Thank you!