Conflicting precedence

dxparker · February 27, 2022, 2:44am

Parsing a language that describes addition with integers, variables, and optional parentheses to any depth. Variables can be chained using a dot notation.

// Example 1
1 + (2 + ((3 + 4))) + a + 5 + b.c

// Example 2
a.(b+c) => Chain(Var, Add(Var, Var))

// Example 3
(a+b).c => Chain(Add(Var, Var), Var)   // Desired tree
(a+b).c => Add(Var, Var)⚠(".")⚠(Var) // Actual tree

The following grammar works for all permutations I’ve come up with except for Example 3 above.

@precedence { boost @left }
@top Program { expr }

// Chain and Var in expr are competing with Chain and Var in chainOperand
expr { maybeTree<!boost Chain | !boost Var | Int> }
Add<t> { maybeGrouped<t !boost plus t> }
Chain { maybeGrouped<chainOperand !boost dot chainOperand> }
chainOperand { maybeTree<Chain | Var> }

maybeGrouped<t> {
  (!boost open maybeGrouped<t> !boost close) | t
}
maybeTree<t> {
  Add<maybeTree<t>> | maybeGrouped<t>
}

@tokens {
	Int { std.digit+ }
	Var { std.asciiLetter+ }
	plus { '+' }
	dot { '.' }  
  ws { std.whitespace+ }
  open { '(' }
  close { ')' }
}
@skip { ws }

Moving the precedence !boost for Chain and Var from expr to chainOperand fixes the parse issue in Example 3 but causes many other permutations to fail, e.g. a + b => Chain(Add(Var,Var),⚠("")).

Omitting !boost in expr and chainOperand creates a reduce/reduce error.

Any advice on how to resolve this precedence problem or correctly describe this language a different way?

marijn · February 27, 2022, 2:56pm

Are you intentionally forbidding add expressions inside grouped expressions? Doesn’t the more traditional (and readable) format of a single recursive expr rule that describes all the various expressions with their appropriate precedences work here? Also, it’s usually a good idea to use separate, distinct precedence names for different constructs, so that you clearly define their relative precedence.

dxparker · February 27, 2022, 8:37pm

Doesn’t the more traditional (and readable) format of a single recursive expr rule that describes all the various expressions with their appropriate precedences work here?

Just the advice I needed. Here’s a working grammar that I’m much happier with. This is my first time doing this - any other suggestions are welcomed.

@precedence { paren @left, dot @left, plus @left }
@top Program { expr }

expr { open expr close | varExpr | nonVarExpr }

varExpr { open varExpr !paren close | AddVars | Chain | Var }
Chain { varExpr !dot dot varExpr }
AddVars { varExpr !plus plus varExpr }

nonVarExpr { open nonVarExpr !paren close | Add | Int }
Add {
  nonVarExpr !plus plus (nonVarExpr | varExpr)
  | (nonVarExpr | varExpr) !plus plus nonVarExpr
}

@tokens {
	Int { std.digit+ }
	Var { std.asciiLetter+ }
	plus { '+' }
	dot { '.' }  
  ws { std.whitespace+ }
  open { '(' }
  close { ')' }
}
@skip { ws }

Are you intentionally forbidding add expressions inside grouped expressions?

The dot operator can not be applied to add expressions that contain integers.