Implicit Multiplication xyz -> ((x*y)*z)

I’ve been trying to create a grammar to parse mathematical expressions such as -ab^{2}-c.

In my case, I need to support implicit multiplication, so that an expression like abc is parsed as ((a*b)*c). and also some juxtapositions like sin xy should be “juxtapose(sin, x*y)”

This, of course, must coexist with all the standard operations. For example, the expression -abc^{2} should be parsed as ((-a)*b)*(c^2).

The issue I’m facing is that I haven’t been able to make my implicit multiplication rule compatible with all other rules. When I fix one case, I end up breaking two others.

The closest I’ve come to a working grammar is shown below. However, it has one issue:
abc^{2} is currently parsed as ((a*b)*c)^2 instead of the correct (a*b)*(c^2).

I’m turning to the forum because I haven’t been able to solve this issue using precedence rules alone and it’s clear I’m missing something that I think will be easy for someone else to spot.

Here are some test cases that I think if they work, everything will work:

  • -x^{2}-y^{2}Sub(Negate(Pow(x,2)), Pow(y,2))
  • -a-b-c-dSub(Sub(Sub(Negate(a), b), c), d)
  • abcMul(Mul(a, b), c)
  • \sinabcJuxtapose( FunctionName, Mul (...)))
  • -xyz^{2}-abc^{2}Sub(Negate(Mul(Mul(x, y), Pow(z, 2))), Mul(Mul(a, b), Pow(c, 2))) (currently does not work)

Minimal working example grammar:

@top Program { expression* }

@precedence { 
  pow @right
  times @left 
  unary @left
  juxt @left
  div @left
  plus @left
  minus @left
}


expression {
  juxtaposeExpression | secondaryExpression
}

secondaryExpression {
  unaryExpression | primaryExpression
}

primaryExpression {
  leaf | binaryExpression 
}

leaf {
  Variable | Constant
}

Pow { expression !pow "^{" expression "}"}
Add { expression !plus "+" expression}
Sub { expression !minus "-" expression}
Mul { expression !times ("*"|'\\cdot') expression}
Div { expression !div "/" expression | !div '\\frac{'expression'}{'expression'}'}

binaryExpression { 
   Add | Sub | Mul | Div | Pow
}

unaryExpression {
  Negate
}
Negate { !unary "-" expression}


juxtaposeExpression {
  Mul { leaf !juxt leaf } |
  Mul { juxtaposeExpression !juxt leaf } |
  Juxtapose { FunctionName !juxt expression }
}

@skip { '$' | space | '\\ ' }

@tokens {
  space { @whitespace }
  Variable { ('_')*$[A-Za-z]("_{"($[A-Za-z]|@digit+)*"}")*}
  FunctionName { ("\\sin" | "\\cos" | "\\tan")}
  Constant { 
    (
      (std.digit+ ("." std.digit*)?)
    ) |
    std.digit+ $[uU]? |
    "0x" (std.digit | $[a-fA-F])+ $[uU]?
  }
}

I think the issue is that Mul { leaf !juxt leaf } doesn’t allow a Pow on the right hand, so the tree you’re looking for isn’t expressed by this grammar.

With precedence operators, it’s usually better to put every type of expression in one big expression rule rather than having subcategories like secondaryExpression, primaryExpression, juxtaposeExpression, etc. It makes the grammar easier to read, faster to parse, and avoids issues like this.

Thanks for your reply.

Yes. With this grammar, whatever rule I came up with to allow Pow on the rhs would fail some of the other tests, so I left the grammar as it is on purpose…

With precedence operators, it’s usually better to put every type of expression in one big expression rule rather than having subcategories like secondaryExpression , primaryExpression , juxtaposeExpression , etc. It makes the grammar easier to read, faster to parse, and avoids issues like this.

That’s how I structured my grammar in the full version (with one big expression). This was my attempt at avoiding some expressions to be valid, like -a-b being parsed as Mul( Negate(a), Negate(b) ) through the implicit multiplication. I separated like this to try to be more strict in some rules.

Below is another grammar where I bunched up things in a big expression and made the Juxtapose more general. In this one, stuff like abc^{2} works as expected, but -a-b-c fails as it is parsed as the implicit multiplication of (-a) * (-b) * (-c). This was my original problem. I got the abc^{2} problem only after trying to fix the -a-b-c one.

@top Program { expression* }

@precedence { 
  highest @left
  pow @right
  times @left 
  unary @left
  juxt @left
  juxtfunc @left
  implicitmul @left
  div @left
  plus @left
  minus @left
  lowest @left
}


expression {
  Variable | Constant | unaryExpression | binaryExpression | juxtaposeExpression
}

Pow { expression !pow "^{" expression "}"}
Add { expression !plus "+" expression}
Sub { expression !minus "-" expression}
Mul { expression !times ("*"|'\\cdot') expression}
Div { expression !div "/" expression | !div '\\frac{'expression'}{'expression'}'}

binaryExpression { 
   Add | Sub | Mul | Div | Pow
}

Negate { !unary "-" expression}
unaryExpression {
  Negate
}

juxtaposeExpression {
  Mul { (Variable | Constant) !implicitmul expression } | // Specific one I want to catch here
  FunctionCall { FunctionName !juxtfunc expression } | // Specific one I want to catch here
  // I'll have a bunch of other specific ones...
  Juxtapose { expression !juxt expression } // General juxtaposition that I deal with it later
}

@skip { '$' | space | '\\ ' }

@tokens {
  space { @whitespace }
  Variable { ('_')*$[A-Za-z]("_{"($[A-Za-z]|@digit+)*"}")*}
  FunctionName { ("\\sin" | "\\cos" | "\\tan")}
  Constant { 
    (
      (std.digit+ ("." std.digit*)?)
    ) |
    std.digit+ $[uU]? |
    "0x" (std.digit | $[a-fA-F])+ $[uU]?
  }
}