Example of using Lezer to generate a traditional AST

Sharing because folks might find this useful, and folks might tell me I’m doing it all wrong and there’s a better way for me to do this :slight_smile:

I’m teaching a compilers course this quarter where we’re using Python as a source language and WebAssembly as a target. I wanted an off-the-shelf parser for Python since I don’t think writing parsers is useful for the course, and also wanted to learn to use Lezer’s interface. I know that generating ASTs isn’t really the primary goal of Lezer, but given the reasonably complete lezer-python, I figured we’d go for it.

Here’s the kind of pattern we ended up writing to go from Lezer Tree to our AST:

This is a small subset of Python I was using to illustrate classes limited to two “field declarations” in lecture, it’s clearly quite incomplete, but enough to get the point across.

The main idea (probably unsurprising if you’re used to Cursor-like traversals, but maybe useful to see) is to write a recursive traversal that matches firstChild/parent calls, and navigates around a bit of the tree to collect the substrings needed to build the AST node.

This is a bit error prone – missing a navigation call like parent() before returning can be pretty rough as the traversal runs off into other parts of the tree. Other than that this seems to work reasonably well, and it’s not too onerous to write another clause for a new type of syntax we want to handle as we build up the language we’re covering.

Cool tool, and thanks for having lezer-python available!

1 Like

Nice! I’ve spent some time thinking about providing a structured way to do this, but it’s hard to do in a way that doesn’t bloat the original tree (which is not acceptable for the main use case).

Did you see the (somewhat) new getChild method and group prop? Those were designed to somewhat help applications like this.

Oh I haven’t tried getChild, that could be useful for skipping some of the repeated nextSibling() calls to jump past uninteresting keywords and delimiters. Thanks!