Is there any ongoing work on lezer-parser for SQL?

Hello,

We were wondering if there is any ongoing work on lezer-parser for SQL? If not, how can we start building one in terms of adding rules to the grammar?

Also curious to know – where did you get the other grammar rules from?

Thank you!

I am personally not interested in maintaining a detailed grammar of SQL dialects—there’s just too much syntax, and each dialect tends to be completely different from the others. So the @codemirror/sql package, which is about on par with the old SQL support in 5.x, is going to be pretty as good as it gets from my side. It doesn’t really define the SQL grammar, just the tokens and some crude structure used for indentation.

That being said, writing more external packages for specific dialects and maintaining them separately would be a great idea.

@marijn We’re interested in building a lang-snowsql npm package that works with Code Mirror 6.

  1. We’re new to lezer-parser and want to understand how to build a grammar from scratch. We have read through writing a grammar section of the docs. Is it possible to provide any notes on how you went about building the python or javascript grammar? And how we could go about building the grammar for a base dialect of SQL?
  2. Also, we were curious as to what is your source of the production rules for python/javascript? Are you porting the rules from a different grammar syntax like lex or yacc or something else or just building it on your own

Some parsers (like C++) are based on the corresponding tree-sitter grammars (see also this tool). Others, like Python and CSS, are based on the grammars provided by the language docs. And with JavaScript and HTML I was just working from my own understanding of the languages.

We were able to create a grammar for Snowflake SQL. It is very similar to PostgreSQL. Does the grammar look okay to you? We appreciate any advise you may have for us. We used LR parser grammars to learn and follow in writing this grammar:

https://github.com/Snowflake-Labs/lezer-snowsql/blob/master/src/snowsql.grammar

I think that repository is not public (or was deleted in the meantime).

I’m so sorry! My bad. I’m working with our admins to fix permissions and share it with you.

Hey @marijn, sorry for the delay. The link below should work. We’ll really appreciate any feedback/advise you’d have on this : )

What is the role of all these specialized keywords at the bottom when most aren’t used in the grammar? Or are they going to be used at some point in the future?

Also, for small punctuation-style tokens, it’s often convenient to just use literals (";") rather than named tokens. You can still make them appear in the tree by mentioning the literals in your @tokens block.

Yes, they are to going to be used for other definitions in the future.

Ah, okay. Will keep this in mind. Thanks for the feedback!