I have been working on adding language support for an in-house SQLish language. There already exists a grammar for it written in ANTLR4, as well as a lot of custom logic around parsing and tokenization that I will be able to leverage, so going the typical Lezer route doesn’t make much sense for my use case.
I attempted writing a Stream Parser, but discovered that the parsing/tokenization utilities that I am using are incompatible with the way that stream parsing works–the utilities I am using parse the entire document at once and do not support a line-by-line approach (I realize that this does not scale; however, the language is extremely limited and even the most complex queries possible are fairly small).
From this document, it seems my only other option is to write a custom parser. I’ve been trying to explore what that option would look like by looking at the markdown package as an example, as well as exploring the docs for Codemirror and Lezer.
I’m looking to validate my approach and make sure I am not missing anything obvious; I also am starting this thread as a place to share my learnings. Here are the initial steps from what I’ve gathered:
- Subclass
Language
found in@codemirror/language
- Subclass the
Parser
class found in@lezer/common
- Implement the methods for
Parser
(createParse
,startParse
,parse
) - Conform all of the types of my existing language utilities to match the types that
Parser
expects
The questions that I’m attempting to answer:
- Is this a valid approach?
- Is there anything obvious I’m missing?
Thanks in advance!