Performance vs. Tree Sitter for non-web-based use

Jason1923 · June 28, 2021, 10:00pm

Hi, I’m developing a text editor written mainly in C++ and I’m deciding between using Tree-sitter and Lezer to handle syntax highlighting.

In the xi editor retrospective, Raph stated,

If I were trying to create the best possible syntax highlighting experience today, I’d adapt Marijn Haverbeke’s Lezer.

Also, Lezer’s grammars/parsers are very compact in file size, which is a plus.

I’m almost sold, but I was wondering how the performance is compared to Tree-sitter since Lezer is designed to be used for the web and is written in JS instead of C. I’m not even sure how JS would interop with my C++ program. Would you suggest I use Tree-sitter, use Lezer (somehow), or roll my own non-web-based version of Lezer? Thanks for reading.

marijn · June 29, 2021, 8:06am

In a C++ program, I’d, at a glance, recommend tree-sitter, because indeed it’ll be a lot easier to interface with than a JavaScript system. I haven’t ran any direct comparison benchmarks. Tree-sitter generates a bit more heavyweight trees, but due to being written in C it likely still has the speed advantage. It’s grammar-writing language is indeed a bit less ergonomic, but should be about equally expressive.