Full document parsing

kbell · December 15, 2022, 4:49pm

Hello,
I would like to extract all the highlighted text from syntax tree, trying ensureSyntaxTree does not give me the full tree. Is there another way to achieve this? We would like to parse the full document and extract all highlighted values matched by our StreamLanguage

    const tree: Tree | null = ensureSyntaxTree(view.state, view.state.doc.length, 5000);

marijn · December 15, 2022, 5:03pm

What makes you conclude that?

kbell · December 15, 2022, 5:43pm

This is what I am trying, I am not getting the full list unless I scroll the entire document then make an edit.

import { EditorView, ViewUpdate } from '@codemirror/view';
import { ensureSyntaxTree } from '@codemirror/language';
import { Tree, SyntaxNodeRef } from '@lezer/common';
import { debounce } from 'lodash';
import { IScrapeEditorChangeEvent } from '../ScrapeViewTextEditor';
import { Subject } from 'rxjs';

/**
 * This extension will iterate over the Tree created by codemirr/lezer and collect the list of extracted values, then trigger onChange to notify the consumer of
 * the editor.
 */
export const tracker = (onChange: Subject<IScrapeEditorChangeEvent>) => {
  const extractMatches = debounce((view: EditorView) => {
    const tree: Tree | null = ensureSyntaxTree(view.state, view.state.doc.length);
    const matches: any[] = [];
    tree?.iterate({
      from: 0,
      to: view.state.doc.length,
      enter: ({ type, from, to }: SyntaxNodeRef) => {
        matches.push({ from, to, value: view.state.doc.sliceString(from, to), type: type.name });
      }
    });
    console.log('matches ', matches);
    onChange.next({ editing: false, text: view.state.doc.toString(), matches });
  }, 500);

  return EditorView.updateListener.of((v: ViewUpdate) => {
    const { view } = v;
    if (v.docChanged) {
      extractMatches(view);
    }
  });
};

marijn · December 16, 2022, 7:42am

It seems this was a bad interaction between an optimization in the stream language and the way ensureSyntaxTree works. This patch should help.

kbell · December 16, 2022, 4:24pm

Thank you @marijn this helps, seeing the full list now. Thanks for the speedy response. It has been a pleasure working with this awesome library