Full document parsing

I would like to extract all the highlighted text from syntax tree, trying ensureSyntaxTree does not give me the full tree. Is there another way to achieve this? We would like to parse the full document and extract all highlighted values matched by our StreamLanguage

    const tree: Tree | null = ensureSyntaxTree(view.state, view.state.doc.length, 5000);
1 Like

What makes you conclude that?

This is what I am trying, I am not getting the full list unless I scroll the entire document then make an edit.

import { EditorView, ViewUpdate } from '@codemirror/view';
import { ensureSyntaxTree } from '@codemirror/language';
import { Tree, SyntaxNodeRef } from '@lezer/common';
import { debounce } from 'lodash';
import { IScrapeEditorChangeEvent } from '../ScrapeViewTextEditor';
import { Subject } from 'rxjs';

 * This extension will iterate over the Tree created by codemirr/lezer and collect the list of extracted values, then trigger onChange to notify the consumer of
 * the editor.
export const tracker = (onChange: Subject<IScrapeEditorChangeEvent>) => {
  const extractMatches = debounce((view: EditorView) => {
    const tree: Tree | null = ensureSyntaxTree(view.state, view.state.doc.length);
    const matches: any[] = [];
      from: 0,
      to: view.state.doc.length,
      enter: ({ type, from, to }: SyntaxNodeRef) => {
        matches.push({ from, to, value: view.state.doc.sliceString(from, to), type: type.name });
    console.log('matches ', matches);
    onChange.next({ editing: false, text: view.state.doc.toString(), matches });
  }, 500);

  return EditorView.updateListener.of((v: ViewUpdate) => {
    const { view } = v;
    if (v.docChanged) {
1 Like

It seems this was a bad interaction between an optimization in the stream language and the way ensureSyntaxTree works. This patch should help.

Thank you @marijn this helps, seeing the full list now. Thanks for the speedy response. It has been a pleasure working with this awesome library :clap: