HOWTO: Scan entire document to gather tokens upon first load


Hello World

I am working on a new mode for a language that has many quirks.

In the token() function, I’m gathering meta information into the state object and this object is nicely available with the lineInfo() method; however, it is not available until the line has been “seen” by CM.

I have been searching for a method to cause the entire document to go through the token() process so that the meta information is collected. And, NO, I do not wish to tell my users to scroll to the bottom and then run the indention routine!

I am not certain that our good friend Marijn had BASIC in mind when creating this wonderful product. QMBASIC has a challenging syntax for indentation.

Any help is always appreciated!

Vielen Danke
Merci Beaucoup

(yes, I am American…) :slight_smile:


The getStateAfter method with true as second argument is probably what you’re looking for.


What I’m needing is a way to store metadata on each line. I had been trying to put it into the state variable, but this has not been a good approach because it mutates.

For now, I think I’ll just use an external array until I get this sorted out.

The methodology will work once the control arrays are stable.


If your mode needs to store something, the state is the proper place for that. I’m not sure what you mean by ‘it mutates’.


I need to store 2 arrays for each line. Some of the indentation rules depend upon what was in the preceding lines.

Some statements go to multiple lines:

D = ''

In this example, we know that we wish to add to the indention level after the ELSE clause, reduce before END THEN, add after and then reduce again before the final END.

This is but 1 way that the syntax allows.

If I capture the keywords on the line in an array:

horizontal[0] = 'READ’
horizontal[1] = 'ELSE’
vertical[0] = ‘READ’

I can now determine when to increase the indention level.

What has been a problem was the pushing and popping of these arrays when they are attached to the state object and when that object is passed into the indent() function.

It almost works.


I’ve just been reading the documentation and other forum topics.

Apparently, it is improper to try to do anything with line numbers as the data must be a stream.

It would be possible to push and pop the control arrays inside of the CM object during the tokenization phase, but it would require:

  1. The ability to compute the present indentation level (so far so good)
  2. The ability to execute the indentLine inside the special indentation function.

Thus, we start with stream.sol(), the tokens are scanned and stored. At stream.eol(), we can now determine if we should indent and how far.

I do not see a state.indent() method. What I will try tomorrow is a state.myIndent being set as an integer at the stream.eol() and before returning. I can then loop through the lines and perform an indent.

Something for tomorrow


OK. I’m stuck yet again.

I’m doing some external operations in JS where I need to process the lines inside the CM object. The problem is that, until the lines have been viewed by the user by scrolling, they do not have data set up in the state object for that line.

For example, I have a 1700-line program. In order to get all the lines tokenized and a state set up, I must first scroll to the bottom of the document, then call the external function.

Is there not a method to have CM scan all lines upon the initial load?


OK. Viewport Margin - set to a high number. That seems to have fixed it.


Yes, there is, as I mentioned before: getStateAfter with true as second argument.