Dynamic token types for ambiguous languages

Hi there,

I’m wondering if CodeMirror supports defining token types more dynamically. For example, if I were to use CodeMirror to create English language sentences, a word in a sentence may be a verb or a noun in a way that a parser cannot infer statically; the information must be supplied by a user externally:

We saw her duck.

I’ve been pointed to markText, but it feels like the wrong place to start. Even if I ended up using it, I may have to build a lot of the use case myself. Note that I need a token to stay of a given type until it’s modified; optionally, I would also like an option to force the token to be deleted or replaced all at once, which means disallowing editing inside the token.


That’s the atomic option to markText.

However, isn’t the mode the place to decide on this and represent the differences as different token types? Is there a best practice for doing it at the mode level? Any reason no to do it like that?

Nope, that’s not what CodeMirror modes do – they provide tokenizing information, but don’t directly change the state of the editor.

But modes can indirectly change the state of the editor: they can change their mind about a token’s type. If a noun changes to a verb, isn’t that a change in the token type? If we don’t use the mode, aren’t we in effect dumbing down the mode and pushing knowledge that should belong in the mode elsewhere?