SimpleMode regex alternative for property paths

Hallo Marijn!

I’m using CodeMirror for a project that requires editing of expressions - basically, simple one-liners… no funny code syntax - hence why I decided to go with simple mode.

The backend code itself has some integration with CM - type information and variable availability is passed to the editor. This is to allow type-hinting and on-the-fly code validation in the editor (eg; the user will know when a variable is not supported).

For example, let’s say there are 3 available variables: user_id, name and address. The state would look like this:

states.start.push({
    regex: new RegExp('user_id|name|address'),
    token: 'variable'
});

This works well for scalar types, but when it comes to objects things start getting messy:

    regex: new RegExp('user_id|name|address|address\.house|address\.street|address\.city|address\.country'),

For one thing, I don’t know how to make a usable regex for recursive references (eg: variable called “root” of class “node” which which defines two properties: name (string) and parent (node)):

    regex: new RegExp('root\.parent|root\.name|parent\.parent|parent\.name'),

Assuming it’s somehow possible, it would be massive though - basically, all possible combinations.


So the question is, what options do I have? Should I go back to writing a “regular” mode?


Edit: One option is to use the correct regex pattern (eg: \([A-Za-z_][\w\.]*)\) but then also pass the matches through a further filtering mechanism (perhaps based on a callback that returns true/false):

states.start.push({
    regex: new RegExp('user_id|name|address'),
    token: 'variable',
    validator: function(token) { typesys.verifyPath(token); } // or just typesys.verifyPath
});

I think this would make sense since the token matcher itself probably should care about the token being valid or not as long as it is the correct syntax. Also, it would be backward-compatible with the current implementation.

You could try representing the different types as states (in the simple mode sense). Or, if that doesn’t work, it might indeed make more sense to write a regular mode.

Regarding types as different states scenario, it means a state for every object.property combination, which result in a large list of rules.
I’ll try it out and see what happens - it’s the easiest fix anyway.

If this still fails, I’m considering adding a “post-process token” feature to simple mode, since the rest of that mode works pretty well anyway (in principle, the same as what I suggested in my edit above).

No, I was thinking your values represent different types, each of which has a different set of properties, so you’d have only one state per type. Definitely not one state per combination – that’d completely defeat the purpose.

I’m a bit lost, can you show me an example with two different states (for two different types)?
Remember that a property can also be of a type with further sub-properties (possibly ad-infinitum).

Something like this (only has a single type, node):

CodeMirror.defineSimpleMode("something", {
  start: [
    // string and number and such go here
    {regex: /\d+/, token: "number"},
    {regex: /root|parent|other_node_variable/, token: "variable", next: "node_type"}
  ],
  node_type: [
    {regex: /\.(?:name|parent)/, token: "property", next: "start"},
    // Anything else falls through to the start state without consuming anything
    {next: "start"}
  ]
});

Nice! I didn’t know it can be used this way…
I’ll see if this can work for my scenario.

Thanks!