I am working with a tokenizer for Mathematica language using legacy approach.
The idea is to highlight built-in (standard library) functions and user-defined symbols. In the modes/mathematica.js
there is a method used
// Literals like variables, keywords, functions
if (stream.match(reIdInContext, true, false)) {
return "function";
}
I added my own to match build-in symbols in a way like
var reKeywords = new RegExp(
"StringQ|Null|NullQ|Do|Block|Module|With|While|Sqrt|Switch|Which|... a lot"
);
// Literals like variables, keywords, functions
if (stream.match(reKeywords, true, false)) {
return "keyword";
}
My feeling that is for sure - it is extremely slow.
Is there any reliable way to do this?
I did check the source code of StreamParser, it seems to be the case that I can simply do
let arrayOfFunctions = ["Table", "While", "Do", ... a lot];
arrayOfFunctions.forEach((key) => {
if (stream.match(key, true, false)) {
return "keyword";
}
})
But the overhead of stream.match is also quite big…