By reading the documentation I understand that the InputStream pos
is not global and is relative the the fragment the parser is parsing (Lezer Reference Manual). Is this correct?
Is it possible to obtain the global position (relative to the complete input) to avoid having to work with this interface?
Why do you want the global position? The amount of things you can do in a tokenizer without breaking incremental re-parsing is rather limited, and the input stream abstraction tries to prevent you from going wrong there.
I was wondering because we have a complex tokenizer that we would not want to modify it much. It uses pure javascript strings as input and this new interface changes things.
Also we make use of the input.substring(start)
(copy all the input starting at start
) which is not efficient but is part of our current tokenizer. We would prerfer using pure strings instead of the interface (the InputStream will force us to do a while
loop with String.fromCharCode()
).
InputStream.pos
does refer to a global position in the whole input. It sounds like the docs you linked explicitly mention this.
Hello Marijn, would you explain why InputStream
interface includes pos
at all? What would be a possible use for that, given that we do not have the original input, and the only means to get anything from the input is to use peek
and advance
?
thanks
konstantin
The XML mode uses it as a cache key to avoid re-reading tag names, and some other external tokenizers use it to check whether they’ve moved the input stream, but indeed, it’s not super useful.