Characters vs CodePoints

It seems that the editor works entirely using code points, i.e., 𐌐 would start at position 0 and end at 1. Is there any way of treating the text in the editor as characters and not code points, i.e., that 𐌐 would be 2 characters, 0-2 (��), instead?

This is entirely inaccurate (as a trivial experiment or a good look at the docs would have told you). The positioning system works in UTF16 code units, like JavaScript strings do.

Unsure how it’s entirely inaccurate, trying to decorate the string 𐌐 with two decorations, one from positon 0-1 and one from 1-2 yields only one decoration ranging from range 0-1.

Splitting an astral character in two by putting a decoration boundary in its middle sounds like a really bad idea. But even so, a decoration going to position 2 is definitely not going to end at position 1. Can you show what you are doing in a minimal piece of code?

Apologies, my testing was incorrect. The decorations work as intended, and the astral character is split, but the UI still only shows it as 1 character. The first decoration has the full character width, while the second has 0px width.

I suppose that is how the browser displays spans that have their boundaries inside characters.

1 Like

I can split them up using a decoration that adds a non-joiner after, but the cursor (using the arrow keys) treats both characters as one, and effectively jumps two steps when navigating back and forth. You can still place the cursor between with the mouse, but the keyboard navigation becomes limited.