Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Support for combining characters



I thought a bit more about how to support combining characters in ZLE,
and since I'm still trying to avoid having to understand phrases like
[1], here are a few tentative conclusions[2].

- Not all terminals[3] support combining characters, and we may not
be able to rely on full support for those that do.  So I think we
need an option like ZLE_COMBINING_CHARS.  Possibly we can probe for
this eventually:
 - read the cursor position from the terminal
 - output a base character
 - output a zero-width accent character
 - read the cursor position again
 - see if it's moved by only the original character width
but that's for the future.

- As far as I understand it, any Unicode character that claims to be
both printable and zero-width is to be treated as a combining character.
It needs to follow a real character for that to happen, so I would
propose to continue handling any that don't in the current fashion
(highlighted <FFFE> etc.) for safety.

- Within zle_refresh.c, it would be best to continue with the
one-entry-per-screen-cell line format (unless anyone is volunteering to
do a wholesale rewrite).  This causes difficulties since we need
multiple characters in the same entry, implying some form of
indirection.  On 64-bit systems, using a real pointer for this will
double the size even for wide characters.  I think we can make use of[4]
the extra flags I added for highlighting.  We could add a flag that
indicates the character is actually an index into an auxiliary array.
This is a bit like how option arguments for builtins are handled.  It's
not particular efficient, but I hope it won't be grotesquely slow with
some optimisation of reallocation.

- Outside zle_refresh.c I think this scheme would be too messy.  In that
case we will need to handle moving and deleting by carefully taking
account of combining characters:  moving left until we reach a
non-zero-width character or the start of line, or moving right over any
trailing zero-width characters.

I'm not sure what to do in the main shell.  The most important thing
here is to be able to handle combining characters in editor widgets.
${(m)#...} will help with this.  I don't know whether we need any more
support than that.

I will try and look at this[5] over the next few weeks.

pws


[1] A beamformer shall set the response type format indicated in the
CSI/steering field of the HT Control field of any sounding frame
excluding the NDP and of any PPDU with the NDP sounding announcement
field set to 1 to one of the non-zero values (CSI, Compressed
Beamforming or Non-compressed Beamforming) that corresponds to a type
that is supported by the beamformee.

[2] "creating a context for change" for all you corporate types out there

[3] (and certainly not Terminal 5)

[4] "leverage" for all you corporate types

[5] "implement change within that context"



Messages sorted by: Reverse Date, Date, Thread, Author