Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: UNICODE Private Use Area characters in BUFFER
- X-seq: zsh-workers 50826
- From: Mikael Magnusson <mikachu@xxxxxxxxx>
- To: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- Cc: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: UNICODE Private Use Area characters in BUFFER
- Date: Mon, 24 Oct 2022 03:27:50 +0200
- Archived-at: <https://zsh.org/workers/50826>
- In-reply-to: <CAH+w=7bJxkKxB2jL0jqxdg0-eeb16u1MCcHnMndT9aDdBDwMpw@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <CAN=4vMowyKmrQtQb=QTxiVzQJXRubz-o2T12=6aQBHSpkKwOig@mail.gmail.com> <CAHYJk3SWfX7ZaFA=WgDBtSPZD0isV5OUHWgf3ienhzhzK+9xQw@mail.gmail.com> <CAN=4vMoLQBt8ST7E3EachnLra05ENPOiY0nDOC0Z_=a=8Mg4SA@mail.gmail.com> <CAHYJk3Qi+DEGBYZvwXrqehzjbHHunnVfx6dhJB7hJpjM9GWHiQ@mail.gmail.com> <CAN=4vMo1m5O4M72qqpQfu3hC4-FkW0PN4o_eVt_O=-yL0Qx8Sg@mail.gmail.com> <CAH+w=7YCtLoqhx-WmGUKxCCbkGX_5Z1jfGmcd8-59b-iptjOew@mail.gmail.com> <CAH+w=7bJxkKxB2jL0jqxdg0-eeb16u1MCcHnMndT9aDdBDwMpw@mail.gmail.com>
On 10/24/22, Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> On Sun, Oct 23, 2022 at 4:35 PM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
> wrote:
>>
>> Asserting that zsh "handles" those characters in other
>> contexts isn't indicative of anything beyond demonstrating that
>> terminal "handling" is a special case.
>
> Seems to me we've got the following options:
>
> 1. Do nothing.
> 2. Presume Roman is correct that these characters can always be
> treated as printable and narrow. (Still no answer as to how best to
> change this?)
> 3. Add an option UNICODE_PRINTABLE_NARROW that when set, asserts all
> these characters to be printable and narrow. Default ... on?
> 4. Add special variable(s) (perhaps via module?) to allow remapping
> the wcwidth9.h lookup tables to make individual characters printable
> and set their width.
I think if we should do anything with wcwidth9.h, it's remove it.
Since adding it there have been 6 subsequent unicode standards, the
latest one adding over 4000 ideographs alone[1] (I don't know what
width the version 9 wcwidth gives for this range). It is probably
returning wrong values for many more thousands of characters on
systems where the libc has newer tables than unicode 9. I suppose it
could be useful to enable when remoting into old systems from a modern
one.
We should probably at least mark it as deprecated, glibc 2.26 added
support for unicode 9 and was released in august 2017, and the unicode
9 wcwidth.h was added to zsh in november 2016, a rather small window
where it mattered. What happened in unicode 9 was that the
presentation width for all emoji was changed to 2[2], I'm not sure how
this motivated people to add custom tables to every program they used
instead of simply updating glibc and have every program be correct at
once...
[1] https://home.unicode.org/announcing-the-unicode-standard-version-15-0/
[2] I couldn't find a more official reference than this atm,
https://github.com/irssi/irssi/issues/720
--
Mikael Magnusson
Messages sorted by:
Reverse Date,
Date,
Thread,
Author