Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: multibyte backwarddeletechar
- X-seq: zsh-workers 16093
- From: Clint Adams <clint@xxxxxxx>
- To: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- Subject: Re: multibyte backwarddeletechar
- Date: Sun, 21 Oct 2001 14:21:06 -0400
- Cc: zsh-workers@xxxxxxxxxx
- In-reply-to: <1011021171339.ZM14059@xxxxxxxxxxxxxxxxxxxxxxx>; from schaefer@xxxxxxxxxxxxxxxx on Sun, Oct 21, 2001 at 05:13:38PM +0000
- Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm
- References: <20011021114254.A17952@xxxxxxxx> <1011021171339.ZM14059@xxxxxxxxxxxxxxxxxxxxxxx>
> I'm a bit surprised that this wouldn't cause significant confusion in
> the ZLE display code. How did the multi-byte character get input in
> the first place? Is it displayed as occupying one character position
> on the screen, or several? If only one, doesn't the cursor end up in
> the wrong place on most word- or line-oriented motions that cross it?
That depends on the terminal emulator and font. If I run
LANG=zh_TW.Big5 crxvt -ls -fm taipei16 -fn 8x16 -km big5 ,
each BIG5 character (2 octets) appears to take up the
vertical space on one ASCII character, and horizontal space
of two ASCII characters. If I run
LANG=zh_TW.Big5 crxvt -ls -fm taipei14 -fn 8x16 -km big5 ,
each BIG5 character (2 octets) appears to take up the
vertical space on one ASCII character, and horizontal space
of two and a half (2.5) ASCII characters, although crxvt
does some ugly overlapping resulting in ZLE not getting confused.
If I run LANG=ja_JP.UTF-8 xterm -class UXTerm ,
each UTF-8 Kanji character (3 octets) appears to take up
the same (2 horizontal, 1 vertical) space. In this case,
ZLE does get horribly confused. If I run
LANG=ru_RU.UTF-8 xterm -class UXTerm ,
each UTF-8 Cyrillic character (3 octets) appears to take
up the horizontal and vertical space of one ASCII character.
This also makes ZLE horribly confused. If I run
LANG=fr_FR.UTF-8 xterm -class UXTerm ,
each UTF-8 French non-ASCII character (2 octets)
appears to take up the horizontal and vertical space of one
ASCII character. Again, this confuses ZLE.
I imagine that 6-byte characters will generally take up
less horizontal space than 6 ASCII characters as well.
> If we're going to support wide and/or multi-byte characters, I think we
> should Do It Right, not by pasting a zillion workarounds into individual
> editor functions.
I suspect that Doing It Right involves changing char *line to
wchar_t *wline, and modifying all dependencies accordingly.
Additionally, we'd need to figure out how much space each
individual character consumes.
Messages sorted by:
Reverse Date,
Date,
Thread,
Author