Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: PATCH: (large) initial support for combining characters in ZLE.
- X-seq: zsh-workers 24833
- From: "Jun T." <takimoto-j@xxxxxxxxxxxxxxxxx>
- To: zsh-workers@xxxxxxxxxx
- Subject: Re: PATCH: (large) initial support for combining characters in ZLE.
- Date: Fri, 18 Apr 2008 03:33:36 +0900
- In-reply-to: <20080413175442.0e95a241@pws-pc>
- Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm
- References: <20080413175442.0e95a241@pws-pc>
Thank you for starting the combining character support!
At 17:54 +0100 08.4.13, Peter Stephenson wrote:
>the base character must be an alphanumeric (and
>I'm not sure about the numeric, I need to find a better definition),
and
I think this is too restrictive, because in some Asian languages
(Japanese, Korean, Thai, etc.) the base character can be non-alphaget.
For example, in Japanese, Hiragana/Katakana can be combined with
U+3099 (VOICED SOUND MARK) or U+309A (SEMI-VOICED SOUND MARK).
Example: U+3057 U+3099 = "じ"
the base character U+3057 = "し" is not an alphanumeric.
>the zero-width characters afterwards (I haven't imposed a limit on how
>many there are) must be punctuation.
I guess this is also too restrictive. I have run the code like the
following
on Fedora7:
wchar_t w;
setlocale(LC_ALL,"");
for(w=1; w<0x2ffff; ++w) {
if(wcwidth(w)==0 && iswpunct(w)==0) {
printf("%05x: %lc\n",w,w);
}
}
It listed 166 characters, all of which seem to be combining chars in
Thai or Korean (U+0e4e and U+1160 may not be combining, I'm not sure).
I think strictly defining combined char is virtually impossible,
because there are so many "nonsensical" combinations like
"Hiragana + umlaut". Even within alphabet, a combination like
"x + U+0318" is almost as strange as "space + grave".
How about accepting any combination?
If terminal emulator displays garbage, the user can turn off the
option COMBINING_CHARS to see the hex code.
Messages sorted by:
Reverse Date,
Date,
Thread,
Author