Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: Unable to input multibyte characters.
On Wed, 4 Nov 2009, Ian-Xue Li wrote:
> numerous
> Hi,
> my problem is that input "äåå" but appears some weird code like "ÃåÃÂ?
> ÃÂÃÂ".
That appears to be the UTF-8 sequence äåå interpreted as some other
char set. (e.g. ISO-8859-1)
$ echo äåå | iconv -f ISO-8859-1 -t UTF-8
ÃÂÂÃÂÂÃ
So, I suspect it's a locale issue. What do you get from the 'locale'
command? For me, under Gentoo Linux, with working UTF-8 support, I get:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=POSIX (<-- personal preference... shouldn't matter here)
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
> I tried setting the multibyte option by "setopt multibyte", but after
> that, "setopt" outputs no multibyte flag in its listing. So I figure
> there might be something wrong with the version 4.3.10 ?
If built with multibyte support, the default is that multibyte will be on.
For me (with multibyte working):
$ set -o | grep multibyte
nomultibyte off
(The 'no' prefix means that it's on by default.)
> Terminal is urxvt and xterm, both were unable to input Chinese and
> Japanese characters with SCIM. (doable in bash, nothing else is
> changed.)
With rxvt-unicode (urxvt), the following works for me:
$ scim -d
# SCIM starts as daemon
$ XMODIFIERS=@im=SCIM GTK_IM_MODULE=scim QT_IM_MODULE=scim urxvt
(... new urxvt starts ...)
$ <Ctrl+Space>
(...activates scim-pinyin... and I'm able to enter things via pinyin.)
To be sure Zsh itself is okay, you can try the following:
$ autoload insert-unicode-char
$ zle -N insert-unicode-char
$ bindkey "^U" insert-unicode-char
Then, to type your 'ni hao ma' from before, where '^U' represents Ctrl+U:
^U 4f60 ^U ^U 597d ^U ^U 55ce ^U
(The first '^U' tells Zsh to expect a hex-coded Unicode charpoint. The
second '^U' tells Zsh you're finished inputting the hex and then it
inserts the char.)
>
> I've also recompiled Zsh with an explicit "--enable-multibyte" && has
> started Zsh with --multibyte flag, they did no help.
>
> (this is vital because there are numerous files are named in these
> characters, and I use the shell tools to manage them. So please help !)
>
Best,
Ben
Messages sorted by:
Reverse Date,
Date,
Thread,
Author