Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: Possible multibyte issue do to missing check for 'utf8' in the source code
- X-seq: zsh-workers 54039
- From: Oliver Kiddle <opk@xxxxxxx>
- To: linuxtechguy@xxxxxxxxx
- Cc: devs <zsh-workers@xxxxxxx>
- Subject: Re: Possible multibyte issue do to missing check for 'utf8' in the source code
- Date: Wed, 05 Nov 2025 20:03:22 +0100
- Archived-at: <https://zsh.org/workers/54039>
- In-reply-to: <CA+rB6GJKdZV3dqKkjmXH3-LNrR=o0sBcnos1jb-z56oWb-qRpA@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <CA+rB6GJKdZV3dqKkjmXH3-LNrR=o0sBcnos1jb-z56oWb-qRpA@mail.gmail.com>
Jim wrote:
> virtual terminal(xfcer-terminal). Both LANG and LC_ALL are set to 'en_US.utf8'.
>
> Using a gentoo system, so naming is 'en_US.utf8' not 'en_US.UTF-8' as is with
> most distributions.
>
> I did a grep of the zsh repository and found the following:
>
> Src/compat.c: if (!strcmp(nl_langinfo(CODESET), "UTF-8"))
>
> Src/utils.c: if (!strcmp(nl_langinfo(CODESET), "UTF-8")) {
If you want to know what your system is comparing against in that case,
do:
zmodload zsh/langinfo
echo $langinfo[CODESET]
I would also suspect that on a Gentoo system, both the lines you list
are skipped as part of the C conditional macros tests.
What exactly do you mean by "output some special characters" and by
"multi-byte".
If all you're doing is echo or print then it is down to
your terminal. Are you using \u escapes? Or are you editing texting with
said multi-byte characters in the zsh line editor.
And by multi-byte, do you just mean characters that use more than one
byte in a UTF-8 encoding, characters that are composed with combining
characters (try setopt COMBININGCHARS) or double width characters that
are wider when displayed.
Problems can occur where libc (as used by zsh) doesn't agree with GUI
framework libraries on the classification of unicode characters. So
it makes a big difference whether your problems are with a few basic
accented characters or with the latest emoji from the newest
unicode spec.
Oliver
Messages sorted by:
Reverse Date,
Date,
Thread,
Author