Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: UTF-8 input [was Re: PATCH: zle_params.c]
- X-seq: zsh-workers 20762
- From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxxxxx>
- Subject: Re: UTF-8 input [was Re: PATCH: zle_params.c]
- Date: Mon, 31 Jan 2005 16:18:26 +0000
- In-reply-to: <200501311146.j0VBki1g028832@xxxxxxxxxxxxxx>
- Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm
- References: <200501261806.j0QI6Q2d021854@xxxxxxxxxxxxxx> <20050129034740.GA21742@xxxxxxxxxxx> <20050130010754.6F985863A@xxxxxxxxxxxxxxxxxxxxxxxx> <1050130063525.ZM24312@xxxxxxxxxxxxxxxxxxxxxxx> <200501311146.j0VBki1g028832@xxxxxxxxxxxxxx>
On Jan 31, 11:46am, Peter Stephenson wrote:
} Subject: Re: UTF-8 input [was Re: PATCH: zle_params.c]
}
} > Otherwise don't you have issues if what the user really means to
} > bind to self-insert is a single-byte character that happens to have
} > the high bit set?
}
} Hmmm... you mean that on a system where mbrtowc() reports that a
} single-byte character is incomplete, the user might nonetheless want to
} insert a single-byte character onto the command line?
No. I mean, suppose the user uses the same .zshrc in both a iso-8859-*
and a UTF-8 locale, and has an explicit bindkey command which is intended
to work only in the iso-8859-* locale. That bindkey happens to use a
character for which, in the UTF-8 locale, mbrtowc() reports incomplete.
This was in part why I added the footnote asking about plans for UTF-8
in shell scripts; is it even possible to have the same .zshrc in these
cases?
However, I wasn't thinking very clearly, since mbrtowc() won't report
incomplete for an iso-8859-* character if LC_CTYPE is set correctly.
I'm still worried about the case where that bindkey exists but is for a
function other than self-insert. If multibyte translation is handled by
a widget at the same priority as all other widgets, that "stray" bindkey
can mess up the whole scheme.
} In other words, are you supposing this is some kind of fallback in
} case the locale isn't set correctly, e.g. it's set to UTF-8 but on an
} xterm with character set ISO-8859-1?
That was probably what was in my head, but on reflection it's not really
something that the shell can deal with.
Messages sorted by:
Reverse Date,
Date,
Thread,
Author