Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: vcs_info and locales
- X-seq: zsh-workers 27903
- From: Phil Pennock <zsh-workers+phil.pennock@xxxxxxxxxxxx>
- To: Frank Terbeck <ft@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: vcs_info and locales
- Date: Sun, 25 Apr 2010 06:19:44 -0700
- Cc: zsh-workers@xxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=spodhuis.org; s=d200912; h=In-Reply-To:Content-Transfer-Encoding:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=RTCs06Zp9+Gqg2Wnxw7n3mV0Jb/YwoPdKwRlaTLZRfM=; b=RlVEr0BdqVxQ+DytCbREOGB/WyCrHZUKrrMdoZFHsFLp/oUVd+9QArb/39rdaGf3uK1f/gTa7e2NVPy4Gix7bhgVcR2RlUNMwnMCUoye8bc7mKBOhaBd3Te/ws3dILulPXfduXj94E6bFzO6n3BXr1H3HufbP2TR+yqrHHxIBPw=;
- In-reply-to: <87aassncyk.fsf@xxxxxxxxxxxxxxxxxxxxxx>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mail-followup-to: Frank Terbeck <ft@xxxxxxxxxxxxxxxxxxx>, zsh-workers@xxxxxxx
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <20100424234017.776ae0ea@coriolan> <87aassncyk.fsf@xxxxxxxxxxxxxxxxxxxxxx>
On 2010-04-25 at 10:38 +0200, Frank Terbeck wrote:
> Anyway, could you try the following patch for the locale problem? I
> think it should solve the issue once and for all.
I have one concern, which leads to the question: is it really necessary
to set LC_ALL instead of LC_MESSAGES?
The main problem is that when you override LC_CTYPE to C, you lose any
potential UTF-8 support, unless the tool just passes through the binary
data.
I think the safest algorithm is not to set LC_ALL but instead:
* set LC_MESSAGES=C
* if LC_ALL is set and is not C, set LANG=$LC_ALL, unset LC_ALL
Make sense?
Rest of this email is just some exploration and skippable.
I don't have NLS support on my main box, or I could do more testing
myself; with { svn log }, where most of my UTF-8 shows, LC_CTYPE=C leads
to expressing the content with escapes instead of cleanly. I know
VCS_Info doesn't use that, I mention it by way of example. { svn info }
by contrast always percent-encodes those characters; this works anyway,
because VCS_Info walks back up the dir-tree to find the svn co dir, so
has the relative info by comparing the FS realpath'd root of the repo to
the current dir.
URL: https://svn.spodhuis.org/ksvn/scratch/Fran%C3%A7ois
-> VCS_Info %S == FranÃois
(yes, I picked the OP's name as example testdata)
For experimentation, I created a repo with a UTF-8 character in its
name. Apache/mod_dav_svn won't serve it:
(20014)Internal error: Can't convert string from 'UTF-8' to native encoding: [...]
but I can use file:/// access instead. A repo named foo-â appears in my
prompt as <foo-%E2%98%BA:0> (<name:version>).
And still VCS_Info works:
URL: file:///home/pdp/tmp/T/ROOT/foo-%E2%98%BA/fred
Repository Root: file:///home/pdp/tmp/T/ROOT/foo-%E2%98%BA
-> VCS_Info %S == fred
pwd -> ..../T/foo-â/fred
URL: file:///home/pdp/tmp/T/ROOT/foo-%E2%98%BA/%E2%99%A1
Repository Root: file:///home/pdp/tmp/T/ROOT/foo-%E2%98%BA
-> VCS_Info %S == â
pwd -> ..../T/foo-â/â
I â VCS_Info for just working, but it's still juju. It also works as
an accidental artifact of the VCS_INFO_get_data_svn implementation. I
get to say "accidental" because apparently I wrote that code.
*scratches head*
(Through all this, cd gets interesting when xtitle updates to iTerm
silently drop the UTF-8 characters through to the display)
Messages sorted by:
Reverse Date,
Date,
Thread,
Author