Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: [PATCH] [[:blank:]] only matches on SPC and TAB
- X-seq: zsh-workers 42779
- From: Daniel Tameling <tamelingdaniel@xxxxxxxxx>
- To: zsh-workers@xxxxxxx
- Subject: Re: [PATCH] [[:blank:]] only matches on SPC and TAB
- Date: Mon, 14 May 2018 21:52:14 +0200
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=i3D0gnPnxXF13Yl9lx4AZhX2NxetC9giGYyXtRU2Ouc=; b=djU5hTiE9TBmqPuG0oJStZpqfQuQY5etNuW8meQFjiIMGEIxhcdkeq+1GZWUDPny7L 0woR0xt5+rHbpPqC63t8yZsxcvj9Z9fgXiFzsJEY0zqk+UDH7Zz71H2nhaPrMiT8WEVC yES3+WyFD63mQ8apqhCRSSL+6QYjez6Ph8YWdJH6Ok8EIEKNuUbi4TdKDdFXTJvDDzQM uk+eDT/SP81TMIUiwb+DasxNKPvAjKVgrMwEJ9geBJlsEkhJBen1CnjjFap7/xe2cxqg t5ArLaOEPNDUKu9QNkf25ONH4nGp5k6dF7i6BbwOT1dpkyrDADt109JXdHvkAAq0+0fE tP+g==
- In-reply-to: <CAH+w=7YTbR8pTm7rdeFVTCHT1Xk7fJJAJB2zi0j6fH9L4P4ULQ@mail.gmail.com>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- List-unsubscribe: <mailto:zsh-workers-unsubscribe@zsh.org>
- Mail-followup-to: zsh-workers@xxxxxxx
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <CAKc7PVDyrTMsmBSEDcMC=CNVCjOnEDVtywRYA0=UnNCBpF=7JQ@mail.gmail.com> <20180514063611.GA7263@chaz.gmail.com> <CGME20180514064505epcas3p1b2f178c595fc9bb962e4094e296ba699@epcas3p1.samsung.com> <20180514064431.GB7263@chaz.gmail.com> <20180514094733.308bff1a@camnpupstephen.cam.scsc.local> <20180514123425.GA19631@chaz.gmail.com> <20180514145056.3eedaea9@camnpupstephen.cam.scsc.local> <20180514155131.GC7263@chaz.gmail.com> <CAKc7PVACLxCpp4XEoizYzJLg9_qMFhrwJvZF3J+fkMudx3q+rg@mail.gmail.com> <CAH+w=7YTbR8pTm7rdeFVTCHT1Xk7fJJAJB2zi0j6fH9L4P4ULQ@mail.gmail.com>
Stephane already quoted some man pages, but here is what the C99/C11
standards say:
"The isblank function tests for any character that is a standard blank
character or is one of a locale-specific set of characters for which
isspace is true and that is used to separate words within a line of
text. The standard blank characters are the following: space (' '),
and horizontal tab ('\t'). In the "C" locale, isblank returns true
only for the standard blank characters."
And Posix seems to say the same: it defines blank for the C locale
and states that in other locales it should at least encompass space
and tab.
So in other locales it seems to be totally undefined what a blank is,
and everybody does what they think is good choice. Thus the mess
Stephane observed. In fact, I looked at the musl library and found
this code:
int isblank(int c)
{
return (c == ' ' || c == '\t');
}
int __isblank_l(int c, locale_t l)
{
return isblank(c);
}
So they completely ignore the locale and just use the bare minimum
required by the standard. So after the patch, zsh would not only
behave differently on different platforms but would also change it's
behavior if you link with a different libc.
Nevertheless, I'm slightly in favour of the patch. While defining our
own :blank: for other locales might give us consistency across
platforms, I think it will end up to be different than what everybody
else does and will thus lead to unexpected results for users -- in
particular if the libc's start to agree on isblank for different
locales. And at that point, it might be difficult to change the
behavior if it breaks backward compatibility.
In fact, it's the hope that the situation will improve in the future
that sways me towards the patch compared to the status-quo. But seeing
the mess Stephane uncovered made it a very tight race.
Finally, whether the patch gets applied or not, the documentation
should definitely be updated to reflect the issues around :blank:.
--
Daniel
Messages sorted by:
Reverse Date,
Date,
Thread,
Author