Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: [PATCH] PCRE/NUL: pass NUL in for text, handle NUL out
- X-seq: zsh-workers 41314
- From: Stephane Chazelas <stephane.chazelas@xxxxxxxxx>
- To: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- Subject: Re: [PATCH] PCRE/NUL: pass NUL in for text, handle NUL out
- Date: Sat, 17 Jun 2017 07:31:28 +0100
- Cc: zsh-workers@xxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=3iSnhdhplahlqOA2AUXa3J80ftQp1XqdoyXl49oC0go=; b=SqDZTwp4aPh+Yf8fijaGH1I56ehiDyWJo4C9gGfGrIgZ1nUCaScAzMhIYNMFjVGCrf xyz3lUHrtgehHRqj1mEw1MDWab25LNjdbwXf/C1G2OTRJrIUV0R578b9jY6z4W/aDmgA yTpCzPINgnwkvDVfWZnV4lPaX+9oq9qalB53CPlJmaiP3jaPc2k2NcAtWH4pnwQQV68I TX7OttYY/hWhCD4LtDrtklOdNJEg4GOWLhobgPxNcUVpBlY3OV21s0/Uzr5BRZnerG9w O3u1SiN9HKH2yH3T8HvCzi75iRMuvW8KoOujiJRHcQtdK0ooiOZBq5j/agU6NreUnirb VwDQ==
- In-reply-to: <170616201049.ZM28016@torch.brasslantern.com>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mail-followup-to: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>, zsh-workers@xxxxxxx
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <20170615204050.GA27003@breadbox.private.spodhuis.org> <20170616064129.GA19469@chaz.gmail.com> <170616201049.ZM28016@torch.brasslantern.com>
2017-06-16 20:10:49 -0700, Bart Schaefer:
> On Jun 16, 7:41am, Stephane Chazelas wrote:
> }
> } Solution for now in zsh is to escape like:
> }
> } [[ $x =~ "\b\Q${word//\\E/\\E\\\\E\\Q}\E" ]]
>
> Hmm, wouldn't "\b\Q${(b)word}\E" be sufficient there? In fact if
> you've applied ${(b)word} do you even need \E and \Q ?
Not really
Inside \Q...\E PCREs, only \E is special, and there's no
escaping you may do. It's like strong quotes. Changing ? to \?
would change the meaning of the regexp. And wouldn't help for \E
Outside of \Q...\E where what needs to be escaped on whether the
regexp has a (?x)), there are things like . or $ (or blanks with
(?x)) it would still leave unescaped.
PCREs (as opposed to some ERE implementations that have things
like \<, \=) are good though in that AFAICT, there are only \x
operators where x is an ASCII alnum, so adding a \ in front of
every ASCII non-alnum should be enough I would think (as long as
we're not inside [...] or things like \g{...}). So a an
equivalent of ${(b)var} for PCRE should not too difficult.
Quoting both ERE and PCRE is a problem in theory for (?x) and
blanks where "\ " is unspecified in ERE, but in practice, I
don't think any ERE implementation would ever have "\ " as a
special operator. So I think it should be a matter of quoting
only (and not more than):
ASCII [[:space:]]
$^*()+[]{}.?\|
(and again (from a security standpoint at least), that quoting
could be fooled in some locales like those that have BIG5-HKSCS
or GB18030 as the charset where some characters whose encoding
contains the encoding of other characters including ASCII ones).
--
Stephane
Messages sorted by:
Reverse Date,
Date,
Thread,
Author