Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: should we use PCRE2_MATCH_INVALID_UTF ?
- X-seq: zsh-workers 54709
- From: Stephane Chazelas <stephane@xxxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: should we use PCRE2_MATCH_INVALID_UTF ?
- Date: Mon, 8 Jun 2026 21:09:29 +0100
- Archived-at: <https://zsh.org/workers/54709>
- In-reply-to: <aiZp1-R2pI4l-4Wk@chazelas.org>
- List-id: <zsh-workers.zsh.org>
- Mail-followup-to: Zsh hackers list <zsh-workers@xxxxxxx>
- References: <aiZp1-R2pI4l-4Wk@chazelas.org>
2026-06-08 19:33:07 +0100, Stephane Chazelas:
[...]
> PCRE2 have a:
>
> > PCRE2_MATCH_INVALID_UTF Enable support for matching invalid UTF
>
> flag.
>
> That would not make "." match that $'\x80' byte but would align
> the behaviour with that of GNU's ERE's at least.
>
> Would it be worth adding? Patch below.
[...]
Argh! "make test" fails with it with:
Testing PCRE multibyte with locale en_US.UTF-8
Test ./V07pcre.ztst failed: bad status 1, expected 0 from:
pcre_compile 'cat(er(pillar)?)?'
pcre_match -d 'the caterpillar catchment' && print $match
Error output:
(eval):pcre_match:2: error in pcre matching for the caterpillar catchment: PCRE2_MATCH_INVALID_UTF is not supported for DFA matching
Was testing: pcre_match -d
So maybe not an option if we care for that "-d"/DFA matching.
--
Stephane
Messages sorted by:
Reverse Date,
Date,
Thread,
Author