Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: [PATCH?] Re: [BUG] `$match` is haunting my regex’s trailing, optional, capture
- X-seq: zsh-workers 52405
- From: Oliver Kiddle <opk@xxxxxxx>
- To: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- Cc: chris0e3@xxxxxxxxx, Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: [PATCH?] Re: [BUG] `$match` is haunting my regex’s trailing, optional, capture
- Date: Tue, 12 Dec 2023 00:49:50 +0100
- Archived-at: <https://zsh.org/workers/52405>
- In-reply-to: <CAH+w=7bSrq8p8-LNbn-M-Fkigo1GP3S=5+uXho5zw3bJxXBbBQ@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <A231AE39-13BE-487E-AE31-AF35F2891A8C@gmail.com> <CAH+w=7b8tF16GhZvpcF8urVV-tAAY6DHFRwp=7QNUfA27QxJhA@mail.gmail.com> <CAH+w=7bSrq8p8-LNbn-M-Fkigo1GP3S=5+uXho5zw3bJxXBbBQ@mail.gmail.com>
Bart Schaefer wrote:
> On Fri, Dec 8, 2023 at 10:23 PM Bart Schaefer <[1]schaefer@xxxxxxxxxxxxxxxx>
> wrote:
>
> On Fri, Dec 8, 2023 at 9:14 PM <[2]chris0e3@xxxxxxxxx> wrote:
> >
> > setopt rematch_pcre
> > [[ 'REQUIRE. OPT' =~ 'REQUIRE.(\s*OPT)?' ]] && printf '\tA. ‹%s›\n'
> $match
> > [[ 'REQUIRE.' =~ 'REQUIRE.(\s*OPT)?' ]] && printf '\tB. ‹%s›\n'
Without rematchpcre and with \s changed to just a space, this will set
match=( '' ) which is what would seem most logical to me.
> Is "unset match" OK here? There doesn't seem to be an obvious way to
> distinguish "there are capture expressions, but none matched anything" from
> "there were no capture expressions". Maybe Oliver has a better clue.
pcre2_get_ovector_count() will give how many capture expressions
the pattern contains. The following:
[[ 'REQUIRE.1' =~ 'REQUIRE.(\s*O(P)T)?(1)' ]]
results in match=( '' '' 1 ). So adding empty elements at the end too is
consistent with that. pcre2_match's return status tells us the
last capture element that was set.
I didn't find anything in the documentation to confirm that later
elements of the ovector will have been initialised empty but they do
appear to be. If you get garbage instead of empty elements, that'll be
the cause.
Oliver
diff --git a/Src/Modules/pcre.c b/Src/Modules/pcre.c
index e48ae3ae5..a49d1a307 100644
--- a/Src/Modules/pcre.c
+++ b/Src/Modules/pcre.c
@@ -391,6 +391,8 @@ bin_pcre_match(char *nam, char **args, Options ops, UNUSED(int func))
pcre_mdata = pcre2_match_data_create_from_pattern(pcre_pattern, NULL);
ret = pcre2_match(pcre_pattern, (PCRE2_SPTR) plaintext, subject_len,
offset_start, 0, pcre_mdata, mcontext);
+ if (ret > 0)
+ ret = pcre2_get_ovector_count(pcre_mdata);
}
if (ret==0) return_value = 0;
@@ -479,7 +481,8 @@ cond_pcre_match(char **a, int id)
break;
}
else if (r>0) {
- zpcre_get_substrings(pcre_pat, lhstr_plain, pcre_mdata, r, svar, avar,
+ uint32_t ovec_count = pcre2_get_ovector_count(pcre_mdata);
+ zpcre_get_substrings(pcre_pat, lhstr_plain, pcre_mdata, ovec_count, svar, avar,
".pcre.match", 0, isset(BASHREMATCH), !isset(BASHREMATCH));
return_value = 1;
break;
Messages sorted by:
Reverse Date,
Date,
Thread,
Author