Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: Issue with ${var#(*_)(#cN,M)}
On Tue, 27 Oct 2015 10:00:34 +0000
Peter Stephenson <p.stephenson@xxxxxxxxxxx> wrote:
> Original problem
> > } ~$ a='1_2_3_4_5_6'
> > } ~$ echo ${a#(*_)(#c2)}
> > } 2_3_4_5_6
>
> On Tue, 20 Oct 2015 16:04:22 -0700
> Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> > What's messing it up is the "*" operator and the backtracking that is
> > implied because * can match anything.
>
> Exactly. What's backtracking over what in what order here is a bit of
> nightmare, and I'm not sure I'm likely to get my mind round it.
>
> Unless someone does, you'll be better of sticking to
>
> % a='1_2_3_4_5_6'
> % echo ${a#([^_]#_)(#c2)}
> 3_4_5_6
>
> and then we don't have the "*" within the group to worry about.
Indeed, I've just noticed that with
% egrep --version
egrep (GNU grep) 2.8
the following:
% egrep '^(*_){2}$' <<<'1_2_'
fails to match completely, i.e the backtracking is too complicated
to handle, whereas
% egrep '^([^_]+_){2}$' <<<'1_2_'
succeeds. At this point, I'm going to document the difficulty and
slowly retreat backwards from the dark corner.
pws
diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 5ea8610..49a0f0d 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2192,6 +2192,16 @@ inclusive. The form tt(LPAR()#c)var(N)tt(RPAR()) requires exactly tt(N)
matches; tt(LPAR()#c,)var(M)tt(RPAR()) is equivalent to specifying var(N)
as 0; tt(LPAR()#c)var(N)tt(,RPAR()) specifies that there is no maximum
limit on the number of matches.
+
+Note that if the previous group of characters contains wildcards,
+results can be unpredictable to the point of being logically incorrect.
+It is recommended that the pattern be trimmed to match the minimum
+possible. For example, to match a string of the form `tt(1_2_3_)', use
+a pattern of the form `tt(LPAR()[[:digit:]]##_+RPAR()LPAR()#c3+RPAR())', not
+`tt(LPAR()*_+RPAR()LPAR()#c3+RPAR())'. This arises from the
+complicated interaction between attempts to match a number of
+repetitions of the whole pattern and attempts to match the wildcard
+`tt(*)'.
)
vindex(MATCH)
vindex(MBEGIN)
Messages sorted by:
Reverse Date,
Date,
Thread,
Author