Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: cp file with a filter



On Fri, 15 Jan 2010 17:17:59 +0100
"Yuri D'Elia" <wavexx@xxxxxxxxxxxx> wrote:
> Is there a way to rewrite the number without having to type again?
> As in re-executing the last command after a regular expression
> substitution?

You can do it with a zsh-style pattern substitution by some variant of
the following (there are lots of possible ways of changing this):

  subprev() {
    print -z ${history[$(($HISTCMD-1))]//${~1}/$2}
  }

  % print This line contains foo
  This line contains foo
  % subprev 'f?o' bar
  % print This line contains bar

Note I deliberately made this bring the line up for verification, so you
have to hit Enter on the new line.  This seems to me a very sensible
precaution with patterns, but there are ways round if you feel
particularly gung ho.

Hitting up-arrow and using the widget function replace-string with the
name replace-pattern is a more interactive alternative.

If you're dead set on regular expressions you can use the [[ ... =~
... ]] syntax, but unfortunately I've just noticed this is a bit broken
for substitutions since although it sets the variable MATCH it doesn't
set the variables MBEGIN and MEND, which both is annoyingly inconsistent
with variable substitution and makes it hard to decide which bit of the
line you're replacing.  The following fixes that omission.  It's an
exercise for the reader to use this to replace the part of the history line
from $MBEGIN to $MEND.

Comments on the patch should go to zsh-workers.

Index: Doc/Zsh/cond.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/cond.yo,v
retrieving revision 1.6
diff -u -r1.6 cond.yo
--- Doc/Zsh/cond.yo	15 Jan 2009 09:49:06 -0000	1.6
+++ Doc/Zsh/cond.yo	17 Jan 2010 20:46:27 -0000
@@ -117,13 +117,28 @@
 extended regular expression using the tt(zsh/regex) module.
 Upon successful match, some variables will be updated; no variables
 are changed if the matching fails.
+
+If the option tt(BASH_REMATCH) is not set the scalar parameter
+tt(MATCH) is set to the substring that matched the pattern and
+the integer parameters tt(MBEGIN) and tt(MEND) to the index of the start
+and end, respectively, of the match in var(string), such that if
+var(string) is contained in variable tt(var) the expression
+`${var[$MBEGIN,$MEND]}' is identical to `$MATCH'.  The setting
+of the option tt(KSH_ARRAYS) is respected.  Likewise, the array
+tt(match) is set to the substrings that matched parenthesised
+subexpressions and the arrays tt(mbegin) and tt(mend) to the indices of
+the start and end positions, respectively, of the substrings within
+var(string).  For example, if the string `tt(a short string)' is matched
+against the regular expression `tt(s(...)t)', then (assuming the option
+tt(KSH_ARRAYS) is not set) tt(MATCH), tt(MBEGIN)
+and tt(MEND) are `tt(short)', 3 and 7, respectively, while tt(match),
+tt(mbegin) and tt(mend) are single entry arrays containing
+the strings `tt(hor)', `tt(4)' and `tt(6), respectively.
+
 If the option tt(BASH_REMATCH) is set the array
 tt(BASH_REMATCH) is set to the substring that matched the pattern
 followed by the substrings that matched parenthesised
-subexpressions within the pattern; otherwise, the scalar parameter
-tt(MATCH) is set to the substring that matched the pattern and
-and the array tt(match) to the substrings that matched parenthesised
-subexpressions.
+subexpressions within the pattern.
 )
 item(var(string1) tt(<) var(string2))(
 true if var(string1) comes before var(string2)
Index: Src/Modules/regex.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Modules/regex.c,v
retrieving revision 1.5
diff -u -r1.5 regex.c
--- Src/Modules/regex.c	19 Jan 2009 08:26:21 -0000	1.5
+++ Src/Modules/regex.c	17 Jan 2010 20:46:27 -0000
@@ -108,11 +108,65 @@
 	    if (isset(BASHREMATCH)) {
 		setaparam("BASH_REMATCH", arr);
 	    } else {
+		zlong offs;
+		char *ptr;
+
 		m = matches;
 		s = ztrduppfx(lhstr + m->rm_so, m->rm_eo - m->rm_so);
 		setsparam("MATCH", s);
-		if (nelem)
+		/*
+		 * Count the characters before the match.
+		 */
+		ptr = lhstr;
+		offs = 0;
+		MB_METACHARINIT();
+		while (ptr < lhstr + m->rm_so) {
+		    offs++;
+		    ptr += MB_METACHARLEN(ptr);
+		}
+		setiparam("MBEGIN", offs + !isset(KSHARRAYS));
+		/*
+		 * Add on the characters in the match.
+		 */
+		while (ptr < lhstr + m->rm_eo) {
+		    offs++;
+		    ptr += MB_METACHARLEN(ptr);
+		}
+		setiparam("MEND", offs + !isset(KSHARRAYS) - 1);
+		if (nelem) {
+		    char **mbegin, **mend, **bptr, **eptr;
+		    bptr = mbegin = (char **)zalloc(nelem+1);
+		    eptr = mend = (char **)zalloc(nelem+1);
+
+		    for (m = matches + start, n = start;
+			 n <= (int)re.re_nsub;
+			 ++n, ++m, ++bptr, ++eptr)
+		    {
+			char buf[DIGBUFSIZE];
+			ptr = lhstr;
+			offs = 0;
+			/* Find the start offset */
+			MB_METACHARINIT();
+			while (ptr < lhstr + m->rm_so) {
+			    offs++;
+			    ptr += MB_METACHARLEN(ptr);
+			}
+			convbase(buf, offs + !isset(KSHARRAYS), 10);
+			*bptr = ztrdup(buf);
+			/* Continue to the end offset */
+			while (ptr < lhstr + m->rm_eo) {
+			    offs++;
+			    ptr += MB_METACHARLEN(ptr);
+			}
+			convbase(buf, offs + !isset(KSHARRAYS) - 1, 10);
+			*eptr = ztrdup(buf);
+		    }
+		    *bptr = *eptr = NULL;
+
 		    setaparam("match", arr);
+		    setaparam("mbegin", mbegin);
+		    setaparam("mend", mend);
+		}
 	    }
 	}
 	else
Index: Test/C02cond.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/C02cond.ztst,v
retrieving revision 1.23
diff -u -r1.23 C02cond.ztst
--- Test/C02cond.ztst	26 Nov 2008 10:50:07 -0000	1.23
+++ Test/C02cond.ztst	17 Jan 2010 20:46:27 -0000
@@ -251,6 +251,39 @@
   fi
 0:regex tests shouldn't crash
 
+  if zmodload -i zsh/regex 2>/dev/null; then
+    string="this has stuff in it"
+    bad_regex=0
+    if [[ $string =~ "h([a-z]*) s([a-z]*) " ]]; then
+      if [[ "$MATCH $MBEGIN $MEND" != "has stuff  6 15" ]]; then
+	print -r "regex variables MATCH MBEGIN MEND:
+  '$MATCH $MBEGIN $MEND'
+  should be:
+  'has stuff  6 15'" >&2
+        bad_regex=1
+      else
+	results=("as 7 8" "tuff 11 14")
+	for i in 1 2; do
+	  if [[ "$match[$i] $mbegin[$i] $mend[$i]" != $results[i] ]]; then
+	    print -r "regex variables match[$i] mbegin[$i] mend[$i]:
+  '$match[$i] $mbegin[$i] $mend[$i]'
+  should be
+  '$results[$i]'" >&2
+	    break
+	  fi
+	done
+      fi
+    else
+      print -r "regex failed to match '$string'" >&2
+    fi
+    (( bad_regex )) || print OK
+  else
+    # if it didn't load, tough, but not a test error
+    print OK
+  fi
+0:MATCH, MBEGIN, MEND, match, mbegin, mend
+>OK
+
 %clean
   # This works around a bug in rm -f in some versions of Cygwin
   chmod 644 unmodish
-- 
Peter Stephenson <p.w.stephenson@xxxxxxxxxxxx>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/



Messages sorted by: Reverse Date, Date, Thread, Author