Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

[PATCH] hist: fix :h and :t inconsistencies



i noticed some inconsistencies with :h and :t

  % echo Doc///(:h1)
  Doc///

it seems to me that trailing slashes should be omitted since they're not
path components nor necessary to distinguish them

as i was changing that i also discovered the // exception, discussed in
workers/13273 etc:

  % echo /bin(:h) ///bin(:h) ////bin(:h) but //bin(:h)
  / / / but //

granted posix leaves it up to the implementation as to whether to do
this, but both macos and glibc dirname({1,3}) return / in all cases,
and even 25 years ago (!) it didn't seem like anyone had actually
encountered a platform that treated // specially, other than cygwin. so
i propose that only cygwin should get this behaviour

at first glance it seems like :t has a similar issue to the :h1 example,
in that ///bin(:t2) gives ///bin instead of /bin. but i think this makes
sense because the /s there are acting as separators between bin and an
invisible root component, and these functions never collapse separators
between two components

however, i noticed that /(:t) and /(:tn) with any n give an empty
string, which i don't think is right. posix says basename({1,3}) should
return / for /, and again if we think of it like there's an invisible
root component you can't go past it feels like that makes sense

some other languages, like python and php, disagree with this. but the
documentation for :t specifically says it should work like basename, so?

as far as the empty-string case -- ${:t} -- posix says basename(3) must
return '.' but basename(1) can return either '.' or ''. we return ''. i
think when our documentation refers to basename it probably means
basename(1) so i guess that's fine

these changes are three separate patches below, tell me if you don't
like any of them

(and again unless i'm advised otherwise i'm holding back changes to
existing functionality until after 5.10)

dana


diff --git a/Src/hist.c b/Src/hist.c
index 00bdbb2b8..162253d6d 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2091,7 +2091,11 @@ remtpath(char **junkptr, int count)
 	    ++strp;
 	}
 
-	/* Full string needed */
+	/* All components needed, but leave off any trailing slashes */
+	while (*strp && !IS_DIRSEP(*strp))
+	    ++strp;
+	*strp = '\0';
+
 	return 1;
     }
 
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index ed25fd7a9..a8e6fbd18 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -2692,6 +2692,16 @@ F:behavior, see http://austingroupbugs.net/view.php?id=888
 >removed
 >are/removed
 
+  for p in a/ a/b/ a/b/c/ a/b//// a//b////; do
+    print ${p:h} ${p:h1} ${p:h2} ${p:h3}
+  done
+0:Modifier :h suppresses trailing slashes
+>. a a a
+>a a a/b a/b
+>a/b a a/b a/b/c
+>a a a/b a/b
+>a a a//b a//b
+
  foo=global-value
  fn() {
     local foo=function-value


diff --git a/Src/hist.c b/Src/hist.c
index 162253d6d..cb902901b 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2106,10 +2106,12 @@ remtpath(char **junkptr, int count)
     if (str == *junkptr) {
 	++str;
 	/* Leading doubled slashes (`//') have a special meaning on cygwin
-	   and some old flavor of UNIX, so we do not assimilate them to
-	   a single slash.  However a greater number is ok to squeeze. */
+	   (and some old flavor of UNIX), so we do not assimilate them to
+	   a single slash there.  However a greater number is ok to squeeze. */
+#ifdef __CYGWIN__
 	if (IS_DIRSEP(*str) && !IS_DIRSEP(str[1]))
 	    ++str;
+#endif
     }
     *str = '\0';
     return 1;
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index a8e6fbd18..baae23bf6 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -2702,6 +2702,26 @@ F:behavior, see http://austingroupbugs.net/view.php?id=888
 >a a a/b a/b
 >a a a//b a//b
 
+  # We deliberately omit the //a case here -- see below
+  for p in /a ///a ////a ////a/b; do
+    print ${p:h} ${p:h1} ${p:h2} ${p:h3}
+  done
+0:Modifier :h collapses leading slashes (usually)
+>/ / /a /a
+>/ / ///a ///a
+>/ / ////a ////a
+>////a / ////a ////a/b
+
+  # Any test that uses two leading slashes like this will have
+  # platform-dependent behaviour. So probably best to just not use them
+  # anywhere else
+  for p in //a //a/b; do
+    print ${(M)OSTYPE:#cygwin*} ${p:h} ${p:h1} ${p:h2} ${p:h3}
+  done
+0:Modifier :h leaves exactly two leading slashes only on Cygwin
+*>(cygwin* // // //a //a|/ / //a //a)
+*>(cygwin* //a // //a //a/b|//a / //a //a/b)
+
  foo=global-value
  fn() {
     local foo=function-value


diff --git a/Src/hist.c b/Src/hist.c
index cb902901b..5bf453ea1 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2155,10 +2155,13 @@ remlpaths(char **junkptr, int count)
 
     if (IS_DIRSEP(*str)) {
 	/* remove trailing slashes */
-	while (str >= *junkptr && IS_DIRSEP(*str))
+	while (str > *junkptr && IS_DIRSEP(*str))
 	    --str;
 	str[1] = '\0';
     }
+    /* If we're at the root, we're finished */
+    if (IS_DIRSEP(*str))
+	return 1;
     for (;;) {
 	for (; str >= *junkptr; --str) {
 	    if (IS_DIRSEP(*str)) {
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index baae23bf6..265b7a8a5 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -2722,6 +2722,16 @@ F:behavior, see http://austingroupbugs.net/view.php?id=888
 *>(cygwin* // // //a //a|/ / //a //a)
 *>(cygwin* //a // //a //a/b|//a / //a //a/b)
 
+  for p in / /// //// /a ///a; do
+    print ${p:t} ${p:t1} ${p:t2} ${p:t3}
+  done
+0:Modifier :t returns collapsed / for root result
+>/ / / /
+>/ / / /
+>/ / / /
+>a a /a /a
+>a a ///a ///a
+
  foo=global-value
  fn() {
     local foo=function-value




Messages sorted by: Reverse Date, Date, Thread, Author