Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: A solution to fix hidden references in reference chains



Here is an updated version, now built on top of workers/54261, and where I have added tests for the various corner cases.

Fix handling of dangling references to work for all of them.

Philippe


On Mon, Mar 23, 2026 at 11:21 PM Philippe Altherr <philippe.altherr@xxxxxxxxx> wrote:
I don't recall all the details (again, other readers?) but my
recollection is that there's a POSIXy issue here.  E.g., an exported
parameter remains exported even if unset (so re-assigning it updates
the environment), and I believe numeric types are supposed to retain
their properties across unset.
 This is why I wonder whether
references ought also to retain reference-ness across unset.

I always considered it a bug that references remain references across unset because in Zsh only local references exhibit this behavior (not global ones) and no other type (integer, float, array, ...) is retained across unset. I have now confirmed that no type is retained across unset neither in ksh, nor in bash. See details below. Given all that, I think that workers/54236 is correct.

Regarding exported parameters, references can't be exported; "typeset -n -x ref=var" raises an error. Therefore, I don't think we need to look at what happens with exported parameters. That being said, even with exported parameters I couldn't find a case where a type is retained across unset.

Below are functions and their output that I have used to confirm that no types are retained across unset. In each output, I have highlighted in red the result of "typeset -p ..." after the parameters were resurrected.

Zsh

The following function allows to test all parameter types for local (zsh-test-unset), global (G=-g zsh-test-unset), exported global (G=-g X=-x zsh-test-unset) and exported local (G=+g X=-x zsh-test-unset) parameters:

zsh-test-unset() {
  typeset $G $X    s=foo
  typeset $G $X -i i=42
  typeset $G $X -E E=4.2
  typeset $G $X -F F=4.2
  typeset $G $X -a a=(foo1 foo2)
  typeset $G $X -A A=([fooK]=fooV)
  typeset $G    -n n=foo

  PS4="# "; set -x
  typeset -p s i E F a A n
  unset   -n s i E F a A n
  typeset -p s i E F a A n
  typeset $G s i E F a A n
  typeset -p s i E F a A n
}


Here is the output for local parameters:

$ zsh-test-unset
# typeset -p s i E F a A n
# typeset -p s i E F a A n
typeset s=foo
typeset -i i=42
typeset -E E=4.200000000e+00
typeset -F F=4.2000000000
typeset -a a=( foo1 foo2 )
typeset -A A=( [fooK]=fooV )
typeset -n n=foo
# unset -n s i E F a A n
# typeset -p s i E F a A n
# typeset s i E F a A n
n=''
# typeset -p s i E F a A n
typeset s=''
typeset i=''
typeset E=''
typeset F=''
typeset a=''
typeset A=''
typeset -n n=''

Only the reference type is retained across unset.

Ksh (Version AJM 93u+m/1.0.10 2024-08-01)

Here is a similar function for ksh. Note that with ksh, "typeset var" doesn't create/resurrect any parameter; a type and/or a value has to be provided to trigger the creation/resurrection of a parameter.

function ksh_test_unset {
  typeset $G $X    s=foo
  typeset $G $X -i i=42
  typeset $G $X -E E=4.2
  typeset $G $X -F F=4.2
  typeset $G $X -a a=(foo1 foo2)
  typeset $G $X -A A=([fooK]=fooV)
  typeset $G    -n n=foo

  PS4="# "; set -x
  typeset -p s i E F a A n
  unset   -n s i E F a A n
  typeset -p s i E F a A n
  typeset $G s i E F a A n
  typeset -p s i E F a A n
  typeset $G s=bar i=bar E=bar F=bar a=bar A=bar n=bar
  typeset -p s i E F a A n
}


Here is the output for local parameters:

$ ksh_test_unset
# typeset -p s i E F a A n
s=foo
typeset -i i=42
typeset -E E=4.2
typeset -F F=4.2000000000
typeset -a a=(foo1 foo2)
typeset -A A=([fooK]=fooV)
typeset -n n=foo
# unset -n s i E F a A n
# typeset -p s i E F a A n
# typeset s i E F a A n
# typeset -p s i E F a A n
# s=bar
# i=bar
# E=bar
# F=bar
# a=bar
# A=bar
# n=bar
# typeset s i E F a A n
# typeset -p s i E F a A n
s=bar
i=bar
E=bar
F=bar
a=bar
A=bar
n=bar


No type is retained across unset.

Bash (5.2.21(1)-release)

Here is the function for bash, which doesn't have support for float parameters:

bash-test-unset() {
  typeset $G $X    s=foo
  typeset $G $X -i i=42
  typeset $G $X -a a=(foo1 foo2)
  typeset $G $X -A A=([fooK]=fooV)
  typeset $G $X -n n=foo

  PS4="# "; set -x
  typeset -p s i a A n
  unset      s i a A
  unset   -n         n
  typeset -p s i a A n
  typeset $G s i a A n
  typeset -p s i a A n
}


Here is the output for local parameters:

$ bash-test-unset
# typeset -p s i a A n
declare -- s="foo"
declare -i i="42"
declare -a a=([0]="foo1" [1]="foo2")
declare -A A=([fooK]="fooV" )
declare -n n="foo"
# unset s i a A
# unset -n n
# typeset -p s i a A n
declare -- s
declare -- i
declare -- a
declare -- A
declare -- n
# typeset s i a A n
# typeset -p s i a A n
declare -- s
declare -- i
declare -- a
declare -- A
declare -- n


No type is retained across unset.

Philippe
 

On Mon, Mar 23, 2026 at 3:21 AM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
On Sun, Mar 22, 2026 at 6:15 PM Philippe Altherr
<philippe.altherr@xxxxxxxxx> wrote:
>>
>> But you can't change a named reference into something else that way.
>> Even "unset -n" doesn't remove the nameref-ness of the surrounding
>> scope parameter.
>
> Yep but I was working on a fix for this issue. I have now sent the patch, see workers/54236.

This is exactly the thing I meant was a conflict.  I'm not convinced
workers/54236 is correct in this regard.

>> > - Ref added to a scope list is now an integer with base=N
>>
>> Per above, I think this is actually impossible, at least at present?
>
> Once workers/54236 is committed all of that should become possible.

Again, should it?  (Asking others, not Philippe.)  Related:

> - In workers/54236, I had to add many checks for PM_UNSET and PM_DECLARED. Most of these would not be needed if whenever a local variable is unset all its flags were cleared and replaced with PM_UNSET. This would guarantee that if a parameter has a type flag (one of PM_ARRAY, PM_INTEGER, PM_NAMEREF, ...) then it's for sure a still alive parameter of that type. Currently, you should always check for the presence of the type flag and the absence of PM_UNSET or the presence of PM_DECLARED, which is rather verbose and very error prone. Do you see any reason why flags should NOT be cleared when a parameter is unset? Afaik, there exists no mechanism that allows undeleting an unset parameter. So I don't see any reason why flags would need to be kept after a parameter is unset.

I don't recall all the details (again, other readers?) but my
recollection is that there's a POSIXy issue here.  E.g., an exported
parameter remains exported even if unset (so re-assigning it updates
the environment), and I believe numeric types are supposed to retain
their properties across unset.  This is why I wonder whether
references ought also to retain reference-ness across unset.

> - I wonder whether we would be better served if PM_DECLARED was replaced with a PM_NULL

We had that argument at some length before PM_DECLARED was introduced,
and decided against PM_NULL, and I'm not excited about rehashing the
topic.

> - Bart, you once suggested that dereferencing not-yet-initialized references ought to trigger an error. Currently such references need to be handled in several places and, when they are part of assignments, they trigger various kinds of errors/warnings that may not necessarily make much sense to end users. My impression is that things could be simpler and more uniform if dereferencing a not-yet-initialized reference would always trigger an error. I will try to write a patch that does that,

There, we need to consider what happens with ksh.
diff --git a/Src/params.c b/Src/params.c
index afc67eb14..4f5454abb 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -485,6 +485,24 @@ static initparam argvparam_pm = IPDEF9("", &pparams, NULL, \
 
 static Param argvparam;
 
+/*
+ * Lists of references to nested variables ("Param" instances) indexed
+ * by scope. Whenever the "base" scope of a named reference is set to
+ * refer to a variable more deeply nested than the reference itself
+ * ("base > level"), the "base" scope has to be updated once the
+ * "base" scope ends. The "scoperefs" lists keep track of these
+ * references. Since "Param" instances get reused when variables with
+ * the same name are redefined in the same scope, listed "Param"
+ * instances may no longer be references when the scope ends or may
+ * refer to a different "base" scope. A given "Param" instance may
+ * also be included in multiple lists at the same time or multiple
+ * times in the same list. Non of that is harmful as long as only
+ * instances that are still references referring to the ending scope
+ * are updated when the scope ends.
+ */
+static LinkList *scoperefs = NULL;
+static int scoperefs_num = 0;
+
 /* "parameter table" - hash table containing the parameters
  *
  * realparamtab always points to the shell's global table.  paramtab is sometimes
@@ -5855,6 +5873,7 @@ static int lc_update_needed;
 mod_export void
 endparamscope(void)
 {
+    LinkList refs = locallevel < scoperefs_num ? scoperefs[locallevel] : NULL;
     queue_signals();
     locallevel--;
     /* This pops anything from a higher locallevel */
@@ -5882,6 +5901,13 @@ endparamscope(void)
 	clear_mbstate();    /* LC_CTYPE may have changed */
     }
 #endif /* USE_LOCALE */
+    /* Reset scope of namerefs that refer to dead variables */
+    for (Param pm; refs && (pm = (Param)getlinknode(refs));) {
+	if ((pm->node.flags & PM_NAMEREF) && !(pm->node.flags & PM_UNSET) &&
+	    !(pm->node.flags & PM_UPPER) && pm->base > locallevel) {
+	    setscope_base(pm, locallevel);
+	}
+    }
     unqueue_signals();
 }
 
@@ -5890,9 +5916,7 @@ static void
 scanendscope(HashNode hn, UNUSED(int flags))
 {
     Param pm = (Param)hn;
-    Param hidden = NULL;
     if (pm->level > locallevel) {
-	hidden = pm->old;
 	if ((pm->node.flags & (PM_SPECIAL|PM_REMOVABLE)) == PM_SPECIAL) {
 	    /*
 	     * Removable specials are normal in that they can be removed
@@ -5956,14 +5980,6 @@ scanendscope(HashNode hn, UNUSED(int flags))
 		export_param(pm);
 	} else
 	    unsetparam_pm(pm, 0, 0);
-	pm = NULL;
-    }
-    if (hidden)
-	pm = hidden;
-    if (pm && (pm->node.flags & PM_NAMEREF) &&
-	       pm->base >= pm->level && pm->base >= locallevel) {
-	/* Should never get here for a -u reference */
-	pm->base = locallevel;
     }
 }
 
@@ -6405,7 +6421,7 @@ setscope(Param pm)
 	    (basepm = (Param)gethashnode2(realparamtab, refname)) &&
 	    (basepm = (Param)loadparamnode(realparamtab, basepm, refname)) &&
 	    (basepm != pm || !basepm->old || (basepm = basepm->old))) {
-	    pm->base = basepm->level;
+	    setscope_base(pm, basepm->level);
 	}
 	if (pm->base > pm->level) {
 	    if (EMULATION(EMULATE_KSH)) {
@@ -6431,6 +6447,25 @@ setscope(Param pm)
     unqueue_signals();
 }
 
+/**/
+static void
+setscope_base(Param pm, int base)
+{
+    if ((pm->base = base) > pm->level) {
+	LinkList refs;
+	if (base >= scoperefs_num) {
+	    int old_num = scoperefs_num;
+	    int new_num = scoperefs_num = MAX(2 * base, 8);
+	    scoperefs = zrealloc(scoperefs, new_num * sizeof(refs));
+	    memset(scoperefs + old_num, 0, (new_num - old_num) * sizeof(refs));
+	}
+	refs = scoperefs[base];
+	if (!refs)
+	    refs = scoperefs[base] = znewlinklist();
+	zpushnode(refs, pm);
+    }
+}
+
 /**/
 static Param
 upscope(Param pm, const Param ref)
diff --git a/Test/K01nameref.ztst b/Test/K01nameref.ztst
index fb27e7261..82ccbfb89 100644
--- a/Test/K01nameref.ztst
+++ b/Test/K01nameref.ztst
@@ -1247,8 +1247,8 @@ F:previously this could create an infinite recursion and crash
 0:Transitive references with scoping changes
 >f4: ref1=f4 ref2=XX ref3=f4
 >f3: ref1=f3 ref2=XX ref3=f3
->g5: ref1=f3 ref2=XX ref3=g4
->g4: ref1=f3 ref2=XX ref3=g4
+>g5: ref1=f3 ref2=XX ref3=f3
+>g4: ref1=f3 ref2=XX ref3=f3
 >f3: ref1=f3 ref2=XX ref3=f3
 >f2: ref1=f1 ref2=XX ref3=f1
 >f1: ref1=f1 ref2=f1 ref3=f1
@@ -1885,4 +1885,87 @@ F:converting from association/array to string should work here too
 ># d:reference to not-yet-defined - local - ref1
 >typeset -i var=42
 
+ typeset -n ref1
+ typeset -n ref2
+ typeset -n ref3=ref2
+ typeset var=aaa
+ () {
+   typeset -i ref2=123 # Hides the reference ref2 in this scope and nested scopes
+   typeset var=bbb
+   () {
+     typeset var=ccc
+     ref1=var
+     ref3=var # From now on ref1 and ref3 should always refer to the same variable
+     echo A:ref1=$ref1 ref2=$ref2 ref3=$ref3
+   } # Both top-level references ref1 and ref2 should be rebound
+   echo B:ref1=$ref1 ref2=$ref2 ref3=$ref3
+   () {
+     typeset var=ddd # No reference should refer to this variable
+     echo C:ref1=$ref1 ref2=$ref2 ref3=$ref3
+   }
+   echo D:ref1=$ref1 ref2=$ref2 ref3=$ref3
+ } # Both top-level references ref1 and ref2 should be rebound
+ echo E:ref1=$ref1 ref2=$ref2 ref3=$ref3
+ () {
+   typeset var=eee # No reference should refer to this variable
+   echo F:ref1=$ref1 ref2=$ref2 ref3=$ref3
+ }
+ echo G:ref1=$ref1 ref2=$ref2 ref3=$ref3
+0:hidden reference refers to a nested variable
+>A:ref1=ccc ref2=123 ref3=ccc
+>B:ref1=bbb ref2=123 ref3=bbb
+>C:ref1=bbb ref2=123 ref3=bbb
+>D:ref1=bbb ref2=123 ref3=bbb
+>E:ref1=aaa ref2=aaa ref3=aaa
+>F:ref1=aaa ref2=aaa ref3=aaa
+>G:ref1=aaa ref2=aaa ref3=aaa
+
+ typeset ref
+ typeset var1=var1@scope1
+ typeset var2=var2@scope1
+ () { # enter scope 2
+   typeset var1=var1@scope2
+   typeset var2=var2@scope2
+   typeset -g -n ref=var1; echo A:$ref # ref added to scope 2
+   typeset -g -n ref=var2; echo B:$ref # ref added to scope 2
+   () { # enter scope 3
+     typeset var1=var1@scope3
+     typeset -g -n ref=var1; echo C:$ref # ref added to scope 3
+     () { # enter scope 4
+       typeset var1=var1@scope4
+       typeset var2=var2@scope4
+       typeset -g -n ref=var1; echo D:$ref # ref added to scope 4
+       typeset -g -n ref=var2; echo E:$ref # ref added to scope 4
+     } # leave scope 4: ref rebound to var2 in scope 2 and added to scope 2
+     echo F:$ref
+   } # leave scope 3: ref remains bound to var2 in scope 2
+   echo G:$ref
+   unset -n ref # ref is unset
+   echo H:$ref
+ } # leave scope 2: ref remains unset
+ echo I:$ref
+0:reference refers successively to multiple variables in multiple nested scopes
+>A:var1@scope2
+>B:var2@scope2
+>C:var1@scope3
+>D:var1@scope4
+>E:var2@scope4
+>F:var2@scope2
+>G:var2@scope2
+>H:
+>I:
+
+ typeset ref
+ () { # enter scope 2
+   typeset var=var
+   typeset -g -n ref=var; echo A:$ref # ref added to scope 2
+   unset -n ref
+   typeset -g -i16 ref=255; echo B:$ref # ref becomes an integer in base 16
+ } # leave scope 2: ref remains an integer in base 16
+ echo C:$ref
+0:reference referring to a nested variable becomes an integer
+>A:var
+>B:16#FF
+>C:16#FF
+
 %clean


Messages sorted by: Reverse Date, Date, Thread, Author