Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: All the way up or current scope
On Fri, May 16, 2025 at 4:38 PM Philippe Altherr
<philippe.altherr@xxxxxxxxx> wrote:
>
> I just did the minimum to get the behavior I expected from loose references. I didn't have time to find and remove all the code that is no longer needed nor did I update the tests.
Comments in no particular order.
You mention ksh compatibility when discussing subscripts, but ksh
doesn't support subscript references -- that idea was entirely an
enhancement that arose from the zsh-workers discussion. There were
several revisions to block security issues, etc. -- in particular
command substitution does not work in reference subscripts. I don't
believe "loose" references (I think "floating" might be a better word)
simplify the implementation in any significant way (more on subscripts
below); in fact, because (with your "loose" experiment patch) a
reference can "float" back into its own scope, it's trivially easy to
cause a reference to loop back to itself. This situation can be
contrived with nested functions in the current implementation
(zsh-5.9.0.2-test-27-gf24958a) but it falls out of the simplest case
with "loose".
I think this means you haven't fully thought through all the cases of
parameter hiding. See K01nameref.ztst "up-reference part 3, hidden
global". There's a somewhat related problem with "up-reference part
6, stacked namerefs, end is in scope" where the floating scope
actually assigns a new target to one of the named references and
completely alters the way the chain is resolved. There are also
name-hiding issues with both of the "local reference points to
same-name global reference" tests, and in the interaction with
parameters declared "private" via the module. I acknowledge you "did
[not] update the tests" but several of these are cases where I believe
the existing tests are revealing of issues that can't be resolved just
by changing the expected test output.
Similarly, three of the four tests fail when using a reference as the
"for" loop parameter. In the "part 3" test one reference is actually
tagged as a cycle (PM_SELFREF, shows up in "typeset -p" output as a
-Un flag) but this is never reported and apparently the test happens
not to resolve that reference so the infinite loop is avoided.
Your patch also changes "Order of evaluation with ${(P)...}" in a way
that I don't understand, which makes me suspect it may be a "just the
minimum" problem, but it's odd.
Further I don't believe "loose" is actually any easier to implement.
The "base" field of Param can't be removed -- it's the integer base
for declaring binary, octal, etc., so it's wasted space in non-numeric
types including namerefs. Consequently (and demonstrably by a bit of
output instrumentation), your implementation of assigning 1 or 0 to a
new "upper" field and then later using (pm->level - pm->upper) in
scope resolution is exactly equivalent to assigning (pm->base =
pm->level - (1 or 0)) and then using pm->base directly at scope
resolution. The only cases that differ are those where pm->base is
not initialized, and modulo the implementation of upscope() vs. your
upscope2(). I suspect, but haven't confirmed, that the 4 places in
typeset_single() where you initialize pm->upper could be factored out
to a place where pm->base could be initialized. I believe the upscope
difference to be related to the name-hiding issues already mentioned.
Returning to subscripts: I assert that your "upper" field is
therefore exactly the same "additional field in the variable
descriptor of the named reference to remember where the main variable
is to be found" and thus does not simplify "handle subscripts in the
same way as the main variable". Subscripts are for practical purposes
the same as any other anonymous function call (almost literally a math
function for ordinary arrays) and there's no way to evaluate them
anywhere but in the current scope at the time the reference is
resolved. Any attempt to alter this would still require either
re-scoping every parameter mentioned in the subscript, or somehow
fully evaluating the subscript before setting it as the target of the
reference, which would potentially have unwanted side-effects. Like
"base", the "width" field is wasted space in parameters other than
those with padding specified, and is thus overloaded in subscript
namerefs as an optimization to avoid having to string-search for the
'[' each time the nameref is resolved. This in turn is necessary
because the parameter table can only be searched for the name of the
parameter (not a name plus it's subscript) before resolving to the
correct scope.
One other remark RE ksh: You can declare (nameref ref='avar[xyz]')
but $ref then expands as (${avar}'[xyz]'). Zsh does this too when
"emulate ksh".
The upshot of all the foregoing is that at this point I don't see any
significant "code that is no longer needed" nor an operational benefit
to "loose" references that outweighs the cases that currently appear
to need additional new code to make sensible. Further I think any
significant differences with namerefs as compared to new-style ksh
functions (for which, thanks for the comparisons) are likely
resolvable by some isolated tests for ksh emulation mode, which can
wait as a future enhancement. (Old-style is obviously entirely
another matter.)
Messages sorted by:
Reverse Date,
Date,
Thread,
Author