So, speed/space tradeoff. I guess the question would be how often we
need to find a named reference 100 scopes away vs. how often we need
100 scoperefs at all, and whether the never-freed 800 bytes (?) is
worth worrying about.
Afaik, by default the call stack is limited to 500. With that default, the absolute worst case is a very long running script that at the very beginning creates named references to nested scopes at all levels from 1 to 500 and then never again uses named references. This would create an array of 500 empty linked lists that never get freed. I would argue that nowadays, even such a large waste is not worth worrying about, especially given the fact that it would only occur in an extremely unlikely corner case.
For scripts that don't use references, the overhead is limited to a null array and a size variable. For most scripts, the array would probably never grow beyond one or two dozen elements. To me, this definitely looks like something we don't need to worry about. It's probably only the very rare scripts that use both recursion and references for which the array may grow to larger sizes. However, that's also the scripts where the improved performance of the array of linked lists (vs the linked list of linked lists) would come in handy. Another advantage of the array solution is that it's more straightforward; inserting a linked list in the middle is trivial while with a linked list of linked lists it would require more involved code.
For all these reasons, I would rather stick with the array of linked lists.
> You can "unset" an enclosing parameter and then redefine/reuse it with "typeset -g".
But you can't change a named reference into something else that way.
Even "unset -n" doesn't remove the nameref-ness of the surrounding
scope parameter.
Yep but I was working on a fix for this issue. I have now sent the patch, see
workers/54236.
Aside #1: Should this code in bin_unset have changed along with other
parts of array-initializer removal in workers/53776?
if ((pm->node.flags & PM_NAMEREF) &&
(!(pm = resolve_nameref(pm)) || pm->width)) {
/* warning? */
continue;
}
(That is, pm->width now irrelevant here?)
Yes, if/once
workers/54198 is committed, the check for pm->width can (and should) be dropped.
Aside #2, noting here so I don't have to refresh my memory again:
% typeset -n positional=3
This used to produce
zsh: invalid variable name: 3
but as of workers/54064 this is allowed:
% typeset -p positional
typeset -n positional=3
with an intent to fix later to be a reference to $3 (currently
$positional always returns empty).
I have a patch that fixes references to positional parameters. I'll try to send it in the next few days.
> I should probably add some explicit tests for the following cases:
> - Ref added to multiple scope lists
> - Ref added multiple times to the same scope list
> - Ref added to a scope list is now an integer with base=N
Per above, I think this is actually impossible, at least at present?
Once
workers/54236 is committed all of that should become possible. I will try to rebase the patch onto
workers/54236 and add tests for these cases.
Fyi, here are a few other things I intend to look into:
- In
workers/54236, I had to add many checks for PM_UNSET and PM_DECLARED. Most of these would not be needed if whenever a local variable is unset all its flags were cleared and replaced with PM_UNSET. This would guarantee that if a parameter has a type flag (one of PM_ARRAY, PM_INTEGER, PM_NAMEREF, ...) then it's for sure a still alive parameter of that type. Currently, you should always check for the presence of the type flag and the absence of PM_UNSET or the presence of PM_DECLARED, which is rather verbose and very error prone. Do you see any reason why flags should NOT be cleared when a parameter is unset? Afaik, there exists no mechanism that allows undeleting an unset parameter. So I don't see any reason why flags would need to be kept after a parameter is unset.
- Currently PM_UNSET can mean two very different things. If PM_DECLARED is also present, then PM_UNSET means that the parameter was declared with no explicit value (needed when the option TYPESET_TO_UNSET is enabled). Otherwise, PM_UNSET means that the parameter was unset, for example with a call to the builtin "unset". This is rather confusing. I wonder whether we would be better served if PM_DECLARED was replaced with a PM_NULL that means that the parameter has a null value and PM_UNSET would always mean that the parameter was unset. I will give it a try to see what code change this would entail.
- Bart, you once suggested that dereferencing not-yet-initialized references ought to trigger an error. Currently such references need to be handled in several places and, when they are part of assignments, they trigger various kinds of errors/warnings that may not necessarily make much sense to end users. My impression is that things could be simpler and more uniform if dereferencing a not-yet-initialized reference would always trigger an error. I will try to write a patch that does that,
- Some time ago, I started to wonder whether the reference resolution implemented in resolve_nameref_rec should instead be part of fetchvalue. My impression is that it would avoid a number of redundancies. I think that it would allow for a much simpler fix for references to positional parameters than the one that I currently have. I also think that it would fix or make it much easier to fix many of the bugs related to subscripted references (however, I would still vote for not supporting them for now). This would be a rather large refactoring but my impression is that many things would be simpler afterwards. However, that's just my gut feeling. I haven't started working on this and maybe once I do, I will quickly find out that something doesn't work.
Philippe