Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: All the way up or current scope

X-seq: zsh-workers 53603
From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
To: Philippe Altherr <philippe.altherr@xxxxxxxxx>
Cc: Zsh hackers list <zsh-workers@xxxxxxx>
Subject: Re: All the way up or current scope
Date: Mon, 12 May 2025 22:59:14 -0700
Archived-at: <https://zsh.org/workers/53603>
In-reply-to: <CAGdYchsm6=NiahMZj1W7KWDnLbeBQv7WW_y--+3E8g4T1jSagA@mail.gmail.com>
List-id: <zsh-workers.zsh.org>
References: <CAGdYchv84+isv2Y5B+WJtfz7cGtX-L7br+ZUgqshZC2Tc1OoOA@mail.gmail.com> <CAH+w=7aZjPT6xr5uEUL_s1PmuRh9hjZKZ=YvCuCJyE6Ye9iurw@mail.gmail.com> <CAGdYchswJc0CrMm2dbhFRgu6F=8wRiWnjNff0FhqYacDH_KJLA@mail.gmail.com> <CAGdYchsS2_QuHWuSWU1661K5Z8JY4k0p_QAYygdPRu8-Dr3jUA@mail.gmail.com> <CAH+w=7amjvo42qqO-b3+4_bcKo4VJodJEzZe9EYih3Mmo8gLUg@mail.gmail.com> <CAGdYchuEL6cs=ACG08txwYP+4BJ_gFZp6hFyTq3K+dbbaPfVfA@mail.gmail.com> <CAH+w=7ZiwkMEQfdK=dpDMzPf9yy0yEZbUQ0J=EyUYVGk_tBJ1A@mail.gmail.com> <CAGdYchsm6=NiahMZj1W7KWDnLbeBQv7WW_y--+3E8g4T1jSagA@mail.gmail.com>

On Mon, May 12, 2025 at 9:19 AM Philippe Altherr
<philippe.altherr@xxxxxxxxx> wrote:
>
> I disagree on the "with no real benefit". Currently, adding a local "var" in a nested function F can change what a "ref" initialized with "var" refers to in an unrelated nested function G where neither F nor F's "var" were ever visible. That doesn't sound good to me.

I say there's no real benefit because I think treating it differently
it's a bad programming "style" in the first place.  It's what the
warn_nested_var option is designed to help you keep out of your code.

> I guess you mean one pass to drop the dead variables and one to update the dangling named references.

Yes.

> My understanding is that the parameter table maps variable names (strings) to variable descriptors (instances of Param). Each Param has a field that contains the depth of the scope in which the variable was defined. Whenever a variable is defined with typeref, the table is looked up. If a Param is found and its scope is the current one, then it's reused (i.e., the new variable overwrites the old Param rather than creating a new Param to replace the old one). Otherwise, if the found Param is from an enclosing scope then a new Param is created with a link to the old Param and inserted into the table to replace the old Param. If no Param is found, a new one is created with no parent Param and inserted into the table.

There are nuances, but that is close enough for this discussion.

> When a function is exited, you look for named references that need to be updated, namely all the ones whose loopup scope has a greater depth that the current scope (the one from the calling function). For all these named references you update the lookup scope to be equal to the new current scope.

So far this is what already happens.

> "All the way up" requires instead to lookup var (in theory after all the variables from the closed scope have been removed, in practice you can just skip over them).

That's where the $ref -> $ref -> $var bit breaks down, just skipping
over the intervening reference isn't enough.

> If a Param is found, set the lookup scope to the scope of that Param, otherwise set it to the global scope.

That's where the other part of your specification gets in the way.
Unless I'm missing something, the parameter found has to have already
existed at the time the named reference was declared.  It's not
sufficient just to know the target parameter is there when the nested
function returns, you also have to know the order of events for
declaring the reference and declaring the target.  Suppose you declare
the "placeholder" reference, then call some function, then create a
target parameter, then call another function that finally does assign
the target name but in the presence of a same-named local.  Upon
finally returning from that second called function there's nothing to
say whether the target already existed at the point of declaration of
the reference.  You're only guaranteed it existed upon initialization
of the reference.

If you take away the "top target must exist before reference is
declared" clause, then I think you get the behavior we have now.

Declaring a reference where you expect it to (or don't know whether it
will) dynamically change referent is risky programming regardless of
how upward re-scoping is specified.

> If it is indeed correct that Param instances are reused rather than recreated when a variable is redefined in the same scope then "all the way up" also enables an optimization of named references.
>
> Each Param corresponding to a named reference stores the depth of the lookup scope. The lookup scope is initially determined by looking up var and using the (definition) scope of the found Param. Each Param corresponding to a named reference could instead store a pointer to the found Param and null if none was found.

As I said earlier in this thread, we tried a pointer implementation
and never got as far as what's now done.  It ended up requiring
reference counting.  I believe Oliver still has the partly-finished
code for that variation somewhere.

Follow-Ups:
- Re: All the way up or current scope
  - From: Oliver Kiddle

References:
- All the way up or current scope
  - From: Philippe Altherr
- Re: All the way up or current scope
  - From: Bart Schaefer
- Re: All the way up or current scope
  - From: Philippe Altherr
- Re: All the way up or current scope
  - From: Philippe Altherr
- Re: All the way up or current scope
  - From: Bart Schaefer
- Re: All the way up or current scope
  - From: Philippe Altherr
- Re: All the way up or current scope
  - From: Bart Schaefer
- Re: All the way up or current scope
  - From: Philippe Altherr

Messages sorted by: Reverse Date, Date, Thread, Author