Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: [PATCH 1/3]: Add named references
- X-seq: zsh-workers 51390
- From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: [PATCH 1/3]: Add named references
- Date: Thu, 9 Feb 2023 15:07:11 -0800
- Archived-at: <https://zsh.org/workers/51390>
- In-reply-to: <66045-1675975796.128039@FBF_.0yMO.Y8fk>
- List-id: <zsh-workers.zsh.org>
- References: <CAH+w=7bd5tHQ8_ZFuyheUrTStm8pR826jH1LB-vMdEnv14nH0w@mail.gmail.com> <67689-1675827940.088548@BxvG.D9_b.7RzI> <CAH+w=7ZFq_MyNtPVetDt84Zp8dnCQXis3p=2sKP018GZ-VTd0g@mail.gmail.com> <12608-1675903622.800470@Xj82.e3y1.svhG> <CAH+w=7ZZUCqYe6w1ZqZZKR6iLsZH0SDDXyzwgTU93nxx6bmxjQ@mail.gmail.com> <66045-1675975796.128039@FBF_.0yMO.Y8fk>
On Thu, Feb 9, 2023 at 12:49 PM Oliver Kiddle <opk@xxxxxxx> wrote:
>
> The following is similar:
> var=hello
> typeset -n ref
> () {
> typeset var=x
> ref=var
> echo $ref
> }
> typeset -p ref
> echo $ref
>
> This creates a reference to a variable at a deeper local level.
Only in a pointers/refcounting implementation. With the assumption of
dynamic scoping, it creates a reference to a name, where the scope
search for the name starts at a lower level. If that name doesn't
exist at that level (because the whole level doesn't exist any more),
the search climbs up. So I get with set -x (and a couple of
in-progress patches for looping references):
+Src/zsh:15> var=hello
+Src/zsh:16> typeset -n ref
+Src/zsh:17> '(anon)'
+(anon):1> typeset var=x
+(anon):2> ref=var
+(anon):3> echo x
x
+Src/zsh:22> typeset -p ref
typeset -n ref=var
+Src/zsh:23> echo hello
hello
If I add
() {
typeset var=y
echo $ref
() {
typeset var=z
echo $ref
}
}
I get
+Src/zsh:24> '(anon)'
+(anon):0> typeset var=y
+(anon):0> echo y
y
+(anon):1> '(anon)'
+(anon):0> typeset var=z
+(anon):0> echo y
y
> The best might be if ref returns to being
> unset when the function returns but an error like ksh is fine too.
I'm not sure how to do that without scanning the whole parameter
table, but I agree it the above is a little puzzling.
> Ksh prints "global reference cannot refer to local variable".
At what point does that happen? Upon the assignment ref=var ?
Relatedly, what should happen on any failed assignment? E.g.
typeset -n xy yx
xy=yx # OK so far
yx=xy # Oops, invalid loop
Should yx become unset, or should it remain a nameref with no referent?
> > The rule for a private should be that you always
> > pass its value.
>
> I hadn't really thought about it that way, perhaps because ksh only
> has private scoping and I'm used to writing in languages that only
> have lexical scoping. Certainly if it is hard to implement, I have no
> objection to this approach.
("This approach" meaning "no public refs to private vars"?) I haven't
tried anything else yet to see how hard it might be.
> We do lose some orthogonality in
> that you can use a private with builtins that take variable names like
> read, compadd and printf (-v). Wrappers of those would have an added
> limitation.
That's true of private already, isn't it?
> When relying only on dynamic scoping, it would be good practice to
> define all the namerefs to passed parameters as early as possible in a
> function to reduce the risk of a name clash.
If you were going to put that in the doc, where would it go?
> It isn't about the positionals being special but that it is useful to be
> able to write a function that exposes an interface similar to read where
> a variable can be named as a parameter. Ksh's making $1, $2.. special
> on the rhs of typeset -n really is very ugly.
This works for my code in current state:
% var=GLOBAL
% typeset -n ref=var
% f() {
function> typeset -n ref=$1
function> print $ref
function> ref=LOCAL
function> }
% f ref
GLOBAL
% print $ref
LOCAL
%
> > f2 \&var
>
> My intention with that suggestion is that you'd only do that to refer to
> $var from the scope of f1's caller. So in practice that'd sooner be
> something like \&$3. For this, it'd be just `f2 var` and f2() would
> declare `private -n mine=\&1`
Yeah, I don't like the idea that a called function can arbitrarily
grab parameters from its caller just by sticking & in front. Caller
needs to do do something (even if only make "normal" use of dynamic
scope) to make the parameter "grabbable".
> > With a hash that's just:
> >
> > typeset -n ref
> > for ref in 'hash[(e)'${(k)^hash[(R)?*]}']'; do ...
>
> "just"!?
Hah! Point was that it's do-able without "for"-context-sensitive
special subscript semantics. I think it would be strange for
ary=( 1 2 3 4 5 )
typeset -n ref='ary[*]
ref=( a b c)
to produce something different than
ary=( 1 2 3 4 5 )
typeset -n ref
for ref in 'ary{*]'; do ref=( a b c ); done
> > > And it could be wise to limit what can be done as part of the
> > > subscript evaluation to avoid a CVE similar to the last one.
> >
> > validate_refname() is calling parse_subscript() ... would further
> > examination of the accepted string be sufficient, or something more
> > needed? I think the greatest danger is of creating an infinite
> > recursion, which can't really be caught in advance.
>
> So if a function gets the target of a nameref from untrusted input the
> function writer needs to know to validate it
No, I meant, would examining the subscript string in the C code be sufficient.
> This should be an error perhaps:
>
> typeset -n ref=arr[1][2]
Why? ${ary[1][2]} isn't an error, it's the 2nd character of the first
word in $ary.
You can keep throwing subscripts on there as long as the resulting
substrings can be indexed-into.
print ${ary[1][3,9][4]} # etc.
> Currently it isn't possible to create a reference fo $!, $?, $@, $+, $#
> and $$. If easy to add, there would be no harm in them.
You can make references to argv and ARGC, but they always refer to the
current argv/ARGC because of the aforementioned implementation of
positionals as C locals. $* $@ $# would have the same issues.
The others would all have to be special-cased individually. What is
$+ ? Do you mean $- ?
Messages sorted by:
Reverse Date,
Date,
Thread,
Author