Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: BUG: Initializations of named references with an empty string should trigger an error



I still object to this.  It makes "typeset -n" magic with respect to
all other declarations and TYPESET_TO_UNSET.  At the very least it
means you can create a parameter that appears set to ${+ref} and has
an empty string value, but you can't actually declare that same
parameter that way.

Two orthogonal points: empty string references and ${+ref} behavior

Empty string references

Consider the following:

Definition: A named reference is a special type of variable that can only ever be initialized once with the name V of a variable. Once initialized, expanding/assigning the named reference expands/assigns the variable referred to by V at initialization time.
Corollary: The statement "typeset -n ref" defines a placeholder which can later be initialized with the name of the variable to refer.

That's in my opinion a simple and easy to understand definition of named references. There is however no way to square this with the idea that "typeset -n ref" would somehow initialize ref with an empty string. Then you have to explain that initializing a named reference with an empty string is allowed and has no effect. Unfortunately that implies that a statement like "ref=$1" may or may not initialize the reference depending on the content of "$1".

In my opinion it's much simpler and more consistent to stick with the rule that references can only ever be initialized once. That implies the following:

- For a placeholder ref, "typeset -p ref" should print "typeset -n ref" (with no =) even when TYPESET_TO_UNSET is not set.

- Converting an empty string scalar variable into a reference should trigger an invalid variable name error. So "typeset str=; typeset -n str" would trigger an error. To be perfectly consistent, when TYPESET_TO_UNSET is not set, "typeset str; typeset -n str" should do the same. Which is no different from "typeset -i int; typeset +i int; typeset -n int", which already now triggers an error even though there is no explicit initialization at all.

Put together that produces a very simple and consistent system. No need to introduce a special case for empty variable names. A simple look at the code can tell you from where on a reference is initialized (no dependency on the run-time value of assigned values).

I don't see what would be lost with this. It's still true that "typeset -n ref" initializes "ref" to a default state, namely a placeholder. It just happens that in this case, there exists no value "val" such that "typeset -n ref=val" produces the same result. And so what? Who cares about that and why? Since named references are such a special beast and given how special the first assignment to named reference is, the inability to replicate "typeset -n ref" via some "typeset -n ref=val" looks to me like the least of all worries.

 ${+ref} behavior

The current behavior of  ${+ref} (and  ${ref+X}, {ref-X}) for uninitialized references looks bogus to me. Given the following definitions

typeset v1=V v2 # no v3
typeset -n r1=v1 r2=v2 r3=v3 r4;


we get the following expansions

${v1}      =V   ${+v1}     =1   ${v1+X}    =X   ${v1:+X}   =X   ${v1-X}    =V   ${v1:-X}   =V  
${r1}      =V   ${+r1}     =1   ${r1+X}    =X   ${r1:+X}   =X   ${r1-X}    =V   ${r1:-X}   =V  
${v2}      =    ${+v2}     =1   ${v2+X}    =X   ${v2:+X}   =    ${v2-X}    =    ${v2:-X}   =X  
${r2}      =    ${+r2}     =1   ${r2+X}    =X   ${r2:+X}   =    ${r2-X}    =    ${r2:-X}   =X  
${v3}      =    ${+v3}     =0   ${v3+X}    =    ${v3:+X}   =    ${v3-X}    =X   ${v3:-X}   =X  
${r3}      =    ${+r3}     =0   ${r3+X}    =    ${r3:+X}   =    ${r3-X}    =X   ${r3:-X}   =X  
${r4}      =    ${+r4}     =1   ${r4+X}    =X   ${r4:+X}   =    ${r4-X}    =    ${r4:-X}   =X  
${(!)r4}   =    ${(!)+r4}  =1   ${(!)r4+X} =X   ${(!)r4:+X}=    ${(!)r4-X} =    ${(!)r4:-X}=X 

Since r1 refers to v1, r2 to v2 and r3 to v3, their respective results are expected to be the same, which is also the case. For r4, which refers to nothing, short of signaling an error like ksh does, the most logical thing to do in my opinion is to behave as if it was referring to a non-existent variable. So like r3. Instead it behaves as if there was an implicit (!) flag, which doesn't make any sense to me. Expansions of named references are supposed to expand the referred variable. Since there is no referred variable, it should behave as in the case where the referred variable doesn't exist.

With the current behavior you can do the following where by initializing a variable you get into a world where less entities are set, which makes no sense to me.

$ typeset -n r; echo ${+r}; r=v; echo ${+r}; v=x; echo ${+r} 
1
0
1

 Philippe


On Fri, May 23, 2025 at 3:24 AM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
On Thu, May 22, 2025 at 5:14 PM Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
>
> On Thu, May 22, 2025 at 11:52 PM Bart Schaefer
> <schaefer@xxxxxxxxxxxxxxxx> wrote:
> >
> > I was thinking more of providing examples that someone might see in
> > (other people's) code and wonder what they meant.
>
> My thinking was that this documentation might cause a further
> proliferation of people adding these pointless quotes, thinking they
> make a difference.

That's also a valid point.  I wasn't so much objecting as explaining
why all three were there.

> It's certainly minor enough that I
> won't push back more on it.

Ditto, so, as you like.

Attachment: expansion-flags.zsh
Description: Binary data



Messages sorted by: Reverse Date, Date, Thread, Author