Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: more splitting



An array _is_ a data structure. A variable in the shell that holds an array is a variable holding a data structure.

The important thing is this: the value of an array IS NOT A STRING.  It's not stored as a string, and it never turns into a string, unless you do something in your code to turn it into one.

The thing stored in memory is not delimited. It has no parentheses around it, no quotes, no spaces between its elements. There isn't any "between its elements" because the elements are stored separately; it is treated as if it were a collection of individual variables each with their own value. Sure, for convenience they are grouped together under a single name so you can ask for $var[1] and $var[2]... and more importantly, use variables like $var[i] and $var[j]... but the elements still have their own separate identities.

The best way to deal with such collections of values is to avoid ever turning them into single strings that have to be parsed again to get the elements back out. If you want to pass an array as arguments to a command, use the "${array[@]}" syntax and the command gets each element of the array as a separate parameter. There's no command line that you have to eval because the array is never turned into a string. It stays an array the whole time.


On Wed, Apr 15, 2026 at 1:23 PM Ray Andrews <rayandrews@xxxxxxxxxxx> wrote:


On 2026-04-15 09:07, Bart Schaefer wrote:
>>      # (q+):quote the quotes ... sorta, -C1: columns split on spaces.
> Not exactly.  No splitting is done by "print".
Pardon.  Fixated on the word 'split' -- yes, of course print doesn't
modify the var.

> One further thought: This is going to block forever until the end of
the stdin stream is reached, and ultimately use approximately twice as
much memory as the size of that stream.  So you don't want to use this
in a place where you don't completely control what's being fed to it.

Voodoo.  I'm not going to be playing with any such thing.

Now back to the hard stuff:

> When you write
>   var=("a b" c$'\n''d e f'' ''g h')
> the shell parses the quoted sections and builds a data structure,
> which zsh calls an array.  The quotes themselves are gone, they were
> only needed to tell how to build the array.

I can understand that various chars are only there to pass instructions
to the shell, like the ticks.  BUT ...

> The information is stored in a data
> structure in the shell.

... 'a data structure' ... not in the var itself?  This is the essential
point.  It seems to me almost incomprehensible that a variable would not
contain everything about itself -- be self-contained.  I could write my
var to a USB stick, take it to another computer, retrieve it into
another zsh environment and it is going to behave exactly as before --
nothing lost.  Yes?  No? That's sorta why I was expecting the pipe to
send *everything* about the var down the tube.  Tho ... it's not a
logical issue if it was a design decision that pipes would alter arrays
by deliberately remanufacturing them into strings.  It's hard to say
this exactly correctly but it's one thing to alter data as a matter of
design, it's another thing to send data intact and entire from here to
there and *yet* something is lost.  Again, there's this notion of
information about the array that's stored somewhere other than in the
array itself.  Get me?  Hard to say this accurately.

> In the case of $var, it's still in that array structure named "var".
But the rules for what happens when you use $var to "output" that
array depend on context.  For a simple usage like
    print -rn $var
the rule is to combine all the elements into a single string with
spaces between them.

... That's no issue: One might chose to display data any number of ways.

> the (q+) tells zsh to rebuild something that has the same semantics as
the original quoting.

... mmmmm ... Ok, sorta.

> If you don't want that structure information to be lost, you have to
tell zsh to re-create it.

... this is getting close to what rots my socks:

% typeset -p var
typeset -a var=( 'a b' $'c\nd e f g h' ij )

% var2=$var

% typeset -p var2
typeset var2=$'a b c\nd e f g h ij'

.... Ooops!  I know better than that:

% var2=( $var )

  % typeset -p var2
typeset -a var2=( 'a b' $'c\nd e f g h' ij )

... all good.  But 'recreate it'?  From where?  I can understand that
the various special characters -- ticks, backslashes -- would have to be
'recreated' as if one was going to build another array at CL via a
manufactured string that is not the array itself, but rather the
keystrokes needed to create it.  No problem.  '(q+)' yes?

Sorry for being so unteachable :(















--
Mark J. Reed <markjreed@xxxxxxxxx>


Messages sorted by: Reverse Date, Date, Thread, Author