Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: more splitting



On Wed 15 Apr 2026, at 12:21, Ray Andrews wrote:
> That's sorta why I was expecting the pipe to
> send *everything* about the var down the tube.  Tho ... it's not a
> logical issue if it was a design decision that pipes would alter arrays
> by deliberately remanufacturing them into strings.

pipes don't alter arrays. pipes are a way for one command or process to
feed data into another. the pipe is completely agnostic to exactly what
that data is. like bart said, it's just bytes. the command that's
writing into the pipe has to decide what those bytes will be

in the case of the print command, its purpose is to simply spit back out
all of the inputs it receives. by default it separates them by a space
and puts a \n at the very end. this is a fundamentally transformative
process. at the end of it, all you have is some text, which contains
zero information about how it was produced. were the original inputs
'a b' 'c d' or 'a b c d' or 'a' 'b' 'c' 'd'? there's no way to know. nor
is there any way to know what you did (parameter expansion, command
substitution, etc) to produce those inputs. all you have is the text.
that text is what print writes into the pipe, and what od reads out of
it

---

so the full chain of events is:

  var=( 'a b' 'c d' 'e f' )

you tell the shell to create a data structure, an array, containing
three elements of three characters each. the shell does not store the
quotes or the spaces between these elements, it has its own way of doing
things internally that you don't need to worry about. the quotes and
spaces are just part of the language you use to talk to the shell

  print $var | od

you tell the shell to do several things all at once. first, the shell
needs to deal with $var. this is parameter-expansion syntax. it tells
the shell 'put the *contents* of the parameter called var here'. since
var is an array of multiple elements, the shell will *act like* you had
typed out each of these elements separately, effectively this:

  print 'a b' 'c d' 'e f' | od

then the shell creates the pipe

it runs od and connects its input to the pipe's output. od waits for
some data to appear

then it runs print, connects its output to the pipe's input, and passes
it the three elements/words produced by the parameter expansion as
arguments. again, the *syntax* of spaces and quotes isn't used for this
operation. basically the shell is doing something like this (python
pseudo-code):

  run_command(cmd="print", args=["a b", "c d", "e f"])

print does not know *anything* about where those inputs came from. it
doesn't know that you used parameter expansion, it wouldn't know how you
quoted them if you'd typed them by hand, etc. it doesn't even know that
var exists. all it knows is that it got three arguments and it needs to
print them

so it essentially makes a string like this (pseudo-code again):

  output = " ".join(args) + "\n"

in other words

  output = "a b c d e f\n"

again, because of this transformation, not only can you not tell how the
arguments to print were produced, you can't even tell what the arguments
*were*. all of the information about how they were structured is gone,
it's just some random text now

print then writes this text to its output, which again is connected to
the pipe. (if it were not connected to the pipe, it would print it to
the screen -- again, as just text. that's print's job, to print text)

od sees it come out the other end and it does its thing with it
(translating it into a hex representation). od has absolutely *zero*
idea where this text came from or how it was produced. it doesn't know
about var, or about print, or about zsh arrays, it just sees the bytes

---

On Wed 15 Apr 2026, at 12:21, Ray Andrews wrote:
> % typeset -p var
> typeset -a var=( 'a b' $'c\nd e f g h' ij )
>
> % var2=$var
>
> % typeset -p var2
> typeset var2=$'a b c\nd e f g h ij'

as we established, $var means 'put the contents of var here', in this
case three elements because var is an array. the shell does that. but
because var2= without ( ) is a scalar assignment, the shell has no
choice but to take those three separate elements and join them into one
so that it can complete the assignment

On Wed 15 Apr 2026, at 12:21, Ray Andrews wrote:
> % var2=( $var )
>
>   % typeset -p var2
> typeset -a var2=( 'a b' $'c\nd e f g h' ij )
>
> ... all good.  But 'recreate it'?  From where? ... '(q+)' yes?

(q+) controls how the parameter expansion happens. ${(q+)var} tells the
shell, 'put the contents of var here -- but as you're doing that, put
literal quotes around each element in the same style i might have done
if i had written them out myself in a command'. in other words, write
them in that human syntax

if you were to pass those literally quoted elements to print, it would
again transform them into plain text. but because the quotes are there,
it's possible to reverse-engineer the structure of the arguments. you
can't know they came from var, but you can know how many there were and
what they contained. and you can parse them back out of the text into a
new list of elements

  % var=( 'a b' 'c d' 'e f' )

  # there's no way to know how this text was produced, no way to turn it
  # back into the inputs we gave to print
  % print $var
  a b c d e f

  # the quotes in the text tell us the structure of the inputs
  % print ${(q+)var}
  'a b' 'c d' 'e f'

  # if we assign that text to a variable, we can parse it back into an
  # array
  % var2=$( print ${(q+)var} )

  # all one big string
  % typeset -p var2
  typeset var2=\''a b'\'' '\''c d'\'' '\''e f'\'

  # use the quoting syntax in the string to break it up into separate
  # elements, then strip off that syntax, then assign to a new array.
  # now we have our inputs back
  % var3=( ${(Q)${(z)var2}} )
  % typeset -p var3
  typeset -a var3=( 'a b' 'c d' 'e f' )

dana




Messages sorted by: Reverse Date, Date, Thread, Author