Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

RE: Word splitting in zsh



Some general notes. The POSIX shell is using one-level textual substitution.
It does *not* know anything about internal structure of variables. It does
*not* splits anything in the middle of substitutions. It behaves damn simply -
replace the values *once* and then split the whole line. That is the only
context where term "word" makes sense - meaning exactly "positional parameter
passed to comand".

> }
> } I've come across this little problem in zsh when I run it under
> } setopt SHWORDSPLIT (not that this is something I normally do).
>
> There's definitely some kind of bug here.
>
> zagzig% echo $ZSH_VERSION
> 3.0.8
> zagzig% set "a1 a2 a3" b c
> zagzig% print -l ${1+"$@"}
> a1 a2 a3
> b
> c

That is correct and is how sh behaves.

> zagzig% setopt shwordsplit
> zagzig% print -l ${1+"$@"}
> a1
> a2
> a3
> b
> c
> zagzig%
>

> Well, that's not quite right,

It is simply wrong.

                                            but 3.1.9-dev-8 is even worse:
>
> zagzig% echo $ZSH_VERSION
> 3.1.9-dev-8
> zagzig% set "a1 a2 a3" b c
> zagzig% print -l ${1+"$@"}
> a1 a2 a3 b c						<-- Yipes!

Well, this is "correct" *zsh* behaviour. The part after `+' is a word - not
array. And is taken as single word and is never splitted. What happens here,
is

- zsh evaluates "$@" that gives you array with three elements
- but because of "scalar context" in this case (the best I can call it) array
is concatenated forming the above value. Even worse, it is inconsistent with
everything else - array joining is supposed to use IFS ... but it does not in
this case. We get (quoting doc): "If NAME is an array parameter, and the
KSH_ARRAYS option is not set, then the value of each element of NAME is
substituted, one element per word."; these elements are then joined together
with space, ignoring actual IFS value.


> zagzig% setopt shwordsplit
> zagzig% print -l ${1+"$@"}
> a1
> a2
> a3
> b
> c
> zagzig%
>

That is just because of above. The structure of WORD in ${name+WORD} is not
remebered. But note the same bug again:

bor@itsrm2% set 'a b c' 1 2
bor@itsrm2% IFS=: print -l ${1+"$@"}
a b c 1 2
bor@itsrm2% setopt shwordsplit
bor@itsrm2% IFS=: print -l ${1+"$@"}
a
b
c
1
2

IFS value is silently ignored.

>
> I don't know exactly when this bug was introduced, though.
>

That is almost inevitable in current implementation. I repeat - sh word
splitting is done exactly once on the line after all substitutions have been
done. In zsh wordspitting happens at every level as part of evrey ${...}
substitution. I never liked it but could not find a good example. Thank you
for finding it :)

> } bruce ~ % args "$@"
> } # Acceptable, but Bourne sh would print a single blank entry here, since
> } # there's a pair of quotes.
>
> Actually, that's not quite true.  Some versions of Bourne sh expand "$@"
> to the empty string, and some expand it to no string at all.  The reason
> for the ${1+"$@"} hack in many shell scripts is so that you don't have
> to care which flavor of Bourne shell you have.  Zsh has always tried to
> be in the latter camp, e.g.,
>

Here saith SUS V2:

Expands to the positional parameters, starting from one. When the expansion
occurs within double-quotes, and where field splitting (see Field Splitting )
is performed, each positional parameter expands as a separate field, with the
provision that the expansion of the first parameter is still joined with the
beginning part of the original word (assuming that the expanded parameter was
embedded within a word), and the expansion of the last parameter is still
joined with the last part of the original word. If there are no positional
parameters, the expansion of "@" generates zero fields, even when "@" is
double-quoted.



-andrej



Messages sorted by: Reverse Date, Date, Thread, Author