Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: SH_WORD_SPLIT, $* and null IFS



On Sep 30,  4:44pm, Paul Mertz wrote:
}
} When SH_WORD_SPLIT is set:
} - "$*" expand to a single field, with each parameter separated by the IFS
} (same as before).
} -  $    will not care about the IFS... EXCEPT if the IFS is null (set to no
} value)... then it will also expands to a single field without delimiters

I'm puzzled by what you mean about "will not care about the IFS".  (Skip
past the examples for me figuring out what you probably mean.)

torch% set - "a b" "c   d" e$'\0'f 'gxh'
torch% print -l $*
a b
c       d
ef
gxh
torch% setopt shwordsplit
torch% print -l $*                      
a
b
c
d
e
f
gxh
torch% IFS=x
torch% print -l $*
a b
c       d
ef
g
h
torch% unsetopt shwordsplit
torch% print -l $*
a b
c       d
ef
gxh
torch% 

Note that with IFS=x the string gxh was split into g and h.

Now here's zsh invoked as "sh":

$ set - "a b" "c   d" e$'\0'f 'gxh'
$ print -l $*
a
b
c
d
ef
gxh
$ IFS=x
$ print -l $*
a b
c   d
ef
g
h
$ 

Again gxh gets split.

} I really don't understand this last behavior...

Can you post a specific example showing what you're experiencing?  Are
you talking about what delimiter gets inserted INTO the string when
joining, as opposed to what delimiter is used for splitting the words
when expanding?

If you look at the zsh manual (info section "14.3.2 Rules" to be more
specific) you'll find (steps irrelevant to this thread skipped):

--- 8< ---
5. _Double-Quoted Joining_
     If the value after this process is an array, and the substitution
     appears in double quotes, and no (@) flag is present at the current
     level, the words of the value are joined with the first character
     of the parameter $IFS, by default a space, between each word
     (single word arrays are not modified).  If the (j) flag is
     present, that is used for joining instead of $IFS.

10. _Forced Joining_
     If the `(j)' flag is present, or no `(j)' flag is present but the
     string is to be split as given by rules 8. or 9., and joining did
     not take place at step 4., any words in the value are joined

[ASIDE: The reference to rules 4, 8 and 9 are wrong here, a renumbering
has not been fully propagated into the cross-reference.  I believe the
correct references are 5, 16, and 17 respectively.]

     together using the given string or the first character of $IFS if
     none.  Note that the `(F)' flag implicitly supplies a string for
     joining in this manner.

16. _Forced Splitting_
     If one of the `(s)', `(f)' or `(z)' flags are present, or the `='
     specifier was present (e.g. ${=VAR}), the word is split on
     occurrences of the specified string, or (for = with neither of the
     two flags present) any of the characters in $IFS.

17. _Shell Word Splitting_
     If no `(s)', `(f)' or `=' was given, but the word is not quoted
     and the option SH_WORD_SPLIT is set, the word is split on
     occurrences of any of the characters in $IFS.  Note this step, too,
     takes place at all levels of a nested substitution.

22. _Semantic Joining_
     In contexts where expansion semantics requires a single word to
     result, all words are rejoined with the first character of IFS
     between.  So in `${(P)${(f)lines}}' the value of ${lines} is split
     at newlines, but then must be joined again before the P flag can
     be applied.

     If a single word is not required, this rule is skipped.
--- 8< ---

Rule 10 plus possibly that last part about rule 22 being skipped when
single words are not required is what causes the behavior on joining.
When IFS is empty, the words are joined at step 10 but NOT split again
at 17, and rule 22 doesn't matter.  When IFS is non-empty, the words
are joined at 10, then split at 17, and then NOT joined at 22.

My recollection (which may be wrong) is that POSIX leaves unspecified
(or makes implementation-defined) the order in which this occurs, and
requires that the double quotes be used to get defined behavior.

-- 



Messages sorted by: Reverse Date, Date, Thread, Author