Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

'<<-' here-documents oddity with line continuation



zsh has an oddity with here-documents using the '<<-' operator.

(Note: below, <tab> represents a tab character, not the literal string
'<tab>'.)

POSIX says:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04
| If the redirection operator is "<<-", all leading <tab> characters
| shall be stripped from input lines and the line containing the
| trailing delimiter.

In a construct like

	cat <<-EOF
<tab>	one \
<tab>	two
<tab>	EOF

where the newline after "one \" is backslash-escaped (line
continuation), zsh outputs

one two

whereas all other shells (bash, dash, *ksh, yash, etc.) output
one <tab>two

Superficially, it looks like zsh is the only shell that actually
complies with POSIX, as it strips the leading <tab> characters from all
lines in the here-document, including lines followed by a line ending in
slash.

However, line continuation in POSIXy shells is parsed at a very early
stage, even before token recognition:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02_01
| A <backslash> that is not quoted shall preserve the literal value of
| the following character, with the exception of a <newline>. If a
| <newline> follows the <backslash>, the shell shall interpret this as
| line continuation. The <backslash> and <newline> shall be removed
| before splitting the input into tokens. Since the escaped <newline>
| is removed entirely from the input and is not replaced by any white
| space, it cannot serve as a token separator.

(One funny effect of this: reserved words such as 'while' or 'select'
are not recognised if any part of them is quoted, but they can still be
split over multiple lines using line continuation!)

So it would seem logical that the definition of "input line" used by
POSIX for here-documents is based on lines resulting *after* parsing
line continuation. That would then keep the <tab>s from being stripped
from "continued" lines.

Here's a quick test script (compatible with all POSIX shells). It
outputs "zsh" on zsh and "ok" on all other shells.

tab=$(printf '\t')
lf=$(printf '\nX'); lf=${lf%X}
eval "foo=\$(cat <<-EOF${lf}${tab}1\\${lf}${tab}2${lf}${tab}EOF${lf})"
case $foo in
( 1${tab}2 ) echo ok ;;
( 12 )       echo zsh ;;
( * )        echo NEWBUG ;;
esac

Since zsh's behaviour looks sensible on the face of it, I'm reluctant to
call it a bug, but it is certainly an incompatibility and seems to be
non-compliant with POSIX. Maybe something to fix in emulation?

Thanks,

- M.



Messages sorted by: Reverse Date, Date, Thread, Author