Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ



brian m. carlson wrote on Sat, 06 Jun 2020 16:28 +0000:
> On 2020-06-06 at 04:33:50, Daniel Shahaf wrote:
> > brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000:  
> > > On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:  
> > > > On 6/5/20, brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote:  
> > > > > zsh typically runs the final command in a pipeline in the main shell
> > > > > instead of a subshell.  However, POSIX requires that all commands in a
> > > > > pipeline run in a subshell, but permits zsh's behavior as an extension.  
> > > > 
> > > > What POSIX actually says is:
> > > > "each command of a multi-command pipeline is in a subshell
> > > > environment; as an extension, however, any or all commands in a
> > > > pipeline may be executed in the current environment"  
> > 
> > That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12.
> > 
> > The part Brian quotes below is from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_01.
> >   
> > > > Ie, it does not say "shall", so it doesn't require a subshell all, in
> > > > fact it explicitly does permit not using one as you also say. The  
> > 
> > This interpretation is analogous to how conforming C programs must
> > assume neither that «char» is signed nor that it is unsigned.  
> 
> Right.  That term in C is "implementation defined."  POSIX has that term
> as well, and it is not used here.  That term means that the
> implementation may pick a behavior, but must document its choice.
> 
> > The sentence preceding the one you quoted reads:
> > .
> >     Non-standard extensions, when used, may change the behavior of
> >     utilities, functions, or facilities defined by POSIX.1-2017.
> > 
> > I take this to mean non-standard extensions aren't bound by "shall"s.
> > 
> > As to why the passage Mikael quoted doesn't use the word "shall"… well,
> > presumably it doesn't use the word "shall" because it doesn't describe
> > "a feature or behavior that is mandatory"¹.  
> 
> Sure, but if the standard didn't want that behavior to be specified
> somehow, then it wouldn't have mentioned it.  Why wouldn't POSIX have
> just omitted that statement and said nothing about it?

Perhaps because POSIX tries to first describe how an abstract or common
implementation behaves, and then proceeds to describe a set of
alternative behaviours known to be used by some implementations.

For example, IIRC C doesn't specify the signedness of «char» because, at
the time C was standardized, some platforms used signed chars and other
used unsigned chars, and it was desired to make both kinds of platforms
conformant.

> POSIX also says[0] that "[w]hen data is transmitted over the network, it
> is sent as a sequence of octets (8-bit unsigned values)" and "16 and
> 32-bit values can be converted using the htonl(), htons(), ntohl(), and
> ntohs() functions."  I don't think we can argue that POSIX permits one
> to use 8-bit signed values or 9-bit values or that the implementation
> can fail to make those functions work this way just because they didn't
> use "shall".  The word "shall" is omitted (and "is" used) all over the
> shell definitions to describe syntax forms, and one isn't permitted to
> substitute some other syntax form in place of the standard one.

So you're saying that wherever POSIX says "is" it is to be read as
"shall", if I understand correctly?  That's a fair argument, but I'm not
sure whether I agree.

> > > What POSIX does say is that one “shall define an environment in which an
> > > application can be run with the behavior specified by POSIX.1-2017.”
> > > I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
> > > that reason.  
> > 
> > What Mikael's saying is that zsh's incumbent behaviour is already
> > POSIX-conforming, but POSIX-conforming implementations have some leeway:
> > have a range of possible behaviours to choose from, just like conforming
> > C compilers can choose what signedness to give to «char».  
> 
> I don't agree.  That behavior is implementation defined, and that has a
> specific meaning.  Certainly implementations can implement additional
> extensions, provided they don't conflict with the behavior specified in
> POSIX.
> 

Could you please clarify what exactly is implementation-defined here,
according to your reading?  What decision in this are implementors
supposed to make for themselves and document for their users?

In any case, our readings of the standards differ.  How can we figure
out what the correct interpretation is?  Is there background information
on Austin Group's bug tracker or mailing lists, for example?  Or can we
just ask them?

> > The passage Mikael quoted specifies that running the last command in
> > a pipeline in a subshell by default is permitted in certain cases,
> > outlined by the phrases "as an extension" and "may".
> > 
> > The definition of "may"¹ says it's used to describe "optional" behaviours,
> > and that conforming applications should tolerate both presence and
> > absence of that behaviour.  
> 
> It says that an "application should not rely on the existence of the
> feature or behavior."  It doesn't say that we can't rely on the absence
> of that feature in a conforming environment.

If "may" describes a feature on whose _absence_ conforming applications
may rely, then what's the difference between "may" and "shall not"?  And
between their respective opposites, "need not" and "shall"?

For example, consider this bit from [2.3.1]: "Implementations also may
provide predefined valid aliases that are in effect when the shell is
invoked."  If conforming applications can rely on the absence of
predefined aliases, that would imply that conforming implementations
must not predefine aliases.

[2.3.1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03_01

> > To summarize, I don't see why behaviour specified with the phrases "as
> > an extension" and "may" should be off by default in a POSIX-conforming
> > mode.  Would you elaborate on this?  
> 
> Because the behavior materially differs between the behavior specified
> declaratively (albeit without "shall") and the extension.  If we were
> talking about situations where the behavior was a choice between
> producing an error (that is, just failing) and producing a useful
> output, then clearly nobody would care: just don't rely on the program
> failing if you give it the syntax specified in an extension.
> 
> For example, the shell is permitted to recognize additional arithmetic
> expressions as an extension.  It would be permissible for the shell to
> understand the legacy C-style expressions like =* (instead of *=), but
> when in POSIX mode, the following would need to print -4:
> 
>   sh -c 'x=2; : $((x =- 4)); echo $x'
> 
> For behaviors where there is no conflict, such as =*, then we could
> always print 8 here, even in a POSIX mode:
> 
>   sh -c 'x=2; : $((x =* 4)); echo $x'

Thanks.  I understand your argument; not sure yet whether I agree with it.

> > (On the other hand, I'm not sure why they bothered to write the words
> > "as an extension" there.  They don't seem to change the meaning one way
> > or the other.)  
> 
> In general, we have to assume standards authors (and legislators) wrote
> the text for a reason and not to be wasteful with words.  Therefore, we
> should assume there is a relevant difference in meaning.

Agreed.

That's exactly why I questioned whether the behaviour in question was
a "non-standard extension": I was trying to interpret the term
'non-standard' within the phrase 'non-standard extension' as
non-superfluous.

> > Well, perhaps there is something we can do to make their lives easier.
> > 
> > Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char
> > flags to help unportable programs.  However, I hesitate to propose
> > adding an option just for this: adding options is always easy to
> > suggest, but not always a good idea.
> > 
> > Since zsh already incorporates a parser for sh scripts, perhaps we could
> > write a tool that automatically adds parentheses to the last element in
> > every pipeline.  That's not such a crazy idea: it already exists (in a
> > much more general form) for C: http://coccinelle.lip6.fr/  
> 
> I think if your goal is for people to change their code to work around
> this when zsh is sh, they will simply not do so, even if that's an
> option, because it doesn't work by default.  In Git alone, there are
> over 240,000 lines of shell between code and tests.  Debian must contain
> tens of millions more.  It's just not going to be achievable to get all
> of those lines changed to work this way.
> 
> If I were to add an option that were off by default for sh and on for
> zsh, then that would meet my needs, and I'd be happy to implement that.
> You seem to be unexcited about that possibility, though.

I'm not sure you understood my point of view precisely.

What I was saying [in my previous message, based on my understanding at
the time, not taking into account your latest reply] was:

- The long-term solution is for people to add parentheses around their
  pipeline elements.

- That solution can be implemented mechanically.

- As a stopgap measure, we can consider enabling the patch's behaviour
  in sh mode _as an opt-in_.
  
  Notwithstanding the opt-in aspect, I'm sure we can figure out a way to
  arrange things so random third party code that runs /bin/sh will be
  served by zsh in sh emulation mode with the patch's behaviour already
  on, if that's what the sysadmin or third-party maintainer want.

> > > zsh is a very popular interactive shell, and allowing it to be used as a
> > > portable sh on systems where the system sh is less capable would be
> > > really beneficial.  
> > 
> > How would it be beneficial?  
> 
> It's already present on a lot of those systems and it avoids the need to
> build one shell for interactive use and another for portable scripting.
> zsh is also appealing as a portable sh because it has a pleasant
> interactive mode, whereas many sh implementations (e.g., dash) do not.
> 

Thanks.

> > > If your objection is to the wording, I'm happy to revise it to remove
> > > the word "requires", but I do think this provides a lot of benefits for
> > > the sh scripting case while not impacting users who are expecting
> > > different behavior for the zsh case.  
> > 
> > The patch would constitute a backwards-incompatible change to anyone who
> > uses zsh as sh today and relies on the current behaviour of pipelines.  
> 
> The thing is, I don't believe anyone does, except for the possibility of
> macOS[1].

https://en.wikipedia.org/wiki/No_true_Scotsman

> I have tried zsh as sh on Debian and many things are broken
> (including debconf).  I'm not aware of any other supported operating
> systems[2] where a user using zsh as /bin/sh is permitted as an option.

And I'm not aware of any regulars on this list who have symlinked
/bin/sh to zsh independently of their OS vendor's configuration options.

> I should also point out that when people write "emulate sh" that they
> probably very much want to emulate the behavior of /bin/sh on their
> system.

Personally, when I write «emulate sh» I would expect to get, not what
bash does as sh or what dash does as sh, but what POSIX specifies sh
should do.

> I'm not aware of any supported system in existence where the
> default /bin/sh (or the default POSIX sh, when /bin/sh is not
> POSIX-compatible) has the zsh behavior; they all run all pipeline stages
> in a subshell.
> 
> I want to be clear that I don't want to change the behavior of the zsh
> mode, where I agree a change would be undesirable and people are almost
> certainly relying on the current behavior.

Thanks for clarifying this.

> > This might have been acceptable if it were a question of changing
> > a non-conforming behaviour to a conforming behaviour.  However, the
> > current behaviour does appear to be conforming.  
> 
> I'm not in agreement that a shell which provides only zsh's behavior is
> conforming in this case.

Okay, so see above re how to resolve our differing interpretations.

Cheers,

Daniel

> [0] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html
> [1] And macOS users are not relying on this behavior from zsh as sh
>     because bash and dash are also valid sh options.
> [2] That is, operating systems in versions which still receive security
>     support from their vendor.



Messages sorted by: Reverse Date, Date, Thread, Author