Thanks Lawrence for the comparison. It's a bit disappointing to see that Zsh has the worst record :-(
Let me try to illustrate with a more concrete example why I think Zsh's behavior could/should be improved. Consider that a first developer wrote the following code:
#!/bin/zsh -e
# ...
function backup() {
scp $1 $BACKUP_SERVER:
echo $1 >> $BACKUP_LOG
}
# ...
backup $file; rm $file
It's arguably not the greatest code but assuming that it's only ever run with ERR_EXIT enabled, it behaves correctly. If "scp" fails, ERR_EXIT is triggered and nothing is logged nor deleted.
Some time later a second developer comes, sees the code "backup $file; rm $file", and thinks to himself that this looks dangerous; the file should only be deleted if the backup was successful. Therefore they change the code to "backup $file && rm $file". Unfortunately, this has exactly the opposite effect of the intended one. From then on, if "scp" fails, ERR_EXIT is no longer triggered and the file is logged and deleted.
On Tue, Nov 8, 2022, at 12:36 AM, Bart Schaefer wrote:
ERR_EXIT kicks in only when the result is not otherwise checked.
Awesome! That's exactly how I would like Zsh to behave. To be slightly more precise, I would formulate it as follow:
ERR_EXIT kicks in if and only if the result is not otherwise checked.
Unfortunately there are several cases where Zsh doesn't behave like that. For example, consider the following commands:
{ false; true } || true; echo $?
if false; true; then echo 0; else echo 1; fi
Both commands print "0". In both commands "false" is part of a condition. However, in both commands, "false" does NOT influence the result of the condition. In fact, in both commands, the result of "false" is NOT checked; replacing it with "true" leads to the exact same result. Therefore, given the specification above, ERR_EXIT should be triggered by "false" in both commands.
In the commands above the problem is that Zsh never triggers the ERR_EXIT. Apparently, when Zsh starts evaluating a condition, it no longer enforces the ERR_EXIT option for the whole evaluation of the condition, even if it contains commands whose result is not checked, like the "false" commands in the example above. This is also true if the commands are nested in called functions. The following code also prints "0" instead of existing after the "false" command.
function foo() { false; true }
foo || true; echo $?
There are other cases where ERR_EXIT is triggered but fails to propagate. A major offender in that regard is the following code:
local var=$(false); echo $?
In this case "false" triggers an ERR_EXIT but it only exits the sub-shell of the command substitution. For some reason, the local variable assignment ignores the exit status of the command substitution and always returns a zero exit status. Therefore the main shell does NOT exit and the command prints "0".
Interestingly, in the following almost identical code, "false" triggers an ERR_EXIT that also exits the main shell:
local var; var=$(false); echo $?
However, having to systematically use this style is rather cumbersome. Furthermore it's not even foolproof. Indeed, if there are multiple command substitutions, the assignment returns the exit status of the last one. Thus, the following command does NOT exit and prints "0":
local var; var=$(false)$(true); echo $?
Since I really really want Zsh to behave as described above, I implemented
zabort, which configures a ZERR trap to exit the current shell and all parent shells whenever an ERR_EXIT is triggered. It also prints a nice stack trace to the command that failed. This fixes the problem for all cases where the ERR_EXIT isn't propagated, like in the variable assignments above. However, the problem of the conditional expressions remains because in that case no ERR_EXIT (and no ZERR trap) is ever triggered. Fixing this seems only possible by changing the implementation of Zsh.
Here is an example using zabort:
#!/bin/zsh
. zabort.zsh
function log() { echo $@ 1>&2 }
function f1() { false; log f1 }
function f2() { : $(f1); log f2 }
function f3() { local v3=$(f2); log f3 }
function f4() { v4=$(f3)$(true); log f4 }
function f5() { f4; log f5 }
f5
And here is what it prints:
Command unexpectedly exited with the non-zero status 1.
at abort-example.zsh:7(abort)
at abort-example.zsh:8(f1)
at abort-example.zsh:9(f2)
at abort-example.zsh:10(f3)
at abort-example.zsh:11(f4)
at abort-example.zsh:13(f5)
Is there any chance that Zsh could be changed to more closely follow the specification above?
I'm mainly interested in a fix for the conditional expressions but fixes for the other issues would also be nice. It would be awesome if "zsh -e" behaved as specified above in all cases.
If needed, I could look into implementing some of the fixes myself. However, before I invest into this, I would prefer to know whether you would be open to such changes.
Philippe