Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: expr length "$val" returns the wrong length for values containing NULL (\\0)
- X-seq: zsh-workers 37372
- From: D Gowers <finticemo@xxxxxxxxx>
- To: "Nikolay Aleksandrovich Pavlov (ZyX)" <kp-pav@xxxxxxxxx>
- Subject: Re: expr length "$val" returns the wrong length for values containing NULL (\\0)
- Date: Thu, 10 Dec 2015 15:30:03 +1030
- Cc: "zsh-workers@xxxxxxx" <zsh-workers@xxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=WO5QshlUjCXNPbHYdEK8ft2ZiMeNGsQVaTynLw7qaIA=; b=QZJ2J11AEMjgLZ6pPGom4xBNO4xAI5OsK0NHAVXtRVq+ndJ9cDsV57oGxT9KpyqZYA aOP0StcCa6VekjZR+ZCC497sDi9PTCfyoKc3A/c5eo3NOOG0W2k3Ok4xi+qf+RfAGEma U5lhXca1RusJK6GO+T8cQw++2q9jPFod+cohvN7T3GsbTQSnYpV5M27mMbnrwWFAbA3i woqX+TYlQF9fQ7jpkUWRdgaXIDyaHoaVy/aYx0PsY0Q+/yeyNIC3tAaAiPIf7u2WOHps IPoHB447ZJ5f9Y5jj+mr4oYla3EZ9/iJGkuwpSln/Ly6HT++XYCQL+Kt1tRuUSF53Jri Q1mg==
- In-reply-to: <1926681449721747@web1m.yandex.ru>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <CAMf8R07=5LKcg3f6VaFDKi9TBt855=t9J7tzqhDjQihB2ftEmg@mail.gmail.com> <2007121449719799@web8h.yandex.ru> <CAMf8R04CsWtB39TVW-0VHhu9dcfPNYyi6gLXi-QhrF8Vp7nRLQ@mail.gmail.com> <1926681449721747@web1m.yandex.ru>
I am aware of the prevalence of NUL-terminated strings, since I've coded in
C in the past, that's why I wrote 'considerable bother to fix it'.
Nevertheless, for a purpose such as argument passing, size + data is
clearly better (easier to secure and more flexible)
On Thu, Dec 10, 2015 at 2:59 PM, Nikolay Aleksandrovich Pavlov (ZyX) <
kp-pav@xxxxxxxxx> wrote:
> 10.12.2015, 07:18, "D Gowers" <finticemo@xxxxxxxxx>:
> > Ah, okay. That (commandline arguments not being able to contain NUL)
> seems.. a bit anachronistic. But I guess it's never been enough of a
> problem to warrant the considerable bother to fix it. Fair enough.
>
> This has nothing to do with the commandline itself. In some very earlier
> days it was decided that strings will be NUL-terminated (in place of e.g.
> being structs with size_t size and char *data) and this statement sneaked
> into many parts of many standards. If you write C code you will have
> problems when dealing with NUL-terminated string because every library
> function that accepts something other then void* pointer with “generic
> data” assumes that string should terminate with NUL. Projects like zsh or
> almost every programming language have to write their own string
> implementations: in zsh it is C strings with escaped characters, in most
> other cases it is length+data pair.
>
> Since one of the functions having NUL convention is exec* function family
> which is used to launch programs and another is main() function on the
> other side that accepts NUL-terminated strings you cannot really do
> anything to fix this: replacing one of the core conventions is *very*
> expensive, especially since you must do this in a backward-compatible way.
>
> > On Thu, Dec 10, 2015 at 2:26 PM, Nikolay Aleksandrovich Pavlov (ZyX) <
> kp-pav@xxxxxxxxx> wrote:
> >> 10.12.2015, 04:52, "D Gowers" <finticemo@xxxxxxxxx>:
> >>> Test case:
> >>>
> >>> v=$(printf foo\\0bar);expr length "$v";expr length $v
> >>>
> >>> alternatively:
> >>>
> >>> v=foo$'\0'bar;expr length "$v";expr length $v
> >>>
> >>> In zsh, the values returned are 3 and 3.
> >>> In dash and zsh, the values returned are 6 and 6.
> >>>
> >>> Both of those results are wrong, AFAICS (foo$'0'bar is 7 characters
> long).
> >>> But the zsh result is more severely wrong. I could understand the
> bash/dash
> >>> result, at least, as 'NULL characters are not counted towards length'.
> >>
> >> Both results are *right*. In both cases you ask the length of the
> string and you get it.
> >>
> >> In dash (also posh, bash and busybox ash) zero byte is skipped when
> storing. So length of the $v *is* six. You may question whether it is right
> storing without zero byte, but the fact that all four shells have exactly
> the same behaviour makes me think this is part of the POSIX standard. In
> any case non-C strings are not on the list of features of these shells
> unlike zsh (it also internally uses C NUL-terminated strings, but zero
> bytes and some other characters are “metafied” (i.e. escaped) and
> unmetafied when passed to the outer world e.g. by doing `echo $v` to pass
> string to terminal).
> >>
> >> As I said in zsh zero byte is stored. But C strings which are the only
> ones that can be arguments to any program are **NUL-terminated**. So what
> you do is passing string "foo" because NUL terminates the string. You
> cannot possibly get the answer you think is right here thus, unless you
> reimplement `expr` as a zsh function.
> >>
> >>>
> >>> In any case, it is easily demonstrated that the string is not 3
> characters
> >>> long, by running 'echo "$V"' or 'print "$v"' or 'echo ${#v}'
> >>>
> >>> `zsh --version` = 'zsh 5.2 (x86_64-unknown-linux-gnu)'
>
Messages sorted by:
Reverse Date,
Date,
Thread,
Author