Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: bug report : printf %.1s outputting more than 1 character



On Tue, Mar 14, 2023 at 7:40 PM Jason C. Kwan <jasonckwan@xxxxxxxxx> wrote:
>
> I'm using the macOS 13.2.1 OS-provided zsh, version 5.8.1, which I understand isn't the latest and greatest of 5.9, so perhaps this bug has already been addressed.

A related case been addressed by declaring it an intentional
divergence from POSIX, see
https://www.zsh.org/mla/workers/2022/msg00240.html

However ...

> In the 4-byte sequence as seen below ( defined via explicit octal codes ), under no Unicode scenario should 4 bytes be printed out via a command of printf %.1s, by design.
>
>  - The first byte of \377 \xFF is explicitly invalid under UTF-8 (even allowing up to 7-byte in the oldest of definitions).

This triggers a branch of the printf code introduced by this comment:
    /*
     * Invalid/incomplete character at this
     * point.  Assume all the rest are a
     * single byte.  That's about the best we
     * can do.
     */

Thus, you've deliberately invoked a case where zsh's response to
invalid input is to punt.  This dates back to the original
implementation in workers/23098,
https://www.zsh.org/mla/workers/2007/msg00019.html, January 2007.




Messages sorted by: Reverse Date, Date, Thread, Author