Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character)



On 2021-04-22 15:59:34 +0200, Vincent Lefevre wrote:
> I would think that's intentional, at least for the precision
> (e.g. %.4s) in order to prevent buffer overflow.

The behavior with incomplete UTF-8 sequences (the one with "\x84\x9d")
is rather ugly:

zira% printf "%3s\n" $(printf "\xe2\x84\x9d") | hd
00000000  20 20 e2 84 9d 0a                                 |  ....|
00000006
zira% printf "%3s\n" $(printf "\x84\x9d") | hd
00000000  20 84 9d 0a                                       | ...|
00000004

zira% printf "%.1s\n" $(printf "\xe2\x84\x9d") | hd
00000000  e2 84 9d 0a                                       |....|
00000004
zira% printf "%.1s\n" $(printf "\x84\x9d") | hd 
00000000  84 9d 0a                                          |...|
00000003

I think that only the POSIX spec makes sense, unless you consider
that %s must handle valid characters, in which case it should fail
with an error on any invalid sequence. But I would say that a
different conversion specifier should be used, as an extension.

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)




Messages sorted by: Reverse Date, Date, Thread, Author