Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
printf %s in UTF-8 is not POSIX-compliant
- X-seq: zsh-workers 24676
- From: Vincent Lefevre <vincent@xxxxxxxxxx>
- To: zsh-workers@xxxxxxxxxx
- Subject: printf %s in UTF-8 is not POSIX-compliant
- Date: Tue, 4 Mar 2008 02:29:17 +0100
- Mail-followup-to: zsh-workers@xxxxxxxxxx
- Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm
Hi,
Under UTF-8 locales:
vin:~> zsh-beta -f
vin% emulate sh
vin% printf ".%2s.\n" é
. é.
vin% /usr/bin/printf ".%2s.\n" é
.é.
vin%
As you can see, the zsh printf builtin doesn't behave like the
coreutils printf, and this is zsh which is wrong. Indeed, the
precision is the number of bytes, not the number of characters.
http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html
says (in the extended description) that the "file format notation"
shall be used for the format (and %s isn't an exception).
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap05.html
(file format notation) says:
s
The argument shall be taken to be a string and bytes from the
string shall be written until the end of the string or the number
of bytes indicated by the precision specification of the argument
is reached. If the precision is omitted from the argument, it
shall be taken to be infinite, so all bytes up to the end of the
string shall be written.
Note: ksh93 has the same bug, but not pdksh and bash. But bash may
change its behavior if not under POSIX compatibility, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=459413
--
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
Messages sorted by:
Reverse Date,
Date,
Thread,
Author