Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: printf %<n>s in UTF-8 is not always POSIX-compliant



On 2012-02-15 00:14:12 -0800, Bart Schaefer wrote:
> On Feb 15,  3:15am, Vincent Lefevre wrote:
> }
> } In UTF-8 locales:
> } 
> } xvii% printf ".%2s.\n" é
> } .é.
> 
> Am I understanding correctly that the intent here is that é is a two-
> byte character so %2s should print the two literal bytes, rather than
> print the single logical character in a field two logical characters
> wide?

Yes, the number is the size in bytes, not in characters. I think
that the intent is to deal with internal structures (e.g. with
file formats where some fields have a fixed or limited size, and
the same syntax can be used in C to avoid buffer overflows).
Note that there's the same problem with:

xvii% printf ".%.3s.\n" éabcd
.éab.
xvii% emulate ksh
xvii% printf ".%.3s.\n" éabcd
.éab.
xvii% emulate sh             
xvii% printf ".%.3s.\n" éabcd
.éa.

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Messages sorted by: Reverse Date, Date, Thread, Author