Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: In POSIX mode, ${#var} measures length in bytes, not characters
ZyX schreef op 07-06-15 om 02:34:
> Do you have a reference where “character” is defined?
Yes:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_02
POSIX specifically allows any character encoding, including multibyte
characters, depending on the user's locale, and on the condition that
the portable character set (basically US-ASCII) is a subset of the
locale's character set.
With UTF-8 now the de facto standard locale and it including multibyte
characters, it's become important for shells to get this right.
> This behaviour is the same in posh and dash:
Yes, dash and pdksh/mksh/posh unfortunately have this bug, too.
But bash, ksh93, and yash correctly measure characters, not bytes. (yash
is supposed to be the most POSIX-compliant of them all.)
Thanks,
- Martijn
Messages sorted by:
Reverse Date,
Date,
Thread,
Author