Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: utf-8




18.12.2014, 21:41, "Ray Andrews" <rayandrews@xxxxxxxxxxx>:
> On 12/18/2014 10:05 AM, ZyX wrote:
>> It is permitted at least in variable and function names: though I cannot find anything relevant in
> ...
>
> Seems I can use unicode 'one way' but not the other:

You are missing the main point. Identifiers consist of the characters for which `iswalnum` is true (there is an implementation detail that for ASCII characters internal zsh equivalent is used, so that glibc has no chances to say that U+0041 LATIN CAPITAL LETTER A is not an alphanumeric character (manual page actually says it must not do this though in any locale) or that U+003D EQUALS SIGN is). “☠” is U+2620 SKULL AND CROSSBONES which does *not* have unicode category “Letter” or “Number” and thus cannot be used in an identifier. To use it in an identifier you must create a custom libc locale (or even a custom libc) which will return true for `iswalnum(0x2620)`.

This is usual behaviour for many languages that have unicode identifiers: use unicode character classes for deciding which codepoints may and which may not form an identifier.

>
>> $ howdy=☠
>>
>> $ echo $howdy
>> ☠
>>
>> $ ☠=howdy
>> zsh: command not found: ☠=howdy
>>
>> $ var☠=howdy
>> zsh: command not found: var☠=howdy
>
> multibyte is on, all 'posix*' options are off.

Try testing with something like `ПЕРЕМЕННАЯ` (Russian translation of “VARIABLE”) or `αβγ` (first three Greek letters). They do work, at least on my system.



Messages sorted by: Reverse Date, Date, Thread, Author