Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: zsh/pcre has errors with unicode bytes
정누리 wrote on Mon, 13 Jul 2020 11:53 +0900:
> $ LC_ALL='C'
> $ str='Hi😊'
> $ for (( i = 1; i <= ${#str}; ++i )); do                     
>       byte="$str[i]"                  
>       [[ $byte -pcre-match [a-zA-Z0-9] ]] && echo $byte || echo 'no match'
>   done
> >> H  
>    i
>    zsh: pcre_exec() error [-10]
From /usr/include/pcre.h on my system:
#define PCRE_ERROR_BADUTF8         (-10)  /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF16        (-10)  /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF32        (-10)  /* Same for 8/16/32 */
So pcre expects the pattern to be a Unicode string, despite the locale.
Actually, wait.  We don't know what the locale is.  I don't build PCRE,
but could you try that again with «export LC_ALL='C'» at the start?
If that doesn't force it to use ASCII, try unsetting the MULTIBYTE
option.  See zpcre_utf8_enabled() (in Src/Modules/pcre.c).
Cheers,
Daniel
>    no match
>    zsh: pcre_exec() error [-10]
>    no match
>    zsh: pcre_exec() error [-10]
>    no match
>    zsh: pcre_exec() error [-10]
>    no match
> 
> Thanks for reading.
Messages sorted by:
Reverse Date,
Date,
Thread,
Author