Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

zsh/pcre has errors with unicode bytes



Hi,

Looks like an error related to unicode bytes exists in current release (5.8) of the zsh/pcre.
When the locale is set to 'C' and trying to process a unicode string byte-by-byte, e.g.,

$ LC_ALL='C'
$ str='Hi😊'
$ for (( i = 1; i <= ${#str}; ++i )); do                     
      byte="$str[i]"
      ord=$(( [##16] #byte ))                           
      echo $ord
  done
>> 48
69
F0
9F
98
8A
$ for (( i = 1; i <= ${#str}; ++i )); do                     
      byte="$str[i]"                  
      [[ $byte -regex-match [a-zA-Z0-9] ]] && echo $byte || echo 'no match'
  done
>> H
   i
   no match
   no match
   no match
   no match
$ for (( i = 1; i <= ${#str}; ++i )); do                     
      byte="$str[i]"                  
      [[ $byte -pcre-match [a-zA-Z0-9] ]] && echo $byte || echo 'no match'
  done
>> H
   i
   zsh: pcre_exec() error [-10]
   no match
   zsh: pcre_exec() error [-10]
   no match
   zsh: pcre_exec() error [-10]
   no match
   zsh: pcre_exec() error [-10]
   no match

Thanks for reading.


Messages sorted by: Reverse Date, Date, Thread, Author