This caught my attention:
static wchar_t
charref(char *x, char *y)
{
wchar_t wc;
size_t ret;
if (!(patglobflags & GF_MULTIBYTE) || !(STOUC(*x) & 0x80))
return (wchar_t) STOUC(*x);
well, this is definitely not valid for arbitrary multibyte character set. I am
just curious if it is possible to consistently assume that UTF-8 is in use?
That can definitely simplify things.
Attachment:
pgp45PnwBoIns.pgp
Description: PGP signature