Peter Stephenson <p.w.stephenson@xxxxxxxxxxxx> wrote:
> I've still had no luck with Solaris, even Solaris 9.  It didn't help
> that the only UTF-8 locale around was ru_RU.UTF-8, but that isn't the
> basic problem and I don't know what is; it seems that even obvious
> multibyte strings like accented Latin characters aren't being
> recognised, even though all the functions are present, the terminal
> works fine with other multibyte systems, and LANG is set correctly.

On further investigation it seems that when given the first byte of a
multbyte character, mbrtowc() sometimes returns -1 (error) instead of -2
(incomplete).  Reading another character and then passing both to mbrtowc()
worked.  Sometimes later in the line it works as expected, returning -2 for
an initial byte.  It doesn't seem to be tied to the mbstate_t parameter in
an obvious way, indeed it wasn't obvious that was doing anything at all.

I had a go at rewriting getrestchar() to look for other queued bytes, but
that wasn't good enough and I haven't yet had a chance to look at why.
From the silence so far it seems like no one has encountered this before.

I wonder if it's some interaction between the library and gcc.

I won't have a chance to get any further before Christmas.  I'll be away
till the 3rd January.

