Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: Better handling of wide glyphs (ask the terminal, not wcwidth)

X-seq: zsh-workers 39851
From: Daniel Hahler <genml+zsh-workers@xxxxxxxxxx>
To: "zsh-workers@xxxxxxx >> Zsh Hackers' List" <zsh-workers@xxxxxxx>
Subject: Re: Better handling of wide glyphs (ask the terminal, not wcwidth)
Date: Mon, 7 Nov 2016 02:09:28 +0100
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/simple; d=thequod.de; h= content-type:content-type:in-reply-to:mime-version:user-agent :date:date:message-id:from:from:references:subject:subject :received:received; s=postfix2; t=1478480969; bh=OWhqPZiluJulP4N uJb3su9cbWiKOdRmTfU2nd/5KDnc=; b=h1BhpQReAPVM/jjbpNiPaWbm+8+6Hhb OtLrkFxcPKgnFPORu3R4ntFYB99MPv7GzkAKUvHxGSHd1G5qEuqVYINcyhG0OjTA pi/BMHHqtS5vKlaYKZPea/tGv+q9U1AKzLMv6/gzxrl9CL5hSZmfSSe8kS3GHaz7 q1svTWIRlf20=
In-reply-to: <161105153708.ZM19128@torch.brasslantern.com>
List-help: <mailto:zsh-workers-help@zsh.org>
List-id: Zsh Workers List <zsh-workers.zsh.org>
List-post: <mailto:zsh-workers@zsh.org>
Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
References: <3126f405-b1a0-b29c-df2b-a3376aabb702@thequod.de> <161105153708.ZM19128@torch.brasslantern.com>

On 05.11.2016 23:37, Bart Schaefer wrote:
> On Nov 5, 11:04pm, Daniel Hahler wrote:
> }
> } This method gets provided as a shared object then, which allows to
> } LD_PRELOAD it (overwriting wcwidth and wcswidth). In this case Zsh
> } will use the same method, and everything is fine!
> }
> } But this shows that there is a problem between Zsh and the terminal,
> 
> Just to clarify, you mean the problem "between Zsh and the terminal" is
> present even *with* this LD_PRELOAD?

No.  Then it works as expected.  But with e.g. Vim this works better
already in the case the terminal and Vim disagree (Vim is not using
wcwidth(3) by default already).

But I thought that this LD_PRELOAD hack might not be necessary after all.

> } So I wondered if Zsh could be smarter even without the custom
> } wcwidth(3) in LD_PRELOAD: there is CSI 6 n ('\e[6n'), which can be
> } used to ask the terminal about the current position.
>
> This is error-prone (network inefficiency/inconsistency may cause it to
> fail) and in most cases zsh internals will be asking for the width of
> a character that isn't on the screen yet at all, or at least is not in
> the position where the cursor is located.
>
> So for this to work we'd have to move the cursor to an innocuous spot
> (already difficult enough with terminal variances), print CIS 6, read
> the position, print the character we care about, print CIS 6 again,
> read again, and finally erase what we just did (with no way to put
> back what was overwritten), all while hoping that the network didn't
> glitch on us in the meantime.  That's a lot of round trips to the
> terminal for what might be inside a loop over a long string.

I see, thanks for your explanation.  I was not taking network traffic
into account at all.

However this would only be necessary for some / special chars after all,
and can be cached then internally - although the terminal might change
its result, e.g. when the font gets changed, of course.

Where would I have to look / poke to do this for the prompt and ZLE only?
There it should be mostly about chars that are about to be displayed,
and in this case the "painting in an innocuous spot" is not required at
all (given that those chars are displayed one by one).

> } What do you think?
> 
> I think unicode glyphs have been allowed to go entirely overboard.  I
> blame Sirius Cyberne -- er, I mean, Apple.

It's also a lot about Powerline, FontAwesome and its variants.  I agree
however that there are two worlds colliding and that it is difficult to
solve this using fixed tables of character widths, especially for
codepoints in the private use area.

I was using a hack with rxvt-unicode before already, which basically
required you to add spaces after wide glyphs.
A new approach is the one described here, which handles them as wide
chars internally, based on the result from the Xft font.  (The code is
at https://github.com/exg/rxvt-unicode/compare/master...blueyed:wcwidth-hack).

> A zsh module that reads glyph widths from a config file might be a way
> to approach this, plus a utility to generate such a configuration from
> the terminal -- sort of a termcap library for glyphs.

One of my initial ideas was also to generate just a custom wcwidth.so to
be used with LD_PRELOAD then, but it depends on the actual font being
used after all.
Since a terminal's font is typically not changed often that would be
feasible, but still requires you to use LD_PRELOAD (and programs picking
that up), so there is not much gained after all (compared to the
wcwidth(3) callback to the terminal).

> a utility to generate such a configuration from the terminal

How would that work then?  Based on the method described above?
Then it would be a pre-generated cache basically?!
It might be hard to predict what glyphs are being used in the future
though, and it is probably rather big.  It's also basically a custom
wcwidth(3) implementation then, isn't it?

Thanks,
Daniel.

Attachment: signature.asc
Description: OpenPGP digital signature

Follow-Ups:
- Re: Better handling of wide glyphs (ask the terminal, not wcwidth)
  - From: Bart Schaefer

References:
- Better handling of wide glyphs (ask the terminal, not wcwidth)
  - From: Daniel Hahler
- Re: Better handling of wide glyphs (ask the terminal, not wcwidth)
  - From: Bart Schaefer

Messages sorted by: Reverse Date, Date, Thread, Author