Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [BUG] ZLE character width with emoji presentation variation selectors in Unicode



Agreed that there's no particular phrasing for this in the Unicode spec wrt the exact width differences. I believe that'll largely be left up to the renderer (in web, mobile, desktop, etc. contexts). 

Given that, it seems like the optimal path forward might be to ask the terminal emulator for this information to ensure alignment in what the shell thinks vs. the terminal (for widths)?

Gotcha re composing characters - that makes sense, thanks for explaining!

But yep, I've got a fallback mechanism here in mind for Zsh (render as 2 cells wide but only reserve 1 cell, to match the shell, similar to iTerm) - my goal with opening this issue was to kick off a discussion on the "correct" way to approach this in Zsh and how to best support this going forward. Since the current experience I've got in mind is suboptimal for Zsh (compared to Bash/Fish) within Warp, for example, due to these limitations. 

Best,
Advait

On Fri, May 10, 2024 at 2:57 PM Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
On Fri, May 10, 2024 at 7:12 PM Advait Maybhate <advait@xxxxxxxx> wrote:
>
> Gotcha, thanks for the context! Combining emojis are weird :)
>
> Hmm, agreed that it won't be possible to use the same standard across all terminals - hence, I was thinking terminfo would allow the terminal to indicate whether it supports these variation selectors with wide characters?
>
> Yep, I was referencing TR51 from Unicode as well (emoji presentation selectors).

From what I could tell (I'm not an expert), there is no phrasing that
implies the width should be different for the emoji presentation form
and the text presentation form.

> From looking a bit into wcwidth, it seems like it doesn't inherently support width for a sequence of code points. I just tried this out in C++ with ICU (International Components for Unicode library) and grapheme clusters to demonstrate the width calculation as 2 with this sequence: gist.github.com/Advait-M/a326cd2e474b9520dc893765ec4cb2c4.

Yes, normal compose sequences are a base character with a width, and
composing characters with 0 width (but effectively rendering to the
left of the insertion point, on top of the base character.)

--
Mikael Magnusson


Messages sorted by: Reverse Date, Date, Thread, Author