Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: Some groundwork for Unicode in Zle

X-seq: zsh-workers 20711
From: François-Xavier Coudert <Francois-Xavier.Coudert@xxxxxx>
To: zsh-workers@xxxxxxxxxx
Subject: Re: Some groundwork for Unicode in Zle
Date: Fri, 14 Jan 2005 16:54:11 +0100
Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm

Hi all,

I'm new to the list but I'm interested in UTF-8 inclusion into Zle. My
question is the following: have you considered the possibility of keeping
storing strings like the line edited in arrays of char (and not wide
chars), while using a few functions to handle the fact that one Unicode
character may be represented by a few chars (and one glyph by a few
Unicode characters, but I'm not sure how this can be handled).

Using a few of the functions glib exports for Unicode (but zsh could use
home-made functions if need be), I hacked (and that's nothing close to
pretty) some internal of Zle in the following way:

diff -r zsh-4.2.3/Src/Zle/zle_misc.c zsh-fx/Src/Zle/zle_misc.c
29a30
> #include <glib.h>
97,98c98,99
<       cs += zmult;
<       backdel(zmult);
---
>       cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line;
>       backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line +
>       cs));
114a116,119
>     if (zmult > cs)
>       backdel (cs);
>     else
>       backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line +
>       cs) - 1);

diff -r zsh-4.2.3/Src/Zle/zle_move.c zsh-fx/Src/Zle/zle_move.c
29c29
< 
---
> #include "glib.h"
162c162,167
<     cs += zmult;
---
>     cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line;
174c179
<     cs -= zmult;
---
>     cs = (char *) (g_utf8_prev_char (line + cs)) - (char *)line;

diff -r zsh-4.2.3/Src/Zle/zle_utils.c zsh-fx/Src/Zle/zle_utils.c
29a30
> #include <glib.h>
94a96,97
>     int next, i;
>     
101,102c104,107
<       line[to] = line[to + cnt];
<       to++;
---
>         next = (char *) (g_utf8_next_char (line + cnt)) - (char *)line
>         - cnt;
>       for (i = to; i < to + next; i++)
>         line[i] = line[i + cnt];
>       to += next;

With this, one can correctly move around and delete (fore and back)
unicode characters with ease. Such modifications seem easy to generalize.
So the points I'd like to get your thoughts on are:

  1. is such an approach useful?
  2. what are the arguments against it? (it may need a wider rewrite of
some builtins that other approaches)

Thanks for your attention, and I hope I will be able to help getting zsh
much more viable on UTF-8 systems!

FX

Follow-Ups:
- Re: Some groundwork for Unicode in Zle
  - From: Peter Stephenson
- Re: Some groundwork for Unicode in Zle
  - From: Clint Adams

Messages sorted by: Reverse Date, Date, Thread, Author