Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: PATCH: autoconf test for multibyte support
- X-seq: zsh-workers 22587
- From: Peter Stephenson <pws@xxxxxxx>
- To: zsh-workers@xxxxxxxxxx
- Subject: Re: PATCH: autoconf test for multibyte support
- Date: Fri, 4 Aug 2006 15:50:31 +0100
- In-reply-to: <200608031806.k73I6Ea2017321@xxxxxxxxxxxxxx>
- Mailing-list: contact zsh-workers-help@xxxxxxxxxx; run by ezmlm
- Organization: Cambridge Silicon Radio
- References: <200608031806.k73I6Ea2017321@xxxxxxxxxxxxxx>
Peter Stephenson <pws@xxxxxxx> wrote:
> If this works I will need to change some of the installation
> documentation.
This changes some documentation.
I'm only guessing it works on Cygwin, all I know is it compiles with the
same code that works everywhere else.
Index: INSTALL
===================================================================
RCS file: /cvsroot/zsh/zsh/INSTALL,v
retrieving revision 1.25
diff -u -r1.25 INSTALL
--- INSTALL 16 Feb 2006 14:28:54 -0000 1.25
+++ INSTALL 4 Aug 2006 14:44:06 -0000
@@ -264,37 +264,32 @@
---------------------------
Support for multibyte character sets that extend ASCII, such as UTF-8, is
-under development but the code in the line editor is sufficiently stable to
-be turned on by default in environments that provide full ISO 10646 support
-including the preprocessor definition __STDC_ISO_10646__. In principle
-this definition does not guarantee the full environment, but in practice
-systems with this defined also provide suitable library support. The shell
-does not probe for all the features, so on other systems use of multibyte
-support must be explicitly enabled when it is available.
+now reasonably close to complete, except that combining characters are not
+handled properly (some assistance with this problem would be appreciated).
+The configuration script should turn on multibyte support on all systems
+where it can be compiled successfully.
The support can be explicitly enabled or disable with --enable-multibyte or
---disable-multibyte. Reports of systems where multibyte support was not
-enabled by default but --enable-multibyte resulted in a usable shell would
-be appreciated. The developers are not aware of any need to use
+--disable-multibyte. The developers are not aware of any need to use
--disable-multibyte and this should be reported as a bug. Currently
-multibyte mode is believed to work automatically on:
+multibyte mode is believed to work on at least the following:
- All(?) current GNU/Linux distributions
-
-and to work when configured with --enable-multibyte on:
-
- OS X 10.4.3 (problems have been reported with multibyte characters
in HFS file names)
- NetBSD 2.0.2
- Solaris 8+ (inputting multibyte characters from the keyboard doesn't
work in some installations).
+ - Cygwin (though use of multibyte characters is somewhat non-standard).
-The main shell is not yet aware of multibyte characters, so for example the
-length of a scalar parameter will return the number of bytes, not
-characters, and pattern tests likewise treat single bytes as if they were
-characters. This means that pattern tests such as ? and [[:alpha:]] do not
-work correctly with characters in multibyte character sets beyond the ASCII
-subset.
+The corresponding shell option MULTIBYTE is now on by default in all
+emulation modes when multibyte support is enabled. Turning it off is not
+recommended unless there is a particular need to examine single bytes
+regardless of the locale. As the line editor bases its behaviour on the
+locale regardless of the option (in order to correspond to the displayed
+character set), the option should be left on during the execution of
+user-defined editor and completion widgets so that the behaviour
+corresponds to that of builtin widgets.
See chapter 5 in the FAQ for some notes on multibyte input.
Index: MACHINES
===================================================================
RCS file: /cvsroot/zsh/zsh/MACHINES,v
retrieving revision 1.3
diff -u -r1.3 MACHINES
--- MACHINES 21 Mar 2006 19:19:07 -0000 1.3
+++ MACHINES 4 Aug 2006 14:44:07 -0000
@@ -180,9 +180,7 @@
SGI: IRIX 6.5
Should build `out-of-the-box'; however, if using the native
compiler, "cc" rather than "c99" is recommended. Compilation
- with gcc is also reported to work. Multibyte is supported,
- for example:
- CC=cc ./configure --enable-multibyte
+ with gcc is also reported to work. Multibyte is supported.
On 6.5.2, zsh malloc routines are reported not to work; also
full optimization (cc -O3 -OPT:Olimit=0) causes problems.
Index: NEWS
===================================================================
RCS file: /cvsroot/zsh/zsh/NEWS,v
retrieving revision 1.10
diff -u -r1.10 NEWS
--- NEWS 28 Feb 2006 12:20:43 -0000 1.10
+++ NEWS 4 Aug 2006 14:44:08 -0000
@@ -5,27 +5,31 @@
Major changes between versions 4.2 and 4.3
------------------------------------------
-- There is support for multibyte character sets in the line editor,
- though not the main shell. See Multibyte Character Support in INSTALL.
+- There is support for multibyte character sets. This is now reasonably
+ close to complete, although Unicode combining characters don't work
+ properly. See Multibyte Character Support in INSTALL.
- The shell can now run an installation function for a new user
- (one with no .zshrc, .zshenv, .zprofile or .zlogin file) without
- any additional setting up by the administrator.
+ (a user with no .zshrc, .zshenv, .zprofile or .zlogin file) without
+ any additional setting up by the administrator. See "THE ZSH/NEWUSER
+ MODULE" in the zshmodules manual page.
- The manual now has a Roadmap section (manual page zshroadmap) to
give new users an indication of the most interesting parts of the
manual.
-- New option PROMPT_SP, on by default, to work around the problem that the
- line editor can overwrite output with no newline at the end.
+- New option PROMPT_SP (on by default): works around the problem that the
+ line editor can overwrite output with no newline at the end. See the
+ zshoptions manual page.
- New option HIST_SAVE_BY_COPY (on by default): history is saved by
- copying and renaming instead of directly overwriting.
+ copying and renaming instead of directly overwriting. See the
+ zshoptions manual page.
- New redirection syntax e.g. {myfd}>file opens a new file descriptor
and stores the number in $myfd, so that >&$myfd will work. Chosen
not to break existing code (and to be compatible with proposals for the
- Korn shell).
+ Korn shell). See the section REDIRECTION in the zshmisc manual page.
- Substitutions of the form ${var:-"$@"}, ${var:+"$@"} and similar where
word-splitting is applied to the text after the :- or :+ (in particular,
@@ -36,20 +40,28 @@
- New Posix-style zsh-specific tests [[:IDENT:]], [[:IFS:]],
[[:IFSSPACE:]], [[:WORD:]] test if character can appear in identifier,
is an IFS character, is an IFS whitespace character, or is considered
- as part of a word (is alphanumeric or appears in $WORDCHARS). Note
- the pattern code doesn't yet handle multibyte characters.
+ as part of a word (is alphanumeric or appears in $WORDCHARS). These
+ works correctly on multibyte characters if the appropriate support
+ is present. See the section FILENAME GENERATION in the zshexpn
+ manual page.
- The idiom =(<<<...) is optimised so that the shell internally turns
the ... into the contents of a file whose name is then substituted.
+ The syntax has always been usable by means of the NULLCMD feature,
+ but previously it generated an intermediate process; it has now
+ been rewritten along the same lines as the optimisation for $(<...)
+ that inserts a file into the command line without the use of an
+ external programme.
- Supplied functions catch and throw provide limited support for
exception handling using the `{ ... } always { ... }' syntax.
+ See the section EXCEPTION HANDLING in the zshcontrib manual page.
- Signals now accept the SIG as part of the name for compatibility with
other shells.
- Editor function argument-base allows non-decimal arguments for
- editor widgets.
+ editor widgets. See the entry in the zshzle manual page.
- As always, there are many enhancements to completion functions.
Index: README
===================================================================
RCS file: /cvsroot/zsh/zsh/README,v
retrieving revision 1.35
diff -u -r1.35 README
--- README 2 Aug 2006 17:16:38 -0000 1.35
+++ README 4 Aug 2006 14:44:09 -0000
@@ -54,7 +54,8 @@
assumed all such octets were allowed in identifiers, however the POSIX
standard does not allow such characters in identifiers. The older
behaviour is still obtained with --disable-multibyte in effect.
-With --enable-multibyte set there are three possible cases:
+With --enable-multibyte in effect (this is now the default anywhere
+it is supported) there are three possible cases:
MULTIBYTE option unset: only ASCII characters are allowed; the
shell does not attempt to identify non-ASCII characters at all.
MULTIBYTE option set, POSIX_IDENTIFIERS option unset: in addition
--
Peter Stephenson <pws@xxxxxxx> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php
Messages sorted by:
Reverse Date,
Date,
Thread,
Author