Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: multibyte optimisations
- X-seq: zsh-workers 39908
- From: Sebastian Gniazdowski <psprint@xxxxxxxxxxxx>
- To: zsh-workers@xxxxxxx
- Subject: Re: multibyte optimisations
- Date: Thu, 10 Nov 2016 06:57:01 -0800
- Dkim-signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h= content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=66ePxgXm9eIcb3JV8J58IYHukT 0=; b=H9UViBnA2OhmVQg25VFtXE5Sb35am4hTNuXQbbjNompJGgcbHOxNy4zExY Avxg1pKTxnLzW5HHT0xyRVpTR1vLPFtjDAT82XcGbmOimZIMIbfjYtTFmLMQrIIg DjTobSifRhKhkoUXEneltXta5OoITuBGbK8ydr8qEgO2+gKCk=
- Dkim-signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=smtpout; bh=66 ePxgXm9eIcb3JV8J58IYHukT0=; b=o49oNzvZPMMNjJ/CDoS6+nag2aBteAIr+G z25jbgw/O8nRgl+E9jSLOe/CP6UWqHN3BizAfvJ7YdMIYwTP54Ddq5F/9W8rakdj CjZFELRDf1JKNKk/UeOPIcKLcHp1+KzzmP5G6bsz0of5mA3zUozZcWL6Ll76vIFx L6QJWwPhQ=
- In-reply-to: <20161110134722.06e6dc51@pwslap01u.europe.root.pri>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <CGME20161110103845epcas3p3e7cabeffae723219daafa8d3e6b32f12@epcas3p3.samsung.com> <1478774232.2371010.783342705.69C81F52@webmail.messagingengine.com> <20161110134722.06e6dc51@pwslap01u.europe.root.pri>
On Thu, Nov 10, 2016, at 05:47 AM, Peter Stephenson wrote:
> On Thu, 10 Nov 2016 02:37:12 -0800
> Sebastian Gniazdowski <psprint@xxxxxxxxxxxx> wrote:
> > Other pointed functions seem to be very valid / expected – multibyte
> > functions. They can be optimized if a courageous decision will be made –
> > to do what charnext / pattern.c does:
> >
> > if (!(patglobflags & GF_MULTIBYTE) || !(STOUC(*x) & 0x80))
> > return x + 1;
> >
> > I.e. to optimize for ASCII as subset of UTF-8 also when calling
> > MB_METACHARLEN, not only for MB_METASTRLEN (recent change).
>
> These look straightforward and along the same lines as what we already
> do.
Was worried that multibyte state can be not clear when requesting length
of character, but that cannot really happen, and if it would, then the
loop that advances char by char would have a problem, being in unclear
situation after recent advancement. With this patch the parser runs for
1493 ms instead of 2148 ms :)
--
Sebastian Gniazdowski
psprint@xxxxxxxxxxxx
Messages sorted by:
Reverse Date,
Date,
Thread,
Author