Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: PATCH: parse from even deeper in hell
- X-seq: zsh-workers 34581
- From: Mikael Magnusson <mikachu@xxxxxxxxx>
- To: Peter Stephenson <p.w.stephenson@xxxxxxxxxxxx>
- Subject: Re: PATCH: parse from even deeper in hell
- Date: Fri, 20 Feb 2015 04:43:49 +0100
- Cc: "Zsh Hackers' List" <zsh-workers@xxxxxxx>
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=nOPkQawiTfZZvxyq9u8RplWfiOV2jf56TLlOoC0dpdg=; b=KM4kCtwKleucyLkn7J/UboxC9imbfiw0B4XjIsgN6nC9sYKLO/AASGRHLDCI3lrFC7 EGY8MHHMd6F7twK9ha16vtiRULt2WtqeR6SvzaEsUOtjWxUeGbhnLoajgNv7Ozcu62Hb z8xNHRGanZLa1bkZrS9cgLeG6jE4N4kkyG6z7/svRBOweA+FW6f1/zlOVW8JomypUUUc iEo6UP34cMRCZLrn4Xn8Yf75G8pXL3O5OmRiqmz5GAGElr1QPFpe2FKpqnxMBLa3oJWH 0pDXgFCtyanCmDaCI91DCv8+3f2RqaxxT+aQDSjzu0iclUFySULKC62TGSGvD5EbtXYZ RUQg==
- In-reply-to: <CAHYJk3RPCXk=G1RQ9cDStP1wBhuy9GbQHA8X3GQE94fkDkQCqQ@mail.gmail.com>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <20150219101315.477f7f95@pwslap01u.europe.root.pri> <CAHYJk3T4yw3cz8o8-EVF8gzN_hi+M4kc92UR-XTaFBsJtDD7qg@mail.gmail.com> <20150219220311.7dfdc4ec@ntlworld.com> <CAHYJk3RYztT7Urq08tysa-Cr0WgJf1Ehmrbingnbap=-eLWGdQ@mail.gmail.com> <CAHYJk3T9rJ1t7GJkRxgEO_q5JF_iFXkqKPwb-aSnZSEwhohD-A@mail.gmail.com> <CAHYJk3RPCXk=G1RQ9cDStP1wBhuy9GbQHA8X3GQE94fkDkQCqQ@mail.gmail.com>
On Fri, Feb 20, 2015 at 4:33 AM, Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
> On Fri, Feb 20, 2015 at 4:22 AM, Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
>> On Fri, Feb 20, 2015 at 4:16 AM, Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
>>> On Thu, Feb 19, 2015 at 11:03 PM, Peter Stephenson
>>> <p.w.stephenson@xxxxxxxxxxxx> wrote:
>>>> On Thu, 19 Feb 2015 22:47:12 +0100
>>>> Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
>>>>> I get a crapton of "bad(2) wordsplit reading history:" with this
>>>>> patch. It seems like all the failed lines have metafied characters in
>>>>> them, if that's a hint. Most don't contain any syntax characters at
>>>>> all, for example:
>>>>> hist.c:3499: bad(2) wordsplit reading history: mp3info 好きになり\M-c\M-^Aい.mp3
>>>>> at: 好きになり\M-c\M-^Aい.mp3s
>>>>> word: 好きになり\M-c\M-^Aい.mp3
>>>>
>>>> Unless I'm missing something, I don't think you've said what the real
>>>> characters you're expecting are. The broken ones aren't much use for
>>>> testing.
>>>>
>>>>> The (2) means it's the second of the two bad=1; assignments
>>>>> triggering.
>>>>
>>>> At line 3490?
>>>
>>> Yes.
>>>
>>>>> I'm also not sure why the utf8 is slightly mishandled in the output
>>>>> there. It has at least been unmetafied, the raw string in the history
>>>>> file is more or less:
>>>>> mp3info 好ぃ�になゃ�たぃ�.mp3
>>>>
>>>> So those aren't actually valid characters? Does that mean metafied
>>>> characters are getting into the history? I've made it necessary for two
>>>> more bytes to be metafied, so if the shell was expecting them to be
>>>> metafied in the history file they won't be. The bytes are 0x9e and
>>>> 0x9f. I guess we could special case those, but do we really output
>>>> metafied characters to the history file?
>>>
>>> The actual line in the history is
>>> mp3info 好きになりたい.mp3
>>> but in the history _file_, it's stored metafied, which is hard to
>>> paste into an email. I'm not sure why pasting the original string
>>> didn't occur to me. AFAIK, history files have always been metafied.
>>> I'm not sure why the た is mangled in the error message is what I tried
>>> to say originally. The final byte is 9f which I suppose is an esc with
>>> the 8th bit set. Maybe something is trying to double unmetafy? Running
>>> it through unmetafy() twice doesn't cause any problems though...
>>
>> Just looked at the debug code and found out about ZSH_DEBUG_LOG, turns
>> out there's also a 0x8A just before the \M-c\M-^
>
> Rerunning the original command seems to produce a different metafied
> string than what was in the history before. What's weird is that it
> does import correctly into the session from both lines... The line
> from running it again also does not cause the wordsplit error.
> grepping both of them into my unmetafy program also produces identical
> utf8 strings.
> This is the one causing a problem,
> mp3info M-eM-%M-=M-cM-^AM-^CM--M-cM-^AM-+M-cM-^AM-*M-cM-^BM-^CM-*M-cM-^AM-^_M-cM-^AM-^CM-$.mp3
> and this is what we store now which is fine,
> mp3info M-eM-%M-=M-cM-^AM-^CM--M-cM-^AM-+M-cM-^AM-*M-cM-^BM-^CM-*M-cM-^AM-^CM-?M-cM-^AM-^CM-$.mp3
>
> Any idea why I have a bunch of history entries stored differently,
> that do unmetafy to the correct string, but are parsed weirdly with a
> patch that changes how $(( is parsed? I don't quite see the connection
> yet :).
Oh I see, you renumbered a bunch of stuff in zsh.h, so text would be
metafied differently now. But only metafy uses the table, unmetafy
doesn't. That explains why the strings are different, but not why the
old string causes an error. Unless we are parsing it before
unmetafying it, which means any random bytes that weren't special
before but are now would be interpreted specially. Can we
unmetafy+metafy the string before lexing? I guess that might be slower
though. (Sorry for the 500 mails).
--
Mikael Magnusson
Messages sorted by:
Reverse Date,
Date,
Thread,
Author