Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: PATCH: parse from even deeper in hell
On Fri, 20 Feb 2015 11:12:39 +0100
Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
> > The question is where to put this in on history read. I think it's
> > going to affect non-lexical history, too, but the error on reading won't
> > be flagged up.
>
> I don't think so, unmetafy() doesn't care about the table. And as I
> checked earlier, both the old and new version of the string in my
> history file is unmetafied to the correct UTF-8 string. The 'only'
> problem is that the lexer is looking at some bytes before it's
> unmetafied and some stuff that should have been metafied to avoid
> being parsed as tokens, isn't, because they weren't special in the old
> version. That's why I think running unmetafy before lexing is
> needed... And if the lexer wants metafied text then we'd just have to
> metafy it again right away.
See if this fixes the problems, then.
Note we're almost out of meta characters with this limitation --- we
can't expand beyond the range of 32 we currently reserve if we need to
keep compatibility with history. We're only just getting away with it
with 0xa0 because 0x80 isn't a meta character, as for historical reasons
they start at 0x83.
pws
diff --git a/Src/hist.c b/Src/hist.c
index 381c7e2..acc4259 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -3377,11 +3377,45 @@ histsplitwords(char *lineptr, short **wordsp, int *nwordsp, int *nwordposp,
char *start = lineptr;
if (uselex) {
- LinkList wordlist = bufferwords(NULL, lineptr, NULL,
- LEXFLAGS_COMMENTS_KEEP);
+ LinkList wordlist;
LinkNode wordnode;
- int nwords_max;
+ int nwords_max, remeta = 0;
+ char *ptr;
+
+ /*
+ * Handle the special case that we're reading from an
+ * old shell with fewer meta characters, so we need to
+ * metafy some more. (It's not clear why the history
+ * file is metafied at all; some would say this is plain
+ * stupid. But we're stuck with it now without some
+ * hairy workarounds for compatibility).
+ *
+ * This is rare so doesn't need to be that efficient; just
+ * allocate space off the heap.
+ *
+ * Note that our it's currently believed this all comes out in
+ * the wash in the non-uselex case owing to where unmetafication
+ * and metafication happen.
+ */
+ for (ptr = lineptr; *ptr; ptr++) {
+ if (*ptr != Meta && imeta(*ptr))
+ remeta++;
+ }
+ if (remeta) {
+ char *ptr2, *line2;
+ ptr2 = line2 = (char *)zhalloc((ptr - lineptr) + remeta + 1);
+ for (ptr = lineptr; *ptr; ptr++) {
+ if (*ptr != Meta && imeta(*ptr)) {
+ *ptr2++ = Meta;
+ *ptr2++ = *ptr ^ 32;
+ } else
+ *ptr2++ = *ptr;
+ }
+ lineptr = line2;
+ }
+ wordlist = bufferwords(NULL, lineptr, NULL,
+ LEXFLAGS_COMMENTS_KEEP);
nwords_max = 2 * countlinknodes(wordlist);
if (nwords_max > nwords) {
*nwordsp = nwords = nwords_max;
Messages sorted by:
Reverse Date,
Date,
Thread,
Author