Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: [BUG] Long line makes pattern matching (by //) hog Zsh

X-seq: zsh-workers 38621
From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
To: Zsh hackers list <zsh-workers@xxxxxxx>
Subject: Re: [BUG] Long line makes pattern matching (by //) hog Zsh
Date: Sun, 5 Jun 2016 17:07:33 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:date:in-reply-to:comments:references:to:subject :mime-version; bh=vkfokoWdfNja43yUaqwS+4XrmF8jPjlANgEN0TpJ0oE=; b=Dp0DWim45qkS+tXwXVZmAYzI31BHVMZRiml32n52yH1yftj5VL6MgHkc7exTJjhPYR vigDOJLQK2HUsMe/VHbvnsNnhf2GXQ4tXGoaE9pqmhF+fbMVuuk+8tWRgJHxxFOuevw6 p0rApxzFC2PSrK3jPDKn3vzGz5wOO7rEqJoBIaJMw3xJMz2ws0UqmpJGJ6hbr1YeRGoG rscQdo6FGkp+InkSxianvYlMQJAiq+cE799dTuAW1pk8BXgKGQivc3smTFQmeR2C+0JB 3WoM8nc0v+LcAg2vCSX7e0k0rLR5d42kqmMVhcJqIgMIWiA79bEVhIef3187pJv0sgT5 xtng==
In-reply-to: <20160605203708.3701c7a2@ntlworld.com>
List-help: <mailto:zsh-workers-help@zsh.org>
List-id: Zsh Workers List <zsh-workers.zsh.org>
List-post: <mailto:zsh-workers@zsh.org>
Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
References: <CAKc7PVC=AES1LhY7tYTXrPsefX3CXgtUsxiVbDaxmc5o2iHnVw@mail.gmail.com> <160605121020.ZM7727@torch.brasslantern.com> <20160605203708.3701c7a2@ntlworld.com>

On Jun 5,  8:37pm, Peter Stephenson wrote:
}
} > On Jun 5,  4:36pm, Sebastian Gniazdowski wrote:
} > }
} > } 1. not backslash nor slash nor space [^ /\\\\]##
} > } 2. not number, slash, backslash, space [^0-9/\\\\ ]##
} > } 3. not slash, backslash [^/\\\\]#
} > } 4. end of line (#e)
} 
} The problem is the patterns are pathological.  Each of them can match
} the same characters.

Indeed, given that the first sub-pattern is "at least one" and the
second sub-pattern includes all the same characters as the first, I
think the first ## can be removed without changing what is matched;
i.e.:

    NLIST_COLORING_PATTERN="([^ /\\\\][^0-9/\\\\ ]##[^/\\\\]#(#e))"

And with that pattern the whole test script completes in a couple of
seconds.

} So it's spending a lot of time repartitioning the
} mathches  between the possibilities of 1. and 2. and 3. in the above.
} That's not polynomially bounded.  I'm not sure if it's even
} exponentially bounded.

I tried to work that out and came up with O(n**((n-1)**(n-2))) before
deciding I didn't remember enough of that math to believe it.

The other problem, which I hadn't really thought about before, is that
this is all being applied inside ${str//pat/repl} expression, bounded
only on the right, which means it has to try every position in $str
even when the pattern does NOT match, which for a fair chunk of the
example input it does not.

There may be an optimization possible in the way that igetmatch()
interacts with pattrylen(), so as to skip forward over the longest
NON-match as well as skipping over the longest match.

References:
- [BUG] Long line makes pattern matching (by //) hog Zsh
  - From: Sebastian Gniazdowski
- Re: [BUG] Long line makes pattern matching (by //) hog Zsh
  - From: Bart Schaefer
- Re: [BUG] Long line makes pattern matching (by //) hog Zsh
  - From: Peter Stephenson

Messages sorted by: Reverse Date, Date, Thread, Author