Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: stack overflow in patmatch()



> On 11/05/2025 05:41 BST Mikael Magnusson <mikachu@xxxxxxxxx> wrote:
> A user on irc reported that zsh crashes when they invoked
> insert-unicode-char with zsh-syntax-highlighting.zsh [1] loaded. It
> turned out they pressed space instead of invoking the widget again
> after entering the unicode codepoint, which of course causes zsh to
> insert however many spaces correspond to the codepoint entered, which
> in this particular case was 8630 spaces (many).
> 
> Long story short, [1] has this line of code:
> [[ "$proc_buf" = (#b)(#s)(([[:space:]]|\\$'\n')#) ]]

There are two cases that look similar but are actually handled very
differently: something like

[[:space:]\\$'\n']#

would be easy --- you just count how many matches there are and then stop.
Unfortunately here one of the possibilites is a two character set so
there's no way of formulating it without requiring a full subpattern match.

As soon as you introduce a generalised expression, though, it becomes
much more complicated and you get pathological cases in backtracking,
since the matcher can't assume anything about the structure of what's
inside the parenthesis with the '#' after it.  There's probably an
optimisation that can be done for the case (....)# that stops it being
recursive.  However, that's not a trivial fix --- without recursion,
where it all comes out automatically in local variable, you're passing
back state to the upper stack frame which has to be managed somehow.

> which causes this backtrace (parts elided)
> (gdb) bt
> #0  0x000000000049d098 in patmatch (
>     prog=<error reading variable: Cannot access memory at address
> 0x7fffff7fef38>)
>     at pattern.c:2695
> #1  0x000000000049e673 in patmatch (prog=0x5073d0) at pattern.c:3252
> #2  0x000000000049dd8e in patmatch (prog=0x507398) at pattern.c:2978
> #3  0x000000000049e673 in patmatch (prog=0x507390) at pattern.c:3252
> #4  0x000000000049dcdc in patmatch (prog=0x507388) at pattern.c:2952
> #5  0x000000000049e673 in patmatch (prog=0x5073d0) at pattern.c:3252
> #6  0x000000000049dd8e in patmatch (prog=0x507398) at pattern.c:2978
> #7  0x000000000049e673 in patmatch (prog=0x507390) at pattern.c:3252
> #8  0x000000000049dcdc in patmatch (prog=0x507388) at pattern.c:2952
> ...
> #8179 0x000000000049e673 in patmatch (prog=0x507390) at pattern.c:3252
> #8180 0x000000000049dcdc in patmatch (prog=0x507388) at pattern.c:2952
> #8181 0x000000000049e673 in patmatch (prog=0x507370) at pattern.c:3252
> #8182 0x000000000049dcdc in patmatch (prog=0x507358) at pattern.c:2952
> #8183 0x000000000049ca21 in pattryrefs (prog=0x507320,

It should certainly be possible to limit the depth of calls to
patmatch().  The big problem is there's never a good compromise to the
limiit --- we've seen this in function recursion, where some people
expect something to work that crashes on another system.  But if we just
return fail for difficult cases we can avoid a crash.

pws




Messages sorted by: Reverse Date, Date, Thread, Author