Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: bufferwords() lexes a subshell in a shortloop repeat as a string
- X-seq: zsh-workers 37701
- From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
- To: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
- Subject: Re: bufferwords() lexes a subshell in a shortloop repeat as a string
- Date: Wed, 20 Jan 2016 07:47:54 +0000
- Cc: Peter Stephenson <p.stephenson@xxxxxxxxxxx>, Zsh hackers list <zsh-workers@xxxxxxx>
- In-reply-to: <CAH+w=7Z7d9Xc2ro9F1cMoyT_TeqmVNYzZc0vOnrCchtRi_4VDQ@mail.gmail.com>
- List-help: <mailto:zsh-workers-help@zsh.org>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:zsh-workers@zsh.org>
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <20160115062648.GA14019@tarsus.local2> <20160115094117.5fcde75c@pwslap01u.europe.root.pri> <20160118022558.GC3979@tarsus.local2> <CAH+w=7Z7d9Xc2ro9F1cMoyT_TeqmVNYzZc0vOnrCchtRi_4VDQ@mail.gmail.com>
Bart Schaefer wrote on Mon, Jan 18, 2016 at 20:56:04 -0800:
> [Returning to the original topic of this thread ...]
>
> On Sun, Jan 17, 2016 at 6:25 PM, Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx> wrote:
> > What confuses me is that 'repeat 3 (x)' and 'repeat 3; do (x); done' are
> > split differently. ;-)
> >
> > Shouldn't both of them treat the "(x)" the same way [either both of
> > them considering it one unit, or both of them considering it three units]?
>
> As Peter said earlier, the (z) flag does nothing but break the string
> into syntactic shell words. With the exception of "for" loops, which
> are a weird special case because of "for ((...))", It does NOT
> interpret shell keywords to parse any corresponding loop structures.
> It knows a little about assignments and redirections but otherwise
> reads lexical tokens in their most generic possible context; you can
> think of it as having "lex" without "yacc" to drive it.
>
Okay; so what I was seeing was that bufferwords() knew that a DOLOOP token
is followed by a command position, but not that a REPEAT token is
followed by a token that's followed by a command position.
I think REPEAT is the only place where that happens: other reserved
words are followed immediately by a command position with no intervening
words. (Which is why get_comp_string() sets 'ins' to '2' only for
REPEAT tokens.)
Aside: bufferwords(), get_comp_string(), and z-sy-h's main loop have
something in common: they all drive the lexer and keep track of a little
bit of syntax. E.g., with this patch all of them keep track of "if the
command word is 'repeat', the word-after-next is a command word".
> (z) also does not expand aliases, which means that even if it did
> interpret keywords you could trivially break it by aliasing something
> else to expand as "repeat" or vice-versa. (In fact you can already
> break the magic "for" parsing the same way.)
Don't do that, then :-)
Cheers,
Daniel
Messages sorted by:
Reverse Date,
Date,
Thread,
Author