Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: Question about ingetc() vs. word-code

X-seq: zsh-workers 41405
From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
To: "zsh-workers@xxxxxxx" <zsh-workers@xxxxxxx>
Subject: Re: Question about ingetc() vs. word-code
Date: Sat, 8 Jul 2017 14:42:59 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=WULwXmV4spoid2/wBSd/xHVEISzw73DFqydogsBAMMk=; b=I0RJzyJwm7y1TdSd6mIHZio/CH2o0LOOhFNXn8lxOQrFTlrPjb+nuEt5ZGgcaH4knx xInse/W9AuS768mY9TGyYtlP4EIpE+rr6wd0TRlSuH7WkwPacIMe6zx625eXIfkzKqvO kQsk271zc842geBF7KxeWJq29pNj3UvK3JbP6f1U/yYJWOYQHI2baIUjYHD3pHwSmKCu OUqFx+m5yIoCBCGRu7Jke3A6ePBD5qQ6P5DrxMQZG3joG1tVdBi4aoYS7BshbT1X/c2a IBVGquwGGbss/D8i7VejU9ciKTdcQvAzDVCG2nHr7DhE6JY7kbhkFT2QjG7GrPDdn2ls /GAA==
In-reply-to: <etPan.595ccf64.1e76a4fd.4e4d@zdharma.org>
List-help: <mailto:zsh-workers-help@zsh.org>
List-id: Zsh Workers List <zsh-workers.zsh.org>
List-post: <mailto:zsh-workers@zsh.org>
Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
References: <etPan.595ccf64.1e76a4fd.4e4d@zdharma.org>

On Wed, Jul 5, 2017 at 4:37 AM, Sebastian Gniazdowski
<psprint@xxxxxxxxxxx> wrote:
> Hello,
> I noticed quite large number of ingetc() calls

ingetc() is the central function used for reading any shell input that
has to undergo alias expansion or any other sort of lookahead -- stdio
only provides one byte of input "put-back" [ungetc()], but in order to
properly manage aliases and to differentiate things like "((..." [as
either two subshells or one math expression], the shell lexer may need
to read, put back, and then re-read an arbitrary amount of the input
stream.

> Why the compiled, not-eval source still exist in hunks in ingetc() input? Many times. The eval-code also appears, but this is probably expected.

The compiled wordcode includes all the original text of most strings
and identifiers, so that XTRACE and VERBOSE output can be properly
reproduced.  Only shell lexical tokens are turned into numeric codes.
Identifiers that are referenced as well as assigned will appear at
each $NAME expansion or function name call.

A possible optimization for compiling whole digests of related
functions would be to build an identifier dictionary and refer to the
identifiers by a wordcode value followed by an offset into the
dictionary, but this would be wasteful for most small/single-function
compilations and would complicate the XTRACE playback.

Follow-Ups:
- Re: Question about ingetc() vs. word-code
  - From: Sebastian Gniazdowski

References:
- Question about ingetc() vs. word-code
  - From: Sebastian Gniazdowski

Messages sorted by: Reverse Date, Date, Thread, Author