Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author

Re: Slurping a file (was: more spllitting travails)

X-seq: zsh-users 29477
From: Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx>
To: Zsh Users <zsh-users@xxxxxxx>
Subject: Re: Slurping a file (was: more spllitting travails)
Date: Sun, 14 Jan 2024 14:09:16 -0800
Archived-at: <https://zsh.org/users/29477>
In-reply-to: <CAN=4vMq=E4s2a0sDFq-Mc8=pVzPnYOM9NaTmesgXQqi+O+mHpw@mail.gmail.com>
List-id: <zsh-users.zsh.org>
References: <ca1761f1-6d8f-452a-b16d-2bfce9076e25@eastlink.ca> <CAH+w=7ZJsr7hGRvD8f-wUogPcGt0DMOcPyiYMpcwCsbBNkRwuQ@mail.gmail.com> <CAA=-s3zc5a+PA7draaA=FmXtwU9K8RrHbb70HbQN8MhmuXTYrQ@mail.gmail.com> <CAH+w=7bAWOF-v36hdNjaxBB-5rhjsp97mAtyESyR2OcojcEFUQ@mail.gmail.com> <205735b2-11e1-4b5e-baa2-7418753f591f@eastlink.ca> <CAH+w=7Y5_oQL20z7mkMUGSLnsdc9ceJ3=QqdAHVRF9jDZ_hZoQ@mail.gmail.com> <CAA=-s3x4nkLST56mhpWqb9OXUQR8081ew63p+5sEsyw5QmMdpw@mail.gmail.com> <CAH+w=7Yi+M1vthseF3Awp9JJh5KuFoCbFjLa--a22BGJgEJK_g@mail.gmail.com> <CAN=4vMpexntEq=hZcmsiXySy-2ptXMvBKunJ1knDkkS+4sYYLA@mail.gmail.com> <CAH+w=7aT-gbt7PRo=uvPK5=+rR3X-PE7nEssOkh+=fxwdeG_7w@mail.gmail.com> <CAN=4vMq=E4s2a0sDFq-Mc8=pVzPnYOM9NaTmesgXQqi+O+mHpw@mail.gmail.com>

On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
<roman.perepelitsa@xxxxxxxxx> wrote:
>
> On Sat, Jan 13, 2024 at 9:02 PM Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
> >
> >   IFS= read -rd '' file_content <file
>
> In addition to being unable to read files with nul bytes, this
> solution suffers from additional drawbacks:
>
> - It's impossible to distinguish EOF from I/O error.

Pretty sure you can do that by examining $ERRNO on nonzero status?

> - It's slow when reading from non-file file descriptors.
> - It's slower than the optimized sysread-based slurp (see below) for
> larger files.

I'm curious whether
  setopt nomultibyte
  read -u 0 -k 8192 ...
is actually that much slower in a slurp-like loop.

> Here's a version with linear time complexity:
>
>     function slurp() {
>       emulate -L zsh -o no_multibyte
>       zmodload zsh/system || return
>       local -a content
>       local -i i
>       while true; do
>         sysread 'content[++i]' && continue

Another thought:  Use -c count option to get number of bytes read and
-s $size option to specify buffer size.  If (( $count == $size )) then
double $size for the next read.

>         (( $? == 5 )) || return
>         break
>       done
>       typeset -g REPLY=${(j::)content}

Why the typeset here?  Just assign?

>     }

Follow-Ups:
- Re: Slurping a file (was: more spllitting travails)
  - From: Roman Perepelitsa

References:
- more splitting travails
  - From: Ray Andrews
- Re: more splitting travails
  - From: Bart Schaefer
- Fwd: more splitting travails
  - From: Bart Schaefer
- Re: Fwd: more splitting travails
  - From: Ray Andrews
- Re: Fwd: more splitting travails
  - From: Bart Schaefer
- Re: Fwd: more splitting travails
  - From: Mark J. Reed
- Re: Fwd: more splitting travails
  - From: Bart Schaefer
- Re: Fwd: more splitting travails
  - From: Roman Perepelitsa
- Slurping a file (was: more spllitting travails)
  - From: Bart Schaefer
- Re: Slurping a file (was: more spllitting travails)
  - From: Roman Perepelitsa

Messages sorted by: Reverse Date, Date, Thread, Author