Zsh Mailing List Archive
Messages sorted by:
Reverse Date,
Date,
Thread,
Author
Re: [PATCH] Abbreviate commit hash in prompt during git rebase
- X-seq: zsh-workers 50073
- From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
- To: Sam Bostock <sam.bostock@xxxxxxxxxxx>
- Cc: Lawrence Velázquez <larryv@xxxxxxx>, zsh-workers@xxxxxxx
- Subject: Re: [PATCH] Abbreviate commit hash in prompt during git rebase
- Date: Sun, 17 Apr 2022 17:52:28 +0000
- Archived-at: <https://zsh.org/workers/50073>
- In-reply-to: <CAHwnEohmcE7Wg6ZBdKmUgU-e7zryRgUXXWPJFRdxU9_nbZ=DpA@mail.gmail.com>
- List-id: <zsh-workers.zsh.org>
- References: <CAHwnEojkSP1hrNvBciwMaaPXre9r-Fs2W2F1dFnnMXuUR4aeaw@mail.gmail.com> <58222599-b634-401b-b70f-3e5fd5633839@www.fastmail.com> <CAHwnEohmcE7Wg6ZBdKmUgU-e7zryRgUXXWPJFRdxU9_nbZ=DpA@mail.gmail.com>
Sam Bostock wrote on Wed, Apr 13, 2022 at 13:53:40 -0600:
> Hi Lawrence,
>
> Thanks for the reply!
>
Have you seen my previous reply to you, <https://zsh.org/workers/50057>?
(For obvious reasons I'll re-send this question to you in a minute from
another address of mine. Apologies in advance for the duplicate.)
> Git's abbreviation mechanism ensures uniqueness, rather than strictly
> truncating. If I'm not mistaken, the core.abbrev setting is effectively
> just a minimum for how many characters to include, with git including as
> many additional characters as required to avoid colliding with another
> object in the repository. I believe the default is determined using a
> heuristic based on the number objects in the repository.
>
> So we could make it a setting and do truncation ourselves, but then we'd be
> displaying a value without uniqueness guarantees.
>
The operative word here is "guarantee". If we truncate to, say, 12 hex
digits, that's 48 bits (= 12 nibbles), so a repository would need to
have 2**24 commits (≈ 16M) in order to have a ≈50% chance of having two
commits with the same first 12 nibbles. (That's _commits_, not
_objects_, since IIRC «d41d8cd^{commit}» v. «d41d8cd^{blob}» is a valid
way to disambiguate.) Only the largest/oldest projects would have
history DAGs this large.¹
And if we truncate "too soon" and the resulting hash is ambiguous, then
what? Would that be a problem?
- It wouldn't break the use-case of looking at the prompt after «git
commit» or «git rebase --continue» to confirm the command worked.
- It wouldn't break the use-case of reviewing the scrollback and knowing
what the state was before each command.
- It _would_ break the use-case of copying the hash into the command
line, because then git would error out about the ambiguous prefix.
The workaround would be to extract the full hash from the reflogs or
from rebase's todo list.
This use-case is so common for me that I had my zshrc store the commit
hash in a global variable so I can easily use it in interactive
commands. (I do «typeset -g y=…» in zshrc and then «git show $y»
interactively.)
So, do we actually need to *guarantee* uniqueness?
¹ For instance, FreeBSD's ports repository, which started in 1994, has
0.6M commit objects (per «git rev-list --all --count») and autogenerates
12-nibble short hashes. (I didn't choose this repository because of
this property; it's just a repository I happened to have handy.)
> Given we're already in somewhat of an edge case (rebase in progress), is
> the penalty still a problem?
>
I don't consider rebases an edge case, so, "Yes.".
> Also, while I'm not familiar with the inner workings of vcs_info, I'm
> wondering if we recompute this every time the prompt is rendered, or if we
> lazily compute the values only when the rebase progresses?
>
Every time.
Cheers,
Daniel
> Sam
>
>
> On Tue., Apr. 12, 2022, 11:38 p.m. Lawrence Velázquez, <larryv@xxxxxxx>
> wrote:
>
> > Hi Sam,
> >
> > On Wed, Apr 13, 2022, at 12:41 AM, Sam Bostock wrote:
> > > First time contributor here
> >
> > Welcome!
> >
> > > diff --git a/Functions/VCS_Info/Backends/VCS_INFO_get_data_git
> > > b/Functions/VCS_Info/Backends/VCS_INFO_get_data_git
> > > index e45eebc8e..dd9c40ab4 100644
> > > --- a/Functions/VCS_Info/Backends/VCS_INFO_get_data_git
> > > +++ b/Functions/VCS_Info/Backends/VCS_INFO_get_data_git
> > > @@ -209,27 +209,15 @@ elif [[ -d "${gitdir}/rebase-merge" ]]; then
> > > return 0
> > > ;;
> > > (''(p|pick|e|edit|r|reword|f|fixup|s|squash)' '*)
> > > - # The line is of the form "pick $hash $subject".
> > > - # Just strip the verb and we're good to go.
> > > - p=${p#* }
> > > + # Typically, the line is of the form "pick $longhash
> > > $subject\n".
> > > # Special case: in an interactive rebase, if the user
> > > wrote "p $shorthash\n"
> > > # in the editor (without a description after the
> > > hash), then the .../done
> > > # file will contain "p $longhash $shorthash\n" (git
> > > 2.11.0) or "pick $longhash\n"
> > > # (git 2.19.0).
> > > - if [[ $p != *\ * ]]; then
> > > - # The line is of the form "pick $longhash\n"
> > > - #
> > > - # Mark the log message subject as unknown.
> > > - # TODO: Can we performantly obtain the subject?
> > > - p+=" ?"
> > > - elif (( ${#${p//[^ ]}} == 1 )) && [[ ${p%% *} == ${p#*
> > > }* ]]; then
> > > - # The line is of the form "p $longhash
> > > $shorthash\n"
> > > - #
> > > - # The shorthash is superfluous, so discard it,
> > > and mark
> > > - # the log message subject as unknown.
> > > - # TODO: Can we performantly obtain the subject?
> > > - p="${p%% *} ?"
> > > - fi
> > > + # Given we don't reliably have the subject, and the
> > > full hash is needlessly long
> > > + # for a prompt, we simply extract the hash (second
> > > word) and delegate to git to
> > > + # format it as "$shorthash $subject" according to
> > > git's hash abbreviation config.
> > > + p=$(${vcs_comm[cmd]} log -1 --pretty=format:'%h %s'
> > > $p[(w)2])
> > > ;;
> > > (''(x|exec) *)
> > > # The line is of the form 'exec foo bar baz' where
> > > 'foo bar
> >
> > It strikes me as a regression to run git if we already have the
> > subject; vcs_info is slow enough as it is. Could this be reserved
> > for the edge cases where we don't have the subject?
> >
> > (This would preclude what is arguably your primary goal, which is
> > using core.abbrev to choose between short and long hashes. But
> > that amounts to using git as a glorified string slicer. Perhaps
> > the choice could be determined by a style instead, and we could do
> > the truncation ourselves?)
> >
> > --
> > vq
> >
Messages sorted by:
Reverse Date,
Date,
Thread,
Author