On Wed, Nov 09, 2011 at 07:51:18PM -0800, Bart Schaefer wrote: > This tends to imply that what's taking those 3-5 seconds is searching > your history file for duplicate entries in order to enforce the > hist_save_no_dups option. Also, inc_append_history may allow the file > to grow to up to 10500 lines for a SAVEHIST of 10000, and those extra > lines will be trimmed at shell exit. > > In more detail, when you use inc_append_history and/or share_history > along with hist_save_no_dups, zsh re-reads and de-duplicates the entire > file from disk after it has obtained a lock for it, rather than just > dumping out the history that is already in memory, because it can't > know if some other shell has appended something new to the file before > the lock was obtained. The speed of the disk is inconsequential to the > CPU expended doing the deduplication. Okay, I wrote stupid and simple program to de-duplicate lines in history file: import qualified Data.ByteString.Char8 as C8 import Data.List import System import Control.Monad fixHist = nub . C8.lines main = do (fname:_) <- getArgs hl <- liftM fixHist $ C8.readFile fname C8.writeFile fname $ C8.unlines hl function nub is O(n^2) time-complex, and still it takes less than second in order to read file, optimize it and write it down. > ./hst .histfile.tst 0.89s user 0.01s system 99% cpu 0.901 total and I believe that using linked hashtables will improve performance as well. Am I missing something there if would say that history saving and de-duplication is suboptimal? -- Eugene N Dzhurinsky
Attachment:
pgpXPU_mbjvU2.pgp
Description: PGP signature