Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

clobber, diff, and any other suggestions...



	some of the users at my site have been complaining about a very spotty 
newsfeed, so i complained in turn to our upstream provider, who responded that 
in order to do anything, i would need to provide them with a list of missing 
articles.  i whipped up this script to climb the directory tree, create a list 
of referenced articles and a list of message-ids for present articles, and find 
out which articles that are being referenced aren't actually present; i was 
wondering if any kind souls could help me tweak it some (and, in the spirit of 
tjl's function archive, let me and others learn by example).
	the first little block of script is, obviously, to create zero-length 
files (or zero out existing files), because the current settings on zsh on my 
machine don't let >> create a file, only append, and don't let > clobber a file, 
only create.  i could swear that the last zsh install i used had the oppostie 
behavior for both, and that the option to set it was clobber/noclobber, but 
doing a search on noclobber in the man page returns nothing, while searching for 
clobber only returns HIST_ALLOW_CLOBBER.  so first, what is that switch?  is 
there some one-time only version of > and >> to toggle that behavior (akin to 
${=variable}, which turns on word splitting for that one substitution) that i 
could use here, since i have become used to not being able to clobber files with 
>, and now sometimes even use it to test for a file's existance?
	the second block should first make a list of every directory under the 
one in which the script is run, and then check for lines beginning with 
'Message-ID', 'Message-Id', or 'References' in all files in those directories, 
parsing out the appropriate field from such lines if they exist.  i stuck in the 
if...then loop because the script would hang if there was a directory that only 
contained other dirs, not files, and the 2> /dev/null in that test was to 
eliminate the error that ls would return when the test failed; the loop _does_ 
stop the script from hanging, but the error text still appears on-screen (which 
isn't a big problem, but which i wouldn't mind fixing somehow).  also, i'm sure 
that that loop could be cleaned up in a lot of different ways--maybe an awk 
statement to do all of the parsing?
	the third block creates a new list of references, one-per-line, and 
sorts it and gets rid of duplicates.  i thought that enclosing the entire while 
loop in curly braces, getting rid of the >> for the output of the echo, and 
piping the output of the entire brace-enclosed block into the sort command would 
let me get rid of those temp files, but that didn't seem to work.  what was i 
missing there?
	finally, and this is the main reason why i'm writing, the last block 
should, as i see it, find any message-ids that occur in both the references list 
and the message-ids list, and output them to the file $matches.  the next line 
diffs $matches against the references list; since the only differences should be 
lines that appear in the references list but not in $matches, i would think that 
the diff output would be a list of message-ids for all of the missing articles, 
each prepended with a '>' (no quotes).  the actual output, however, has a few 
ids that are prepended with a '<', which would mean that these were articles 
that appeared in the match-list but not the original reference list; since the 
match-list can only contain lines that are in the original reference list, i 
have a feeling that somethign is seriously awry here.
	ideas?  suggestions?  comments?  a one-line obscure zsh command that 
would do all of this much more cleanly :) ?

	tia,
	sweth.
	
#!/usr/local/bin/zsh

outdir=$HOME/newses
idfile=$outdir/mesgids
reffile=$outdir/refids
matches=$outdir/matchids
misses=$outdir/missids

echo 'zeroing files'
cp /dev/null $idfile
cp /dev/null $reffile
cp /dev/null $misses
cp /dev/null $matches
cp /dev/null ${reffile}.list
cp /dev/null ${reffile}.list2

for dir in `find . -type d -name '*' -print` ; do
   if [[ -n `ls $dir/*(.) 2> /dev/null` ]] ; then
      echo "grepping mesg-ids in $dir..."
      grep '^Message-I[dD]: <.*>' `ls ${dir}/*(.)` | cut -d':' -f3- \
        | tr -d ' ' | sort >> $idfile
      echo "grepping refs in $dir..."
      grep '^References: <.*>' `ls ${dir}/*(.)` | cut -d':' -f3- \
        | sed 's/^ //' >> $reffile
   fi;
done 

echo "splitting words..."
while read line ; do
   items=(${=line})
   for item in $items ; do
      echo $item >> ${reffile}.list
   done ;
done < $reffile
sort ${reffile}.list | sort -u >> ${reffile}.list2

echo 'finding matches...'
while read line ; do
   grep $line $idfile >> $matches
done < ${reffile}.list2

diff $matches ${reffile}.list2 > $missids


-- 
"Countin' on a remedy I've counted on before
Goin' with a cure that's never failed me
What you call the disease
I call the remedy"  -- The Mighty Mighty Bosstones



Messages sorted by: Reverse Date, Date, Thread, Author