Re: [htdig] update digging

Frank Guangxin Liu (
Wed, 3 Mar 1999 08:22:25 -0500 (EST)

> On Tue, 2 Mar 1999, Frank Guangxin Liu wrote:
> > hmm. Maybe my apache 1.2.x server doesn't support If-Modified-Since header?
> Possibly. I guess the Apache people would be the ones to ask. ;-)
> > That is much better, though lots of network bandwidth is still wasted.
> > Is it safer to use two connections for each document (in case of update dig)?
> > HEAD and GET. Does the reply from HEAD provide more reliable information
> > and always give the last modification date?
> HEAD does supply the last modification date, but there's no benefit as far
> as reliability. Not all servers implement HEAD... Further,
> If-Modified-Since is nice since it does everything we want. If it isn't
> modified, it doesn't send it. If it is, it performs the usual GET. Using
> HEAD would *require* two connections, which increases network usage.
> (In other words, it's a big tradeoff. For any server that implements
> If-Modified-Since, that's the best solution.)
> However, this is something that I hope to test for the 3.2 series. I'm
> hoping to put in full HTTP/1.1 support, including Keep-Alives and
> persistent connections. This will *really* help network bandwidth.
> > hm.. I am running the latest version.
> > Both the initial db and the update db are created using htdig-3.1.1.
> > I had to re-create the initial db because the pdf_parser screwed up
> > in htdig-3.1.0.
> If you could send me info on how you're calling htdig and htmerge, I'd
> appreciate it. This should not be happening at all.

ls -l .....htdig/db
-rw-r--r-- 1 root root 544903168 Feb 23 22:02 db.docdb
-rw-r--r-- 1 root root 544903168 Feb 23 21:41
-rw-r--r-- 1 root root 15255552 Feb 23 21:41
-rw-r--r-- 1 root root 653474521 Feb 23 21:11
-rw-r--r-- 1 root root 523611136 Feb 23 21:11 db.words.db

I modified the original rundig script for the update dig task.
After rundig is done, I will manually copy/move .work files ...

Here is the rundig.update:


# rundig
# $Id: rundig,v 1.7 1999/01/31 04:27:02 ghutchis Exp $
# This is a sample script to create a search database for ht://Dig.
export TMPDIR
$BINDIR/htdig -s -a
$BINDIR/htmerge -s -a
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Thu Mar 04 1999 - 09:09:19 PST