Re: [htdig] update digging


Frank Guangxin Liu (frank@ctcqnx4.ctc.cummins.com)
Wed, 3 Mar 1999 08:22:25 -0500 (EST)


> On Tue, 2 Mar 1999, Frank Guangxin Liu wrote:
>
> > hmm. Maybe my apache 1.2.x server doesn't support If-Modified-Since header?
>
> Possibly. I guess the Apache people would be the ones to ask. ;-)
>
> > That is much better, though lots of network bandwidth is still wasted.
> > Is it safer to use two connections for each document (in case of update dig)?
> > HEAD and GET. Does the reply from HEAD provide more reliable information
> > and always give the last modification date?
>
> HEAD does supply the last modification date, but there's no benefit as far
> as reliability. Not all servers implement HEAD... Further,
> If-Modified-Since is nice since it does everything we want. If it isn't
> modified, it doesn't send it. If it is, it performs the usual GET. Using
> HEAD would *require* two connections, which increases network usage.
> (In other words, it's a big tradeoff. For any server that implements
> If-Modified-Since, that's the best solution.)
>
> However, this is something that I hope to test for the 3.2 series. I'm
> hoping to put in full HTTP/1.1 support, including Keep-Alives and
> persistent connections. This will *really* help network bandwidth.
>
> > hm.. I am running the latest version.
> > Both the initial db and the update db are created using htdig-3.1.1.
> > I had to re-create the initial db because the pdf_parser screwed up
> > in htdig-3.1.0.
>
> If you could send me info on how you're calling htdig and htmerge, I'd
> appreciate it. This should not be happening at all.
>

ls -l .....htdig/db
-rw-r--r-- 1 root root 544903168 Feb 23 22:02 db.docdb
-rw-r--r-- 1 root root 544903168 Feb 23 21:41 db.docdb.work
-rw-r--r-- 1 root root 15255552 Feb 23 21:41 db.docs.index
-rw-r--r-- 1 root root 653474521 Feb 23 21:11 db.wordlist.work
-rw-r--r-- 1 root root 523611136 Feb 23 21:11 db.words.db

I modified the original rundig script for the update dig task.
After rundig is done, I will manually copy/move .work files ...

Here is the rundig.update:

#!/bin/sh

#
# rundig
#
# $Id: rundig,v 1.7 1999/01/31 04:27:02 ghutchis Exp $
#
# This is a sample script to create a search database for ht://Dig.
#
DBDIR=/home/httpd/htdig/db
COMMONDIR=/home/httpd/htdig/common
BINDIR=/home/httpd/htdig/bin
TMPDIR=$DBDIR
export TMPDIR
$BINDIR/htdig -s -a
$BINDIR/htmerge -s -a
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Mar 04 1999 - 09:09:19 PST