htdig: Updating and merging


Clyde Brown (WEBMASTER@hsc.edu)
Tue, 15 Dec 1998 15:34:33 -0500


The "How it works" page in the ht://dig documentation says, "if you want to
only update changed documents, these changes have to be merged into the
searchable database." I'm not clear on how that is accomplished. Do I use
the config file to limit digging to directories that I know have been
updated?

I have been running htdig with the -a option to use alternate work files.
I found that new digs and merges didn't update the databases that htsearch
was using. In other words, I had a folder full of 'db.docdb',
'db.wordlist', etc. as well as 'db.docdb.work', 'db.wordlist.work', etc.
Only the '.work' files were being updated. So I have been using the
following 'rundig' script:

  #! /bin/sh
  /home/httpd/htdig/bin/htdig -v -a -s -t
  /home/httpd/htdig/bin/htmerge -v -a -s
  mv /home/httpd/htdig/db/db.wordlist.work \
     /home/httpd/htdig/db/db.wordlist
  mv /home/httpd/htdig/db/db.words.db.work \
     /home/httpd/htdig/db/db.words.db
  mv /home/httpd/htdig/db/db.docs.index.work \
     /home/httpd/htdig/db/db.docs.index
  mv /home/httpd/htdig/db/db.docdb.work \
     /home/httpd/htdig/db/db.docdb

The result is to do a completely new dig and merge within the '.work' files
and then use them to overwrite the old database. My impression from the
documentation is that there is a better way to do updates. Any suggestions
on how to use my work files?

Clyde C. Brown
Webmaster, Hampden-Sydney College
(804) 223-6856
<mailto:webmaster@hsc.edu>
<http://www.hsc.edu/>

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:52 PST