Re: htdig: non-destructive updates


webmaster@www.nisu.flinders.edu.au
Wed, 17 Jun 1998 14:13:47 +0930 (CST)


On 16 Jun, Michael Graff wrote:
> Is there any reasonable way to do this?
>
> (1) index a group of sites.
> (2) nightly, re-index them, updating the current database as
> needed.
> (3) keep searches up and running while the indexing is taking
> place.
> (4) not keep 2 copies of the database. It is already large.
>
> --Michael

Short answer: the -a argument to htdig

Longer:

I've put together a script, modified from rundig, which I run as needed,
but you could run from cron, which uses the -a argument as follows (the
comments should make it clear?). You'll probaly want to skip the bit
where I keep the old versions of the databases.

#! /bin/sh

#
# updatedig
#
# This is a script to update the search database for ht://Dig.
#
if [ "$1" = "-v" ]; then
    verbose=-v
fi

# -a: run using alternate work files so search can still be done during index run
# -t: create an ASCII version of document database in doc_list as specified
# in the config file
# -s: print stats after completion
/web/webdocs/htdig/bin/htdig -a -t $verbose -s
/web/webdocs/htdig/bin/htmerge -a $verbose -s
/web/webdocs/htdig/bin/htnotify $verbose

# Because the -a switch creates alternate work files, but doesn't seem to move
# them into the correct place, we will do it here.
mv /web/webdocs/htdig/db/db.docdb /web/webdocs/htdig/db/db.docdb.old
mv /web/webdocs/htdig/db/db.docdb.work /web/webdocs/htdig/db/db.docdb

mv /web/webdocs/htdig/db/db.docs.index /web/webdocs/htdig/db/db.docs.index.old
mv /web/webdocs/htdig/db/db.docs.index.work /web/webdocs/htdig/db/db.docs.index

mv /web/webdocs/htdig/db/db.wordlist /web/webdocs/htdig/db/db.wordlist.old
mv /web/webdocs/htdig/db/db.wordlist.work /web/webdocs/htdig/db/db.wordlist

mv /web/webdocs/htdig/db/db.words.gdbm /web/webdocs/htdig/db/db.words.gdbm.old
mv /web/webdocs/htdig/db/db.words.gdbm.work /web/webdocs/htdig/db/db.words.gdbm

#
# Only create the endings database if it doesn't already exist.
# This database is static, so even if pages change, this database will not
# need to be rebuilt.
#
if [ ! -f /web/webdocs/htdig/common/word2root.gdbm ]
then
    /web/webdocs/htdig/bin/htfuzzy $verbose endings
fi

# This next needs to be run if synonyms are added/modified/removed
# Guess the best way would be to delete synonyms.gdbm before
# running this script??

if [ ! -f /web/webdocs/htdig/common/synonyms.gdbm ]
then
    /web/webdocs/htdig/bin/htfuzzy $verbose synonyms
fi
# end updatedig

Hope that helps...

-- 
David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet.flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA Visit the PHP mirror at http://au.php.net:81/

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:34 PST