Re: [htdig3-dev] Htdig database backend


Subject: Re: [htdig3-dev] Htdig database backend
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Tue Dec 14 1999 - 11:54:54 PST


On Tue, 14 Dec 1999 loic@ceic.com wrote:

> I think URL state description, robots.txt content, cookies are all
> candidates to be stored on disk. One *very* interesting feature would
> be to have a restartable crawler. htdig + ^C + htdig restart where it
> stopped. Once you store the state of your crawler in a database, you
> get that advantage.

You can actually get some amount of restart with the -l flag contributed
by Didier Gautheron. It stores the current state of the retriever to a
file which it re-reads on next invocation.

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Dec 14 1999 - 12:08:33 PST