Re: [htdig] update digging


Geoff Hutchison (ghutchis@wso.williams.edu)
Tue, 2 Mar 1999 22:49:27 -0500 (EST)


On Tue, 2 Mar 1999, Frank Guangxin Liu wrote:

> Will htdig follow those new URLs though they are not in the original
> db file?

Yes! It uses the old database to speed up reindexing. It checks the dates
in the database so that it can skip as much work as possible. :-) As I
said earlier, it tries not to download documents already in the database.
And if the server sends it and it hasn't changed, it won't bother parsing
it.

But if it HAS changed, it goes about its normal business. It will re-parse
the document, add the URLs to the list to be checked. So new URLs will be
added to the database.

> Yes, it can find new URLs, but will it follow those URLs and add
> the new stuff in the db?

Yup. The point of "update" digs isn't to only ensure the docs in the db
are up to date. The point is to speed up the indexing. If you already have
the information, why bother to collect it again! :-)

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Mar 04 1999 - 09:09:18 PST