Re: [htdig] update digging


Joseph Cheek (joseph@cheek.com)
Tue, 02 Mar 1999 20:57:58 -0800


is there any easy way to integrate update digging with an alternate workfile
database [-a option]? i have a db that i don't want to take offline each
time i update, so i use the -a option. unforch, it seems to requiring
regetting each url as it uses the .work filenames. can i copy the original
files to their .work counterparts before re-digging, thereby allowing me to
use -a?

thanks!

joe

Geoff Hutchison wrote:

> On Tue, 2 Mar 1999, Frank Guangxin Liu wrote:
>
> > Will htdig follow those new URLs though they are not in the original
> > db file?
>
> Yes! It uses the old database to speed up reindexing. It checks the dates
> in the database so that it can skip as much work as possible. :-) As I
> said earlier, it tries not to download documents already in the database.
> And if the server sends it and it hasn't changed, it won't bother parsing
> it.
>
> But if it HAS changed, it goes about its normal business. It will re-parse
> the document, add the URLs to the list to be checked. So new URLs will be
> added to the database.
>
> > Yes, it can find new URLs, but will it follow those URLs and add
> > the new stuff in the db?
>
> Yup. The point of "update" digs isn't to only ensure the docs in the db
> are up to date. The point is to speed up the indexing. If you already have
> the information, why bother to collect it again! :-)
>
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig@htdig.org containing the single word "unsubscribe" in
> the SUBJECT of the message.

--
 Joseph Cheek, Director of Cheek Consulting in Seattle, WA
  Cheek Consulting provides Linux and Internet technology solutions
   We support Linux in business, marketing and commerce on the Internet
    Email joseph@cheek.com, visit www.cheek.com, or call (206) 282-2892
     Search the Linux KB technical archives at http://linuxkb.cheek.com/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Mar 04 1999 - 09:09:18 PST