Re: [htdig] indexing multiple web sites


Subject: Re: [htdig] indexing multiple web sites
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Wed May 03 2000 - 09:28:51 PDT


On Wed, 3 May 2000, atta dubson wrote:

> now *that* sounds more like it. you mean -m on htdig, right? will this
> override start_url and set limit_urls_to to -m as well? what would be the

This sets start_url. It also sets max_hop_conut to 0. It does not set
limit_urls_to.

> proper way for me to update in full all the files at foo.org on a database
> with foo.org and bar.org pages, leaving bar.org untouched?

I would probably do something like this:
htstat -u | grep foo.org >foo.urls
htdig -m -m foo.urls

> and when is 3.2 expected to be stable? :)

I think I should make this a FAQ. The short answer is "real soon now." The
long answer is "the more help we get, the faster it goes." Right now the
snapshots should be fairly stable, but we may have to break the databases
one last time before we release the final version. (This is not as bad as
it sounds. With the new htdump/htload utilities, you can dump the
databases, delete them, then load them into a new version. :-)

The biggest snag is the code is not feature complete yet. Since several of
the major contributors are very busy right now, it may take a while if no
one steps forward to help with things like duplicate detection.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed May 03 2000 - 07:15:43 PDT