Re: [htdig] indexing multiple web sites


Subject: Re: [htdig] indexing multiple web sites
From: atta dubson (dubson@palicanon.org)
Date: Wed May 03 2000 - 09:17:16 PDT


On Wed, 3 May 2000, Geoff Hutchison wrote:

> As they would say in New England "yah can't get there from here."
>
> In 3.1.5 if you want one database, you'll have to accept that updates
> will occur on all URLs.

doesn't sound like fun when indexing 400+ sites, most of which are very
static, while some update weekly.

> However, you could have separate databases for each group you want to
> update and then use htmerge -m to merge them two-at-a-time until you
> have the one database for searching.

still doesn't sound like fun, but i could have separate weekly and monthly
databases which i could merge weekly.

> In 3.2 there are a few ways of doing this, by specifying the list of
> URLs through standard input, or by using -m and a file of URLs
> specifying the exact list of URLs to update/index. (The latter was
> broken until this last snapshot.)

now *that* sounds more like it. you mean -m on htdig, right? will this
override start_url and set limit_urls_to to -m as well? what would be the
proper way for me to update in full all the files at foo.org on a database
with foo.org and bar.org pages, leaving bar.org untouched?

and when is 3.2 expected to be stable? :)

thanks...
atta

from the dhammapada:
 
  A good awakening have ever Gotama's disciples, whose recollection is
  always established, day and night on the Buddha. 296
 
http://www.palicanon.org/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed May 03 2000 - 07:03:51 PDT