Re: [htdig] Indexing scope


Subject: Re: [htdig] Indexing scope
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Sun Apr 16 2000 - 18:36:48 PDT


At 11:14 AM -0700 4/16/00, Dave Lers wrote:
> > include: htdig.conf
> > limit_urls_to:
> > max_hop_count: 1
>
>Meaning *all* links are indexed and *no* links are indexed recursively ?

Right, this would mean that it should just go one hop from the URLs
in the database.

> > If you run htdig again with this config file, you'll add in all the
>
>htdig -c my_second_configfile ?

Yes.

> > links to external sites and go no further. The snag is that this will
> > only work when the base database consists *only* of local URLs. So
>
>Because those external links will expand the next time through?

Right, so you'd effectively be going two deep.

> > you're best bet is to have each config file specify a separate
> > database and copy over the second database to ensure it's "fresh."
>
>htmertge -m my_second_configfile ?

No, more like:
cp database-one/* database-two/
htdig -c my_second_configfile

>I just realized most/all my external URL's are in a flatfile db, can I do
>something like:
>
>limit_urls_to: ${start_url} `/path/to/my.db`
>(assuming they are all links to specific pages, no http://foo/ URL's)?.

Oh sure. As long as there's whitespace between each URL, that'll work
just fine.

>Dave (who wishes he could figure these things out as fast as you responded :).

It helps knowing some of the code and answering a few of these before. ;-)

Cheers,

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Apr 16 2000 - 16:45:04 PDT