Re: [htdig] Indexing scope


Subject: Re: [htdig] Indexing scope
From: Dave Lers (dave@dalrun.com)
Date: Sun Apr 16 2000 - 11:14:41 PDT


On Sun, 16 Apr 2000, Geoff Hutchison wrote:
> At 10:08 AM -0700 4/16/00, Dave Lers wrote:
> >I'm indexing local pages fine but i would also like to index all external
> >links (not recursive, just the page linked to), is this possible?
>
> This isn't very easy to do at the moment. One way to do it is to have
> your regular config file that will index your local pages. Then you
> have an additional config that looks something like this:
>
> include: htdig.conf
> limit_urls_to:
> max_hop_count: 1

Meaning *all* links are indexed and *no* links are indexed recursively ?

> If you run htdig again with this config file, you'll add in all the

htdig -c my_second_configfile ?

> links to external sites and go no further. The snag is that this will
> only work when the base database consists *only* of local URLs. So

Because those external links will expand the next time through?

> you're best bet is to have each config file specify a separate
> database and copy over the second database to ensure it's "fresh."

htmertge -m my_second_configfile ?

I just realized most/all my external URL's are in a flatfile db, can I do
something like:

limit_urls_to: ${start_url} `/path/to/my.db`

(assuming they are all links to specific pages, no http://foo/ URL's)?.

Dave (who wishes he could figure these things out as fast as you responded :).

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Apr 16 2000 - 12:00:43 PDT