Subject: RE: [htdig] puzzled by htdig
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Thu Oct 05 2000 - 15:31:54 PDT
On Thu, 5 Oct 2000, GYGAX,OTTO (HP-Corvallis,ex1) wrote:
> My limit_urls_to key is set as you have it below (default).
> My start_url is currently set to a list of urls such as http://
> http://
> http://
> pointer to http://
> that contains links to every single mailing archive page.
OK, but then ~arch won't fall into the limits as you've set them (since
it's not any of the patterns in start_url). If you want to index all
documents on the server, you may want a more liberal limit_urls_to
directive, e.g.
limit_urls_to: http:// > Before I extended the start_url key attr., I only had http:// OK, that was one of my points--it will follow the links it sees. So if you
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>
This archive was generated by hypermail 2b28
: Thu Oct 05 2000 - 15:36:17 PDT
> http://
> server's index.html file, missing all other directories at the root. At one
index starting with http://server/ then it will follow links from
index.html. Unless you add those directories (as you did) to start_url, it
won't even know they're there.
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/