Re: [htdig] Quick question


Subject: Re: [htdig] Quick question
From: Glenn J. Rowe (glenn.rowe@OttawaComputer.Com)
Date: Sun Mar 05 2000 - 12:18:34 PST


It does (I think) only index the sites I want but during the indexing process
it says stuff like...

New server: www.cnn.com, 80
+
New server: desertnews.com, 80
+
New server: www.latimes.com, 80

I haven't told it to index any of those pages. It shows hundreds of pages I
didn't specify. For this reason it takes forever. I want to know if there
is a way to stop it.

Glenn

Jim Cole wrote:

> In the config file, are you setting the limit_urls_to attribute to match
> the start_url attribute? Something like...
>
> start_url: http://www.somesite1.com/stuff/ \
> http://www.somesite2.com/otherstuff/
>
> limit_urls_to: http://www.somesite1.com/stuff/ \
> http://www.somesite2.com/otherstuff
>
> This should cause htdig to only index pages that include either
> http://www.somesite1.com/stuff/ or http://www.somesite2/otherstuff/ in
> their full URL.
>
> Jim
>
> Glenn J. Rowe's bits of Sun, 5 Mar 2000 translated to:
>
> >Pardon me. I just started using htdig and just now joined this mailing
> >list. I have a question which I am sure someone will be able to answer.
> >
> >I have specified a rather small list of sites that should be indexed.
> >htdig does only index those sites; however, when indexing it follows
> >links to sites that aren't in the list. This poses a problem because a
> >few sites have a large amount of external links on them and htdig
> >follows everyone of those links. It doesn't index them but it follows
> >them thus making the indexing process take FOREVER. Is there a way to
> >stop that?
> >
> >Glenn Rowe
> >OttawaComputer.Com
> >
> >
> >------------------------------------
> >To unsubscribe from the htdig mailing list, send a message to
> >htdig-unsubscribe@htdig.org
> >You will receive a message to confirm this.
> >
> >
> >
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Mar 05 2000 - 12:20:38 PST