Re: [htdig] Quick question


Subject: Re: [htdig] Quick question
From: Glenn J. Rowe (glenn.rowe@OttawaComputer.Com)
Date: Sun Mar 05 2000 - 13:21:16 PST


And the prize goes to Geoff Hutchison !!!!!

I had a URL in the sites.txt file that had a space between the http:// and the
www.

Thank you everyone for all your help. It is indexing now and so far so good!!!

Glenn

Geoff Hutchison wrote:

> At 3:18 PM -0500 3/5/00, Glenn J. Rowe wrote:
> >I haven't told it to index any of those pages. It shows hundreds of pages I
> >didn't specify. For this reason it takes forever. I want to know if there
> >is a way to stop it.
>
> At 3:09 PM -0500 3/5/00, Glenn J. Rowe wrote:
> >start_url: `${common_dir}/sites.txt`
> >limit_urls_to: ${start_url}
>
> I think you need to look very carefully through your sites.txt file.
> It's entirely possible you have some small typo in there. Remember
> that while start_url is going to ignore an invalid URL, limit_urls_to
> acts as an OR on all the patterns it gets.
>
> For example, let's say the file contains a line:
>
> http://www.
>
> Well, this will be ignored pretty quickly by start_url, but it will
> include almost every server on the web for limit_urls_to.
>
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Mar 05 2000 - 13:23:21 PST