Re: [htdig] Quick question


Subject: Re: [htdig] Quick question
From: Jim Cole (greyleaf@yggdrasill.net)
Date: Sun Mar 05 2000 - 12:52:43 PST


As far as I know, what you are doing should be fine. I even did a quick
test using the same settings as you and no extra sites were picked up.
It really sounds like your limit_urls_to are not being recognized for
some reason. Did you paste what you show below into the message? If not
are you sure there isn't a typo somewhere in your config file? Something
commented out? More than one occurrence of one of those attributes? Have
you tried recreating the databases from scratch?

Jim

Glenn J. Rowe's bits of Sun, 5 Mar 2000 translated to:

>No - It looks like this...
>
>start_url: `${common_dir}/sites.txt`
>limit_urls_to: ${start_url}
>
>Am I doing it right? Thanks for trying to help me.
>
>Glenn
>
>
>
>
>Jim Cole wrote:
>
>> In the config file, are you setting the limit_urls_to attribute to match
>> the start_url attribute? Something like...
>>
>> start_url: http://www.somesite1.com/stuff/ \
>> http://www.somesite2.com/otherstuff/
>>
>> limit_urls_to: http://www.somesite1.com/stuff/ \
>> http://www.somesite2.com/otherstuff
>>
>> This should cause htdig to only index pages that include either
>> http://www.somesite1.com/stuff/ or http://www.somesite2/otherstuff/ in
>> their full URL.
>>
>> Jim
>>
>> Glenn J. Rowe's bits of Sun, 5 Mar 2000 translated to:
>>
>> >Pardon me. I just started using htdig and just now joined this mailing
>> >list. I have a question which I am sure someone will be able to answer.
>> >
>> >I have specified a rather small list of sites that should be indexed.
>> >htdig does only index those sites; however, when indexing it follows
>> >links to sites that aren't in the list. This poses a problem because a
>> >few sites have a large amount of external links on them and htdig
>> >follows everyone of those links. It doesn't index them but it follows
>> >them thus making the indexing process take FOREVER. Is there a way to
>> >stop that?
>> >
>> >Glenn Rowe
>> >OttawaComputer.Com
>> >
>> >
>> >------------------------------------
>> >To unsubscribe from the htdig mailing list, send a message to
>> >htdig-unsubscribe@htdig.org
>> >You will receive a message to confirm this.
>> >
>> >
>> >
>
>
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>htdig-unsubscribe@htdig.org
>You will receive a message to confirm this.
>
>
>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Mar 05 2000 - 12:52:12 PST