Re: [htdig] Newbie indexing problems


Subject: Re: [htdig] Newbie indexing problems
From: Sara Rudd (cuddg@csv.warwick.ac.uk)
Date: Fri Oct 27 2000 - 04:12:11 PDT


Adam and Gilles,

Thank-you both for taking the time and trouble to respond
to my query. I have put in your suggestions and it
seems to have done the trick! (think the server_alias
was the key thing)

Many Many thanks

A happy bunny!
Sara

On Thu, Oct 26, 2000 at 12:12:50PM -0500, Gilles Detillieux wrote:
> According to Sara Rudd:
> > I am having some trouble with configuring htdig it
> > seems to be ignoring some of the servers that I have
> > listed using limit_urls dont know why. But if I
> > include the main server name, under limit_urls is does
> > index them
> >
> > www.csv.warwick.ac.uk
> >
> > but of course I get a double count.
>
> If by double count, you mean some or all URLs are repeated, then you may
> need to use server_aliases as Adam suggested. If you mean something else,
> could you be more precise?
>
> > Heres hopefully enough of the conf file to maybe
> > give a clue as to whats going wrong?
> >
> >
> > start_url: http://www.warwick.ac.uk/
> >
> > limit_urls_to: ${start_url} http://www.astro.warwick.ac.uk/ \
> > http://www.bio.warwick.ac.uk/ http://www.dcs.warwick.ac.uk/ \
>
> This may be part of your problem. You have a space character after the
> backslash on the line above, which would prevent the backslash from being
> used as a continuation character. In general, the backslash changes
> the meaning of the character immediately following it, so if the newline
> character doesn't immediately follow it, the newline isn't suppressed, so
> the definition ends there. However, in the htdig.conf file you included
> at the end of your message, I didn't see an extra space, so I'm not sure
> which version to trust. Check your htdig.conf file to be sure.
>
> > http://www.eng.warwick.ac.uk/ http://law.bio.warwick.ac.uk/ \
> > http://www.maths.warwick.ac.uk/ http://www.phys.warwick.ac.uk/ \
> > http://www.wbs.warwick.ac.uk/ http://www.hosp.warwick.ac.uk/ \
> > http://www.conferences.warwick.ac.uk/ http://www.unitemps.warwick.ac.uk/ \
> > #http://www.csv.warwick.ac.uk/ \
> >
> > exclude_urls: /cgi-bin/ .cgi /warnes/ /templates/ /server_stats \
> > /logs/ /img/ \
>
> Here's another potential problem, which also appears in your attached
> htdig.conf file. You shouldn't have a backslash on the last line of a
> definition, because you're not continuing the definition on the following
> line. Note that the backslash will cause the following line to be
> appended to the current one even if the following line begins with a "#"!
>
> --
> Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca>
> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
> Dept. Physiology, U. of Manitoba Phone: (204)789-3766
> Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930

-- 
Sara Rudd, Webmaster, IT Services 
For useful information on publishing web pages at warwick go to
http://www.warwick.ac.uk/web
Subscribe to my email list for useful web updates              web-contacts
Email    majordomo@warwick.ac.uk  with message body  subscribe web-contacts

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Fri Oct 27 2000 - 04:18:33 PDT