Re: [htdig] Newbie indexing problems


Subject: Re: [htdig] Newbie indexing problems
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Thu Oct 26 2000 - 10:12:50 PDT


According to Sara Rudd:
> I am having some trouble with configuring htdig it
> seems to be ignoring some of the servers that I have
> listed using limit_urls dont know why. But if I
> include the main server name, under limit_urls is does
> index them
>
> www.csv.warwick.ac.uk
>
> but of course I get a double count.

If by double count, you mean some or all URLs are repeated, then you may
need to use server_aliases as Adam suggested. If you mean something else,
could you be more precise?

> Heres hopefully enough of the conf file to maybe
> give a clue as to whats going wrong?
>
>
> start_url: http://www.warwick.ac.uk/
>
> limit_urls_to: ${start_url} http://www.astro.warwick.ac.uk/ \
> http://www.bio.warwick.ac.uk/ http://www.dcs.warwick.ac.uk/ \

This may be part of your problem. You have a space character after the
backslash on the line above, which would prevent the backslash from being
used as a continuation character. In general, the backslash changes
the meaning of the character immediately following it, so if the newline
character doesn't immediately follow it, the newline isn't suppressed, so
the definition ends there. However, in the htdig.conf file you included
at the end of your message, I didn't see an extra space, so I'm not sure
which version to trust. Check your htdig.conf file to be sure.

> http://www.eng.warwick.ac.uk/ http://law.bio.warwick.ac.uk/ \
> http://www.maths.warwick.ac.uk/ http://www.phys.warwick.ac.uk/ \
> http://www.wbs.warwick.ac.uk/ http://www.hosp.warwick.ac.uk/ \
> http://www.conferences.warwick.ac.uk/ http://www.unitemps.warwick.ac.uk/ \
> #http://www.csv.warwick.ac.uk/ \
>
> exclude_urls: /cgi-bin/ .cgi /warnes/ /templates/ /server_stats \
> /logs/ /img/ \

Here's another potential problem, which also appears in your attached
htdig.conf file. You shouldn't have a backslash on the last line of a
definition, because you're not continuing the definition on the following
line. Note that the backslash will cause the following line to be
appended to the current one even if the following line begins with a "#"!

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Thu Oct 26 2000 - 10:18:52 PDT