Subject: Re: [htdig] Rundig
From: Gilles Detillieux (firstname.lastname@example.org)
Date: Thu Nov 25 1999 - 11:29:49 PST
According to Jason Carvalho:
> When I run 'rundig', it crawls my web site then when it comes to the
> merge stage, it outputs:
> Deleted, no excerpt :2156 http://ww...etc. for loads of my pages.
> All in all, it found about 9500 pages but only merged 7500, giving the
> above message for the rest.
> What does this mean?
The two most common causes are: a) the document contained no text, or
the text was excluded by noindex meta tags, or b) the document was
disallowed by the server's robots.txt file. If you ran htdig or rundig
with -vvv, then htdig's output should give you more of an indication of
which situation arose with these pages.
-- Gilles R. Detillieux E-mail: <email@example.com> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You'll receive a message confirming the unsubscription.
This archive was generated by hypermail 2b25 : Thu Nov 25 1999 - 11:41:48 PST