Re: [htdig] Deleted, no excerpt (???)


Subject: Re: [htdig] Deleted, no excerpt (???)
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue Jan 18 2000 - 07:42:33 PST


According to Paul COURBIS:
> When I run htmerge, I get a lot of messages :
> Deleted, no excerpt: xxx/http...
>
> What does it mean ? Why does htmerge suppress so many documents from the
> database ? As far as I understand english it seems that it means that
> there's no keyword for these pages, despite the fact that when I connect
> to it there's a lot of text...

The most common causes of this are:
 - a noindex directive somewhere in the document
 - the document was disallowed by robots.txt
 - the server_max_docs limit was reached before this document could be parsed

You'd need to correlate the htmerge -v output back to the htdig -v (or -vv)
output to see which of these conditions occurred.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Jan 18 2000 - 07:43:22 PST