[htdig3-dev] Re: max_keywords patch (was Re: [htdig3-dev] Re: htdig-3.1.4 prerelease)


Subject: [htdig3-dev] Re: max_keywords patch (was Re: [htdig3-dev] Re: htdig-3.1.4 prerelease)
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Wed Dec 08 1999 - 10:11:12 PST


According to me:
> This undocumented and untested patch adds the max_keywords attribute to
> htdig, to index only as many keywords in meta tags, per document, as is
> specified in the attribute value. A value of 0 means no limit. This
> helps combat meta keyword spamming, but still leaves the problem that
> the first n spam keywords in a document still get indexed, so searches
> for these words will still pull up the spamming documents.

Hmm. I guess the patch overlooks keywords that come in via an external
parser. There would need to be a patch for that too, unless it was handled
by Retriever::got_word() instead.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Dec 08 1999 - 10:24:40 PST