Re: [htdig] Bug in 3.1.4


Subject: Re: [htdig] Bug in 3.1.4
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Thu Jan 27 2000 - 09:29:53 PST


According to D.J.Adams@soton.ac.uk:
> I'm testing htdig 3.1.4 in preparation to upgrading from version 3.1.2.
>
> It looks good, but a bug I reported to this list months ago is still
> there. No one commented on at the time, so I guess it got overlooked.
>
> A match-all search for "the problem", for example, works as expected:
> "the" is a bad word, and pages containing "problem" are found. However,
> a search for "The problem" or "THE problem" finds nothing.
>
> I can't believe I'm the only one seeing this.

Whoops! I guess most users never bother to type capitals in search
engines. The problem is the test for "bad_words" requires words in
lowercase, and htsearch wasn't doing this before calling that function
(only before adding it to its internal database search word list),
so it wasn't rejecting the word, even though htdig did when building
the database, so the search for "the" failed. Sorry I missed your
earlier report. Here's the fix:

--- htsearch/htsearch.cc.ucbug Thu Dec 9 18:28:49 1999
+++ htsearch/htsearch.cc Thu Jan 27 11:15:33 2000
@@ -408,6 +408,7 @@ setupWords(char *allWords, List &searchW
                 }
 
                 pos--;
+ word.lowercase();
                 if (boolean && mystrcasecmp(word.get(), "and") == 0)
                 {
                     tempWords.Add(new WeightWord("&", -1.0));

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Thu Jan 27 2000 - 09:31:10 PST