htdig: Updated patch for htsearch.cc; now also fixes "dangling or"


Hans-Peter Nilsson (hans-peter.nilsson@axis.com)
Tue, 12 Jan 1999 03:37:23 +0100


An update to my recent patch for setupWords() in htsearch.cc
because I identified some more goo.

It seems it is always wrong to remove words in "boolean" search,
as this will leave a dangling "or", "and" or "not" operator:
if you badword "cat", your "boolean" search for "cat or dog"
will just say "or dog" in $(WORDS).

This may still be incomplete; I would rather remove this
"filtering" of what to keep in $(WORDS) entirely; it only
half-heartedly removes badworded words and tries to skip the
"hidden" on-the-fly modifiers (those the user wrote inline in
the query such as "hidden:" and "exact:", see the
code and Mr Scherpbier's recent mail with the message-id
<369A37D0.C4C390AF@contigo.com>, not in the archive yet).

This done for no good reason IMHO -- I think $(WORDS) should be
kept unmodified as the user wrote it; only for the *user* to
modify.

But that would be a change in function more than a fix for an
abnormal situation, so I will not make a patch for it until I
know if that's acceptable. (So? ;-)

This patch is a *replacement* for my recent patch (it was
easiest for me this way, as that one wasn't in CVS yet. :-)

By the way, is this address (htdig@sdsu.edu) really appropriate
for patches? <URL:http://dev.htdig.org/patches.html> says they
should go here ("the htdig mailing list"), but I think
htdig3-dev would be better. Thoughts?

Sun Jan 11 02:42:51 1999 Hans-Peter Nilsson <hp@axis.se>

        * htsearch/htsearch.cc (setupWords): Do not skip words
        if "boolean" search.

*** /tmp/htsearch.cc.orig Sat Dec 19 17:55:11 1998
--- ./htsearch.cc Tue Jan 12 02:13:18 1999
*************** setupWords(char *allWords, List &searchW
*** 417,427 ****
              i++;
              continue;
          }
! if (badWords.IsValid(p))
              parsedWords << p << ' ';
! if (boolean && ((mystrncasecmp(p, "or", 2) == 0) ||
! (mystrncasecmp(p, "and", 3) == 0) ||
! (mystrncasecmp(p, "not", 3) == 0)))
              parsedWords << p << ' ';
      }
  
--- 450,458 ----
              i++;
              continue;
          }
! if (boolean)
              parsedWords << p << ' ';
! else if (badWords.IsValid(p))
              parsedWords << p << ' ';
      }
  
brgds, H-P
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Wed Jan 13 1999 - 09:13:05 PST