htdig: Patch for (better) explaining the "not" search-operator, suggestions for fix.


Hans-Peter Nilsson (hans-peter.nilsson@axis.com)
Mon, 11 Jan 1999 00:54:28 +0100


Is it just me, or is the "boolean"-search operator "not" in
htsearch confusing?

I think it is mislabeled; it should be called "without" since
it is a binary operator with that effect; in htsearch you cannot
say "not dog" or "cat and not dog" or even "cat and (not dog)",
you *do* say "cat not dog". For users of altavista (among other
implementations of logical search-expressions) this comes as an
illogical surprise.

If it's ok, I would like to submit a patch that adds the word
"without" as a synonym operator for the current behavior of
"not", then later hopefully a patch to make "not" valid as an
unary operator as well as binary (much like minus in arithmetic
expressions).

If it's *not* ok, then at least the patches below to the
documentation and syntax.html needs to be fixed to mention "not"
and how it works; there's only a spurious note in RELEASE.html
that '"and", "or" and "not" [are fixed]'.

I see some messages about this in the mailing list archives, but
IMHO (only) explaining to *each* user what they misunderstood is
not a good investment of your time.

For your amusement, here are patches for syntax.html and
hts_method.html to explain better (needed, as was requested in
<URL:http://www.htdig.org/mail/1998-10/0046.html>):

Mon Jan 11 00:42:51 1999 Hans-Peter Nilsson <hp@axis.se>

        * htdoc/hts_method.html: Add explanation of operator "not".

        * installdir/syntax.html: Added examples of correct logical
        expressions.

Index: installdir/syntax.html
===================================================================
RCS file: /opt/htdig/cvs/htdig3/installdir/syntax.html,v
retrieving revision 1.5
diff -p -c -r1.5 syntax.html
*** syntax.html 1998/10/12 02:10:51 1.5
--- syntax.html 1999/01/10 23:48:20
*************** Error in Boolean search for '$(LOGICAL_W
*** 5,11 ****
  <hr noshade size=4>
  Boolean expressions need to be 'correct' in order for the search
  system to use them.
! The expression you entered has errors in it.<br>
  <blockquote><b>
  $(SYNTAXERROR)
  </b></blockquote>
--- 5,14 ----
  <hr noshade size=4>
  Boolean expressions need to be 'correct' in order for the search
  system to use them.
! The expression you entered has errors in it.<p>
! Examples of correct expressions are: <b>cat and dog</b>, <b>cat
! not dog</b>, <b>cat or (dog not nose)</b>.<br>Note that
! the operator <b>not</b> has the meaning of 'without'.
  <blockquote><b>
  $(SYNTAXERROR)
  </b></blockquote>
Index: htdoc/hts_method.html
===================================================================
RCS file: /opt/htdig/cvs/htdig3/htdoc/hts_method.html,v
retrieving revision 1.2
diff -p -c -r1.2 hts_method.html
*** hts_method.html 1998/09/08 03:29:10 1.2
--- hts_method.html 1999/01/10 23:48:21
***************
*** 53,60 ****
      <p>
        The boolean expression parser is a simple recursive descent
        parser with an operand stack. It knows how to deal with
! "and", "or" and parenthesis. The result of the parser will be
! one set of matches.
      </p>
      <p>
        At this point, the matches are ranked. The rank of a match is
--- 53,63 ----
      <p>
        The boolean expression parser is a simple recursive descent
        parser with an operand stack. It knows how to deal with
! "not", "and", "or" and parenthesis. The result of the parser
! will be one set of matches.<br>
! Note that the operator "not" is used as the word 'without' and
! is binary: You can not write "cat and not dog" or just "not
! dog" but you can write "cat not dog".
      </p>
      <p>
        At this point, the matches are ranked. The rank of a match is

brgds, H-P
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sun Jan 10 1999 - 16:36:31 PST