[htdig3-dev] feedback on ParseTree


Subject: [htdig3-dev] feedback on ParseTree
From: Quim Sanmarti (qss@gtd.es)
Date: Wed Aug 23 2000 - 03:54:55 PDT


I've being playing a bit with the ParseTree framework (the version I found
in the 082000 snapshot).

1. I was particularly interested in how the boolean queries are parsed. The
first thing I tried was:

a or b and c and d

This parsed without errors, yielding as LogicalWords for boolean parsing:

: (a or b and c and d)

Nice, but this gives no clue of about what would be the structure of the
resulting tree. Consequently, I modified slightly AndParseTree and
OrParseTree to insert parens around the string returned in GetLogicalWords.
Retrying parsetest, I find the surprising result below:

: ((((a or b) and c) and d))

Hmm. Does this imply that the parser generates *binary* children when
trying boolean? Trying only with 'and' operators, does much the same...

2. Next, I tried

a and b or c
and
a or b and c

Yielding respectively

: (((a and b) or c))
: (((a or b) and c))

Hmm. No precedence is defined between 'or' and 'and'.

3. Next, I tried to force precedence by inserting parens in the query. So

a or (b and c)

yields

: (((a or b) and c)

It seems to be ignoring the parens.

4. I tried to include a phrase:

a or "b c"

with the result

Parsing as a boolean query FAILED
[silence, infinite loop]

a single phrase "a b c" does the same.

// Joaquim Sanmarti
// GTD Ingenieria de sistemas y software industrial, S.A.
// c/Rosa Sensat 9-11
// 08005 Barcelona SPAIN
// Tel. +34 93 225 77 00
// Fax. +34 93 225 77 08
// mailto:qss@gtd.es
// http://www.gtd.es

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Aug 23 2000 - 04:00:42 PDT