Re: htdig: Searching for adiacent words


Geoff Hutchison (Geoffrey.R.Hutchison@williams.edu)
Tue, 24 Nov 1998 11:08:59 -0500 (EST)


> Can you add support for the boolean "NEAR" and specify
> in the config file how many words NEAR should examine to
> spit out a correct result? The default could be something
> like 2 or 3, but a person could change it if they'd like.

Phrase searching may be the most requested feature and the most important
feature to add. But at the moment, I don't know of a way to add it without
really blowing up the size of the databases. And it would still require
some code changes.

The problem is this--right now we essentially don't store the location of
words. But if we want to implement phrase or proximity searching, we need
to store the location of *every* word in *every* document. Ouch. The word
database would be huge.

If someone can find me a paper or reference or bit of code that implements
phrase or proximity searching, I'll take a look.

If anyone wants to take this on as a project, I do have ideas on
implementation.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:51 PST