Re: htdig: Prefix Matching?


J. op den Brouw (MSQL_User@st.hhs.nl)
Wed, 09 Dec 1998 10:14:27 +0100


plucas@frost.com wrote:
>
> I have been reading the recent messages about prefix searching and it looks
> like it could be very useful I so I tried it on our site but I have been
> unable to make it work. I am using htdig 3.1.0b2 on a Solaris 2.6 Sparc
> box.
>
> After reading through the mail archives I have added and/or changed the
> following lines in our configuration file:
>
> search_algorithm: exact:1 prefix:1 endings:0.8
> prefix_match_character: "*"
> max_prefix_matches: 100
> minimum_prefix_length: 3
>
> I tried running 'htfuzzy prefix' in case that was needed to prepare a
> special prefix-endings database but it returned "htfuzzy: 'prefix' is not a
> supported algorithm" so I guess I don't need to do that.

You don't need to htfuzzy the prefix. It doesn't build a database like
htfuzzy
endings does. Just set search_algorithm: to prefix:1 and it should work.

Note that the '*' should not be in valid_punctuation: !!!
This is what caused trouble at my place.

> When I search for document I get 838 matches for "document or documented or
> documenting or
> documenter or documents or documenters".
>
> If I search for doc I get 17 matches for "doc".
>
> If I search for doc* or docu* or documen* I get no matches at all.
>
> Am I missing something obvious?
>
> Many thanks,
>
> Paul Lucas
> Frost & Sullivan
>
> ----------------------------------------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-request@sdsu.edu containing the single word "unsubscribe" in
> the body of the message.
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:49 PST