htdig: cant get prefix matching to work


Jerry Preeper (preeper@cts.com)
Tue, 22 Sep 1998 09:45:30 -0500


I have just updated htdig to be able to use prefix matching (and filesystem
digging) and can't seem to get prefix matching to work. I am running on
FreeBSD 2.2.6 and Apache, using htdig 3.0.0b1 (updated from 3.0.8b2)

I compiled and installed following the instructions on the www.htdig.org
with no problems. Before indexing the site I backed up the db directory
and removed all the files in the db directory. I have added the following
to the standard htdig.conf file included in the distribution

search_algorithm: exact:1 prefix:0.5
match_method: and
matches_per_page: 10
max_prefix_matches: 500
minimum_prefix_length: 3

(I have not added the local_urls directive so I can take care of each item
I'm changing separately.)

I have not included the endings in the search algorith because searches on
things like New York Rangers match too many other words like news, newness
and stuff like that. I have also tried bumping prefix up to 1 with no
success either.

In my search.html page I have the following code for the search:

<form method="post" action="/cgi-bin/htsearch">
Search:
<input type="text" size="30" name="words" value="">
<br>
<font size=-1>
Match: <select name=method>
<option value=and>All
<option value=or>Any
</select>
Format: <select name=format>
<option value=builtin-long>Long
<option value=builtin-short>Short
</select>
</font>
<input type="submit" value="Search">
</form>

If I search on something like Cobi Jones I get a bunch of matches.
However, if I search on Cobi Jone I get nothing. If I change it to Cobi
Jone* I still get nothing.

Does anyone have an idea of what I'm missing?

Jerry Preeper
preeper@cts.com

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:49 PST