Esa Ahola (esa@cyclone.mindspring.com)
Thu, 11 Dec 1997 18:14:29 -0500 (EST)
Haven't heard back from you; that's quite okay, just wanted to make sure
mail was not getting lost in one direction or another.
I discovered that the prefix algorithm is pretty overbearing in complex
queries without a mechanism to request it explicitly for specific words.
I did a quick hack to use a trailing '*' to indicate prefix matching; e.g.
foo or bar*
My test page mentioned below now uses that syntax.
Do you think this is a worthwhile enhancement to ht://Dig?
-- Esa Ahola esa@cyclone.mindspring.com---------- Forwarded message ---------- Date: Fri, 28 Nov 1997 01:50:33 -0500 (EST) From: Esa Ahola <esa@cyclone.mindspring.com> To: Andrew Scherpbier <andrew@contigo.com> Subject: Re: Prefix algorithm and other tweaks
> 1. Yank GDBM and substitute Berkeley DB in Btree mode. Random index > and sorted index in one!
This was easier than I thought, and I don't even speak C++. Kudos aplenty to your exceptionally clear code!
I have implemented a prototype "prefix" fuzzy algorithm. Works wonders so far in limited testing; see
http://mercedes.mindspring.com/mercedes/archives/prefix.html
Seems that additional configuration variables are in order, such as max prefix matches and minimum prefix length (one or two-character prefixes will be rather hopeless with large databases.)
This is exciting; I think prefix matching is by far the most useful fuzzy algorithm.
-- Esa Ahola esa@cyclone.mindspring.com
This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:25:24 PST