Re: Prefix algorithm and other tweaks

Esa Ahola (
Thu, 11 Dec 1997 18:14:29 -0500 (EST)

Haven't heard back from you; that's quite okay, just wanted to make sure
mail was not getting lost in one direction or another.

I discovered that the prefix algorithm is pretty overbearing in complex
queries without a mechanism to request it explicitly for specific words.
I did a quick hack to use a trailing '*' to indicate prefix matching; e.g.

    foo or bar*

My test page mentioned below now uses that syntax.

Do you think this is a worthwhile enhancement to ht://Dig?

Esa Ahola

---------- Forwarded message ---------- Date: Fri, 28 Nov 1997 01:50:33 -0500 (EST) From: Esa Ahola <> To: Andrew Scherpbier <> Subject: Re: Prefix algorithm and other tweaks

> 1. Yank GDBM and substitute Berkeley DB in Btree mode. Random index > and sorted index in one!

This was easier than I thought, and I don't even speak C++. Kudos aplenty to your exceptionally clear code!

I have implemented a prototype "prefix" fuzzy algorithm. Works wonders so far in limited testing; see

Seems that additional configuration variables are in order, such as max prefix matches and minimum prefix length (one or two-character prefixes will be rather hopeless with large databases.)

This is exciting; I think prefix matching is by far the most useful fuzzy algorithm.

Esa Ahola

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:25:24 PST