[htdig] Precise Fuzziness


Subject: [htdig] Precise Fuzziness
From: Dave Melton (dmelton@blarg.net)
Date: Mon Nov 22 1999 - 03:46:54 PST


Hi all,

I've just added a bunch of content to a site that I'm indexing with
htdig 3.1.2.

The site is mainly in English, but contains a large number of Japanese
names and terminology. My problem is that there are two different
transcription conventions for writing Japanese words with an English
character set. The main difference is that one system uses double
vowels to indicate certain sounds. I've been experimenting with
soundex and metaphone, but if I turn the weighting up enough that it
has any effect, I get far too many bogus matches to be useful.

Is there any way to manually define a specific set of matching rules?
If search strings could contain regular expressions, I could do what
I want by modifying the search string before htsearch sees it. Are
there any other ways to accomplish this kind of thing?

Thanks,

  Dave Melton

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You'll receive a message confirming the unsubscription.



This archive was generated by hypermail 2b25 : Mon Nov 22 1999 - 03:59:11 PST