ht://Dig Copyright © 1995-2004 The ht://Dig Group
Please see the file COPYING for license information.


htfuzzy [-c configfile][-v] algorithm ...


Htfuzzy creates indexes for different "fuzzy" search algorithms. These indexes can then be used by the htsearch program.


-c configfile
Use the specified configuration file instead of the default.
Verbose mode. Used once will provide progress feedback, used more than once will overflow even the biggest buffers. :-)


Indexes for the following search algorithms can currently be created:
Creates a slightly modified soundex key database. A soundex key encodes letters as digits, with similar sounding letters (c, k, q) given the same digit. Vowels are not coded. Differences with the standard soundex algorithm are:
  • Keys are 6 digits.
  • The first letter is also encoded.
Creates a metaphone key database. This algorithm is more specific to English, but will get fewer "weird" matches than the soundex algorithm.
Creates an accents key database. This algorithm will map all accented letters to their unaccented counterparts, so that a search for the unaccented word will yield all variations of this word with accents.
Creates two databases which can be used to match common word endings. The creation of these databases requires a list of affix rules and a dictionary which uses those affix rules. The format of the affix rules and dictionary files are the ones used by the ispell program. Included with the distribution are the affix rules for English and a fairly small English dictionary. Other languages can be supported by getting the appropriate affix rules and dictionaries. These are available for many languages; check the ispell distribution for more details.
Creates a database of synonyms for words. It reads a text database of synonyms and creates a database that htsearch can then use. Each line of the text database consists of words where the first word will have the other words on that line as synonyms.


The default configuration file.
(Output) Maps between characters with and without accents for accents fuzzy rule
(Output) Database of similar-sounding words for metaphone fuzzy rule
(Output) Database of similar-sounding words for soundex fuzzy rule
COMMON_DIR/english.0, COMMON_DIR/english.aff
(Input) List of words and affix rules used to generate endings
COMMON_DIR/root2word.db, COMMON_DIR/word2rood.db
(Output) Database used for endings fuzzy rule
(Input) List of groups of words considered synonymous
(Output) Database used for synonyms fuzzy rule

See Also

htdig, htmerge, htsearch, Configuration file format, and ispell.

Last modified: $Date: 2004/06/14 08:49:46 $