Subject: RE: [htdig] Accent problem.
From: NEPOTE Charles (Neuilly Gestion) (firstname.lastname@example.org)
Date: Mon May 15 2000 - 06:50:11 PDT
I am sorry but I think the accent patch won't solve my problem because it is
an "after-merge solution".
Without accent patch, if I manualy search "tue or tué" it still find only
one document... The problem is in the database.
Extract from db.wordlist (read carrefully -- i hope you can read accented
search "trie" will find 0 1 2
search "trié" will find 3 4 5
search "trié or trie" will find 0 1 2 3 4 5
search "tue" will find 2
search "tué" will find 2
search "tue or tué" will find 2
=> there is a problem... and other people should reproduce (anyone down
Many thanks for your help.
> -----Message d'origine-----
> De : Geoff Hutchison [mailto:email@example.com]
> Envoyé : lundi 15 mai 2000 15:16
> À : NEPOTE Charles (Neuilly Gestion)
> Cc : 'firstname.lastname@example.org'
> Objet : Re: [htdig] Accent problem.
> At 12:51 PM +0200 5/15/00, NEPOTE Charles (Neuilly Gestion) wrote:
> >(only the file which correspond to "i:2" will be found).
> >Is this can be solve ?
> >(Note I have in htdig.conf :
> >locale: fr_FR
> You probably want to try the accents fuzzy patch at
> (Thanks to Joe Jah for archiving patches.)
> This works along the lines of the soundex or metaphone fuzzy
> algorithms. You run it after running htmerge and it will add
> alternative accented or unaccented words to the query (with lesser
> weight as determined by the search_algorithms attribute).
> See my other message just now about the +/- of this approach or
> simply stripping accented words. As you noted in your message, the
> two words do not mean the same thing!
> -Geoff Hutchison
> Williams Students Online
> To unsubscribe from the htdig mailing list, send a message to
> You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Mon May 15 2000 - 04:39:21 PDT