RE: [htdig] Accent problem.


Subject: RE: [htdig] Accent problem.
From: NEPOTE Charles (Neuilly Gestion) (charles.nepote@cetelem.fr)
Date: Mon May 15 2000 - 06:50:11 PDT


I am sorry but I think the accent patch won't solve my problem because it is
an "after-merge solution".
Without accent patch, if I manualy search "tue or tué" it still find only
one document... The problem is in the database.

Extract from db.wordlist (read carrefully -- i hope you can read accented
chars) :

trie i:0
trie i:1
trie i:2
trié i:3
trié i:4
trié i:5
tue i:0
tué i:0
tue i:1
tué i:1
tue i:2
tué i:2
...

search "trie" will find 0 1 2
search "trié" will find 3 4 5
search "trié or trie" will find 0 1 2 3 4 5
search "tue" will find 2
search "tué" will find 2
search "tue or tué" will find 2

=> there is a problem... and other people should reproduce (anyone down
there ?)...

Many thanks for your help.
Charles Népote.

> -----Message d'origine-----
> De : Geoff Hutchison [mailto:ghutchis@wso.williams.edu]
> Envoyé : lundi 15 mai 2000 15:16
> À : NEPOTE Charles (Neuilly Gestion)
> Cc : 'htdig@htdig.org'
> Objet : Re: [htdig] Accent problem.
>
>
> At 12:51 PM +0200 5/15/00, NEPOTE Charles (Neuilly Gestion) wrote:
> >(only the file which correspond to "i:2" will be found).
> >
> >
> >Is this can be solve ?
> >(Note I have in htdig.conf :
> >locale: fr_FR
>
> You probably want to try the accents fuzzy patch at
> <ftp://sol.ccsf.cc.ca.us//htdig-patches/3.1.5/accents.5>
>
> (Thanks to Joe Jah for archiving patches.)
>
> This works along the lines of the soundex or metaphone fuzzy
> algorithms. You run it after running htmerge and it will add
> alternative accented or unaccented words to the query (with lesser
> weight as determined by the search_algorithms attribute).
>
> See my other message just now about the +/- of this approach or
> simply stripping accented words. As you noted in your message, the
> two words do not mean the same thing!
>
> --
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.
>



This archive was generated by hypermail 2b28 : Mon May 15 2000 - 04:39:21 PDT