Re: [htdig] problems with the "accent" patch


Subject: Re: [htdig] problems with the "accent" patch
From: Robert Marchand (robert.marchand@UMontreal.CA)
Date: Fri Mar 03 2000 - 07:22:34 PST


Hi,

   the problem here is that words are truncated in the word database and
the accent algorithm uses them to construct its own database. But
when the user make a search, the words are not truncated when the
corresponding accents keys are calculated so it does not find them. So
the generateKey method must truncate before calculating the key because
it can't know if it is call by htfuzzy or htsearch.

And when you have a word less than the maximum, there is no problem.
For the others similar algorithm this don't seem to be a problem
(soundex and metaphone) because they restrict themselves to a key of
6 characters.

Thanks.
Also thanks to Gilles for his patch.

At 23:49 00-03-02 +0100, Eric van der Vlist wrote:
>But, the accents keys are already affected by maximum_word_length !
>
>What is exactly the functionality of this patch ?
>
>Eric (confused :=)
>
>Gilles Detillieux wrote:
>>
>> According to Robert Marchand:
>> > I will add a correction to have accents keys in sync with the
>> > maximum_word_length parameter.
>>
>> Here's a kludgy fix, which I haven't tested yet, but I think will work.
>> I'm not wild about the external reference to config, while other methods
>> in this class have the config object passed to them, but it should get
>> the job done. Obviously, this patch should be applied after the one
>> I posted earlier today, still using patch -p1.
>>
>> --- htdig-3.1.5.accents/htfuzzy/Accents.cc.notrunc Thu Mar 2
11:25:42 2000
>> +++ htdig-3.1.5.accents/htfuzzy/Accents.cc Thu Mar 2 16:33:10 2000
>> @@ -134,10 +134,16 @@ Accents::writeDB(Configuration &config)
>> void
>> Accents::generateKey(char *word, String &key)
>> {
>> + extern Configuration config;
>> + static int maximum_word_length =
config.Value("maximum_word_length", 12);
>>
>> if (!word || !*word)
>> return;
>>
>> + String temp(word);
>> + if (temp.length() > maximum_word_length)
>> + temp.chop(temp.length()-maximum_word_length);
>> + word = temp.get();
>> key = '0';
>> while (*word) {
>> key << MinusculeISOLAT1[ (unsigned char) *word++ ];
>>
>> --
>> Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca>
>> Spinal Cord Research Centre WWW:
http://www.scrc.umanitoba.ca/~grdetil
>> Dept. Physiology, U. of Manitoba Phone: (204)789-3766
>> Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
>
>--
>------------------------------------------------------------------------
>Eric van der Vlist Dyomedea
>
>http://www.dyomedea.com http://www.ducotede.com
>------------------------------------------------------------------------
>
-------
Robert Marchand tél: 343-6111 poste 5210
DiTER-SDI e-mail: marchanr@diter.umontreal.ca
Université de Montréal Montréal, Canada

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Mar 03 2000 - 07:27:00 PST