Re: [htdig] No more weird endings problem


Subject: Re: [htdig] No more weird endings problem
From: Alexey Rodriguez (alexey@dicyt.umss.edu.bo)
Date: Fri Jun 09 2000 - 03:18:44 PDT


On Fri, 9 Jun 2000, Alexey Rodriguez wrote:

> Good morning everyone, i finally managed to get some time to look
> at my problem. I discovered that htfuzzy has a small bug while parsing
> *.aff files. If you have the following rule:
>
> Z > -Z, CES # audaz audaces
> ^
> .
> .
> |
> htfuzzy will stop parsing the line
> after this space, therefore it will cut the word ending but it won't add
> the later part. It caused a lot of repetitions for generated words.
> I fixed the problem with a lazy script that removed spaces after
> the comma. Even the "DB2 problem..." messages stopped appearing.
> Maybe this is an issue that has been already addressed. IMHO it
> not a good idea to (only) strip the spaces off the aff file, it would be
> better to fix the parsing code in EndingsDB.cc so that people with similar

hehe the file wich needs some fixing is SuffixEntry.cc and not EndingsDB.cc

> aff files won't have that problem. I can make the patch if you consider
> thatnecessary (Gilles? Geoff?).
> Another issue that i encountered is that mungeWord doesn't handle
> accented words ('abaco -> ábaco). Is this normal or must i fix the aff
> file (or source for instance) ?
> Thanks for reading.
> Alexey
>
>
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.
>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Jun 09 2000 - 04:58:42 PDT