Subject: Re: [htdig] No more weird endings problem
From: Gilles Detillieux (firstname.lastname@example.org)
Date: Fri Jun 09 2000 - 08:14:13 PDT
According to Alexey Rodriguez:
> Good morning everyone, i finally managed to get some time to look
> at my problem. I discovered that htfuzzy has a small bug while parsing
> *.aff files. If you have the following rule:
> Z > -Z, CES # audaz audaces
> htfuzzy will stop parsing the line
> after this space, therefore it will cut the word ending but it won't add
> the later part. It caused a lot of repetitions for generated words.
> I fixed the problem with a lazy script that removed spaces after
> the comma. Even the "DB2 problem..." messages stopped appearing.
> Maybe this is an issue that has been already addressed. IMHO it
> not a good idea to (only) strip the spaces off the aff file, it would be
> better to fix the parsing code in EndingsDB.cc so that people with similar
> aff files won't have that problem. I can make the patch if you consider
> thatnecessary (Gilles? Geoff?).
That would be great. Ideally, htfuzzy should take any aff file that
ispell allows, without having to preprocess it. If you can get a patch
to us, it would be appreciated. (I'll be away for the next 3 weeks,
but I'm sure Geoff could port it to the 3.2 development code, and Joe
can archive the 3.1.5 version of the patch for others to use.)
> Another issue that i encountered is that mungeWord doesn't handle
> accented words ('abaco -> ábaco). Is this normal or must i fix the aff
> file (or source for instance) ?
Again, I'm not familiar enough with affix files to say for sure, but it
seems to me that htfuzzy should map the two-character accent notation
to the corresponding ISO accented letter. Maybe some of the affix file
parsing code should be borrowed from ispell.
-- Gilles R. Detillieux E-mail: <email@example.com> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Fri Jun 09 2000 - 06:04:30 PDT