Re: [htdig] ASCIIfy patch


Subject: Re: [htdig] ASCIIfy patch
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Wed Oct 04 2000 - 10:35:53 PDT


According to Andoni Ayala:
> Anybody know where can i fund ASCIIfy patch?
>
> I would like that when people seach cafe, the engine search café and cafe (for example)

There are a couple different patches for dealing with accents. The one
which I think is preferable adds an "accents" fuzzy match method to
htsearch, but still retains the accents in the database. This is also
the method that was added to the 3.2.0b2 beta.

    ftp://ftp.ccsf.org/htdig-patches/3.1.5/accents.5

There is another patch which essentially strips all accents, leaving
only ASCII characters, which is a bad approach in my opinion, but others
feel differently.

    ftp://ftp.ccsf.org/htdig-patches/3.1.5/accents.zip
    ftp://ftp.ccsf.org/htdig-patches/3.1.5/accents.zip.README

Someone else made a script that builds a synonyms dictionary from the
words database by stripping accents, to essentially use the synonyms
fuzzy algorithm in place of the accents algorithm. I don't think this
was ever posted to the list or archive. Some hacks have been made to
earlier versions of htdig as well, but are probably not worth trying.
(One of them involved a program that pretty indiscriminately zapped the
db.docdb to strip accents from excerpts. Yeesh!)

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Wed Oct 04 2000 - 10:40:09 PDT