Re: [htdig] A Suggestion on Accents

Subject: Re: [htdig] A Suggestion on Accents
Date: Tue May 16 2000 - 01:09:14 PDT

> >Rather than a fuzzy accents search method, why not make the htdig database
> >accent independent? After all, it is case independent already!
> >For example:
> >
> >Garçon -> Garçon -> garçon -> garcon
> I would make the analogy to word suffixes rather than to case. There
> is an endings fuzzy rather than a general stemming step during
> indexing. IMHO, this makes searches a bit more precise because the
> alternatives will get less weight than what the user actually
> entered. (Remember the old maxim "the customer is always right?")
> Besides, there are some situations where the unaccented word and the
> accented word do *not* mean the same thing.

Yes, and when I search for 'garçon' am I looking for a waiter or a school boy?

> (BTW, the 3.2 code isn't completely case independent. It stores a
> flag when the word is capitalized. My feeling is that user queries
> with capitals should return capitals preferentially.)

Neat idea.

> All that said, it would be possible to patch the code in
> and remove accents before storing the word.

I'll take a look at the 3.1.5 code, but don't hold your breath.

> --
> -Geoff Hutchison
> Williams Students Online

David J Adams
Computing Services
University of Southampton

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Mon May 15 2000 - 22:57:16 PDT