[htdig] A Suggestion on Accents

Subject: [htdig] A Suggestion on Accents
From: D.J.Adams@soton.ac.uk
Date: Mon May 15 2000 - 04:34:05 PDT

Our web pages are overwhelmingly in English, but we do have academics who
put up pages in other languages and would like them to be searchable. I'm
sure that this is quite common.

Rather than a fuzzy accents search method, why not make the htdig database
accent independent? After all, it is case independent already!
For example:

Garçon -> Garçon -> garçon -> garcon

and 'garcon' goes into the database.

Is this a sensible suggestion? Entering 'garcon' into an English-language
version of (say) Netscape is a lot easier than entering 'garçon', and it
seems reasonable to me that a search for 'garcon' will find not only 'garcon'
and 'Garcon' but also 'garçon' and 'Garçon'.

I would even volunteer to work on a patch myself, but I lack knowledge of
locales, and anything I wrote would probably cause more problems than it
would solve.

David J Adams
Computing Services
University of Southampton

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Mon May 15 2000 - 02:22:06 PDT