Re: [htdig] problems with accents

Torsten Neuer (
Thu, 20 May 1999 11:59:51 +0200

According to Philippe Riviere:
>* two problems arise with accents :
>1) some browsers don't like URLs containing accentuated letters (it would
>be better to have them escaped). This happens in the results page when your
>search of an accentuated word yields many results : the 1 2 3 4 5 next
>links contain accents

It would certainly be better to not have accentuated letters in URLs
in general. IMHO this is more a matter of proper naming of document
files than of having search engines recognizing them. I'd bet you'll
go into trouble with that with more than just ht://Dig..

>2) searching "étude" does not yield "etudes" and vice-versa. I'd prefer it to.

Look at ht://Dig documentation, set your locale to a proper value
(probably fr_FR), get a french dictionary and affix rule file for
the endings algorithm and re-index your site.

>* I patched for a presentation glitch (in my view) : the 1 2 3 4
>5 images are separated by a space which can be ugly (depending on the
>images you use) ; I've suppressed that space in the code, but it would be
>nicer to have this as an option (and this is something I don't know how to
>-- lines ( v 3.1.2) were
>504: *str << p << ' ';
>514: *str << tmp << "\">" << p << "</a> ";
>504: *str << p ;
>514: *str << tmp << "\">" << p << "</a>";

This patch will mess up the displayed search results on non-graphical
browsers like Lynx. The space emitted by htsearch certainly has its

Many people nowadays are unaware of the use of non-graphical browsers.
In fact they are widely used - mainly for administrative purposes on
intranets.. and this is where ht://Dig is also used to make up search
angines for local documentation.

As htsearch cannot determine beforehand what browser requests the query,
the space character is therefore obviously necessary.


