Re: [htdig] Problem with german umlauts

Gilles Detillieux (
Wed, 2 Jun 1999 08:40:17 -0500 (CDT)

According to Norbert Hartl:
> Yesterday I discovered a strange problem. I am indexing german pages
> with htdig. After configuring the locale: de_DE.ISO-8859-1 into
> htdig.conf and using a german endings db everything works fine.
> In the search form I can use all of the german umlauts and htdig
> finds the documents for it.
> This works for the search form but not for the $(PAGELIST). When I am
> typing an umlaut into a form it will be converted to %E4 (for ) in order
> to pass it via URL.
> In the PAGELIST there a URLs with an unconverted umlaut. This is leading
> to a misbehaviour by the Mac Netscape. Using the URLs with the un-
> converted umlaut there are no search results for this browser and a
> scrumbled umlaut in the following search form.
> Netscape on Linux and Windows are working with it (the versions I
> have for testing).
> Is this misinterpreted by the Macintosh? Any ideas?
> Is there are workaround for converting this entities for the URLs in

We uncovered a bug back on May 20, in the encodeURL() function. This
function should encode all non-ascii characters, but right now it doesn't.
Here's the fix:

--- htlib/ Tue Feb 16 23:03:56 1999
+++ htlib/ Wed Jun 2 08:29:05 1999
@@ -75,7 +75,7 @@ void encodeURL(String &str, char *valid)
     for (p = str; p && *p; p++)
- if (isdigit(*p) || isalpha(*p) || strchr(valid, *p))
+ if (isascii(*p) && (isdigit(*p) || isalpha(*p) || strchr(valid, *p)))
             temp << *p;

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Wed Jun 02 1999 - 06:04:25 PDT