Re: [htdig] Problem with german umlauts

Torsten Neuer (
Wed, 2 Jun 1999 15:53:16 +0200

According to Norbert Hartl:
>let me first say that I love the way htdig works and all its features.
>Yesterday I discovered a strange problem. I am indexing german pages
>with htdig. After configuring the locale: de_DE.ISO-8859-1 into
>htdig.conf and using a german endings db everything works fine.
>In the search form I can use all of the german umlauts and htdig
>finds the documents for it.
>This works for the search form but not for the $(PAGELIST). When I am
>typing an umlaut into a form it will be converted to %E4 (for ä) in order
>to pass it via URL.
>In the PAGELIST there a URLs with an unconverted umlaut. This is leading
>to a misbehaviour by the Mac Netscape. Using the URLs with the un-
>converted umlaut there are no search results for this browser and a
>scrumbled umlaut in the following search form.
>Netscape on Linux and Windows are working with it (the versions I
>have for testing).
>Is this misinterpreted by the Macintosh? Any ideas?

It may also be that the Mac uses a strict conformant behaviour whereas
other versions are relaxing. URLs should not contain any special cha-
racters, AFAIK.

>Is there are workaround for converting this entities for the URLs in

Yep. You can wrap htsearch in a little CGI or server-side script
that does the magic for you with some regexp (a simple shell script
with a sed(1) command should work).


InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail:            Internet:

------------------------------------ To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Wed Jun 02 1999 - 06:20:28 PDT