Subject: Re: [htdig] Problem with umlauts in HTML documents
From: Torsten Neuer (email@example.com)
Date: Tue Nov 30 1999 - 01:26:52 PST
Jens Moellenhoff wrote:
> Currently we're testing the usage of ht://Dig version 3.2. We have
> managed to index several directories. We even managed to install the
> German dictionary and grammar, so that it gives several alternative
> search words.
> But now when we search for a German word containing a German umlaut
> (e.g. "Überfall"), it gives no match. We even tried to transcribe it as
> "Ueberfall", but to no avail. A search for "Überfall" also showed
> no result, because it splitted the search term at the ";".
> However, when we searched for "berfall" or for 'U"berfall', it found the
> document containing the word "Überfall", but it highlighted only the
> string "berfall" in the result list.
> The most interesting thing is that these difficulties only occured with
> HTML and TXT files. PDF files do recognize all umlauts. We can index
> these files, search for "Überfall", and the search result is displayed
> We also tried to change the language declaration in the config file
> according to the FAQs, using "locale: de_DE.ISO_8859-1", but that didn't
> work either.
The FAQ contains an *example* of using the locale directive. The actual
setting of this directive depends upon the locale database installed on
your system (usually in "/usr/share/locale" or "/usr/lib/locale").
See locale(5) for more information.
For a German system, de_DE.ISO_8859-1 *may* work (if this is the name of
the installed German locale).
> I am sorry if this has been described elsewhere before, but I'd be very
> glad if you could point me to that resource then.
In fact, the FAQ "4.10. How do I index documents in other languages?"
does not contain the line "locale: de_DE.ISO8859-1", but instead says
-- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: firstname.lastname@example.org Internet: http://www.inwise.de
------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com You will receive a message to confirm this.
This archive was generated by hypermail 2b25 : Tue Nov 30 1999 - 01:39:47 PST