Re: [htdig3-dev] Bug#56721: htdig and locale de_DE peculiarities. (fwd)

Subject: Re: [htdig3-dev] Bug#56721: htdig and locale de_DE peculiarities. (fwd)
From: Geoff Hutchison (
Date: Thu Feb 03 2000 - 18:21:05 PST

At 5:21 PM +0100 1/31/00, Gergely Madarasz wrote:
>I use htdig with a locale: de_DE setting. It seems unable to find
>occurrences of words containing non-ascii characters that are part of
>titles, <Hn> or emphasis elements. Say, if i look for "bg" in my
>data, it finds an index.html document that contains the line

This is rather odd. You see, the HTML parser doesn't pay much
attention to emphasis tags like <strong> or <em> and doesn't really
do anything different about <Hn> tags as far as recording words.

However, Marc Pohl <> found a problem with handling
of 8-bit characters. I don't know whether it would fix this problem,
but the patch is attached.

Please let me know if this helps,

-Geoff Hutchison
Williams Students Online

To unsubscribe from the htdig3-dev mailing list, send a message to
You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Thu Feb 03 2000 - 18:23:26 PST