[htdig3-dev] Bug#56721: htdig and locale de_DE peculiarities. (fwd)


Subject: [htdig3-dev] Bug#56721: htdig and locale de_DE peculiarities. (fwd)
From: Gergely Madarasz (gorgo@sztaki.hu)
Date: Mon Jan 31 2000 - 08:21:43 PST


I just got this bugreport in the debian BTS

-- 
Madarasz Gergely           gorgo@sztaki.hu           gorgo@linux.rulez.org
     It's practically impossible to look at a penguin and feel angry.
         Egy pingvinre gyakorlatilag lehetetlen haragosan nezni.
                   HuLUG: http://mlf.linux.rulez.org/

---------- Forwarded message ---------- Date: Mon, 31 Jan 2000 17:13:54 +0100 From: Florian Hars <florian@hars.de> To: submit@bugs.debian.org Subject: Bug#56721: htdig and locale de_DE peculiarities. Resent-Date: Mon, 31 Jan 2000 16:18:02 +0000 (GMT) Resent-From: Florian Hars <florian@hars.de> Resent-To: debian-bugs-dist@lists.debian.org Resent-cc: Gergely Madarasz <gorgo@sztaki.hu>

Package: htdig Version: 3.1.4-1

This is probably for upstream.

I use htdig with a locale: de_DE setting. It seems unable to find occurrences of words containing non-ascii characters that are part of titles, <Hn> or emphasis elements. Say, if i look for "bég" in my data, it finds an index.html document that contains the line

<a href="beg-islamabad-1990.html">B&eacute;g 1991: From the Quark Model to the Stand...</a>

but not the document beg-islamabad-1990.html itself, that starts with:

<html><head><title>B&eacute;g 1991: From the Quark Model to the Stand...</title> <body> <h1>Mirza Abdul Baqi <strong>B&eacute;g</strong>: From the Quark Model to the Standard Model: Ten Fateful Years in Particle Physics (1964--74 C.\,E.).</h1> <p>Mirza Abdul Baqi <strong>B&eacute;g</strong> (1991): <em>From the Quark Model to the Standard Model: Ten Fateful Years in Particle Physics (1964--74 C.\,E.).</em>

It also doesn't find another document containing

<p><a href="beg-islamabad-1990.html">Mirza Abdul Baqi <strong>B&eacute;g</strong>: <em>From the Quark Model to the Stand...</em> 221-284</a></p> although it finds both documents if I look for "Mirza".

Yours, Florian. -- + when hideous hordes of web designers will leave ripped bloodless bodies of hosts they parasited upon and convulsively start tearing limbs of each other in agony illuminated by artificial light [...], then we know that time has come for dêë|||zêïñe++++ >>>> Å.Ñ.Ñ.Ï.H.Î.L.Ä.T.Î.Ö.Ñ -- www.absurd.org

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Jan 31 2000 - 08:23:21 PST