Re: [htdig] Problem with &..; entities in meta tags


Lennart Almkvist (la@nrm.se)
Fri, 30 Jul 1999 09:09:43 +0200


The 'þ' problem.
We are running htdig 3.1.2, on a pc with Red Hat Linux 6.0
with an apache web server 1.3.6.

Locale is sv_SE in the htdig config file.

There are no thorns in the text, only in the meta tags

One example:
<META NAME="other" CONTENT="Stemorsblom Natt og dag Almindelig
Stedmorsblomst Keto-orvokki
&thorn;renningarfj&oacute;la Wild Pansy Wildes Stiefm&uuml;tterchen">
where the german and icelandic names contains '&...;' encodings

In the .wordlist and .words.db file the words 'stiefmuuml;t' and
'thorn;rennin' are found.

Should not such words be in these files in their decoded form, that is one 8
bit char insted of five or six letters ?

Lennart Almvist
Museum of Natural History, Stockholm

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word unsubscribe in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Jul 29 1999 - 23:25:54 PDT