Re: [htdig] Problem with &..; entities in meta tags

Gilles Detillieux (
Thu, 29 Jul 1999 13:53:23 -0500 (CDT)

According to Lennart Almkvist:
> In one database we index only the words in a couple of meta tags, containing
> e.g. flower names in various languages. These names sometimes contain some
> odd letters, as icelandic th. These letters have been coded as 'þ'
> etcetera.
> In the config file, text_factor is 0, and keywords_meta_tag_names is used to
> define the actual meta tags. The locale is set.
> However, in such words the '&...;' entity seems to be kept in the database.
> Searching for 'thorn*' get many hits, but not words entered with the actual
> letter, as Alt+0222 for for the icelandic thorn.
> In other databases, without this meta tag limitation, there are no problems
> with the &...; entities. What can the explanation be ?

I don't know, but if you mention which version of htdig you're running,
which operating system (and which OS version) it's running on, and what
your locale setting is in your htdig.conf, it may help narrow things down.

Are the hits on "thorn" really happening because of the þ entities
in your documents, or do you have documents that contain the word "thorn"
in them? After all, every rose has its thorns! (Sorry, I couldn't resist.)

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to containing the single word unsubscribe in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Thu Jul 29 1999 - 11:11:33 PDT