[htdig3-dev] htdig 3.1.4 is not 8-bit-clean on solaris


Subject: [htdig3-dev] htdig 3.1.4 is not 8-bit-clean on solaris
From: Marc Pohl (Marc.Pohl@wdr.de)
Date: Mon Jan 03 2000 - 15:08:53 PST


Hi,

the last weeks i wondered why htdig don't like any words with the german U umlaut (char 252) on my solaris server. All locale setting were correct and the same configuration runs on a linux box without any problems.

Today i discovered, that the reason for that is, that WordList::valid_word() is not 8-bit-clean on Sun Solaris 2.6 !
(iscntrl(252) gets 1, but iscntrl((unsigned char)252) is 0)

The patch in htdig-3.1.4/htcommon/WordList.cc is easy:

111c111
< if (HtIsStrictWordChar((unsigned char)*word) && !isdigit(*word))

---
>       if (HtIsStrictWordChar((unsigned char)*word) && !isdigit((unsigned char)*word))
116c116
<       else if (allow_numbers && isdigit(*word))
---
>       else if (allow_numbers && isdigit((unsigned char)*word))
122c122
<       else if (iscntrl(*word))
---
>       else if (iscntrl((unsigned char)*word))

Marc

Marc Pohl, Online-Service-Center, Westdeutscher Rundfunk, D-50600 Koeln marc.pohl@wdr.de, +49 221 220 8618, http://www.wdr.de/

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Jan 03 2000 - 15:24:55 PST