Re: htdig: [Patch] non english text parser broken

Vadim Chekan (
Mon, 9 Nov 1998 14:32:15 +0200

>No, but the enconding i'm using is iso-latin1 which is the default.
>The non-ascii characters are all in iso-latin1, not html entities.
>I'm using the same configuration file I used for 3.1.0b1 and things
>worked fine in 3.1.0b1

Do you have non-latil words in debug output?
./htdig -isvvvvvvv > ttt
tail ttt
word: Ukraine*@2
word: Kiev*@2
word: Reitarskaya,@2
word: 8-a@2
word: ?????@993
word: ?????????????@994
word: ??@996
word: ??????????@998
word: ????????.@1000
 size = 23040
pick:, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig: 10 documents

I have russia words in parser debug output.

Vadim Chekan.

> 3. You can whether check up it works by looking in the db.wordlist
> Until you don't have there non-english words, you have problem.

To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:46 PST