Re: [htdig] re: Problems with iso characters

Subject: Re: [htdig] re: Problems with iso characters
From: Petri Lankoski (
Date: Tue Oct 24 2000 - 02:52:13 PDT

Peter Peltonen writes:
> Petri Lankoski wrote:
> > > I have bit problems with htdig and iso characters and I can't find
> > > solution from FAQ to my problem. Htdig DB contains 8bit
> Here's how I got htdig working in Finnish (with ISO characters, that is):
> 1. Configured my htdig.conf:
> locale: fi_FI.ISO-8859-1
> lang_dir: /var/lib/htdig/common/finnish
> bad_word_list: ${lang_dir}/bad_words
> endings_affix_file: ${lang_dir}/finnish.aff
> endings_dictionary: ${lang_dir}/finnish.0
> endings_root2word_db: ${lang_dir}/root2word.db
> endings_word2root_db: ${lang_dir}/word2root.db
> 2. Hunted the web and finally found a finnish.dict file. I copied the file
> as finnish.0 to the directory I specified in my htdig.conf (I also created
> that directory :). Copied finnish.aff there too. (If you cannot find these
> files, I can send them to you).
> 3. I made a list of bad words to the file bad_words
> I'm not sure if the machine running htdig has to be configured to be using
> the fi-locale. I don't think so, but I changed that just to be sure.

I tried with instructions above and still htsearch don't find
accented characters. As far as I can see db contains 8-bit characters.

[12:31] xcalibur /var/lib/htdig/db > /www/cgi-bin/htsearch
Enter value for words: mäyrä
Enter value for format: long
Content-type: text/html

<h1>No matches were found for 'mäyrä'</h1>
Check the spelling of the search word(s) you used.
If the spelling is correct and you only used one word,
try using one or more similar search words with "<b>Any</b>."


[12:32] xcalibur /var/lib/htdig/db > grep mäyrä db.wordlist
mäyrä i:165 l:561 w:439 a:1
mäyrä i:170 l:64 w:936
mäyrä i:259 l:123 w:877
mäyrä i:260 l:208 w:792
mäyrä i:269 l:263 w:2146 c:7
mäyrä i:270 l:237 w:3902 c:13
mäyrä i:272 l:595 w:405
mäyrä i:405 l:0 w:250895 c:3
mäyrä i:406 l:862 w:138
mäyrä i:418 l:26 w:974
mäyrä i:84 l:742 w:258
mäyrä i:85 l:117 w:883
mäyrä i:90 l:626 w:374
mäyräkoira i:421 l:697 w:303
mäyrälle i:405 l:958 w:42
mäyrältä i:170 l:203 w:797
mäyrän i:269 l:247 w:1050 c:2
mäyrän i:270 l:615 w:695 c:3
mäyrän i:86 l:341 w:1507 c:5
mäyrän i:89 l:667 w:333
mäyrää i:405 l:944 w:56

locale: fi_FI
lang_dir: /var/lib/htdig/common/finnish
bad_word_list: ${lang_dir}/bad_words
endings_affix_file: ${lang_dir}/finnish.aff
endings_dictionary: ${lang_dir}/finnish.0
endings_root2word_db: ${lang_dir}/root2word.db
endings_word2root_db: ${lang_dir}/word2root.db

[12:38] xcalibur /var/lib/htdig/db > locale

System is Redhat 6.2 and htdig is htdig-3.1.5-0glibc21

  Petri Lankoski		Yeah you wanna go out 'cause it's raining			and blowing * You can't go out cause your	roots are showing * dye em black
  PGP:             type o negative

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Tue Oct 24 2000 - 02:57:52 PDT