Re: SV: [htdig] Foreign chars (Swedish)


Subject: Re: SV: [htdig] Foreign chars (Swedish)
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Mon Nov 29 1999 - 11:54:05 PST


According to Philippe Ramkvist-henry:
> On Fri, 26 Nov 1999, Gilles Detillieux wrote:
> > That all looks the way it should, as far as I'm concerned. I guess we
> > need to focus on htsearch, as it appears to be the culprit. (Either that
> > or htmerge.) Could you try running htsearch from the command line,
> > and seaching first for ANLÄNDE, and then for anlände? I'd like to see
> > what it finds in both cases.
>
> I get exaclty the same problem as when searching from a HTML form as
> searching directly from the shell.

Just a hunch, but you wouldn't happen to have a ä in valid_punctuation,
would you? In any case, could you run htdig -vvv twice, searching
first for ANLÄNDE, and then for anlände? How do the initial debugging
messages differ. What's happening to the ä - is it getting stripped
out or changed to another character? Is the upper case Ä getting changed
to a ä, or to another character? Are you using the exact same config
file for htdig, htmerge and htsearch?

> > Yes, and your db.wordlist looks fine (at least what you showed me),
> > so it should work, as long as you're also feeding Latin 1 characters
> > into htsearch. If you are, then it's a bug or a corrupt database (the
> > index, not the word list).
>
> Hmm, maybe. Is there anyway to force the input in the HTML form to be
> "uppercase"? It's a dirty solution but it would work.

Not that I know of, but you could put a originalWords.uppercase(); right
after the originalWords.chop(" \t\r\n"); in htsearch/htsearch.cc. If the
htsearch -vvv above doesn't get to the root of the problem, it might be
interesting to see if this hack has any effect.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b25 : Mon Nov 29 1999 - 12:06:39 PST