htdig: Double Byte Technology...


Maren S. Leizaola (leizaola@unitedmta.com)
Mon, 20 Jul 1998 18:58:39 +0800 (CST)


Hi,
        Does anyone know there are any copies out there of HTDig that do
double Byte technology such as Japanese and Chinese? I've been trying it
out and I can get Chinese indexed upto an extent.

        If 4 Chinese characters(8bytes) which do not have spaces in
between then, are put into the dig file as one single 8 byte word. When it
is actually 4 words of 2 bytes.

        Unfortunately the search unit has to be one single 2 byte entity.
I.e. If the characters are in the Double Byte range they have to be split
into two byte units and then indexed.... The htmerge and htsearch do their
job properly and can search the 4 words in Chinese. As long as you enter
the exact sequence that they were indexes as.

Does anyone have this working? or Anyone have any input?

Maren.

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:53 PST