Re: [htdig] Problem writing an external parser


Subject: Re: [htdig] Problem writing an external parser
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Mon Jan 24 2000 - 08:40:24 PST


According to Bernard T. Higonnet:
> I am trying to write an external parser which seems to work partially
...
> 1) the words which seem to be db.docdb are not to be found in
> db.wordlist (there's lots of stuff there, including names in the indexed
> files' full path names, but not these words)
> 2) searching on these words produces no documents (otherwise search
> seems to work normally)
>
> If this helps:
> My external parser returns records with first character t,h, and w but
> not u,a,i, or m

If the words appear correctly in db.docdb, it must mean that the h
record contains all the words you want, which is good. That they don't
appear in db.wordlist would explain why the search doesn't find them -
db.wordlist is used to build the word database for htsearch.

The most likely cause is a problem with the format of the w records your
parser puts out. Note that w records must have 4 fields, separated by
tab characters. The record must start with a w and a tab, then the word,
another tab, the location (0-1000), a final tab, and the heading level
(0-11). If fields are missing or the values are out of range, they will
be rejected.

Have a look at the output of your parser as you run it manually on one
of your documents, to see what's wrong with the w fields.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Jan 24 2000 - 08:42:36 PST