Re: htdig: Searching PDF or Word Files


Geoff Hutchison (ghutchis@wso.williams.edu)
Mon, 11 Jan 1999 13:53:02 -0400


At 6:30 AM -0400 1/11/99, Shyam B S wrote:
>I am trying to index and Search MS Word and PDF files. I am using catdoc
>and acrobat as the external parsers for these documents. htsearch finds

Do you mean that you've specified catdoc and acrobat in the external parser
attribute? If so, it's not going to work reliably (if at all). The external
parser support expects output to follow certain guidelines documented in
http://www.htdig.org/attrs.html#external_parser so you can't just plug any
program in.

If you're running any of the 3.1.0bX series, they include a PDF parser that
works with acrobat (and should work out of the box). More recent betas
include scripts to handle Word documents using catdoc.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Wed Jan 13 1999 - 09:13:04 PST