Re: htdig: Searching PDF or Word Files

Shyam B S (
Tue, 12 Jan 1999 04:37:20 PST

>At 6:30 AM -0400 1/11/99, Shyam B S wrote:
>>I am trying to index and Search MS Word and PDF files. I am using
>>and acrobat as the external parsers for these documents. htsearch
>Do you mean that you've specified catdoc and acrobat in the external
>attribute? If so, it's not going to work reliably (if at all). The
>parser support expects output to follow certain guidelines documented
> so you can't just plug
>program in.
>If you're running any of the 3.1.0bX series, they include a PDF parser
>works with acrobat (and should work out of the box). More recent betas
>include scripts to handle Word documents using catdoc.

Thanks. I am using htparsedoc as the external parser which calls catdoc
for word docs. I could solve the problem, by modifying the htparsedoc to
return record type h along with record types title(t) and words(w).



Get Your Private, Free Email at
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Wed Jan 13 1999 - 09:13:05 PST