Subject: Re: [htdig] PDF indexing problem
From: Gilles Detillieux (email@example.com)
Date: Tue Aug 08 2000 - 12:40:37 PDT
According to Justin Hopkins:
> I'm trying to index a dozen or so pdf files on my intranet,
> and both parse_doc.pl w/xpd and acrobat (3 and 4) choke
> on the .pdf files.
> When acroread chokes, it gives me several of these
> sorts of errors:
> PDF::parse: cannot open acroread output from
> When parse_doc.pl chokes, it gives several:
> sh: /usr/local/bin/parse_doc.pl: No such file or directory
Well, parse_doc.pl can't exactly choke if it doesn't even run. You'd
need to make sure you installed the script where you tell htdig you
put it, and make sure it's executable.
When you said it worked after implementing the fix to max_doc_size, as
David suggested, I assume you mean it now works with acroread. If you
find that the results with acroread are unsatisfactory (which many of
us have), you may want to switch to doc2html, in the contrib section
of the htdig.org web site. It's now the PDF converter of choice for
-- Gilles R. Detillieux E-mail: <firstname.lastname@example.org> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Tue Aug 08 2000 - 02:40:30 PDT