Re: [htdig] PDF indexing problem


Subject: Re: [htdig] PDF indexing problem
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue Aug 08 2000 - 12:40:37 PDT


According to Justin Hopkins:
> I'm trying to index a dozen or so pdf files on my intranet,
> and both parse_doc.pl w/xpd and acrobat (3 and 4) choke
> on the .pdf files.
>
> When acroread chokes, it gives me several of these
> sorts of errors:
>
> PDF::parse: cannot open acroread output from
> http://omniweb/resmis/docs/PMSs/lib
> ica/userguide/LTCONFIG.pdf
>
> When parse_doc.pl chokes, it gives several:
> sh: /usr/local/bin/parse_doc.pl: No such file or directory

Well, parse_doc.pl can't exactly choke if it doesn't even run. You'd
need to make sure you installed the script where you tell htdig you
put it, and make sure it's executable.

When you said it worked after implementing the fix to max_doc_size, as
David suggested, I assume you mean it now works with acroread. If you
find that the results with acroread are unsatisfactory (which many of
us have), you may want to switch to doc2html, in the contrib section
of the htdig.org web site. It's now the PDF converter of choice for
htdig.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Aug 08 2000 - 02:40:30 PDT