Re: [htdig] PDF indexing


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Fri, 13 Aug 1999 13:04:08 -0500 (CDT)


According to burditt@okstate.edu:
> I just installed htdig and it is working great for HTML and text files but
> then I run the htdig program I get the following error message.
>
> sh:/usr/local/bin/acroread: no such file or directory
>
> I have Acrobat 4.0 installed but /usr/local/bin/ is empty. I'm sure its
> something simple but how do I get it to index these files correctly?

A few points to consider:
1) the ht://Dig configure script currently expects to find acroread in
/usr/local/bin, and sets that as the default location even if it doesn't
find it. You need to set pdf_parser to indicate an alternate location,
and you must include the two options it needs.
2) a bug in Acrobat 4 makes it a poor choice for a pdf parser for htdig.
Its -pairs option causes a segmentation violation, so you'd need to kludge
up a wrapper script for it so it could work without that option.
3) a lot of users have had more success using pdftotext as part of an
external parser, rather than using acroread.

See the following for details:

http://www.htdig.org/attrs.html#pdf_parser
http://www.htdig.org/FAQ.html#q5.2
http://www.htdig.org/FAQ.html#q4.9

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Aug 13 1999 - 11:05:09 PDT