Re: [htdig] PDF parser errors


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Wed, 30 Jun 1999 09:49:40 -0500 (CDT)


According to Evan Taylor:
> We have htdig-3.1.1 running on a redhat 5.2 and 6.0 machine for some
> time now without any problems. Recently we upgraded to htdig-3.1.2 on
> both the 5.2 and 6.0 servers. Now we are getting errors with PDFs. We
> are using the following line in the config file -
>
> pdf_parser: /usr/local/Acrobat3/bin/acroread -toPostScript -pairs
>
> This worked fine with 3.1.1 but with 3.1.2, we are getting the following
> error message.......
> /home/httpd/htdig/db/htdig29111.pdf: An unrecognized token '%s' was
> found.

The error above is generated by acroread, not htdig, so I suspect that
there's more that changed to your system than just the htdig version.
Either you have a new acroread command, that has problems with files that
the old one didn't complain about, or this is a new PDF that you weren't
indexing before. I don't see how the version change would have caused
this problem. There's really very little that changed between 3.1.1 and
3.1.2, apart from the HTML parsing.

> PDF::parse: cannot open acroread output from
> http://###.####.###/text/community_services/your_
> environment/natenv/factsheet3.pdf
>
> Has anyone seen this before??

--
Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.

--IIS17386.930756949/atapple2.apple.com--

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Jun 30 1999 - 11:10:05 PDT