Re: [htdig] Still more info on pdf conversion problems.

Subject: Re: [htdig] Still more info on pdf conversion problems.
From: Gilles Detillieux (
Date: Mon Feb 07 2000 - 13:25:11 PST

According to Stan Brown:
> Sorry to keep posting bits of this, but it's an ongoing battle :-(

Unfortunately, in all of these bits you never mentioned which version
of htdig you're running. This is extremely relevant, especially now that
the first 3.2 beta has been released.

> If I put the following in the htdig.conf file:
> external_parsers: application/msword /usr/local/bin/ \
> application/postscript /usr/local/bin/ \
> application/pdf /usr/local/bin/
> Then the parse_doc script is called, resulting in the error I posted in
> my previous message.

That error definitely came from pdftotext. Running on the
same PDF will give you the same error message. If you're sure the PDF
file is correct, and pdftotext is in error, then you should report this
to Derek Noonberg, maintainer of the xpdf package. See the xpdf docs for
contact info.

> If however, I put the following in there:
> external_parsers: application/msword->text/html /usr/local/bin/ \
> application/postscript->text/html /usr/local/bin/ \
> application/pdf->text/html /usr/local/bin/
> The acroread is called, just as though I had no external converters
> defined.
> What am I doing wrong?

If htdig is defaulting to acroread, it's because it didn't recognise your
external_parsers definition for application/pdf. That means either you're
running an older version of htdig than 3.1.4, so it doesn't handle external
converters, or you have a typo in your external_parsers definition.

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Mon Feb 07 2000 - 15:42:25 PST