[htdig] pdf indexing question

Subject: [htdig] pdf indexing question
From: Matthew R. MacIntyre (matthew.macintyre@newlix.com)
Date: Tue Jul 25 2000 - 08:42:03 PDT

Hello all,

I'm having a problem indexing pdf files. The htdig phase seems to work
fine, no errors are produced, but when the htmerge phase is run, this error
always shows up:

Deleted, no excerpt: 17/http://svr-newlix/products/technical/faq.pdf

I'm not really sure how to go about fixing this problem. Here's what I have
in my configuration file:

external_parsers: application/msword->text/html
/usr/local/htdig/bin/conv_doc.pl \
/usr/local/htdig/bin/conv_doc.pl \
               application/pdf->text/html /usr/local/htdig/bin/conv_doc.pl

I was trying to use the parse_doc.pl script instead of the conv_doc.pl
script for a little while, but I kept getting many errors about acroread not
showing up, and how the pdf files could not be repaired.

Any help with how to fix this would be greatly appreciated.



Matthew R. MacIntyre
Webmaster, Newlix Corporation
Tel: 613.225.0516
Fax: 613.225.5625

To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Mon Jul 24 2000 - 22:41:20 PDT