Well, the big project that we were going to do that required indexing PDFs
hasn't progressed from the point it was at when I promised to provide an
update. So, no update yet. My information is based on using htdig 3.1.0b4
(with minor modifications) and pdftops (from xpdf 0.80) to index ONE PDF
document. This worked fine for me!

The change I made to htdig was in

    // acroread << " -toPostScript " << pdfName << " " << tmpdir << " 2>&1";
    acroread << " " << pdfName << " " << psName << " 2>&1";

This appears to make pdftops happy.

I've put my test document back on my web site and re-indexed the site so
that you can see that it works yourself. If you go to and search for 'isp' you will see
'' listed as the second hit. I've placed
a PostScript copy at

I hope this helps. Please let me know what else you discover or if you
would like any additional information...


