Re: [htdig] pdf parser: No error;) Search: No results;(


Rick Wiggins (wiggins@gwis.com)
Tue, 23 Feb 1999 16:31:43 -0500 jjah@cloud.ccsf.cc.ca.us (Joe R. Jah)


Well, the big project that we were going to do that required indexing PDFs
hasn't progressed from the point it was at when I promised to provide an
update. So, no update yet. My information is based on using htdig 3.1.0b4
(with minor modifications) and pdftops (from xpdf 0.80) to index ONE PDF
document. This worked fine for me!

The change I made to htdig was in PDF.cc:

    // acroread << " -toPostScript " << pdfName << " " << tmpdir << " 2>&1";
    acroread << " " << pdfName << " " << psName << " 2>&1";

This appears to make pdftops happy.

I've put my test document back on my web site and re-indexed the site so
that you can see that it works yourself. If you go to
http://www.gwis.com/search and search for 'isp' you will see
'http://www.gwis.com/help/test.pdf' listed as the second hit. I've placed
a PostScript copy at http://www.gwis.com/help/test.ps.

I hope this helps. Please let me know what else you discover or if you
would like any additional information...

Rick

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Feb 26 1999 - 14:34:12 PST