[htdig] Adding anchors to PDF files.


Subject: [htdig] Adding anchors to PDF files.
From: Gaillard Pierre-Olivier (po.gaillard@free.fr)
Date: Tue Mar 28 2000 - 13:06:51 PST


Hello,

 I have devised a simple way to open a PDF file right at the page of a
match found by htsearch.
 I have described this solution on my web page
(http://po.gaillard.free.fr).
Basically, it's a simple Python external parser for PDF and a two-line
Netscape helper, used to benefit from the "add_anchor_to_excerpt: true"
option of htsearch.

The external parser:
  - uses pdftotext from Xpdf.
  - adds anchor "PAGE_XXX" at beginning of each page.

The helper :
  - takes the URL of the PDF file to open
  - extract anchor information from the URL to get the page number (e.g.
http://htdig.org/htdig.pdf#PAGE_122 => 122)
  - run xpdf on the right document and tell it to open it at the right
page.

 The process can be used for Word documents and others too (with catdoc,
for instance).

     I thought some people could use this simple trick. Of course
comments are welcome.

        P.O. Gaillard

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Mar 28 2000 - 12:06:02 PST