Re: [htdig] Indexing PDF Files


Subject: Re: [htdig] Indexing PDF Files
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Wed Nov 01 2000 - 14:26:50 PST


On Wed, 1 Nov 2000, Roy Stephane wrote:

> When I perform rundig in verbose mode, I find that htdig recognise all my
> PDF files, it shows theire size. After that, when htmerge find a PDF, it say
> that there is no excerpt, so the file (temporary file) is deleted.

You haven't told us how verbose you're going. Using -v is less verbose
than -vvvv, for example.

Also, you can test the parsing script itself by calling it from the
command-line with a filename. You want to make sure it's actually parsing
the PDF files correctly, e.g.:

parse_doc.pl file.pdf application/pdf http://www.foo.com/file.pdf
/etc/htdig.conf

(that should all be on one line.)

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Wed Nov 01 2000 - 14:33:02 PST