Re: [htdig] Still no luck with indexing PDF's


Subject: Re: [htdig] Still no luck with indexing PDF's
From: Anthony Peacock (a.peacock@chime.ucl.ac.uk)
Date: Mon Feb 07 2000 - 08:42:50 PST


> On Mon Feb 7 10:07:32 2000 Anthony Peacock wrote...
> >
> >
> >Can you try converting a single PDF file by running either Acroread or
> >pdf2text from the command line. And send the results to the list.
>
> pdftotext is the source of the "No current point in closepath" errors.
>
> Does this help?

Even when you run it from the command line?

In this case I think you have some broken pdf files. I had this problem on my
site a few months ago. The pdf files had been created with some bad
constructs in them. When I got the user to recreate them, the errors went
away.

>
> Runig conv_doc.pl by hand on a specific file does seem to get a usable
> looking result.
>
> Might this be the way to go?
>
> Any ideas on why acroread is still called instead of it, when I put,
> what I beleive to be the correct thing in htdig.conf?
>
> external_parsers: apliaction/pdf->text.html /usr/local/bin/conv_doc.pl

I don't use conv_doc.pl, I use parse_doc.pl.

I have sent you the relevant lines from my configuration file.

---
Fare Thee Well
Anthony Peacock       
CHIME, Royal Free & University College Medical School
WWW:    http://www.chime.ucl.ac.uk/~rmhiajp/
"The social dynamics of the net are a direct consequence of the fact that
nobody has yet developed a Remote Strangulation Protocol." --Larry Wall

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Feb 07 2000 - 08:40:35 PST