RE: htdig: PDF Parsing errors


Frost, Timothy E (timothy.frost@nz.eds.com)
Thu, 24 Sep 1998 09:19:26 +1200


What happens if you attempt to access that PDF file using the installed
acroread on the system that you run the dig on? (I.e. is the pdf file
actually corrupt?).

T

> -----Original Message-----
> From: Anat Rozenzon [SMTP:Anat.Rozenzon@telrad.co.il]
> Sent: Wednesday, September 23, 1998 9:26 PM
> To: Geoff Hutchison
> Cc: jaym@aztech-cs.com; htdig@sdsu.edu
> Subject: Re: htdig: PDF Parsing errors
>
> Geoff Hutchison wrote:
>
> > At 3:10 PM -0400 9/18/98, J.A. MacDonald wrote:
> > >I'm trying to index a site that has a bunch of pdf's in it. I've
> set the
> > >pdf_parser in the conf file to point to the installed acroread and
> am
> > >getting the following errors while running rundig:
> > >
> > >/tmp/htdig351.pdf: Could not repair file.
> > >PDF::parse: cannot open acroread output
> >
> > These kinds of errors are usually caused by the "max_doc_size" being
> set to
> > low. I believe the default is 100K since HTML files don't usually
> come
> > larger.
> >
> > I guess this is FAQ material now. :-)
> >
> > -Geoff Hutchison
> > Williams Students Online
> > http://wso.williams.edu/
> >
> >
> ----------------------------------------------------------------------
> > To unsubscribe from the htdig mailing list, send a message to
> > htdig-request@sdsu.edu containing the single word "unsubscribe" in
> > the body of the message.
> >
> >
>
> I too got this error even when setting:
> max_doc_size: 50000000
>
> and the pdf is not so large about 300k
> what else could be wrong?
>
> TIA
>
> --
> Anat Rozenzon
>
> `,,`,,`,,`,
> API/Intranet team Tel: +972-8-9134480
> Telrad Ltd. Fax: +972-8-9133487
> P.O.B. 50, Lod, Israel Email: anat.rozenzon@telrad.co.il
> ,,`,,`,,`,,`
>
>
>
> ----------------------------------------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-request@sdsu.edu containing the single word "unsubscribe" in
> the body of the message.
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:51 PST