Re: htdig: PDF Parsing errors


Anat Rozenzon (Anat.Rozenzon@telrad.co.il)
Thu, 24 Sep 1998 08:41:48 +0200


"Frost, Timothy E" wrote:

> What happens if you attempt to access that PDF file using the installed
> acroread on the system that you run the dig on? (I.e. is the pdf file
> actually corrupt?).

No, it is opened w/o any problems. This thing happens with all the pdf
files.

>
>
> T
>
> > -----Original Message-----
> > From: Anat Rozenzon [SMTP:Anat.Rozenzon@telrad.co.il]
> > Sent: Wednesday, September 23, 1998 9:26 PM
> > To: Geoff Hutchison
> > Cc: jaym@aztech-cs.com; htdig@sdsu.edu
> > Subject: Re: htdig: PDF Parsing errors
> >
> > Geoff Hutchison wrote:
> >
> > > At 3:10 PM -0400 9/18/98, J.A. MacDonald wrote:
> > > >I'm trying to index a site that has a bunch of pdf's in it. I've
> > set the
> > > >pdf_parser in the conf file to point to the installed acroread and
> > am
> > > >getting the following errors while running rundig:
> > > >
> > > >/tmp/htdig351.pdf: Could not repair file.
> > > >PDF::parse: cannot open acroread output
> > >
> > > These kinds of errors are usually caused by the "max_doc_size" being
> > set to
> > > low. I believe the default is 100K since HTML files don't usually
> > come
> > > larger.
> > >
> > > I guess this is FAQ material now. :-)
> > >
> > > -Geoff Hutchison
> > > Williams Students Online
> > > http://wso.williams.edu/
> > >
> > >
> > ----------------------------------------------------------------------
> > > To unsubscribe from the htdig mailing list, send a message to
> > > htdig-request@sdsu.edu containing the single word "unsubscribe" in
> > > the body of the message.
> > >
> > >
> >
> > I too got this error even when setting:
> > max_doc_size: 50000000
> >
> > and the pdf is not so large about 300k
> > what else could be wrong?
> >
> > TIA
> >
> > --
> > Anat Rozenzon
> >
> > `,,`,,`,,`,
> > API/Intranet team Tel: +972-8-9134480
> > Telrad Ltd. Fax: +972-8-9133487
> > P.O.B. 50, Lod, Israel Email: anat.rozenzon@telrad.co.il
> > ,,`,,`,,`,,`
> >
> >
> >
> > ----------------------------------------------------------------------
> > To unsubscribe from the htdig mailing list, send a message to
> > htdig-request@sdsu.edu containing the single word "unsubscribe" in
> > the body of the message.
> ----------------------------------------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-request@sdsu.edu containing the single word "unsubscribe" in
> the body of the message.

--
        Anat Rozenzon

`,,`,,`,,`, API/Intranet team Tel: +972-8-9134480 Telrad Ltd. Fax: +972-8-9133487 P.O.B. 50, Lod, Israel Email: anat.rozenzon@telrad.co.il ,,`,,`,,`,,`

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:51 PST