Re: htdig: Another PDF and ht3.1.0b1 question


Anat Rozenzon (Anat.Rozenzon@telrad.co.il)
Thu, 01 Oct 1998 10:02:54 +0200


GREGORY_GUERIN@HP-FtCollins-om5.om.hp.com wrote:

> Hi Anat,
>
> I was wondering if you found the answer to your question below. I am
> having the same problem that you have described. It doesn't fail to
> parse all the pdf files but it does with a good percentage of them.
> If you have any input I'd appreciate it.
>
> Thanks Anat,
>
> Greg Guerin
> Phone: 1(970)898-6139
> Hewlett-Packard Company
> Fort Collins, CO 80528
>
> ______________________________________________________________________
> Hi all,
>
> I have a Solaris 2.6 machine which I've managed to inatll htdig3.1.0b1
> on.
> I set this in the conf file:
> pdf_parser: /tools/Acrobat/bin/acroread
>
>
> but I get these messages while digging:
> ... /tmp/htdig11117.pdf: Could not repair file.
> PDF::parse: cannot open acroread output
>
>
> Has anyone seen this? knows what's wrong?
> Thanks
>
>
>
> --
> Anat Rozenzon
> `o,,,,o`o,,,,o`o,,,,o`o,,,
> API/Intranet team Tel: +972-8-9134480
> Telrad Ltd. Fax: +972-8-9133487
> P.O.B. 50, Lod, Israel Email: anat.rozenzon@telrad.co.il
> o,,,,o`o,,,,o`o,,,,o`o,,,,o`
>

Hi,
We have a solution that seems to work. First, you must set the environment
variable 'TMPDIR' (bu something like "setenv TMPDIR /tmp")
Then, and this is the main thing I think, we've put a larger SWAP, about 3
times from the memory.
We now have:
Memory 0.5G
Swap 1.5G

It now seems to be working ok, we have pdf files of about 1-5M and they are
all indexed.
bye

--
        Anat Rozenzon

`,,`,,`,,`, API/Intranet team Tel: +972-8-9134480 Telrad Ltd. Fax: +972-8-9133487 P.O.B. 50, Lod, Israel Email: anat.rozenzon@telrad.co.il ,,`,,`,,`,,`

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:27 PST