Re: [htdig] strange problems with files to be parsed externally

Subject: Re: [htdig] strange problems with files to be parsed externally
From: Gergely Madarasz (
Date: Tue Feb 29 2000 - 09:29:26 PST

On Tue, 29 Feb 2000, Gilles Detillieux wrote:

> According to Gergely Madarasz:
> > I've just got a bugreport from the debian BTS saying that indexing pdf
> > files causes htdig to hang. First I thought it might be the bit modified
> > which is included in the .deb package, but it seems it is
> > not. The following happens: htdig only downloads the first 200000 bytes of
> > the pdf:
> > -rw-r--r-- 1 root root 200000 Feb 29 18:12 /tmp/htdext.28167
> > of course the parser can't handle this since the file is originally much
> > larger and expects additional data. What might cause this ?

Yeah, I already realized that this is the cause... so I'll comment out the
default external_parsers line in the htdig .deb package... then if one
wants to index pdf files, one should both uncomment this and raise the
max_doc_size value...

Madarasz Gergely  
     It's practically impossible to look at a penguin and feel angry.
         Egy pingvinre gyakorlatilag lehetetlen haragosan nezni.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Tue Feb 29 2000 - 09:33:40 PST