Re: [htdig] Pages still not all indexing.

Subject: Re: [htdig] Pages still not all indexing.
From: Torsten Neuer (
Date: Tue Sep 05 2000 - 08:48:23 PDT

Karen Reardon wrote:
> If I take out the extra '0' that I added in the max_doc_size, I do not get
> the segmentation fault.... but I also do not get the entire page...
> -kmr
> > > I upped the max_doc_size to 50000000 and got this response:
> > >

It is quite a difference between 5 or 50 M - I assume the indexer
simply tries to get the memory and does not check whether new/malloc
returns NULL.

Most probably, the largest page you have to index is much smaller than
the 50 M you specified as the max_doc_size - and also most probably,
the AIX box the indexer runs on has some trouble to allocate the 50 M
memory block for the indexer.

Sadly, the server where the pages reside does not reply with a content-
length header, so it is hard estimating the best value for max_doc_size.

Perhaps you simply increase the value by repeatedly adding 5 M until the
indexer manages to fetch the entire page. Hopefully, you won't run out
of memory on the AIX box in that case.



InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail:            Internet:

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Tue Sep 05 2000 - 08:51:46 PDT