Re: [htdig] Pages still not all indexing.


Subject: Re: [htdig] Pages still not all indexing.
From: Torsten Neuer (tneuer@inwise.de)
Date: Tue Sep 05 2000 - 08:48:23 PDT


Karen Reardon wrote:
>
> If I take out the extra '0' that I added in the max_doc_size, I do not get
> the segmentation fault.... but I also do not get the entire page...
>
> -kmr
>
> > > I upped the max_doc_size to 50000000 and got this response:
> > >

It is quite a difference between 5 or 50 M - I assume the indexer
simply tries to get the memory and does not check whether new/malloc
returns NULL.

Most probably, the largest page you have to index is much smaller than
the 50 M you specified as the max_doc_size - and also most probably,
the AIX box the indexer runs on has some trouble to allocate the 50 M
memory block for the indexer.

Sadly, the server where the pages reside does not reply with a content-
length header, so it is hard estimating the best value for max_doc_size.

Perhaps you simply increase the value by repeatedly adding 5 M until the
indexer manages to fetch the entire page. Hopefully, you won't run out
of memory on the AIX box in that case.

hth,

  Torsten

-- 
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail: info@inwise.de            Internet: http://www.inwise.de

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Sep 05 2000 - 08:51:46 PDT