Subject: Re: [htdig] Re: 3.1.5 strange freeze problem (fwd)
From: Peter L. Peres (firstname.lastname@example.org)
Date: Fri Oct 13 2000 - 23:48:17 PDT
On Mon, 9 Oct 2000, Andrew Scherpbier wrote:
>"Peter L. Peres" wrote:
>Does the htdig process get very big?
I have re-run the application overnight and have come to the following
1. The application is swapping a lot when 'frozen' but the swapping is
confined to a very small area so it takes place mostly in the hardware
disk cache (disk light is mostly off ?). This is also very fast (66MB/sec
measured disk cache to CPU, cache = 2M I think). Swap size is about 8 M
for this application. Memory footprint is 15M + 4M shared etc. The system
responsiveness is good while this happens (cpu load 1.3 or so).
2. The problem occurs after htdig indexes some very large tabulated data
files (which it should not be indexing). Notice after, not during. I am
speaking of 15 M text files containing up to 20 floating point numbers per
line. I will remove these from the search and re-run the htdig. Since I
have index_numbers turned on, it is just barely possible that htdig tries
to sort out the millions of numbers. This is to be checked.
3. I will run the htdig again without the offending data sets in the input
>It would probably help to run htdig with a single '-v' option so that you can
>see what page it is working on when it stops. (Make sure you either redirect
>the output or run htdig from within 'script' because there will be a single
>line per web page and it sounds like you have a lot of pages...)
I did run htdig with a rundig script and nohup as usual and I only have a
1.4 M nohup.out after this fyi. I used one -v.
>If all else fails, you could also run indexes on smaller chunks of your tree
>and merge them together to get the full index.
I have yet to try this. Fortunately I run it on a spare set of database
files so I can still use the search feature with the old databases...
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
This archive was generated by hypermail 2b28 : Thu Oct 12 2000 - 23:57:25 PDT