Subject: Re: [htdig] Suse long run: done, problem solved
From: David Robley (email@example.com)
Date: Wed Apr 26 2000 - 18:32:21 PDT
On 27 Apr, Peter L. Peres wrote:
> the machine finished ! The loop was in the java api docs. There were no
> other loops. There is no bug in htdig wrt. this problem (looping).
> Here are some stats from the end:
> 27425.60user 10781.29system 43:11:27elapsed 24%CPU (0avgtext+0avgdata
> 0inputs+0outputs (18429778major+3453038minor)pagefaults 2501532swaps
> htdig ran with a niceness 18 for the last 25% of the indexing. Load was
> 0.8 or so during this time. docdb is about 200MB. My input was about
> The loop problem was in the tree:
> which has more than 500 entries.
> System: i486/100MHz/24MB RAM 4.3+2.8 GB EIDE disks (not UDMA), headless
> (ethernet only) Suse 6.2 Linux (w. modified html documentation system - by
> me). As you can see the machine was swapping like crazy. I think I'd need
> a machine with 256MB RAM to avoid serious swapping. Not likely anytime
> thank you all for the ideas,
I think I've come across this sort of problem when trying to index a
series of documents that have a lot of internal references (A
HREF="#target"> and htdig tries to follow each of these links, ending up
going in ever decreasing circles until....
My solution was to add something like html# to the exclude_urls list.
-- David Robley | WEBMASTER & Mail List Admin RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet.flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Wed Apr 26 2000 - 16:19:19 PDT