Re: [htdig] htdig 3.2b2 performance

Subject: Re: [htdig] htdig 3.2b2 performance
From: Marcel Hicking (
Date: Thu Jun 15 2000 - 04:17:35 PDT

Is there any chance to have htsearch run reniced
to minus something? I pretty often have 5 or more
htsearch threads running at the same time, and
since the machine is already very busy with at
about 200-300 http and ftp connects, this pushes
the load well over the limit.


On 12 Jun 00, at 10:04, Geoff Hutchison wrote:

> At 10:38 AM -0700 6/11/00, Ravindra Wankar wrote:
> >Phrase match seems very very slow (as compared to "all words" and "any
> >words").
> Strange. I notice a small slowdown, but not much.
> >Also, when running htdig, initially htdig takes up 97-98% of CPU time.
> >Memory usage is high but I don't see swapping. After a while the cpu
> >usage drops to around 40%. Mem is still fine.
> Yes, the word database code still needs some optimization. Profiling
> the code has shown that this is the major bottleneck. If you fiddle
> with the cache size, performance improves, but it's silly to cache
> the whole database. ;-)
> >Similarly when htsearch is run I see almost 90-95% CPU usage. What
> >happens if there are 10 simultaneous searches?
> Right, but you see high CPU usage when you run htsearch in previous
> versions too. Basically all of the programs are designed to run with
> as much CPU as you give them... When I actually finish rewriting the
> htsearch backend rewrite, it will be possible to cache search results
> and intermediate results (i.e. part of a query). You *could* do it
> now, but the code would be a total mess.
> >Would moving to MYSQL DB help? I don't see a patch for 3.2 versions.
> Not really. A SQL database might help speed up the document indexes
> slightly, but the word database in SQL would be massive. So you may
> or may not have a performance increase for the word database, but I'm
> very confident you'd have a much bigger database.
> >Does anyone know what is/are the bottlenecks? Disk/Mem/CPU? e.g. given
> >the above configuration, what can be changed to speed things up?
> You will get better disk performance if you use a SCSI disk. This is
> a significant bottleneck however you cut it and will probably remain
> one. The fewer times you need to hit the disk, the better.
> --
> -Geoff Hutchison
> Williams Students Online
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> You will receive a message to confirm this.

To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Thu Jun 15 2000 - 02:08:07 PDT