Re: htdig: 3.1.b2 -> 3.1.b3 performance degradation +

Gilles Detillieux (
Mon, 21 Dec 1998 13:13:25 -0600 (CST)

According to Michael Pfennich:
> I installed 3.1.b3 yesterday on a solaris 2.5.1 machine.
> --> I also noticed a massive performance degradion!!
> it takes more than 20 times longer to search a Database with 70k
> Documents!!

That seems to agree with what Joe Jah said last Thursday. There are
a couple possible reasons for this, relating to some of the score
calculation enhancements introduced in 3.1.0b3.

First of all, there is a memory leak in htsearch, as I mentioned to
Rodger Zeisler this morning. That could chew up a lot of VM, leading
to a lot of paging/swapping. Please try this patch and see what impact
it has on performance.

--- htsearch/ Tue Dec 15 10:58:13 1998
+++ htsearch/ Mon Dec 21 10:11:07 1998
@@ -852,6 +852,7 @@
               links = 1; // It's a hack, but it helps...
             score += config.Double("backlink_factor")
               * (thisRef->DocBackLinks() / (double)links);
+ delete thisRef;

Secondly, even with this fix, htsearch now reads DocumentRef records from
the docdb for all matches, not just the ones shown on the current page,
so that will cause a lot more disk I/O, especially when a search has a
lot of matches! I'd be interested in hearing from those who've noticed
a substantial performance degradation. (I think Geoff would be too.)
I'm curious how much of that degradation is due to the memory leak,
corrected with the patch above, and how much is due to extra database I/O.
My own database is too small to notice any appreciable difference.

If the degradation is still unacceptable, even with the patch, I've been
toying with a small change that would avoid the extra I/O when you set
date_factor and backlink_factor to 0. You'd lose the scoring enhancements
that way, but you'd also lose the performance hit. If it's worth it,
I'll post a patch - but let me know soon because I'm off on holidays
from Wednesday until the new year.

By the way, here's the related memory leak patch for htnotify:

--- htnotify/ Tue Dec 15 10:58:13 1998
+++ htnotify/ Mon Dec 21 13:11:03 1998
@@ -138,6 +138,7 @@
         ref = docdb[str->get()];
+ delete ref;
     delete docs;

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:55 PST