htdig: Re: 3.1.b2 -> 3.1.b3 performance degradation +

Geoff Hutchison (
Thu, 17 Dec 1998 14:12:17 -0500

At 1:45 PM -0500 12/17/98, Joe R. Jah wrote:
>1. It takes considerably longer to search ( 10 to 20 times) than
> 3.1.b2

I'm not surprised that it's slower to search, but that seems pretty big.
Perhaps inefficient memory usage in the new search code is causing your
searches to switch to VM. I haven't noticed a big speed hit. The discussion
about sorting results exposed some really poor use of memory in 3.1.0b3
which will be fixed.

>2. Many of the pages present in 3.1.b2 results, are absent in
> 3.1.b3 results.

I'm guessing that they're *there*, but the ranking has changed. If you see
the release notes, I expect the changes to generally produce more accurate
results. But it seems to pick a few pages that it feels are really good and
give them significantly higher rankings.

>3. I can not explain the size changes of the db.wordlist and db.words.db
> files.

The db.wordlist file *should* be a little smaller in 3.1.0b3. Thanks to a
patch by Didier Gautheron, we remove repetitive information in that file.
However, this version records the text of links pointing to a page, so
there are usually more word entries. So the db.words.db file, which can't
benefit from the patch, is likely to be larger.

In short, this is to be expected.

-Geoff Hutchison
Williams Students Online

To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:53 PST