Re: [htdig] scalability of htdig

Barry Zubel (
Tue, 16 Mar 1999 09:50:06 -0000

I have a site that currently has, uhh, (quick peek) 90k documents on it,
(which increase by up to 1000 a day) and htdig is admirable. I run an update
every hour (admittedly the site is on a remote machine: the search machine
is a dedicated PII-400 with 256MB Ram) and it loves it. I've never had it
crash yet.

If I ever start having problems with volumes (unlikely, plenty of HDD space
and memory) I know that I'll get a quick response from the htdig developers:
thats what its all about.

Barry Zubel
Technical Manager
City Mutual Ltd

-----Original Message-----
From: Philip Jenkins <>
To: <>
Date: Tuesday, March 16, 1999 6:11 AM
Subject: [htdig] scalability of htdig

>I was wondering if you could answer a couple of questions for me.
>I am looking for a search engine for a site that I am putting up that
>will cover a specific subject and I want the search engine to search
>certain sites that are either submitted or linked from relating sites.
>will be indexing about 1,000 to 3,000 remote sites, and probably around
>to 25,000 documents. I have been looking at SWISH-E, SWISH++ and
>ht://dig search engines. I have more impressed by far by ht://dig then
>any others
>that I have seen, and I am trying to stay with GNU software.
>I noticed that some sites that use ht://dig have
>over 5,000 items indexed. I was wondering if you could tell me how well
>scales to larger sites. Also if you think the engine could handle as
>many documents
>as I am needing to do. Does ht://dig handle both Indexes like Yahoo and
>normal searching?
>How well does it crawl sites to index them, does it crash on large
>One last question, I wanted to add a link to
>have people submit there own sites, if I do this does ht://dig
>automatically index them?
>Thank you for you time.
>To unsubscribe from the htdig mailing list, send a message to
> containing the single word "unsubscribe" in
>the SUBJECT of the message.

To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Wed Mar 17 1999 - 10:05:13 PST