Re: [htdig] Going for the big dig

Subject: Re: [htdig] Going for the big dig
From: Geoff Hutchison (
Date: Tue Dec 19 2000 - 10:02:19 PST

On Tue, 19 Dec 2000, David Gewirtz wrote:

> on something. I attempted to index a remote site, in this case
> Now, I have no idea how many pages that is. But I let the index process run

If you have no idea how many pages will be on a server, I'd start with a
set max_hop_count or server_max_docs limit and go from there. These
attributes are meant to keep the dig from spiralling out of control (or in
this case, out of the limits of your server).


> handle it. Right now, I'm thinking the process is too big. Can htdig and/or
> htmerge running on a 258MB or 384MB machine handle indexing/merging sites

This question is a bit hard to answer. From what you said, the answer is
"no," but I can't give a better answer unless there's at least an estimate
of the number of URLs, as I mentioned earlier.

There are also simple "link checker" scripts which can give you a count of
the number of URLs on a site.

-Geoff Hutchison
Williams Students Online

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Tue Dec 19 2000 - 10:13:05 PST