Subject: Re: [htdig] Going for the big dig
From: Geoff Hutchison (email@example.com)
Date: Tue Dec 19 2000 - 10:02:19 PST
On Tue, 19 Dec 2000, David Gewirtz wrote:
> on something. I attempted to index a remote site, in this case Lotus.com.
> Now, I have no idea how many pages that is. But I let the index process run
If you have no idea how many pages will be on a server, I'd start with a
set max_hop_count or server_max_docs limit and go from there. These
attributes are meant to keep the dig from spiralling out of control (or in
this case, out of the limits of your server).
> handle it. Right now, I'm thinking the process is too big. Can htdig and/or
> htmerge running on a 258MB or 384MB machine handle indexing/merging sites
This question is a bit hard to answer. From what you said, the answer is
"no," but I can't give a better answer unless there's at least an estimate
of the number of URLs, as I mentioned earlier.
There are also simple "link checker" scripts which can give you a count of
the number of URLs on a site.
-- -Geoff Hutchison Williams Students Online http://wso.williams.edu/
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>
This archive was generated by hypermail 2b28 : Tue Dec 19 2000 - 10:13:05 PST