Re: htdig: indexing a site

Colin Viebrock (
Wed, 09 Sep 1998 10:17:18 -0400

Also sprach Adam Crews (at 07:00 PM 9/8/98 -0700) ...
>I bill my clients by the amount of bandwidth that they use. I have one
>site that is about 40mb or so.. The indexing of the site is causing a
>large skew in the actual bandwidth that they use. The search engine is a
>"free" benefit of their site. I would like to be able to have htdig simply
>read the html pages from a starting directory and then go from there...

Not in the docs (I don't think) but doesn't htdig identify itself to the
server when requesting files?

If I look at my access logs for, I see entries like: - - [09/Sep/1998:10:20:35 -0400] "GET
/downloads/app98.pdf HTTP/1.0" 200 74091 - - [09/Sep/1998:10:20:44 -0400] "GET
/plab-sally.php3 HTTP/1.0" 200 1806 - - [09/Sep/1998:10:20:44 -0400] "GET /plab-ken.php3
HTTP/1.0" 200 3762

So, technically, I could exclude all the bandwidth used by to get a count of "real" usage.

Colin Viebrock Creative Director Private World Communciations

                                                   Your mouse has moved.
                                           Windows must be restarted for
                                              the change to take effect.

To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:46 PST