Re: htdig: indexing a site


Adam Crews (doo@divaweb.com)
Wed, 9 Sep 1998 09:45:39 -0700 (PDT)


I was hoping to not have to filter my log files... before other programs
use them to make the stats...
if I need to I can just grep -v out all the "htdig/3.0.8b1" entries
from the log before I run it through other programs. I use an apache
combined log format so I get the "browser type" on the same line as the
requested file and byte count.

Like I said, I would rather not need to pre-process my logs before making
stats, but looks like I need to...

Thanks for the input everyone.

-Adam

On Wed, 9 Sep 1998, Colin Viebrock wrote:

 | Also sprach Adam Crews (at 07:00 PM 9/8/98 -0700) ...
 | >I bill my clients by the amount of bandwidth that they use. I have one
 | >site that is about 40mb or so.. The indexing of the site is causing a
 | >large skew in the actual bandwidth that they use. The search engine is a
 | >"free" benefit of their site. I would like to be able to have htdig simply
 | >read the html pages from a starting directory and then go from there...
 |
 | Not in the docs (I don't think) but doesn't htdig identify itself to the
 | server when requesting files?
 |
 | If I look at my access logs for www.summerworks.on.ca, I see entries like:
 |
 | www.summerworks.on.ca - - [09/Sep/1998:10:20:35 -0400] "GET
 | /downloads/app98.pdf HTTP/1.0" 200 74091
 | www.summerworks.on.ca - - [09/Sep/1998:10:20:44 -0400] "GET
 | /plab-sally.php3 HTTP/1.0" 200 1806
 | www.summerworks.on.ca - - [09/Sep/1998:10:20:44 -0400] "GET /plab-ken.php3
 | HTTP/1.0" 200 3762
 |
 | So, technically, I could exclude all the bandwidth used by
 | www.summerworks.on.ca to get a count of "real" usage.
 |
 |
 | ________________________________________________________________________
 | Colin Viebrock Creative Director
 | cmv@privateworld.com Private World Communciations
 | http://www.privateworld.com
 |
 | Your mouse has moved.
 | Windows must be restarted for
 | the change to take effect.
 |
 | ----------------------------------------------------------------------
 | To unsubscribe from the htdig mailing list, send a message to
 | htdig-request@sdsu.edu containing the single word "unsubscribe" in
 | the body of the message.
 |

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:47 PST