htdig: Indexing a site-log analysis


Brian Litke (blitke@sedl.org)
Thu, 10 Sep 1998 13:21:44 -0500


Adam,

You may not have to pre-process your log files depending on the features of
your log analysis program. I use WUSAGE (http://www.boutell.com/wusage/ )
to run my site's stats, and it allows me to specify individual sites to
exclude from the totals.

For example, my site wants to count outside visitors hitting our pages, not
employees using a browser on our LAN to look at the pages, so I've already
got WUSAGE set to ignore hits from my server, which is also the one running
the htdig indexing program, running my link checking program, and running
glimpse (another search tool I use on the site which indexes pages
periodically).

So, depending on what other tools your using on your site, you may already
have a problem with inflated hit totals. Or, if your log analysis program
has its exclusions set, it may already be ignoring hits from people and
site magament tools on your server.

Brian

>I was hoping to not have to filter my log files... before other programs
>use them to make the stats...
>if I need to I can just grep -v out all the "htdig/3.0.8b1" entries
>from the log before I run it through other programs. I use an apache
>combined log format so I get the "browser type" on the same line as the
>requested file and byte count.
>
>Like I said, I would rather not need to pre-process my logs before making
>stats, but looks like I need to...
>
>Thanks for the input everyone.
>
>-Adam

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:42 PST