[htdig] Parsing 3.1.3 log files

Daniel MacKay (Daniel.MacKay@Dal.Ca)
Fri, 24 Sep 1999 13:09:55 -0300


One of the things I'd like to get out of ht://Dig is a list of dead pages
at my institution.

Getting the dead pages is easy; in the log they're marked "not found".
Getting their sources is a little harder, but with V3.1.2 all you had to
do was run htdig in -vv mode, and remember what page was currently being
scanned, and every page that was marked as "pushing" under that "belonged"
to that page.

I have perl code to do this, will share it with anyone once I get it
working again.

I just brought up 3.1.3 this morning and am getting the usual number of of
"not found" errors, but there are no matching "pushing" lines for this.

My guesses:
 - htdig is reading something out of existing database files?
 - 3.1.3 is not logging all of the pages that it's pushing

Also: is there any documentation for the format of the log file? what are
the three numbers at the beginning of the line, e.g.

        14:2:0:<url>: not found

Network Operations Centre Manager                   902 494-danm
Dalhousie University, Halifax, Nova Scotia, Canada.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Fri Sep 24 1999 - 09:14:21 PDT