Geoff Hutchison (firstname.lastname@example.org)
Fri, 24 Sep 1999 11:25:58 -0500
>Getting the dead pages is easy; in the log they're marked "not found".
>Getting their sources is a little harder, but with V3.1.2 all you had to
Actually, it's a *lot* easier than this. Use the -s flag. At the end
of the dig, it will print the broken URLs and their referers. There's
even a contributed script in the archive that will help you do
various things with the list.
>Also: is there any documentation for the format of the log file? what are
>the three numbers at the beginning of the line, e.g.
> 14:2:0:<url>: not found
Index #, DocID, Hopcount
where Index # is incremented every step during that indexing run,
DocID is the internal database ID #, and hopcount is the number of
hops from the start_url.
Williams Students Online
To unsubscribe from the htdig mailing list, send a message to
email@example.com containing the single word unsubscribe in
the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri Sep 24 1999 - 09:31:03 PDT