[htdig] A Perl program to make pages out of "Not found: " data


Daniel MacKay (Daniel.MacKay@Dal.Ca)
Mon, 27 Sep 1999 11:25:14 -0300


Howdy.

Below is a small perl program to make pages of dead links arranged by
server so that each of your web managers can easily find dead links on
their machines. It uses the output of htdig -s flag.

It writes a whole bunch of files into the directory $prefix and puts $title
at the top of each page.

The one for my site is at http://noc.dal.ca/search/deadlinks/ and has about
8000 dead links on it (my site has 86,000 pages.)

#!/local/bin/perl

# showdead.pl Daniel MacKay Daniel.MacKay@Dal.Ca
# 990922 DEM Scan the log from a "htdig -s" run and produce pages listing
# all the dead links for your web managers to browse.

$prefix = "/local/www/search/deadlinks/" ;
$title = "Dead links found on 990924 dig\n";

while (<>) {
   chop;
   #print "$_|\n";
   s/\s*$//;
   if (m/^Not found:\s+(.*) Ref: (.*)$/) {
      ($bad,$ref) = ($1,$2) ;
      $key = $ref ;
      # print "$_\n" ;
      $key =~ s/^http:\/\///;
      $key =~ s/\/.*$//;
      push(@bad,"$key\t$ref\t$bad") ;
      }
   } ;

open (SERV,">$prefix/index.html") || die "can't open dead index file" ;
chmod (644,"$prefix/index.html");
print SERV "<<html><head><title>$title</title></head>\n";
print SERV "<body><h1 align=\"center\">$title</h1><ul>\n";

$okey = "" ;
foreach $_ (sort(@bad)) {
   ($key,$ref,$bad) = split /\t/ ;
   if ($okey ne $key) {
      print OUT "</ul></body></html>\n";
      close OUT ;
      print SERV "<li>$count bad links on <a
href=\"bad_$okey.html\">$okey</a>\n" ;
      $count=0;
      print "Now writing to $key\n" ;
      open (OUT,">$prefix/bad_$key.html") ||
        die "can't open file for $key" ;
      chmod (644,"$prefix/bad_$key.html");
      print OUT "<html><head><title>Bad links on $key</title></head>\n";
      print OUT "<body><h1 align=\"center\">$title<br>$key</h1><ul>" ;
      $okey = $key ;
      };
   $count++;
   print OUT "<li>$bad<br><a href=\"$ref\">$ref</a></li>\n" ;
   } ;

print OUT "</ul></body></html>\n";
close OUT;
print SERV "</ul></body></html>\n" ;
close SERV ;

--
Daniel.MacKay@Dal.Ca
Network Operations Centre Manager                   902 494-danm
Dalhousie University, Halifax, Nova Scotia, Canada.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Mon Sep 27 1999 - 07:29:56 PDT