Subject: Re: [htdig] Identifying non-indexed URLs
From: Geoff Hutchison (firstname.lastname@example.org)
Date: Tue Mar 14 2000 - 07:50:46 PST
On Tue, 14 Mar 2000, Bigler, Tyson MT SSI wrote:
> knowing which URLs were seen but not indexed because they weren't
> "parsable". Is this easily done?
I'm not quite sure what you mean. I'm assuming you want some listing of
URLs included in <a href="..."></a> tags that are malformed?
For better or worse, the URL-parsing code doesn't reject malformed URLs.
So you should see them rejected by the normal means. Granted, I haven't
run it through every possible URL-ish input (malformed or not), so it's
possible there are bugs.
Remember, if you want to take a look at every URL seen, you can set
Williams Students Online
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Tue Mar 14 2000 - 07:56:13 PST