Re: [htdig] Identifying non-indexed URLs

Subject: Re: [htdig] Identifying non-indexed URLs
From: Geoff Hutchison (
Date: Tue Mar 14 2000 - 07:50:46 PST

On Tue, 14 Mar 2000, Bigler, Tyson MT SSI wrote:

> knowing which URLs were seen but not indexed because they weren't
> "parsable". Is this easily done?

I'm not quite sure what you mean. I'm assuming you want some listing of
URLs included in <a href="..."></a> tags that are malformed?

For better or worse, the URL-parsing code doesn't reject malformed URLs.
So you should see them rejected by the normal means. Granted, I haven't
run it through every possible URL-ish input (malformed or not), so it's
possible there are bugs.

Remember, if you want to take a look at every URL seen, you can set

-Geoff Hutchison
Williams Students Online

To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Tue Mar 14 2000 - 07:56:13 PST