Re: [htdig] relative URL retrieval infinite recursive loop


Subject: Re: [htdig] relative URL retrieval infinite recursive loop
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Tue Jan 04 2000 - 14:59:41 PST


At 6:31 AM -0600 12/28/99, Glenn Nielsen wrote:
>PROBLEM
>-------
>
>The following is a valid URL for a document...
>
><a href="/parent/parent.html/index.html">Parent Page</a>
>
>where "/parent/parent.html" is a file on the server that is
>returned by the webserver from the above URL.

Is it valid? Yes.
Is it a good URL. No.

Now this has come up recently on the bug report list. But when I
tried this at "home" so to speak, the server returned a 404. (IMHO,
if parent.html is NOT server-parsed, this is the Right Thing To Do
TM.)

>A possible solution would be to compare the contents of the parent and
>child documents when the child comes from a relative URL. If the
>document contents for the parent and child are identical and have the
>same last modification date stamp, ignore the child document and report
>an error. Then continue, digging the next href in the parent.

Maybe. This is a bit of a pain though since you have to "remember"
that it came from a relative URL. The whole problem is resolved when
you have duplicate-document detection, which has been on the plate
for a while. Unless someone volunteers to do it, it may be some time
before it sees light of day, though.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Jan 04 2000 - 15:17:35 PST