[htdig] Re: Problems with htdig and 301 redirects via apache mod_speling


Neil Prockter (n.prockter@lse.ac.uk)
Mon, 08 Feb 1999 15:22:39 +0000


I sent this earlier. I've sorted it out myself I think. The problem appears to be
that that the case of a url is thrown away by the Need2Get function.

I'll report this as a bug (if it hasn't been already)

Neil Prockter

> I'm having a few problems with htdig and 301 redirects via apache mod_speling (I
> can't see how mod_speling is to blame but I thought I'd mention it anyway)
>
> My main problem is due to case sesitivity
>
> I'm using mod_speling (which sends back 301 redirects to agents if it can find an
> easy spelling correction) so in the examples I'm about to give a request to
> /depts recieves a 301 redirect to /Depts - and as Depts is a directory a request
> to /Depts gets a 301 redirect to /Depts/
>
> I've added BrowserMatch "htdig/3\.1\.0b4" force-response-1.0 to apache to force
> http 1.0 response for htdig as it was previously recieving http 1.1 redirects but
> this hasn't made much difference as I get the same behaviour with or without it.
>
> I've got limit_urls_to: http://www.lse.ac.uk/ so that shouldn't be affecting
> htdig
>
> I've got case_senisitive: false but that does help. (I've got other problems with
> that that I will posted seperately later)
>
> If I give the 'correct' spelling of Depts
> start_url: http://www.lse.ac.uk/Depts
>
> I get get redirected to http://www.lse.ac.uk/Depts/ and this IS pushed onto the
> stack for htdig to retrieve.
>
> 0:0:0:http://www.lse.ac.uk/Depts: Retrieval command for
> http://www.lse.ac.uk/Depts: GET /Depts HTTP/1.0
> User-Agent: htdig/3.1.0b4 (n.prockter@lse.ac.uk)
> Host: www.lse.ac.uk
>
> Header line: HTTP/1.0 301 Moved Permanently
> Header line: Date: Mon, 08 Feb 1999 14:13:11 GMT
> Header line: Server: Apache/1.3.4 (Unix)
> Header line: Location: http://www.lse.ac.uk/Depts/
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 3
> redirect
> redirect: http://www.lse.ac.uk/Depts/
> resolving 'http://www.lse.ac.uk/Depts/'
> pushing http://www.lse.ac.uk/Depts/
> pick: www.lse.ac.uk:80, # servers = 1
> 1:1:-1:http://www.lse.ac.uk/Depts/: Retrieval command for
> http://www.lse.ac.uk/Depts/: GET /Depts/ HTTP/1.0
>
> and everything follows as expected (although links to /depts/ still don't get
> followed so this ain't much good)
> ----------
>
> however start_url: http://www.lse.ac.uk/depts/
> I get get redirected to http://www.lse.ac.uk/Depts/ and this IS NOT pushed onto
> the stack for htdig to retrieve.
>
> 0:0:0:http://www.lse.ac.uk/depts/: Retrieval command for
> http://www.lse.ac.uk/depts/: GET /depts/ HTTP/1.0
> User-Agent: htdig/3.1.0b4 (n.prockter@lse.ac.uk)
> Host: www.lse.ac.uk
>
> Header line: HTTP/1.0 301 Moved Permanently
> Header line: Date: Mon, 08 Feb 1999 14:18:46 GMT
> Header line: Server: Apache/1.3.4 (Unix)
> Header line: Location: http://www.lse.ac.uk/Depts/
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 3
> redirect
> redirect: http://www.lse.ac.uk/Depts/
> resolving 'http://www.lse.ac.uk/Depts/'
> pick: www.lse.ac.uk:80, # servers = 1
> -----------

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Feb 10 1999 - 17:09:06 PST