[htdig] Problems with htdig and 301 redirects via apache mod_speling


Neil Prockter (n.prockter@lse.ac.uk)
Mon, 08 Feb 1999 14:44:34 +0000


I'm having a few problems with htdig and 301 redirects via apache mod_speling (I
can't see how mod_speling is to blame but I thought I'd mention it anyway)

My main problem is due to case sesitivity

I'm using mod_speling (which sends back 301 redirects to agents if it can find an
easy spelling correction) so in the examples I'm about to give a request to
/depts recieves a 301 redirect to /Depts - and as Depts is a directory a request
to /Depts gets a 301 redirect to /Depts/

I've added BrowserMatch "htdig/3\.1\.0b4" force-response-1.0 to apache to force
http 1.0 response for htdig as it was previously recieving http 1.1 redirects but
this hasn't made much difference as I get the same behaviour with or without it.

I've got limit_urls_to: http://www.lse.ac.uk/ so that shouldn't be affecting
htdig

I've got case_senisitive: false but that does help. (I've got other problems with
that that I will posted seperately later)

If I give the 'correct' spelling of Depts
start_url: http://www.lse.ac.uk/Depts

I get get redirected to http://www.lse.ac.uk/Depts/ and this IS pushed onto the
stack for htdig to retrieve.

0:0:0:http://www.lse.ac.uk/Depts: Retrieval command for
http://www.lse.ac.uk/Depts: GET /Depts HTTP/1.0
User-Agent: htdig/3.1.0b4 (n.prockter@lse.ac.uk)
Host: www.lse.ac.uk

Header line: HTTP/1.0 301 Moved Permanently
Header line: Date: Mon, 08 Feb 1999 14:13:11 GMT
Header line: Server: Apache/1.3.4 (Unix)
Header line: Location: http://www.lse.ac.uk/Depts/
Header line: Connection: close
Header line: Content-Type: text/html
Header line:
returnStatus = 3
 redirect
redirect: http://www.lse.ac.uk/Depts/
resolving 'http://www.lse.ac.uk/Depts/'
   pushing http://www.lse.ac.uk/Depts/
pick: www.lse.ac.uk:80, # servers = 1
1:1:-1:http://www.lse.ac.uk/Depts/: Retrieval command for
http://www.lse.ac.uk/Depts/: GET /Depts/ HTTP/1.0

and everything follows as expected (although links to /depts/ still don't get
followed so this ain't much good)
----------

however start_url: http://www.lse.ac.uk/depts/
I get get redirected to http://www.lse.ac.uk/Depts/ and this IS NOT pushed onto
the stack for htdig to retrieve.

0:0:0:http://www.lse.ac.uk/depts/: Retrieval command for
http://www.lse.ac.uk/depts/: GET /depts/ HTTP/1.0
User-Agent: htdig/3.1.0b4 (n.prockter@lse.ac.uk)
Host: www.lse.ac.uk

Header line: HTTP/1.0 301 Moved Permanently
Header line: Date: Mon, 08 Feb 1999 14:18:46 GMT
Header line: Server: Apache/1.3.4 (Unix)
Header line: Location: http://www.lse.ac.uk/Depts/
Header line: Connection: close
Header line: Content-Type: text/html
Header line:
returnStatus = 3
 redirect
redirect: http://www.lse.ac.uk/Depts/
resolving 'http://www.lse.ac.uk/Depts/'
pick: www.lse.ac.uk:80, # servers = 1
-----------

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Feb 10 1999 - 17:09:05 PST