htdig: Bug: redirects


John Goerzen (jgoerzen@southwind.net)
05 Jun 1998 17:27:17 -0500


Hi,

First let me ask to please CC any reply to me since I'm not
subscribed.

I downloaded HTDIG -- it looks very good and has a much more agreeable
license than WebGlimpse! However, I've noticed several problems.

First, let me lay out the situation -- I have a server named
gesundheit.cs.twsu.edu, and there is another name, happy.cs.twsu.edu,
that is a CNAME for it due to the realization that a lot of people
misspell gesundheit. So all URLs given out reference
happy.cs.twsu.edu.

The first problem is that HTDIG doesn't understand HTTP/1.1. For
clients that do, the server will use happy.cs.twsu.edu in all
redirects -- preserving the URL that the client thought it was looking
up. However, it doesn't do that for HTTP/1.0 (it can't), so this
leads to trouble. I have references to things like
http://happy.cs.twsu.edu/aclug/events, which gets redirected to
http://gesundheit.cs.twsu.edu/aclug/events/ for clients with
HTTP/1.0. That seems to confuse HTDIG, and leads to confusing results
for viewers.

Problem 2: Here are two lines from my config:

start_url: http://happy.cs.twsu.edu/aclug/
limit_urls_too: .cs.twsu.edu/aclug

This setup will cause HTDIG to skip the /events URL as mentioned
above, even though it gets redirected to something that IS in the
allowed set. Changing the start_url to
http://gesundheit.cs.twsu.edu/aclug/ magically fixes the problem, but
it should act no differently.

Suggestions appreciated.

Thanks,
John

-- 
John Goerzen                              Southwind Internet Access, Inc.
E-mail: Business, jgoerzen@southwind.net; Personal, jgoerzen@complete.org
Computer Science Dept., Wichita State University,    jgoerzen@cs.twsu.edu
Developer, Debian GNU/Linux                       <http://www.debian.org>
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:31 PST