htdig: htdig-3.0.8b2: Suggested BUG in URL.cc for http:/ URLs


Barry Cornelius (Barry.Cornelius@durham.ac.uk)
Fri, 4 Sep 1998 17:45:31 +0100 (BST)


I have just strted to use htdig. I have htdig-3.0.8b2, and have had a
problem with URLs like:
      http:/ITS
In directory htlib, lines 151 to 152 of URL.cc are:
      if (hasService && ((strncmp(ref, "http:/", 6) == 0) ||
           (strncmp(ref, "http:", 5) != 0)))
and this condition is satisfied for the above URL. This leads to a call
of parse which then leads to it executing lines 302 to 306 of URL.cc:
      _host = 0;
      _port = 0;
      _url = 0;
      _path = p;
      _normal = 1;
      return;
which leads to nonsense.

If you change line 151 to:
      if (hasService && ((strncmp(ref, "http://", 7) == 0) ||
then a URL like:
      http:/ITS
leads to the condition on line 151 giving false. It now goes on
eventually to execute lines 171 to 178 of URL.cc which are:
 if (*ref == '/')
 {
     //
     // The reference is on the same server as the parent, but
     // an absolute path was given...
     //
     _path = ref;
 }
which I believe is exactly what's required.

In the cases where *ref starts with "http" (which is most common), I think
it's possible to prove that (with the current code) the condition on line
171, i.e., *ref == '/', is never true and so the statement on line 177 is
unreachable.

I would appreciate someone confirming the above. If it's true, I'm
puzzled as to why hasn't it been detected previously. Is the "http:/"
code on line 151 new?

--
Barry Cornelius                      Telephone: (0191 or +44 191) 374 4717
User Services, Information Technology Service,            Office: 374 2892   
Science Site, University of Durham, Durham, DH1 3LE, UK      Fax: 374 7759
http://www.durham.ac.uk/~dcl0bjc       mailto:Barry.Cornelius@durham.ac.uk

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:41 PST