Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Fri, 16 Jul 1999 13:10:09 -0500 (CDT)
According to loic@ceic.com:
> > > 8:12:1:http://www.senga.org/uri/html>: not found
> ..
> >
> > The CVS version is much more lenient about URLs. If you read the
> > messages, it's trying to connect to the URLs
> > "http://www.senga.org/uri/html>" or "http://www.senga.org/support.html>"
> > which are incorrect links.
>
> I thing this is because the quotes are missing :
>
> <a href=uri/html>uri</a>
>
> Do you think htdig should permanently consider this an incorect href ?
> If so it will have troubles with a lot of existing web sites.
I had a feeling this might crop up after the changes to HTML.cc. Here's
the fix, which I just committed to the CVS source tree:
Fri Jul 16 13:04:27 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca>
* htdig/HTML.cc(parse): fix to prevent closing ">" from being passed
to do_tag().
Index: htdig/HTML.cc
===================================================================
RCS file: /opt/htdig/cvs/htdig3/htdig/HTML.cc,v
retrieving revision 1.48
diff -u -p -r1.48 HTML.cc
--- htdig/HTML.cc 1999/07/13 20:58:06 1.48
+++ htdig/HTML.cc 1999/07/16 17:19:49
@@ -276,9 +276,9 @@ HTML::parse(Retriever &retriever, URL &b
q = (unsigned char*)strchr((char *)position, '>');
if (!q)
break; // Syntax error in the doc. Tag never ends.
- tag = 0;
- tag.append((char*)position + 1, q - position);
position++;
+ tag = 0;
+ tag.append((char*)position, q - position);
while (isspace(*position))
position++;
if (!in_space && spacebeforetags.CompareWord((char *)position)
@@ -328,8 +328,9 @@ HTML::parse(Retriever &retriever, URL &b
q = (unsigned char*)strchr((char *)position, '>');
if (q)
{
+ position++;
tag = 0;
- tag.append((char*)position + 1, q - position);
+ tag.append((char*)position, q - position);
do_tag(retriever, tag);
position = q+1;
}
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri Jul 16 1999 - 10:27:21 PDT