htdig: Bug in handling wrong HTML

Frank Richter (
Fri, 7 Aug 1998 19:26:02 +0200 (MET DST)

I think there is a bug in htdig (3.0.8b2, Solaris 2.6). When it parses a
document containing wrong HTML - in my example an unclosed comment - it
stores the beginning content from the document parsed before. Of course,
wrong HTML is a bad thing, but I think it should store no content (or a
warning) instead of other content for this wrong page.


It contains a link to .../t2.html with wrong HTML.
The resulting is:
0 u: t:Title 1 a:0
m:902509994 s:130 h: HEAD 1 Link to t2 some text l:902509999 L:1
I:130 d: A:
1 u: t:Title 2 a:0
m:902509903 s:183 h: HEAD 1 Link to t2 some text l:902509999 L:0
I:183 d:Link to t2 A: ^^^^^^^^^^^^^^^^^^^^^^^ that's wrong!

You see in the second entry for t2.html the content of t1.html.
Does anyone has a fix or a suggestion where to look in the code?

        - Frank

Work:  Computing Services, Technical University, 09107 Chemnitz, Germany

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:15 PST