Re: [htdig] Still Having problem with &amp


Subject: Re: [htdig] Still Having problem with &
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Fri Aug 11 2000 - 11:45:25 PDT


According to Gilles Detillieux:
> OK, I found a bug in htdig/HTML.cc, which I think causes it to think the
> "&" isn't translated to "&", so it copies the whole entity through.
> I'll see if I can find a sensible fix. Thanks for persisting.

Could you please give this patch a try and tell me if it fixes this
problem, without breaking anything else? It should do for 3.2.0b1,
3.2.0b2, and any recent 3.2.0bx snapshot.

--- htdig/HTML.cc.orig Wed May 24 07:42:43 2000
+++ htdig/HTML.cc Fri Aug 11 13:41:32 2000
@@ -259,8 +259,8 @@ HTML::parse(Retriever &retriever, URL &b
                 scratch = 0;
                 scratch.append((char*)position, q+1 - position);
                 textified = HtSGMLCodec::instance()->encode(scratch);
- if (textified[0] != '&') // it was decoded, copy it
- {
+ if (textified[0] != '&' || textified.length() == 1)
+ { // it was decoded, copy it
                     position = (unsigned char *)textified.get();
                     while (*position)
                       {

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Aug 11 2000 - 01:45:09 PDT