Re: [htdig] URLs parsing problem


Torsten Neuer (tneuer@inwise.de)
Fri, 15 Oct 1999 09:39:36 +0200


Gilles Detillieux wrote:
>
> --- htdig/SGMLEntities.cc.orig Wed Sep 22 11:18:41 1999
> +++ htdig/SGMLEntities.cc Thu Oct 14 15:08:31 1999
> @@ -280,5 +280,11 @@ SGMLEntities::translateAndUpdate(unsigne
>
> if (*entityStart == ';')
> entityStart++; // A final ';' is used up.
> - return translate(entity);
> + unsigned char e = translate(entity);
> + if (e == ' ' && strncmp((char *)orig, "&#32", 4) != 0)
> + {
> + entityStart = orig + 1; // Catch unrecognized entities...
> + return '&';
> + }
> + return e;
> }

Thanks for the improvements, but shouldn't the test for the space
character include   and   as well?

cheers,
  Torsten

-- 
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail: info@inwise.de            Internet: http://www.inwise.de

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Sat Oct 23 1999 - 09:47:48 PDT