Geoff Hutchison (ghutchis@wso.williams.edu)
Fri, 30 Jul 1999 13:55:40 -0400
Gilles Detillieux wrote:
> OK, we do clearly have a problem with SGML entities in 3.1.2, as well
> as 3.2. (3.2 has some more serious problems, which I was hoping to
> tackle, but that's another story.) So, right now, it only translates
> &foo; entities outside of any HTML tags. I think there are reasons
Unfortunately we also need to translate URLs in an HTML context. It has
become a "standard" to include escapes such as & and © in the
URL text itself. This is not forbidden in the RFC on URIs, but for
obvious reasons it's not always supported by the webserver. Furthermore
we need to normalize URLs anyway.
I was initially thinking this would need to be placed in the URL code,
but it strikes me that this really only needs to happen in the HTML
parser itself.
(And yes, I'm aware the 3.2 HtSGMLCodec has problems, but I've been a
bit pre-occupied.)
-- -Geoff Hutchison Williams Students Online http://wso.williams.edu/------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri Jul 30 1999 - 10:14:21 PDT