RE: htdig: HTML special characters in HREF


Frost, Timothy E (timothy.frost@nz.eds.com)
Mon, 28 Sep 1998 13:54:47 +1200


> >Given the proliferation of generated HTML these days, I suspect that
> >htdig needs to put *everything* through SGMLEntity replacement. This
> >means *every* place where a URL can occur, not just the HREF= clause.
>
> Say it isn't so! :-)
>
FrontPage seems to encode using &; rather than % for most things. One
user of FrontPage seems to like document and anchor names with spaces,
and some of those are encoded as %20, others as  , and yet others
are not encoded at all. The cases that struck me last week were
ampersands in file names. It does seem to depend on the tag, to some
extent. Consider the following excerpts:

>From pme.htm:
 
<p><a name="PME Risk Process &amp; Templates"><font size="3" face="Comic
Sans MS"><b>PME
Risk Process &amp; Templates</b></font></a></p>

>From the frame :
        <td valign="top" nowrap width="2%"><font color="#CE1818"
        size="2" face="Comic Sans MS"><strong>*</strong></font></td>
        <td width="50%"><a
        href="pme.htm#PME Risk Process &amp; Templates"
        target="main"><font color="#CE1818" size="2"
        face="Comic Sans MS"><strong>PME Risk Process &amp;
        Templates</strong></font></a></td>

And another, very ugly HREF:
 
href="DROLINKS/PS%20Training/ps2_2%20(7)%20Exporting%20&amp;%20Importing
%20Processes.ppt"

Looking at the references, I find that FrontPage is converting spaces to
%20 in file names, but not in targets within a page. However, it is
consistently using &amp; for all ampersands, wherever they appear.

> This isn't the best of news. I looked through the RFC for URLs last
> night.
> While I couldn't find anything forbidding &amp; instead of %xxx, it
> did
> mention that special characters were to be encoded as %xxx and "&" was
> a
> special character. <chuckle>
>
> Hmm. I guess we should just run through a document first and do
> SGMLEntity
> replacement. Then we can parse to our heart's content. This is
> probably
> what Netscape and IE do.
>
>
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
>
> ----------------------------------------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-request@sdsu.edu containing the single word "unsubscribe" in
> the body of the message.
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:52 PST