Gilles Detillieux (email@example.com)
Mon, 19 Jul 1999 14:14:09 -0500 (CDT)
According to Cam Proctor:
> > Please think about this a little more. In essence you will have to
> > proxy/filter the retrieved HTML page. This means that the base URL will
> > be different so all relative links and URLs for any components
> > referenced from that page (images, activeX controls, applets, etc.) will
> > have to be modified. The proxy will have to interpret the HTML and
> > modify the right tags.
> > etc...
> for the project that this will be used (at least one part of it) there
> will be a set of files (pure html, no scripts) that will be indexed
> (about 1.5 Gb of data currently). these files will be used only
> for this search engine. this particular instance should be ok for this
> solution (once i get the spaces thing working right).
For pure html, a <base> tag should handle the problem with relative hrefs.
It should be easy to generate this from the document's URL, and insert it
into the output stream at the appropriate spot (right after the <head>
tag, I think).
-- Gilles R. Detillieux E-mail: <firstname.lastname@example.org> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Mon Jul 19 1999 - 11:31:36 PDT