Re: [htdig] Preprocessing of HTML pages


Subject: Re: [htdig] Preprocessing of HTML pages
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Wed Jun 14 2000 - 07:41:45 PDT


At 6:48 PM +0200 6/13/00, Reich, Stefan wrote:
>So is there a way to tell the internal HTML parser to parse something
>different than text/html, so I could use text->myhtml for the external
>converter and then tell the internal one to parse myhtml.

Well, if you're willing to edit the code, this is fairly easy. You'll
want to make changes in htdig/Document.cc and
htdig/ExternalParser.cc. The HTML.cc code for the parser doesn't
actually care what MIME type it uses--it just expects the raw data
from the file.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Jun 14 2000 - 05:33:43 PDT