Re: htdig: htdig, muffin and javascript


Geoff Hutchison (Geoffrey.R.Hutchison@williams.edu)
Thu, 17 Sep 1998 13:37:25 -0400 (EDT)


> > Well, if someone would volunteer to poke into muffin and look for the
> > JavaScript filtering code, we can always look at that (I'm assuming it's
> > GPL?). If it looks reasonable, I'm sure a patch for htdig/HTML.cc can be
> > made.
>
> Yes, Muffin is GPL. If you guys can get me more information about the
> problem I'll do my best to fix it.

I guess the "problem" is this: ht://Dig interprets JavaScript in HTML
files as text. So if we can take the code Muffin uses to strip JavaScript
and add it to a "remove JavaScript" pass over the HTML files before
ht://Dig begins the real indexing, we'd be set.

This could be pretty simple. If Muffin's JavaScripting code is in one or
two files and has a high-level function (something to return an HTML
buffer w/o JavaScript), then it would almost be a drop-in. If it's not
quite that simple, then we can extract what we need into a file.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:48 PST