Thus spake Geoff Hutchison (at 01:37 PM 9/17/98 -0400) ...
>I guess the "problem" is this: ht://Dig interprets JavaScript in HTML
>files as text. So if we can take the code Muffin uses to strip JavaScript
>and add it to a "remove JavaScript" pass over the HTML files before
>ht://Dig begins the real indexing, we'd be set.

What about the "problem" of people using JS to pop up windows and other
URLs and such? If you simply strip all the JS code from a document, you'll
lose these links (and the info in them).

And I haven't even mentioned JS that creates URL references on the fly, or
based on other variables. Good luck coding a parser for that!

The only complete solution I can see is to write a program that emulates a
browser and follows every possible link, button, image map, etc. possible
from that page.

[or do the digging on the server side ... but then what URL do you present
to the user?]

