Re: [htdig] Stripping java script from pages


denis filipetti (denis@world.std.com)
Fri, 12 Feb 1999 14:44:43 -0500


This sounds interesting for my application as well. How does one use
<script> and </script> line noindex_start and noindex_stop, I didn't
realize that the tags was mapable.

Denis

At 05:22 PM 2/12/99 +0100, Marjolein Katsma wrote:
>
>Hugh,
>
>The new 3.1.0 release has the options noindex_start and noindex_stop to
>delimit sections of HTML documents to be ignored.
>
>If you can't edit the pages to enclose the scripts in (default) comment
>markers for noindex_start and noindex_stop, you could try to set
>noindex_start to <SCRIPT> and noindex_stop to </SCRIPT>. This would cause
>htdig to ignore all script sections (but not scripts embedded in tags).
>
>Hope this helps.
>
>At 17:34 1999-02-12 +1100, you wrote:
>>
>>Hi all,
>>
>>I have a growing number of sites that I need to index that have java script
>>in them. I need a way to strip out the javascript prior to it being
>>indexed by htdig.
>>
>>In the archive someone suggested the use of muffin (a proxy server) which
>>would be fine however it seems to require the presence of an X Windows
>>system which I will not install.
>>
>>I'm running FreeBSD, so if someone can suggest a PERL script or some other
>>way of doing this I would much appreciate it.
>>
>>Regards,
>>
>>Hugh Blandford.
>>------------------------------------
>>To unsubscribe from the htdig mailing list, send a message to
>>htdig@htdig.org containing the single word "unsubscribe" in
>>the SUBJECT of the message.
>>
>
>Marjolein Katsma webmaster@javawoman.com
>Java Woman - http://javawoman.com/
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>htdig@htdig.org containing the single word "unsubscribe" in
>the SUBJECT of the message.
>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Feb 17 1999 - 10:10:03 PST