Re: [htdig] How can I remove some keywords, from search results?

Subject: Re: [htdig] How can I remove some keywords, from search results?
From: Torsten Neuer (
Date: Sat May 06 2000 - 12:49:39 PDT

> Tasos Angelis wrote:
> I noticed that in the search results only html tags are removed.
> But, how can I remove some other unwanted words from the results of
> htsearch?
> For example in a site that htdig indexes there are some javascript
> functions.
> I want to selectivly remove those functions.
> e.g. .... something something something .... findit();.... something
> to be:
> .... something something something .... .... something
> without findit(); in it.
> There are 10 or little more of those functions.

You should put anything inside of <SCRIPT> tags in SGML comments.
This is not only valid vof Ht://Dig, but also for most other robot
software which else will interpret the contents of <SCRIPT> as normal
text. Another approach to leave it out would be to put the JavaScript
code in an external file (<SCRIPT SRC="file.js">). I don't know any
crawler that will follow the SRC-attribute of the <SCRIPT> tag.
You can also configure Ht://Dig to ignore anything inside <SCRIPT>
by using the noindex_start/noindex_end configuration attributes.


InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail:            Internet:

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Sat May 06 2000 - 10:37:47 PDT