Subject: Re: [htdig] excluding file trees from indexing process
From: Torsten Neuer (email@example.com)
Date: Tue Nov 30 1999 - 05:17:25 PST
Jens Moellenhoff wrote:
> This may be just another one of these newbie questions, but how can I
> exclude virtual file trees from being indexed? Whenever I enter the
> keyword "index" in my search form, it returns a lot of hits like
> "Index of folder1/folder2/folder3/" and shows the folder's index when I
> click on one of these hits.
> I know this can be avoided (e. g. by using "exclude_urls" or
> "bad_extensions"?), but not how exactly. I searched the mailing list
> database for I don't know how long and read the FAQ, but I don't have a
> clue yet.
If you need the virtual trees to be walked by the indexer (e.g. in order
to fetch some non-HTML documents from them), you cannot use the
directive of Ht://Dig. Since the index is generated automatically by
web server, you need to add some indexer control information to this
generation of index documents.
A portable approach would be to back off from automatical indexing by
web server and switch to some server side scripting (server-parsed HTML,
PHP, ASP or some CGI) which produce the directory listings (this would
also allow you to add some design to it). These listings should include
a proper "robots" meta tag (or be stuffed with Ht://Dig specific indexer
control) to control the dig process.
For the Apache web server, you could also hack the mod_autoindex to
also include robots control.
-- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: firstname.lastname@example.org Internet: http://www.inwise.de
------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com You will receive a message to confirm this.
This archive was generated by hypermail 2b25 : Tue Nov 30 1999 - 05:30:10 PST