Subject: Re: [htdig] excluding file trees from indexing process
From: Torsten Neuer (email@example.com)
Date: Wed Dec 01 1999 - 07:54:59 PST
Jens Moellenhoff wrote:
> firstname.lastname@example.org schrieb:
> > As you said, you have no access to the web server itself, but maybe to
> > the documents served? In this case you could place your own default
> > document in the directory which should then produce the directory lis-
> > tings.
> But I fear that if I place such a default index.htm in these
> directories, the subfolders of these directories won't be indexed,
> because they don't appear as links on this index.htm file.
As long as you don't place a link to the directory in that document..
> They should not appear in that file by any means, because I don't want
> the user to have an overview of the subdirectories, but I want these
> subdirectories to be indexed.
So what's the difference? Right now, you have automatically generated
index documents which allow everyone to have an overview on the
> As I stated at the beginning of this thread, I want absolutely no search
> result showing a directory tree.
That's where the robots exclusion standard comes in and that's why you
need to customize this default document.
Of course, you can also have another tool, gathering the URLs (i.e.
documents) to be indexed from the directory structure and include
this URL list in the start_urls directive of your Ht://Dig conf.
But I'm not sure if this is really required, since any auto-index
document which has a <META NAME="robots" CONTENT="noindex,follow">
in its header should do just that automatically.
-- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: email@example.com Internet: http://www.inwise.de
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this.
This archive was generated by hypermail 2b25 : Wed Dec 01 1999 - 08:07:54 PST