Re: [htdig] Going for the big dig

Subject: Re: [htdig] Going for the big dig
From: Terry Collins (
Date: Tue Dec 19 2000 - 12:20:25 PST

Gilles Detillieux wrote:


> I think you misunderstood. htdig already does read the robots.txt file
> and skips all disallowed documents.

Woops, my apologies for that gaff, my brain has started the holiday
season without me {:-).
Actually, I given up remembering how you do/I did anything under linux -
with versions every three months, it is all different everytime I look
at something.

You are correct about that as I now remember having to look at this in
detail as my robots.txt excludes all the lists I archive on site from
indexing bots and htdig very obediently acted on this. I wanted htdig to
actually index the contents of these lists, but exclude everything else,
which it now does quite nicely.

> Actually, on my site I don't bother with exclude_urls at all, and use the
> robots.txt file instead. This way, anything that I don't want indexed by
> htdig won't be indexed by any other search engine either.

I wish all search engines did obey robots text.

Thanks for the development effort with htdig. Very useful app.

   Terry Collins {:-)}}} Ph(02) 4627 2186 Fax(02) 4628 7861  
   email:  www:  
   WOA Computer Services <lan/wan, linux/unix, novell>

"People without trees are like fish without clean water"

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Tue Dec 19 2000 - 13:31:43 PST