[htdig] Ignore robots.txt and META-Tags.

Subject: [htdig] Ignore robots.txt and META-Tags.
From: Sven Hartge (hartge@ds9.argh.org)
Date: Mon Jan 24 2000 - 18:24:16 PST


We want to index our local server using htdig. There is only one slight
problem: We have several META-tags in our files, to prevent external
search-enginges to crawl through the whole server. Now, htdig honors
these tags too and indexes only some pages. I used an old release and
manually patched the check for no-index out of the source, but this is
way to much work to do, if I need to upgrade the htdig version (and is
definately not a right thing [tm]). I've read the documentation which
comes with htdig and also searched through the website, but ... I am
_sure_ I am missing something here.

Oh, and a next one: Is it possible to search for words containing umlauts
() if there are _no_ locales installed? I do not have root-access so
I won't be able to install them in the right places. Will htdig work, if
they are installed somewhere in the /home-directory of the user?


Sv'pardon my bad english'en

Letzte Worte des Biologen: "Die Schlange kenne ich, die ist nicht giftig."

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Mon Jan 24 2000 - 18:27:01 PST