Re: htdig: Protected server support

heddy Boubaker (
09 Apr 1998 11:53:22 +0200

 <> "Andrew" == Andrew Scherpbier <> writes:

Andrew> There is actually another method that may or may not be easier to
Andrew> maintain. htdig looks for the HTML meta tag name "htdig-noindex".
Andrew> So if the documents you do not want to cover in a search contain
Andrew> "<meta name=htdig-noindex
Andrew> value=foo>", they will not be found in a search.
Andrew> Unfortunately, this only covers HTML documents.

 hi Andrew,
 Ok! but that not enough, lets elaborate a little: Suppose we want to have 2
 database: 1 for our Intranet that will index everything that is accessible
 from our local net, and the other for the `Externet' (the Internet) for
 everything that is accessible from the outside only. Your solution is not
 good in this case because htdig will not index the document in both cases
 ... The only solution we have for now is to make htdig run under IP addresses
 matching the local/extern stuff (as explained in my previous msg). Maybe a
 new META will help, BTW it will be nice that htdig take into account some
 other few metas (it could be in the TODO list):
 new: name=DISTRIBUTION content="(external|extern)|(internal|intern|intranet|local)"
      Tell htdig what is the distribution of the document (htdig should know
      in what mode it is running, this must be new option to add).
 - these 2 following are often used by others search engines -
 use: DESCRIPTION, htsearch should use what is in the description meta tag
      instead of the title of the document.
 use: ROBOTS - there is a patch for that I think !
 Another thing that could be changed IMHO is what is displayed in long-format
 when no keyword is found in the description stored for a doc: currently "none
 of the keywords was found in the top of this document" is very confusing for
 users, they often think that ht://Dig is buggy and that it show documents not
 matching the request. We have to found a new message for that, maybe
 "keywords was found in this document but no description available" should be
 more clear ? What do you think of that ?

 Lastly, another thing to add maybe could be the generalization of the use of
 regexp instead of substring (exclude-url, limit-urls-to ...)
 BTW where can we find soundex, metaphone and endings rules for French (any
 other froggies out there ? ;-))
 ht://Dig is a very useful and very well written tool but it still need some
 very few little ameliorations to became the perfect search engine of our
 dreams, thanks a lot for it Andrew - hope you'll have time again to work on
 it next -.


- heddy -

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:01 PST