[htdig] META tags not working

Brett Hansen (brett@annis.com)
Mon, 24 May 1999 16:14:05 -0400

We are currently have an issue with HTDIG where the meta tags just don't
seem to work.
this is what we are trying to use:<meta name="robots"

We run htdig with the correct parms and
        1st -- The page itself gets indexed (even though we said noindex)
        2nd -- The links on the page are not followed

This is our config file:

        database_dir: /net/testicsc/data/searchfiles
        database_base: ${database_dir}/junk


        start_url: http://test.icsc.org/srch/indexme.html

        limit_urls_to: http://test.icsc.org/srch/

        exclude_urls: /cgi-bin/ .cgi .19 .99 .98 /cases_imp/

        max_head_length: 10000

        search_algorithm: exact:1 synonyms:0.5 endings:0.1

        #stuff put in by mike to override the default header &footer stuff

        search_results_header: /net/netsitedocs/testicsc/searchheader.html
        search_results_footer: /net/netsitedocs/testicsc/searchfooter.html
        star_image: /graphics/star.gif
        star_blank: /graphics/star_blank.gif
        nothing_found_file: /net/netsitedocs/testicsc/searchnomatch.html

And here is the file we start indexing, called indexme.html:
              <meta name="robots" content="noindex,follow">
        <p>If you need something indexed that isn't pointed to from in srch, even
         it is in srch, add it here</p>
        <a href="/srch/cgi/memberprint?datafile=aprrer/current/index.html">Asia
        <a href="/srch/cgi/memberprint?datafile=logo/logo.html">logo page</a>

These are the commands we type to run htdig:

        # htdig -v -i -u xxxxx:xxxxx -c /etc/testicschtdigjunk.conf

        New server: test.icsc.org, 80
        0:0:0:http://test.icsc.org/srch/indexme.html: ++ size = 373
size = 14

:2:1:http://test.icsc.org/srch/cgi/memberprint?datafile=logo/logo.html: ---
--------------------- size = 9143

        # htmerge -c /etc/testicschtdigjunk.conf

When we do a search on "asia" the indexme.html page is the only thing that
shows up on
the search. Why is this? What am I doing something wrong?

Note: If we use the meta tag: <meta name="htdig-noindex"> the page doesn't
get indexed but
        there is no "name" command to force htdig to follow the links.

Any help would be great!

