[htdig] META tags not working


Brett Hansen (brett@annis.com)
Mon, 24 May 1999 16:14:05 -0400


We are currently have an issue with HTDIG where the meta tags just don't
seem to work.
this is what we are trying to use:<meta name="robots"
content="noindex,follow">

We run htdig with the correct parms and
        1st -- The page itself gets indexed (even though we said noindex)
        2nd -- The links on the page are not followed

This is our config file:

        database_dir: /net/testicsc/data/searchfiles
        database_base: ${database_dir}/junk

        #/var/lib/htdig

        start_url: http://test.icsc.org/srch/indexme.html

        limit_urls_to: http://test.icsc.org/srch/

        exclude_urls: /cgi-bin/ .cgi .19 .99 .98 /cases_imp/
/articles_imp/

        max_head_length: 10000

        search_algorithm: exact:1 synonyms:0.5 endings:0.1

        #stuff put in by mike to override the default header &footer stuff

        search_results_header: /net/netsitedocs/testicsc/searchheader.html
        search_results_footer: /net/netsitedocs/testicsc/searchfooter.html
        star_image: /graphics/star.gif
        star_blank: /graphics/star_blank.gif
        nothing_found_file: /net/netsitedocs/testicsc/searchnomatch.html

And here is the file we start indexing, called indexme.html:
        <html>
        <head>
              <meta name="robots" content="noindex,follow">
        </head>
        <body>
        <p>If you need something indexed that isn't pointed to from in srch, even
though
         it is in srch, add it here</p>
        <a href="/srch/cgi/memberprint?datafile=aprrer/current/index.html">Asia
Pacific
        Report</a>
        <a href="/srch/cgi/memberprint?datafile=logo/logo.html">logo page</a>
        </body>
        </html>

These are the commands we type to run htdig:

        # htdig -v -i -u xxxxx:xxxxx -c /etc/testicschtdigjunk.conf

        New server: test.icsc.org, 80
        0:0:0:http://test.icsc.org/srch/indexme.html: ++ size = 373
        1:1:1:http://test.icsc.org/srch/cgi/memberprint?datafile=aprrer/current/:
size = 14

:2:1:http://test.icsc.org/srch/cgi/memberprint?datafile=logo/logo.html: ---
--------------------- size = 9143

        # htmerge -c /etc/testicschtdigjunk.conf

When we do a search on "asia" the indexme.html page is the only thing that
shows up on
the search. Why is this? What am I doing something wrong?

Note: If we use the meta tag: <meta name="htdig-noindex"> the page doesn't
get indexed but
        there is no "name" command to force htdig to follow the links.

Any help would be great!

Brett Hansen
The Annis Group
Network Support Technician
email: brett@annis.com

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Mon May 24 1999 - 12:30:56 PDT