Re: [htdig] index always scores 100

Subject: Re: [htdig] index always scores 100
From: Gilles Detillieux (
Date: Thu Sep 07 2000 - 10:58:51 PDT

According to Geoff Hutchison:
> At 9:15 AM -0500 9/5/00, Ted Stresen-Reuter wrote:
> >If you want, go to and enter the
> >word "kraft" as the search term and you'll see what I mean. I've tried
> >deleting the databases and indexing again, but I still got the same
> >results....
> So here's the answer. I poured through your verbose output and found
> a few links like this:
> href: (Published: March 1998 Kraft
> Foods, Inc. names Amina Dickerson ...)
> So this is where it's getting "Kraft"--from the link text. You can
> turn this off using description_factor since it doesn't seem to be
> working very well in your case. Usually the text of links is fairly
> accurate as a description of the page (or it's so general that it's
> not likely to show up in searches like "click here.")
> In any case, the combination of this and possibly backlink_factor are
> probably the reason you're getting these "phantom" matches.

It's strange that I didn't find any documents containing links like
the one above when I searched for "kraft" on his web site. Do these
documents contain any <meta name="robots" content="noindex,follow">
tags, or does his search form use a hidden "restrict" or "exclude" field
that I didn't notice? My understanding is that link description text
is supposed to appear in the index for both the hyperlinked document,
using description_factor, and the document containing the link, using

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Thu Sep 07 2000 - 11:00:48 PDT