Re: [htdig] index always scores 100


Subject: Re: [htdig] index always scores 100
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Thu Sep 07 2000 - 10:58:51 PDT


According to Geoff Hutchison:
> At 9:15 AM -0500 9/5/00, Ted Stresen-Reuter wrote:
> >If you want, go to http://www.chicagophilanthropy.com/search/ and enter the
> >word "kraft" as the search term and you'll see what I mean. I've tried
> >deleting the databases and indexing again, but I still got the same
> >results....
>
> So here's the answer. I poured through your verbose output and found
> a few links like this:
>
> href: http://www.chicagophilanthropy.com/ (Published: March 1998 Kraft
> Foods, Inc. names Amina Dickerson ...)
>
> So this is where it's getting "Kraft"--from the link text. You can
> turn this off using description_factor since it doesn't seem to be
> working very well in your case. Usually the text of links is fairly
> accurate as a description of the page (or it's so general that it's
> not likely to show up in searches like "click here.")
>
> In any case, the combination of this and possibly backlink_factor are
> probably the reason you're getting these "phantom" matches.

It's strange that I didn't find any documents containing links like
the one above when I searched for "kraft" on his web site. Do these
documents contain any <meta name="robots" content="noindex,follow">
tags, or does his search form use a hidden "restrict" or "exclude" field
that I didn't notice? My understanding is that link description text
is supposed to appear in the index for both the hyperlinked document,
using description_factor, and the document containing the link, using
text_factor.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Thu Sep 07 2000 - 11:00:48 PDT