Re: [htdig] how does the dig the ranking ?


Subject: Re: [htdig] how does the dig the ranking ?
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Mon May 15 2000 - 21:27:52 PDT


At 12:22 PM -0500 5/10/00, Gilles Detillieux wrote:
> > Once Again: Can anyone tell me how the ranking is calculated by the
> > dig-algorithm? is there a formula? does it matter at which position in the
> > doc the search-term is found?
>
>In the 3.1.x series, position does matter. Words at the top of the document
>are ranked higher than words closer to the end. My understanding is that
>this is no longer the case in the 3.2 betas. As for the actual formulae,
>I don't really know. Perhaps someone else can shed more light?

Sorry I didn't get to this sooner. The actual formula is a bit
complicated. In 3.1.x, the formula for a word factor in a document is
something like this:

score = Sum(all occurrences)
        [1000-(word location)/1000] * _factor

where _factor is the appropriate factor for the given word (e.g.
text_factor, keyword_factor, ...)

It's a little more complicated when you consider backlink_factor and
date_factor, but this is about it.

The 3.2 betas are a bit more complicated and there are still some
scoring issues to be cleaned up, but the result should be much better.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon May 15 2000 - 19:19:54 PDT