Re: [htdig] How does the DIG the ranking?


Subject: Re: [htdig] How does the DIG the ranking?
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Wed Mar 08 2000 - 07:23:26 PST


At 3:18 PM +0100 3/8/00, Thomas Albl wrote:
>BTW Is there a description how the DIG generates the ranking value (in
>accordiance to the used algorithms)?

Not really, though I'll probably be writing about that to the
developer list this weekend as part of my series of overviews.

Here's the basics. When indexing, the context of the word counts
based on the appropriate *_factor attribute. In versions before 3.2,
ht://Dig also counts the location of the word, based on the character
position scaled from 1-1000. (I might have the details a bit off, I
haven't looked at the 3.1 code in a while.) This has an inverse
linear relationship--the closer to the top, the higher the score.

Then the score of all of a given word is added up for a document and
this is stored in the word database.

The picture is fairly different in 3.2, but that's not really your question.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Mar 08 2000 - 07:29:48 PST