Re: [htdig] Improving quality of AND search results


Garret W. Gengler (garretg@otable.com)
Tue, 18 May 1999 09:22:02 -0500


You're right... they shouldn't.... I think I was actually seeing the results of an "or" search.

I had the search method set to "And"... changing it to all lowercase ("and") fixed the problem. I think somehow htdig was executing an "or" search by default (even though I think the docs say it defaults to "and".)

Thanks for the help,
-Garret

----- Original Message -----
From: <plucas@frost.com>
To: Garret W. Gengler <garretg@otable.com>
Cc: <htdig@htdig.org>
Sent: Monday, May 17, 1999 5:36 PM
Subject: Re: [htdig] Improving quality of AND search results

> I may be missing something really obvious here but surely documents B and C
> should not show up at all in the results of an "AND" search if they do not
> contain any occurrences of one of the search terms.
>
> Paul Lucas
> Frost & Sullivan
>
> "Garret W. Gengler" <garretg@otable.com> on 05/17/99 02:21:40 PM
>
> To: htdig@htdig.org
> cc: (bcc: Paul Lucas/Electronic Delivery -
> FSCA/Mountain_View_CA/US/Frost & Sullivan)
>
> Subject: [htdig] Improving quality of AND search results
>
> I'm looking for a way to improve the quality of htdig search results as
> follows...
>
> If I do an "AND" search of a document htdig appears to rate documents by
> the total count of any of the three keywords. I'd like htdig to give the
> highest score to a document that contains every keyword, even if it just
> contains one of each... then after that, it can start using the keyword
> count method.
>
> Here's an example... an AND search with the keywords "composite material
> skeleton"...
>
> Document A:
> "composite" occurs 1 time
> "material" occurs 2 times
> "skeleton" occurs 1 time
>
> Document B:
> "composite" occurs 15 times
> "material" occurs 2 times
> "skeleton" never occurs
>
> Document C:
> "composite" occurs 4 times
> "material" occurs 5 times
> "skeleton" never occurs
>
> In this example, I'd like Document A to get the highest score, then B, then
> C... Currently, htdig returns rates B highest, then C, then A.
>
> Are there any configuration directives that might help me to adjust this
> behavior?
>
> -Garret Gengler
> RoundTable Media
> garretg@otable.com

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Tue May 18 1999 - 07:40:22 PDT