Re: [htdig] Improving quality of AND search results

Garret W. Gengler (
Tue, 18 May 1999 09:22:02 -0500

You're right... they shouldn't.... I think I was actually seeing the results of an "or" search.

I had the search method set to "And"... changing it to all lowercase ("and") fixed the problem. I think somehow htdig was executing an "or" search by default (even though I think the docs say it defaults to "and".)

Thanks for the help,

----- Original Message -----
From: <>
To: Garret W. Gengler <>
Cc: <>
Sent: Monday, May 17, 1999 5:36 PM
Subject: Re: [htdig] Improving quality of AND search results

> I may be missing something really obvious here but surely documents B and C
> should not show up at all in the results of an "AND" search if they do not
> contain any occurrences of one of the search terms.
> Paul Lucas
> Frost & Sullivan
> "Garret W. Gengler" <> on 05/17/99 02:21:40 PM
> To:
> cc: (bcc: Paul Lucas/Electronic Delivery -
> FSCA/Mountain_View_CA/US/Frost & Sullivan)
> Subject: [htdig] Improving quality of AND search results
> I'm looking for a way to improve the quality of htdig search results as
> follows...
> If I do an "AND" search of a document htdig appears to rate documents by
> the total count of any of the three keywords. I'd like htdig to give the
> highest score to a document that contains every keyword, even if it just
> contains one of each... then after that, it can start using the keyword
> count method.
> Here's an example... an AND search with the keywords "composite material
> skeleton"...
> Document A:
> "composite" occurs 1 time
> "material" occurs 2 times
> "skeleton" occurs 1 time
> Document B:
> "composite" occurs 15 times
> "material" occurs 2 times
> "skeleton" never occurs
> Document C:
> "composite" occurs 4 times
> "material" occurs 5 times
> "skeleton" never occurs
> In this example, I'd like Document A to get the highest score, then B, then
> C... Currently, htdig returns rates B highest, then C, then A.
> Are there any configuration directives that might help me to adjust this
> behavior?
> -Garret Gengler
> RoundTable Media

