Re: htdig: Htsearch does not exclude bad_words
Wed, 13 May 1998 10:37:46 +0930 (CST)

On 12 May, John Lines wrote:
> Our documents which are indexed with htdig include some about the year 2000,
> and one of my users did a search for 'year 2000' and was surprised not to
> get anything back.
> I suspect that htdig excludes pure numbers from the words it collects, and
> so when he asked for 'year AND 2000' it didnt find anything - but that came
> as a bit of a surprise to the user. Htsearch can also be prevented from
> finding words which do really exist by including a word from the bad_words
> list in the search, for example 'free will' (assuming a database of
> philosphical
> texts.
> As a suggestion for a future enhancement it would be good if htsearch could
> identify that it was being asked to search for a noise word and either
> silently discard it, or better, tell the user 'ignored search for 2000'.
> John Lines

You can configure htdig to include numbers as words when it digs. You
need to add 'allow_numbers: true' to the htdig configuration file.


David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | AusEinet | Flinders University, ADELAIDE, SOUTH AUSTRALIA Visit the PHP mirror at

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:16 PST