Re: htdig: Htsearch does not exclude bad_words


webmaster@www.nisu.flinders.edu.au
Wed, 13 May 1998 10:37:46 +0930 (CST)


On 12 May, John Lines wrote:
> Our documents which are indexed with htdig include some about the year 2000,
> and one of my users did a search for 'year 2000' and was surprised not to
> get anything back.
>
> I suspect that htdig excludes pure numbers from the words it collects, and
> so when he asked for 'year AND 2000' it didnt find anything - but that came
> as a bit of a surprise to the user. Htsearch can also be prevented from
> finding words which do really exist by including a word from the bad_words
> list in the search, for example 'free will' (assuming a database of
> philosphical
> texts.
>
> As a suggestion for a future enhancement it would be good if htsearch could
> identify that it was being asked to search for a noise word and either
> silently discard it, or better, tell the user 'ignored search for 2000'.
>
>
> John Lines
>

You can configure htdig to include numbers as words when it digs. You
need to add 'allow_numbers: true' to the htdig configuration file.

Cheers

-- 
David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet.flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA Visit the PHP mirror at http://au.php.net:81/

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:16 PST