Re: [htdig] Not indexing a word

Subject: Re: [htdig] Not indexing a word
From: Geoff Hutchison (
Date: Wed Feb 23 2000 - 09:58:55 PST

On Wed, 23 Feb 2000 wrote:

> Most of the large search engines I've seen no longer ignore short
> words and stopwords -- they just index everything. I realize it
> requires a lot more disk space (though there may be some clever ways
> around that), but it simplifies things both internally and for the
> end-users. That way, they can search for "To Be Or Not To Be" and
> find something!
> My rule for search engines is "no surprises", and I think there are
> enough legitimate instances of people needing to search two and even
> one-letter words that ht://Dig should allow that as an option.

It's *always* been an option. You can set minimum_word_length to 1 and it
will index everything. You can empty out the bad_words file (or stop words
for the rest of us) and it won't ingore any words.

I don't know if this will become default behavior--I've had too many
private questions about how little disk space and RAM they can get away
with. Of course the force of public opinion is always something to sway
defaults. :-)

-Geoff Hutchison
Williams Students Online

To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Wed Feb 23 2000 - 10:02:28 PST