Re: [htdig] Speed of Indexing and size of Index


Subject: Re: [htdig] Speed of Indexing and size of Index
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Wed Jun 07 2000 - 06:48:25 PDT


At 11:23 AM -0700 6/6/00, amit tewari wrote:
>I am trying to evaluate htDig for the speed of
>indexing, size of indexes and speed of search ,

OK, you also need to mention what version you are using, since this
makes a difference. :-) For example, the beta and developmental 3.2
code is slower at indexing right now, has smaller databases and
faster search.

>Size of the Index diecroty was about 300 MB ( which
>about 1/3 the size of the documents ) I find it too
>big specially if I consider some other engines which
>give the index size of about 1/10 of the document size

Again, this depends on version. Also, you might want to turn on
excerpt compression with:
compression_level: 6
# or higher--this essentially turns on gzip compression

Also, these "indexes" include much more than the indexes themselves,
though you've certainly cut back on the stored excerpts drastically
with your max_head_length setting. Some search engines simply store a
word index, which you'll note is quite small.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Jun 07 2000 - 04:41:08 PDT