Subject: [htdig] db sizes
From: justin (email@example.com)
Date: Sun Aug 06 2000 - 19:03:00 PDT
I have got htdig running perfectly now. It is updating the index without
re-reading all files:) The only problem I am having is that the db
files are very large. These are the db files for ~600M of archived html
search_algorithm: exact:1 synonyms:0.5 endings:0.1
to just exact:1 make the db any smaller?
I am also thinking the db are large not because of htdig but because of
the email. I had used postal, a smtp benchmark to send the 600M of
mail. Postal does not send english words but random ASCII garbage,
Could this be why the db files are so large?
Attached is a sample html email generated from postal.
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Sun Aug 06 2000 - 21:02:51 PDT