Geoff Hutchison (ghutchis@wso.williams.edu)
Wed, 20 Jan 1999 08:55:06 -0400
* List: htdig3-dev@sob.htdig.org
Here's a summary of rebuilding my databases from scratch, before and after
the StringMatch changes.
Before:
-rw-r--r-- 1 htdig htdig 199081984 Jan 19 05:44 db.docdb
-rw-r--r-- 1 htdig htdig 199081984 Jan 19 05:40 db.docdb.work
-rw-r--r-- 1 htdig htdig 8492032 Jan 19 05:40 db.docs.index
-rw-r--r-- 1 htdig htdig 122348000 Jan 19 05:31 db.wordlist.work
-rw-r--r-- 1 htdig htdig 112433152 Jan 19 05:31 db.words.db
(No run output available, around 57,000 documents from both htdig and htmerge)
After:
-rw-r--r-- 1 htdig htdig 90511360 Jan 20 07:45 db.docdb
-rw-r--r-- 1 htdig htdig 90511360 Jan 20 07:44 db.docdb.work
-rw-r--r-- 1 htdig htdig 3305472 Jan 20 07:43 db.docs.index
-rw-r--r-- 1 htdig htdig 38475835 Jan 20 07:41 db.wordlist.work
-rw-r--r-- 1 htdig htdig 37135360 Jan 20 07:41 db.words.db
htdig: Run complete
htdig: 1 server seen:
htdig: wso.williams.edu:80 52906 documents
htdig: Errors to take note of:
htmerge: Total word count: 86809
htmerge: Total documents: 22320
htmerge: Total doc db size (in K): 114880
While I doubt there are any duplicate documents in the dbs after htmerge,
there seem to be *missing* documents. Is anyone else concerned about the
huge difference between htdig and htmerge?
-Geoff
This archive was generated by hypermail 2.0b3 on Thu Feb 04 1999 - 22:13:08 PST