[htdig3-dev] Re: [htdig3-dev] StringMatch and duplicate documents


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Wed, 20 Jan 1999 09:59:02 -0600 (CST)


* List: htdig3-dev@sob.htdig.org

According to Geoff Hutchison:
> Here's a summary of rebuilding my databases from scratch, before and after
> the StringMatch changes.
>
> Before:
..
> (No run output available, around 57,000 documents from both htdig and htmerge)
>
> After:
..
> htdig: Run complete
> htdig: 1 server seen:
> htdig: wso.williams.edu:80 52906 documents
> htdig: Errors to take note of:
>
> htmerge: Total word count: 86809
> htmerge: Total documents: 22320
> htmerge: Total doc db size (in K): 114880
>
>
> While I doubt there are any duplicate documents in the dbs after htmerge,
> there seem to be *missing* documents. Is anyone else concerned about the
> huge difference between htdig and htmerge?

Huston, we have a problem... :) Did you try the StringMatch patches in
isolation? I'm wondering if the first or second patch is the problem, or
both.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930



This archive was generated by hypermail 2.0b3 on Thu Feb 04 1999 - 22:24:19 PST