Geoff Hutchison (ghutchis@wso.williams.edu)
Wed, 20 Jan 1999 11:13:53 -0500 (EST)
* List: htdig3-dev@sob.htdig.org
On Wed, 20 Jan 1999, Gilles Detillieux wrote:
> > While I doubt there are any duplicate documents in the dbs after htmerge,
> > there seem to be *missing* documents. Is anyone else concerned about the
> > huge difference between htdig and htmerge?
>
> Huston, we have a problem... :) Did you try the StringMatch patches in
> isolation? I'm wondering if the first or second patch is the problem, or
> both.
Alas, I tried them at the same time--I'm running the current CVS tree.
I'm going to start debugging by running just htdig, which returned a
number of documents in the right ballpark (I know I have around 50,000
webpages based on link checking.)
Then I'm going to take a look at the db and put some debugging code into
htmerge.
Has anyone else noticed missing pages?
-Geoff
This archive was generated by hypermail 2.0b3 on Thu Feb 04 1999 - 22:24:19 PST