Re: [htdig] Htmerge: "Deleted, invalid"


Subject: Re: [htdig] Htmerge: "Deleted, invalid"
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Fri Jul 14 2000 - 08:38:34 PDT


According to D.J.Adams@soton.ac.uk:
> I've run htdig -vv followed by htmerge -vvv and I still cannot see
> any reason why htmerge decides, apparently arbitrarily, that a page is
> invalid. None of the reasons given above seem to fit.

If you run htdig -vv, without a -i option, and you have an existing
database, then htdig will run an update dig, not an initial dig, so it's
possible that it will reindex churchpage4.html and churchpage5.html,
but not the others. Are you certain that these two pages don't appear
elsewhere in the htdig or htmerge logs, or for that matter that you're
starting out without an existing database?

> So I try an experiment: I reduce limit_urls_to include only the starting URL
> and http://www.tregalic.co.uk/sacred-heart/ and run htdig & htmerge.
...
> I do not accept that pages 4 & 5 just happened to unavailable on the
> first occasion and available on the second. Nor can I see any
> differences in the htdig logs for these pages. The same sizes are
> reported in both cases.

If there's an existing database in the first case, but not the second,
that may be the cause of the discrepancy. To be certain, use the -i
option to htdig in all test cases, and let us know if it still finds
these two pages as "invalid".

> I think there is a bug in htmerge 3.1.5 which causes it to declare
> some pages as "invalid" in some cases.

That may be, but I want to be sure we've ruled out every other possibility
first. I've never seen a bug report like this, so it would be very
unusual if it is indeed a bug showing up in your case, but not for
other users. If you can find a consistent test case that fails on
an initial dig, please provide details on your OS, version, config,
etc. so that we can look into this further.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Jul 14 2000 - 05:54:46 PDT