Re: [htdig3-dev] Problem: same urls in doc database


Subject: Re: [htdig3-dev] Problem: same urls in doc database
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Mon Feb 21 2000 - 08:27:26 PST


On Mon, 21 Feb 2000, Valdas Andrulis wrote:

> also i have 2 servers, one have about 50 documents, all are static, and
> one server has only one document, which is dynamic.
...
> document count increases by one, i.e. at first there are one
> http://anotherserver/ url, then two urls, three urls, etc.

Hmm. Could you try changing one of the static URLs? The databases changed
quite a bit, forcing a rewrite of how documents are purged. I'm curious to
know if there's a bug in this new code.

> I think there something with last record in database. When pushing urls
> already in database each second run it dosn't push url
> http://anotherserver/ from database, but insted from start_urls, i.e.
> that means that thereis no such url in database! This is weired!?

There's usually another "last record" where ht://Dig keeps some special
information (namely the number of records!) so I'm not sure it's that
exactly. I'm not quite sure how you know whether it's coming from
start_urls or the database. From your what you said, it will load the
first server, grab 30 pages from it, then go to the other server. It will
do this whether you get it from the database or not.

The question is whether it's sending an If-Modified-Since header. It will
do this if it has the document in the DB. I don't know how much debugging
you need to turn on to see that, I tihnk it's somewhere around -vvvvv.

Also, have you tried setting a config file for just this server? Do you
see the same behavior if it's the only URL in the database?

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Feb 21 2000 - 08:30:51 PST