Re: [htdig3-dev] Problem: same urls in doc database

From: Valdas Andrulis (
Date: Mon Feb 21 2000 - 09:59:14 PST


It is easier to setup a web server with one document yourself, than to
me to explain all the stuff. Document does not have to be dynamic, you
can change modification date by hand.

I have tried one server one document setup with the latest cvs code, the
same weired behavior.



On Mon, 21 Feb 2000, Geoff Hutchison wrote:

GH> On Mon, 21 Feb 2000, Valdas Andrulis wrote:
GH> > also i have 2 servers, one have about 50 documents, all are static, and
GH> > one server has only one document, which is dynamic.
GH> ...
GH> > document count increases by one, i.e. at first there are one
GH> > http://anotherserver/ url, then two urls, three urls, etc.
GH> Hmm. Could you try changing one of the static URLs? The databases changed
GH> quite a bit, forcing a rewrite of how documents are purged. I'm curious to
GH> know if there's a bug in this new code.
GH> > I think there something with last record in database. When pushing urls
GH> > already in database each second run it dosn't push url
GH> > http://anotherserver/ from database, but insted from start_urls, i.e.
GH> > that means that thereis no such url in database! This is weired!?
GH> There's usually another "last record" where ht://Dig keeps some special
GH> information (namely the number of records!) so I'm not sure it's that
GH> exactly. I'm not quite sure how you know whether it's coming from
GH> start_urls or the database. From your what you said, it will load the
GH> first server, grab 30 pages from it, then go to the other server. It will
GH> do this whether you get it from the database or not.
GH> The question is whether it's sending an If-Modified-Since header. It will
GH> do this if it has the document in the DB. I don't know how much debugging
GH> you need to turn on to see that, I tihnk it's somewhere around -vvvvv.
GH> Also, have you tried setting a config file for just this server? Do you
GH> see the same behavior if it's the only URL in the database?
GH> -Geoff

