Re: [htdig] Last-Modified


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Fri, 15 Oct 1999 11:57:32 -0500 (CDT)


According to Jim Cole:
> I have been tracking down a problem involving pages that HAVE been
> modified being pulled by HtDig and then getting passed over with a
> "retrieved but not changed" message. After looking at the code and
> manually poking the server in question, it looks like this problem
> is due to the server not returning a Last-Modified header.
>
> Is it true that doc->ModTime() and ref->DocTime() will always be
> the same if the server is not returning a Last-Modified header? And
> thus deciding that the document has not changed?
>
> If so, should modification_time_is_now: true fix this problem?

Yes, if you're running htdig 3.1.3, or the 3.2 development code.
The handling of modification_time_is_now was broken in 3.1.2 and earlier.
However, with modification_time_is_now: true htdig will always assume
the document has just been modified if it doesn't get a Last-Modified
header, so it will always reindex everything on that site. If you have
any control over that server, I'd recommend dumping the defective HTTP
server and replacing it with one that does put out Last-Modified headers.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Sat Oct 23 1999 - 09:48:24 PDT