Re: htdig: Update digs using Apache?


webmaster@www.nisu.flinders.edu.au
Mon, 23 Mar 1998 11:51:46 +0930 (CST)


On 21 Mar, Geoff Hutchison wrote:
> Hi,
>
> Recently I've been looking at the logs to see exactly what htdig is
> retrieving. I normally run "update" digs (i.e. without the -i flag). But
> htdig retrieves *everything*, no matter whether it has changed or not.
>
> I think I see the problem but don't know enough about the HTTP protocol to
> make heads or tails. I'm running Apache (1.3b5, though I'm confident it
> doesn't matter what version). When htdig does an update dig, it does a
> "GET" command (off the topic, perhaps update digs should start with a HEAD
> command?). If it has a date available, it sends it in the request.
>
> If the server returns status "304" then htdig skips the document.
>
> My questions are:
> 1) What is status code 304?

>From the Http 1.0 docs (should be the same for 1.1)

304 Not modified: If the client has performed a conditional GET request
and access is allowed, but the document has not been modified since the
date and time specified in the If-Modified-Since field, the server shall
respond with this status code and not send an Entity-Body to the client.
Header fields contained in the response should only include information
which is relevant to cache managers and which may have changed
independently of the entity's Last-Modified date. Examples of relevant
headers include Date, Server and Expires.

> 2) Does Apache support the "If-Modified-Since:" query?

Looks like it from the source (http_protocol.c)

> 3) Do people doing update digs see debug messages of "not changed" when
> running in verbose mode? (I don't.)

Can't remember :-) been a while since I ran one in verbose mode. You
could try -vv?

I don't think using HEAD would be a good idea, from my skimming of the
HTTP protocol. It appears that an If-Modified-Since header is included
with a HEAD request, it is ignored. Seems like htdig is capable of
sending an If-Modified-Since and a date, so it would have to use GET for
that to work. I suspect that the date of each document, as it was
indexed last, is kept in the document database so that date can be used
in the If-Modified-Since request.

Cheers

-- 
David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet/flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:25:50 PST