Re: [htdig] is it possible to limit which section of the


Geoff Hutchison (ghutchis@wso.williams.edu)
Fri, 28 May 1999 19:47:45 -0400


At 2:59 PM -0400 5/26/99, Joseph Cheek wrote:
>is this a bug, or by design? i would expect update digging to reindex the
>pages, since they had been modified since the initial dig.

Indeed it should.

>as further consequence of this update-vs-initial dig problem, i have a web
>site that continually adds new pages to the site. update digs never see the

This is very odd. It sounds like there's some sort of miscommunication
between your server and ht://Dig. Basically, it seems like htdig is
assuming that these documents aren't changed when they have, in fact,
changed on disk.

So... My usual $0.02: try running htdig with some debugging turned on. In
this case, I'd go for 'htdig -v' which should show you whether documents
are reparsed, or what their status is. Basically, there are three possible
responses for a document:

1) Document has been downloaded and parsed. +++**--- are all indications of
part of the parsing:
0:2:0:http://www.htdig.org/: ++ size = 373

2) Document was in database, htdig sent If-Modified-Since header, and
server sent an 'unchanged' response.
0:2:0:http://www.htdig.org/: Not changed

3) Document was in database, htdig sent If-Modified-Since header, and
server sent document. Htdig did not index because Last-Modifed header was
the same as the date in the database.
0:2:0:http://www.htdig.org/: Retrieved but not changed

So I'm wondering if you're seeing #2 and #3 when you should be seeing #1...
What are the responses to your new pages?

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri May 28 1999 - 16:11:56 PDT