Re: [htdig] databases (3.1.5)


Subject: Re: [htdig] databases (3.1.5)
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Fri Mar 17 2000 - 08:54:14 PST


According to Sphboc@aol.com:
> Is there some documentation on the format/content of the databases, as
> produced by htdig and htmerge?
>
> What I'd like to be able to do, if feasible, is to tell, from the databases
> themselves which url's have been indexed, and ideally the date on which this
> was done.

I don't think there's much documentation on the specific format of the
databases, other than the source code. I don't think the date on which
a document was last indexed is stored, but the last modified date is
stored in db.docdb. This date will be the date indexed for documents
where the server doesn't return a last modified date, e.g. for dynamic
content.

It would probably be pretty easy to build a simple docdb dumping tool
out of htnotify, which does a simple traversal through the database.
You could get it to output any field you want from the "DocumentRef"
object. Apart from that, I don't think any such tool exists yet, though
its on the to-do list for 3.2.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Mar 17 2000 - 07:51:34 PST