Subject: Re: [htdig] databases (3.1.5)
From: Gilles Detillieux (firstname.lastname@example.org)
Date: Fri Mar 17 2000 - 08:54:14 PST
According to Sphboc@aol.com:
> Is there some documentation on the format/content of the databases, as
> produced by htdig and htmerge?
> What I'd like to be able to do, if feasible, is to tell, from the databases
> themselves which url's have been indexed, and ideally the date on which this
> was done.
I don't think there's much documentation on the specific format of the
databases, other than the source code. I don't think the date on which
a document was last indexed is stored, but the last modified date is
stored in db.docdb. This date will be the date indexed for documents
where the server doesn't return a last modified date, e.g. for dynamic
It would probably be pretty easy to build a simple docdb dumping tool
out of htnotify, which does a simple traversal through the database.
You could get it to output any field you want from the "DocumentRef"
object. Apart from that, I don't think any such tool exists yet, though
its on the to-do list for 3.2.
-- Gilles R. Detillieux E-mail: <email@example.com> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Fri Mar 17 2000 - 07:51:34 PST