Re: [htdig3-dev] Current Status as of snapshot 3.2.0b3-091000


Subject: Re: [htdig3-dev] Current Status as of snapshot 3.2.0b3-091000
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Sun Sep 10 2000 - 15:46:02 PDT


At 10:50 AM -0700 9/10/00, Joe R. Jah wrote:
>I see that "Duplicate document detection while indexing" is missing from
>the list:

Yup. Toivo merged in some md5 code. So now the check_unique_md5 and
check_unique_date attributes should provide this, though it clearly
will need a fair amount of testing!

>I missed last week's status report. Does the ommission mean that the
>feature is already implemented in the code? I am trying hard to contain
>my excitement and jubilation;)

That's the idea. Definitely if you see something disappear from the
STATUS report and see a ChangeLog message about it. ;-)

Tue Aug 30 12:00:00 2000 Toivo Pedaste <toivo@ucs.uwa.edu.au>
          
        * htlibs/md5.cc, htlibs/md5.h: Generate md5 hash of
        a page and also optionally the modify date.
          
        * htlibs/mhash_md5.h, htlibs/mhash_md5.c, htlibs/libdefs.h:
        Md5 hash code from libmhash
          
        * htdig/Retriever.cc: Allow storing m5 hashes of pages
         in order to reject aliases.
          
        * htcommon/defaults.cc: Options "check_unique_md5" and
        "check_unique_date"
 
-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sun Sep 10 2000 - 15:49:01 PDT