Re: [htdig] Duplicates

Subject: Re: [htdig] Duplicates
From: Geoff Hutchison (
Date: Wed Dec 27 2000 - 19:21:10 PST

On Sat, 23 Dec 2000, Ing. Noel Vargas Baltodano wrote:

> I've succesfully ran Htdig, and it scanned every file I wanted to. The
> only thing now is that I get several duplicates.
> Is there a way to tell Htdig to display 'unique' URLs only?

It *does* only display unique URLs. If you see two URLs that are exactly
(i.e. character for character) the same in htsearch, there's a bug.

On the other hand, it's very easy to have multiple URLs point to the same
document. This is the most common problem of "duplicates." If you are
willing to try a beta, grab the latest snapshot of the 3.2.0b3 code and
look at the RELEASE.html file in it. There is now code to compute an md5
checksum to eliminate this problem.

-Geoff Hutchison
Williams Students Online

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Wed Dec 27 2000 - 19:32:43 PST