Re: [htdig] Recrusiv Digging

Geoff Hutchison (
Mon, 21 Jun 1999 21:58:21 -0400 (EDT)

On Mon, 21 Jun 1999, Michael Reutlinger wrote:

> The Problem is, that htDig doesn't realize, that
> it already saw a page.
> Wouldn't it be usefull to exclude pages the engine
> already saw in one indexing run ??

It does realize it saw a page. However, it's criteria is based on the URL.
So if you have several URLs pointing to the same document, you're going to
get duplicates. More powerful duplicate elimination code is in the works.

I had a discussion about this at work the other day. Personally, I really
prefer redirects to one canonical URL on the webserver. This makes it much
easier for statistics, search engines, and even site organization. But I'm
also aware it's a personal preference.

-Geoff Hutchison
Williams Students Online

To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Mon Jun 21 1999 - 18:18:22 PDT