Joe R. Jah (email@example.com)
Fri, 10 Sep 1999 09:12:53 -0700 (PDT)
On Fri, 10 Sep 1999, Nick O'Brien wrote:
> Date: Fri, 10 Sep 1999 15:13:20 +0100 (GMT Daylight Time)
> From: Nick O'Brien <N.G.J.OBrien@reading.ac.uk>
> To: firstname.lastname@example.org
> Subject: [htdig] htdig and symbolic links
> We are implementing htdig (v3.1.2 + the patch kit on Solaris 2.6) on our
> main web server. One comment we have had is that there are alot of
> duplicate search results pointing to the same web pages. This is usually
> caused by having several different Unix symbolic links pointing to the
> same directory/file in the web document tree.
> Is there any way we can prevent the indexing of these duplicates? I see
> from the mailing list archives that for previous versions of htdig there
> were patches to fix this issue but they are not available for the current
> I see from the bug database the latest advice is to eliminate symbolic
> links - however for many practical reasons it is not possible for us to
> do this.
> Is it for example possible to configure htdig to index our URLs via the
> filesystem instead of HTTP (i.e using local_urls) and to ignore the
> symbolic links?
> How are people on the list working round this problem? Or is this an
> unresolved bug I will need to (re)log with the htdig developers?
Our site is in the same boat that your site is in; I use the same old
patch for version 3.0.8b2, but I apply it manually at every new release.
You can get it from:
Then with an ugly extensive set of local_urls for each and every symbolic
link in the site:( I mange to suppress duplicates, quadruplicates, and
Boy, do I look forward to 3.2, which is promised to take care of the
menace of duplicates.
-- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah email@example.com
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org containing the single word unsubscribe in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri Sep 10 1999 - 09:16:08 PDT