Re: [htdig] htdig and symbolic links


Joe R. Jah (jjah@cloud.ccsf.cc.ca.us)
Fri, 10 Sep 1999 09:12:53 -0700 (PDT)


On Fri, 10 Sep 1999, Nick O'Brien wrote:

> Date: Fri, 10 Sep 1999 15:13:20 +0100 (GMT Daylight Time)
> From: Nick O'Brien <N.G.J.OBrien@reading.ac.uk>
> To: htdig@htdig.org
> Subject: [htdig] htdig and symbolic links
>
>
> Hi,
>
> We are implementing htdig (v3.1.2 + the patch kit on Solaris 2.6) on our
> main web server. One comment we have had is that there are alot of
> duplicate search results pointing to the same web pages. This is usually
> caused by having several different Unix symbolic links pointing to the
> same directory/file in the web document tree.
>
> Is there any way we can prevent the indexing of these duplicates? I see
> from the mailing list archives that for previous versions of htdig there
> were patches to fix this issue but they are not available for the current
> version.
>
> I see from the bug database the latest advice is to eliminate symbolic
> links - however for many practical reasons it is not possible for us to
> do this.
>
>
> Is it for example possible to configure htdig to index our URLs via the
> filesystem instead of HTTP (i.e using local_urls) and to ignore the
> symbolic links?
>
> How are people on the list working round this problem? Or is this an
> unresolved bug I will need to (re)log with the htdig developers?

Our site is in the same boat that your site is in; I use the same old
patch for version 3.0.8b2, but I apply it manually at every new release.
You can get it from:

        ftp://sol.ccsf.cc.ca.us/htdig-patches/3.0.8b2/Retriever.cc.0

Then with an ugly extensive set of local_urls for each and every symbolic
link in the site:( I mange to suppress duplicates, quadruplicates, and
multuplicates;)

Boy, do I look forward to 3.2, which is promised to take care of the
menace of duplicates.

Regards,

Joe

-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        jjah@cloud.ccsf.cc.ca.us

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Sep 10 1999 - 09:16:08 PDT