Re: [htdig] server_aliases


Alexander Bergolth (leo@strike.wu-wien.ac.at)
Fri, 19 Mar 1999 21:30:46 +0100 (MEZ)


Hi!

On Fri, 19 Mar 1999, Patrick Dugal wrote:

> Without being too technically specific, it's possible that
> there exists two different URLs for the exact same document
> on the same server. For example,
> http://www.foo.com/index.html and
> http://specific.machine.foo.com/index.html can be the same
> file. Right?
>
> Does anybody have any idea which of the two URLs will be
> returned to the user when he/she does a search supposing I
> have configured the indexer using server_aliases to
> recognize the aliase(s) properly?

I have several machines, each with some aliases that all access the same
document space.
Each document should appear only once, showing www.wu-wien.ac.at (an alias
for the main web-server) as server.

To let htdig normalize the url and take the canonical name of each server
you have to add
allow_virtual_hosts: false

.. to translate this name to the preferred name, you add
server_aliases: `${config_dir}/aliases`

..and an aliases-file with the following contents:

speth08.wu-wien.ac.at:80=www.wu-wien.ac.at:80
speth09.wu-wien.ac.at:80=www.wu-wien.ac.at:80
apollo.wu-wien.ac.at:80=www.wu-wien.ac.at:80
buddy.wu-wien.ac.at:80=www.wu-wien.ac.at:80
proxy.wu-wien.ac.at:80=www.wu-wien.ac.at:80
olymp.wu-wien.ac.at:80=www.wu-wien.ac.at:80
asterix.wu-wien.ac.at:80=www.wu-wien.ac.at:80
botanix.wu-wien.ac.at:80=www.wu-wien.ac.at:80
falbala.wu-wien.ac.at:80=www.wu-wien.ac.at:80
gutemine.wu-wien.ac.at:80=www.wu-wien.ac.at:80
osiris.wu-wien.ac.at:80=www.wu-wien.ac.at:80

(The canonical names are on the left side and the names that should appear
in the search results are on the right side.)

The start url should contain the preferred name:
start_url: http://www.wu-wien.ac.at

.. and you should limit your search to something like that:
limit_urls_to: .wu-wien.ac.at/
(before the normalization and translation, can be any servername)

limit_normalized: http://www.wu-wien.ac.at/
(after the translation)

Hope that helps!

Cheers,
         Leo

-----------------------------------------------------------------------
Alexander (Leo) Bergolth leo@leo.wu-wien.ac.at
WU-Wien - Zentrum fuer Informatikdienste http://leo.wu-wien.ac.at
Info Center
Linux - because reboots are for hardware changes

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Mar 19 1999 - 17:32:54 PST