Re: [htdig] Multiple domain names pointing on the same site


Subject: Re: [htdig] Multiple domain names pointing on the same site
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Mon Jan 08 2001 - 09:58:57 PST


According to Malcolm Austen:
> On Mon, 8 Jan 2001 gregh@easynet.fr wrote:
> + If a site can be reached via different domain names,
> + is there a trick to make htsearch generate result
> + links pointing to the domain name the user reached
> + the site with ?
>
> Check out the server_aliases: options. It does just what you want.
>
> server_aliases: a.com:80=b.com:80
>
> will result in references to a.com being treated as if they were
> references to b.com

Well, this is a start, but it's only part of the solution. What this will
do is ensure that only the canonical server name, b.com in this example,
is used for entries in the database. However...

Greg also wrote:
+ A user reaches the site a.com, makes a search, the result
+ would be www.a.com/searchedpage.html
+ If another user reaches the site b.com (wich is the same
+ document as a.com), the result would link to
+ www.b.com/searchedpage.html

This is tricker, as what you want is for a given, presumably static
database for all domains, to alter the search results' domain names to
match the domain name used in the URL that called htsearch. I think this
would require a combination of server_aliases as above for canonicalising
the domain name, and url_part_aliases to encode the canonical domain in
the database. Then, the search wrapper would figure out the domain name
used in the CGI URL, and pass that to the real htsearch which would use
it in its own url_part_aliases to decode the encoded canonical domain
into the desired domain name.

For example, in htdig and htmerge's htdig.conf:
server_aliases: www.a.com:80=www.real.com:80 \
                www.b.com:80=www.real.com:80
url_part_aliases: www.real.com *site

Then in htsearch's config file:
url_part_aliases: ${searchdomain} *site
searchdomain: www.real.com
allow_in_form: searchdomain

Then, the search form would set the "config" input parameter to set this
particular search config file, and set the action to call a wrapper script
like this one, using the "GET" method:

-------------------------
#!/bin/sh

case "$QUERY_STRING" in
*searchdomain=*) ;; # searchdomain is already set, so leave it
*) # set searchdomain to HTTP host name used in request
        QUERY_STRING="${QUERY_STRING}&searchdomain=$HTTP_HOST"
        export QUERY_STRING
        ;;
esac

exec /some/path/to/real/htsearch
-------------------------

I'm pretty sure this should work, because htsearch seems to parse
allow_in_form's value, and make its input parameters override the
corresponding config attributes, before the url_part_aliases value is
parsed by the HtURLCodec class.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Mon Jan 08 2001 - 10:11:39 PST