Re: [htdig] More Load balancing


Subject: Re: [htdig] More Load balancing
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue May 30 2000 - 14:15:08 PDT


According to Mark Sullivan:
> I am using just the syntax though on 3.1.4, is this an incorrect way of
> doing this?

OK, that depends on what "this" is. I still don't know if you want the
URLs to be rewritten before or after the files are fetched.

> allow_virtual_hosts: false
> limit_urls_to: .domain.com/
> server_aliases: www1.domain.com:80=www.domain.com:80 \
> www2.domain.com:80=www.domain:80
> limit_normalized: http://www.domain.com/
> start_url: http://www.domain.com/
>
> I tried server_alias: www1.mydomain.com=www.mydomain.com on its own and
> the results still showed www1.domain.com as the host...

Yes, the old syntax still works fine. The only change made to Leo's
code in regards to the syntax is that, as of 3.1.3 (not 3.1.4 as I said
earlier), you can now omit the ":80", as it's assumed by default.

  server_aliases: www1.mydomain.com=www.mydomain.com

and

  server_aliases: www1.mydomain.com:80=www.mydomain.com:80

are now equivalent. Note that it must be server_aliases (plural),
and not server_alias.

> The last reference you made was to:
> url_part_aliases:
> http://search.example.com/~htdig *site \
> http://www.htdig.org/this/ *1 \
> .html *2
> How would I write this mess out?

First of all, I want to make it very clear that this does something
very different than server_aliases does. The server_aliases is a way
of telling htdig that two or more names refer to the SAME host, so they
should be all rewritten to the canonical form you want BEFORE indexing.

Your Subject heading of More Load balancing suggests that the different
names actually refer to different hosts. So my question: do you want
htdig to fetch all documents from the server named www.mydomain.com?
If so, then you should use server_aliases, and not mess with
url_part_aliases.

If you want htdig to fetch the document from whichever server
is identified in the particular URL it is parsing, and later
rewrite that URL to use www.mydomain.com, then you need to
set up two separate config files as described in the FAQ and
http://www.htdig.org/attrs.html#url_part_aliases, and use
something like:

  url_part_aliases: http://www1.mydomain.com/ *1

for htdig and htmerge, and use something like:

  url_part_aliases: http://www.mydomain.com/ *1

for htsearch. The www1 doamain name will be coded as *1 in the database,
which htsearch will expand to the www canonical domain name.

In either case, you'll need to reindex from scratch to get the right
mapping.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue May 30 2000 - 12:04:31 PDT