According to Mark Sullivan:
> I am using just the syntax though on 3.1.4, is this an incorrect way of
> doing this?

OK, that depends on what "this" is. I still don't know if you want the
URLs to be rewritten before or after the files are fetched.

> allow_virtual_hosts: false
> limit_urls_to:
> server_aliases: \
> limit_normalized:
> start_url:
> I tried server_alias: on its own and
> the results still showed as the host...

Yes, the old syntax still works fine. The only change made to Leo's
code in regards to the syntax is that, as of 3.1.3 (not 3.1.4 as I said
earlier), you can now omit the ":80", as it's assumed by default.




are now equivalent. Note that it must be server_aliases (plural),
and not server_alias.

> The last reference you made was to:
> url_part_aliases:
> *site \
> *1 \
> .html *2
> How would I write this mess out?

First of all, I want to make it very clear that this does something
very different than server_aliases does. The server_aliases is a way
of telling htdig that two or more names refer to the SAME host, so they
should be all rewritten to the canonical form you want BEFORE indexing.

Your Subject heading of More Load balancing suggests that the different
names actually refer to different hosts. So my question: do you want
htdig to fetch all documents from the server named
If so, then you should use server_aliases, and not mess with

If you want htdig to fetch the document from whichever server
is identified in the particular URL it is parsing, and later
rewrite that URL to use, then you need to
set up two separate config files as described in the FAQ and, and use
something like:

  url_part_aliases: *1

for htdig and htmerge, and use something like:

  url_part_aliases: *1

for htsearch. The www1 doamain name will be coded as *1 in the database,
which htsearch will expand to the www canonical domain name.

In either case, you'll need to reindex from scratch to get the right

