[htdig] server_aliases

Patrick Dugal (patrick.dugal@nrc.ca)
Fri, 19 Mar 1999 14:26:06 -0500

Without being too technically specific, it's possible that
there exists two different URLs for the exact same document
on the same server. For example,
http://www.foo.com/index.html and
http://specific.machine.foo.com/index.html can be the same
file. Right?

Does anybody have any idea which of the two URLs will be
returned to the user when he/she does a search supposing I
have configured the indexer using server_aliases to
recognize the aliase(s) properly? In some cases there are
over 5 aliases for the same machine. Will all of them show
up in the search results or is there a way to index the one
I want?

The reason I'm asking is because there are many aliases for
75 machines within the same domain that I want to have
indexed. Keeping in mind that start_url attribute is
www.foo.com and limit_urls_to attribute is .foo.com and that
content-creators sometimes make the mistake of making links
to the canonic (real) name of the machine, indexing was
possible in the past but we don't want people to see the
canonic names in the results.

Does anyone know if this is possible with ht://Dig? If so,
can you suggest any reading material or what would be the
best approach? I may not have explained the problem clearly
enough, let me know.


Pat :)

To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Fri Mar 19 1999 - 17:32:54 PST