Walter Hafner (email@example.com)
Fri, 12 Feb 1999 16:50:18 +0100 (MET)
First of all: I just installed 3.1.0 (final version) on my box (FreeBSD
2.2.8 STABLE, P-II/300, 128 MB) and it seems to run flawlessly. It's
still in the indexing stage, but this works ok.
However, I'm still struggling with my ht://Dig setup ...
- more then 300 machines with WWW servers
- more than 200.000 physical pages
- The majority of the servers have one ore more aliases.
Several of the machines listen to as much as 4 names and the WWW Server
responds to each of them! Even worse, absolute URLs in these servers
point from one alias to a different one _and_ each server contains
documents with relative links. So I end up, indexing servers with
several hundred documents up to four times.
This kind of environment brings ht://Dig to its knees. Disabling virtual
hosts would be a solution, unfortunately I have lots of _real_ virtual
hosts to index. So I just _have_ to "allow_virtual_hosts" which in turn
results in lots of unnecessary queries.
The "server_aliases" option is _not_ what I want. It is impossible to
maintain such a big namespace manually.
Last time I counted, ht://Dig reported 450 servers, despite a _long_
list of mappings in the "server_aliases" section of my configuration.
Some time ago I wrote about Netscape Compass Server and the way it deals
with server aliases: If two pages with different URLs share the same IP
number, it checks the contents of the pages. If they are the same,
Compass assumes it to be a server alias.
Is such a behaviour in the queue for ht://Dig? If yes, when will it be
available (I know, I know ...)? Unfortunately I don't know C++ at all...
Today the following idea occured to me: What if I'd install a Squid
proxy and configure ht://Dig for proxy usage. Since squid is written in
C (hehe), I could patch it to return "301 Moved Permanently" for certain
URLs. This way I could do pretty much anything I want. Well, I have to
code it of course, but I see no problem there.
Question: How dows ht://Dig react to 301 ? Does it discard the URL? Does
it follow the new URL?
-- Walter Hafner__________________________________ firstname.lastname@example.org <A href=http://www.tum.de/~hafner/>*CLICK*</A> "Multiple exclamation marks," he went on, shaking his head, "are a sure sign of a diseased mind." (Terry Pratchett, "Eric") ------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Wed Feb 17 1999 - 10:10:03 PST