Re: htdig: Digging both internal and external sites

Gilles Detillieux (
Fri, 4 Dec 1998 15:49:18 -0600 (CST)

According to Denis Bazinet:
> Frank and Gilles proposed to "push" the firewall to inside the computer. That is, to put a
> proxy server close to the htdig machine, so that it thinks all sites are external to this internal proxy
> server.

Well, actually that was Frank's suggestion. My suggestions were:
1) Physically or NFS mount the web content behind the firewall onto the
system running htdig, and use "local_urls" to get htdig to get those from
the local filesystem rather than an HTTP server or proxy; or
2) Patch htdig to conditionally exclude certain pages from the proxy,
and fetch them directly from their HTTP server, if they're in the
http_proxy_exclude list, a parameter added in my patch.

Denis, I'd still like you to try out my patch if you can, just to make
sure it works.

> Frank took this one step further and mentionned that this new proxy server could be installed
> on the same machine as htdig. Therefore, before any communication out to the web server(s) would
> happen, it would go through its proxy server process.

That's a good idea. If you're going to use a new proxy server as an
intermediary, it would be best to host it on the same machine, and access
it as localhost rather than as the machine's own host name, so that you're
not causing unecessary network traffic on your LAN. If you run squid
(the intermediary proxy server Frank suggested) on a separate server,
then whenever htdig gets a page outside of the firewall, it goes over
your LAN twice, first from the firewall proxy server to the squid server,
then from the squid server to the htdig client.

My patch has the advantage that you don't need a second proxy server,
and don't need to figure out how to install and configure it. You just
need to add one parameter to htdig.conf, after rebuilding htdig with
the patch of course, to specify which servers are behind the firewall.

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:48 PST