Re: htdig: Digging both internal and external sites

Gilles Detillieux (
Wed, 2 Dec 1998 08:50:38 -0600 (CST)

According to Denis Bazinet:
> Call me dense if it's an easy answer, but I can seem to find a way to
> have one htdig.conf file that will allow htdig to dig sites on our LAN
> and sites through a proxy server. Is there a setting that which sites
> should not use the proxy server?

As of 3.1.0b2, it seems to be an all or nothing deal. Either it goes
through the proxy server for everything, or it doesn't. I haven't heard
any talk of changing it in b3.

You'd need to patch htdig/ to do what you want. In the
Document::RetrieveHTTP() function (or method? - sorry, I'm not up on
my C++ terminology), you'd need to check url->host() against a list of
hosts, or url->get() against a list of URLS (which would be a new config
file parameter, like local_urls I guess) to decide whether to go through
the proxy server or not. You could use Retriever::IsLocal() as a model
for the check you'd need to implement, e.g. as a Document::IsNotProxy()

The other option would be to get all the local files from the local file
system rather than going through HTTP, using the local_urls parameter.
Than would mean all your LAN sites' content would need to be mounted
(directly or via NFS) on the host running htdig.

