Re: [htdig] porblems with cfm files on 3.1.5

From: Gilles Detillieux (
Date: Thu Oct 26 2000 - 10:27:42 PDT

According to Glen Davies:
> >Does anyone know why 'htdig' won't follow links using the
> >full HREF URL?
> >ie:
> >htdig only tries to follow >''
> This problem is referred to in a post 2 Dec 1998 and and was
> apparently a known bug and fixed then. I am running 3.1.5 on
> Debian Linux on a DEC alpha and am having the same problem,
> htdig seems to be ignoring anytthing after the ? I am going
> through http not the local file system. Any ideas? There are
> various other old post about problems with bits after the ?
> in cgi urls but they all seem to indicate the problem was
> fixed a few releases back.

This was a bug in htdig version 3.1.0b2, and was fixed some time ago.
However, some users find that behaviour to be desirable, and have patched
more recent releases to essentially reintroduce the bug. Are you sure
you copy of htdig doesn't have one of these patches applied to it?

The other thing to look for is whether the "?" appears in your setting
of exclude_urls, as this is a commonly added setting, e.g. to suppress
duplicate indexes generated by Apache's fancy indexing. If it's there,
any form of the URL with the "?" and parameter will be suppressed
outright, but if htdig encounters the URL elsewhere without the "?" and
parameter, it will index in then.

If you want to allow CGI parameters, but suppress Apache's fancy indexing,
don't put a "?" in exclude_urls, but instead add these explicit strings,
which hopefully won't be used by any other CGI or cfm file:

        ?D=A ?D=D ?M=A ?M=D ?N=A ?N=D ?S=A ?S=D

If you have neither a patch applied, nor the "?" in exclude_urls, then
I'm stumped. Try htdig -ivvv to see why the URLs you want are being

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

