Subject: Re: [htdig] CGI Question
From: Tod Thomas (firstname.lastname@example.org)
Date: Mon Apr 03 2000 - 06:18:26 PDT
Thanks Gilles and Jeff for your suggestions.
I commented out the exclude_urls line. The only other limiting configuration
option I have in place is limit_urls_to which I have valued with our domain
maintain session, only to provide user customization. Should no customization
cookies exist, the initial page is a default so I would think that if anything
htdig would have indexed the links presented on that default page.
This particular site is driven by a central CGI process that dynamically
generates all content. Using cookies a user has the ability to customize the
presentation, but if they choose not to a default one is provided. My thought
was htdig would have a problem indexing the site in its entirety since it
would need to know what everybody's customizations were. I did think that
htdig could index the default presentation though since cookies aren't
Thanks again - Tod
> You might want to look at the contents of your htdig.conf file, especially
> in the areas of limit_urls_to, exclude_urls, and bad_extensions. The
> mime-type of the content handed back from the server can affect the
> indexing process as well -- this is a little less likely in your
> Also, if your site uses/requires cookies for session management, you might
> have some issues. I don't believe that htdig supports cookie usage at this
> time (I've never had the occasion to check).
> If your site uses authentication (ala basic auth, etc), you'll want to
> look into the options that can enable this in your config file. Check the
> htdig docs.
Gilles Detillieux wrote:
> You probably need to change exclude_urls in your htdig.conf. By default,
> it excludes all URLs containing /cgi-bin/ or .cgi, either of which is
> likely to be in the URLs for your CGIs.
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Mon Apr 03 2000 - 05:18:06 PDT