htdig: Indexing output of CGIs


William Rhee (willrhee@umich.edu)
Thu, 12 Nov 1998 21:18:24 -0500 (EST)


Hi there,

I'm trying to index a single page which has a bunch of links to CGIs with
urlencoded parameters in their URL's query string, eg:

        http://someplace.org/cgi-bin/something?ID=1234&blah=foo

I removed the "exclude_urls" directive in the default htdig.conf which
tells it not to index URLs matching the patterns /cgi-bin/ and .cgi but
none of the pages get indexed.

Examining the "url_list" of all the URLs which htdig extracts while it is
running, it appears that the parameters of the query string are being
truncated. That is, there are many lines in the url list where:

        http://someplace.org/cgi-bin/something

appears, but the ? and query string:

        ?ID=1234&blah=foo

is missing. Is this by design or has someone out there also
experienced the symptom (maybe already patched it?! :-) )?

cheers,
--Will

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:47 PST