[htdig] htdig fail 2


Frank Guangxin Liu (frank@ctcqnx4.ctc.cummins.com)
Tue, 28 Sep 1999 10:00:17 -0500 (EST)


It seems htdig handles -u option not correct.

Today I found htdig failed to index a site which requires no
authentication. To narrow the problem, I set
start_url: http://bpnet.cummins.com/
limit_urls_to: ${start_url}

and rundig -vvvv gives:
        1:0:http://bpnet.cummins.com/
New server: bpnet.cummins.com, 80
Retrieval command for http://bpnet.cummins.com/robots.txt: GET /robots.txt
HTTP/1.0
User-Agent: htdig/3.1.3 (webmaster@ctcqnx4.ctc.cummins.com)
Host: bpnet.cummins.com

Header line: HTTP/1.1 404 Object Not Found
Header line: Server: Microsoft-IIS/4.0
Header line: Date: Tue, 28 Sep 1999 14:44:47 GMT
Header line: Content-Length: 461
Header line: Content-Type: text/html
Header line:
returnStatus = 1
 pushed
pick: bpnet.cummins.com, # servers = 1
0:0:0:http://bpnet.cummins.com/: Retrieval command for
http://bpnet.cummins.com/: GET / HTTP/1.0
User-Agent: htdig/3.1.3 (webmaster@ctcqnx4.ctc.cummins.com)
Authorization: Basic aHRkaWc6aHRkaWcxMA==
Host: bpnet.cummins.com

Header line: HTTP/1.1 401 Access Denied
Header line: WWW-Authenticate: NTLM
Header line: WWW-Authenticate: Basic realm="bpnet.cummins.com"
Header line: Content-Length: 537
Header line: Content-Type: text/html
Header line:
returnStatus = 5
 not authorized
pick: bpnet.cummins.com, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig: bpnet.cummins.com:80 1 document

Here is the source of index.htm
<!-- body_default.htm -->

bpcity

I can view this url without a problem from netscape browser.
By looking at rundig output, it seems like it tries and fails
at authentication. While using netscape browser, it doesn;'t
prompt for authentication at all and works just fine.

When I take a closer look at the rundig script, I found I
have a -u xxxxx:xxxx option to htdig because this is my
generic rundig script that index the whole intranet and
some sites require this xxxxx:xxxx password...
After I take out -u xxxxx:xxx option from rundig script,
it can index this site without a problem. Now it seems to
me htdig may need to do another try if password failed
by the server.

Frank

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word unsubscribe in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Tue Sep 28 1999 - 08:05:11 PDT