Re: [htdig] Htdig unable to read link


Subject: Re: [htdig] Htdig unable to read link
From: Chad Phillips (gphillip@aafp.org)
Date: Fri May 05 2000 - 20:49:50 PDT


I tried htdig 3.15 and 3.2b . The start url is just the url to the servlet. The rest of the config file is pretty much default. I found a copy of my old java code and htdig works fine with it. I added a flush statement, and made a change to the server so that it adds a header to each page.

Htdig may be fine, the problem may be on my end. It is just wierd that the URL works fine with IE and Netscape but not htdig.

>>> Geoff Hutchison <ghutchis@wso.williams.edu> 05/05/00 22:04 PM >>>
At 9:30 PM -0500 5/5/00, Chad Phillips wrote:
>My companies web site has some dynamic content that is made by java
>servlets. I can't seem to get htdig to index the links. It reads
>the url fine, but says that it is a broken link.

Ok, so from your message, I can tell that you're using some flavor of
3.2.0 beta. It would be useful to know what version you're using,
what your config file looks like and so on.

Also, I assume you read the upgrade guide mentioned in the release notes?

>Htdig used to index them fine, but we made some changes to the
>server and java code and now htdig can't read them. Does anyone
>have an idea on what is wrong with htdig? or what the servlets may
>be doing wrong.

It's really hard to tell from your description what's going on. The
URL you posted looks OK, but the referer looks a little weird. Still,
why do you think it's a problem with ht://Dig? You say you made
changes to the servlet code--are your old URLs still valid? If you're
running an update of your databases, then you must remember that it
will go through and check the old URLs.

>http://www.aafp.org/servlet/mntPress?prhtml=press_list_pub&actioncode
>=list&category=press Ref: http://:0

One other thing is worth mentioning--n 3.2.0 (and to a limited
degree in 3.1.5), it will stop indexing servers that it cannot reach.
So if your server causes a timeout and the indexer cannot get through
on a retry, it will mark the server as dead. This can be set in
3.2.0b2 and later through things like tcp_max_retries and
tcp_wait_time as mentioned in the release notes and documentation.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri May 05 2000 - 18:35:52 PDT