Re: [htdig] htdig program hangs on one particular URL


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Wed, 10 Mar 1999 12:00:33 -0600 (CST)


According to Dan Dexter:
> I'm running htDig 3.1.0b4 on a Digital UNIX 4.0D system.
>
> The htdig program hangs when it tries to index the document
> http://inspection.jsc.nasa.gov/I98Exhibit/421.html
>
> I think it might be caused by the META tags in this document. My solution
> to htdig hanging on this document is simply to exclude it in the htdig
> configuration file.
>
> I will be upgrading to htDig 3.1.1 soon, but I would like to know if anyone
> with htDig 3.1.1 can successfully index this particular document.
>
> If v3.1.1 can not index this document, then htDig might need to be updated to
> make it more robust to the broken HTML in this document.

I've tried both htdig 3.1.1 and htdig 3.1.0b4 (Red Hat Linux) on the
URL above, and neither of them hangs! The META tags are strange in
that document. Because the content= strings for the meta description
and meta keywords tags aren't quoted, htdig doesn't grab the whole
thing (only up to the first space), but that doesn't cause it to hang.

Can you run htdig to index only that document on your system, and if so,
does it still hang? (Use a temporary config file that sets start_url &
limit_urls_to to just the one URL, and give htdig the config file name
with the -c option. Add in many -v's for good measure.)

If it still hangs, use -vvvvvvv to see how far it gets before hanging.
If you can get a stack backtrace (by running htdig from the debugger,
or trigerring a core dump when it hangs), that may be useful too.

I'm assuming it's a configuration or OS related problem, as I can't
reproduce it here.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Mon Mar 15 1999 - 08:57:45 PST