Torsten Neuer (firstname.lastname@example.org)
Wed, 16 Jun 1999 10:01:42 +0200
According to Lim Swee Tat IS NCS:
>Basically, I'm supposed to implement a search engine in the company's
>intranet. But due to the way the intranet is working now, I'm having a bit
>of major trouble getting any spider/robot or agent to set up the basic
>database at all.
>Basically, when the user makes a request to the site, on a prespecified port
>number say, 81, a servlet thread class is called to set up the connection
>and authenticate the user. Thereafter, each page he goes to is basically
>generated by the different servlets.
>The basic problem here is that coz the servlet produces output directly to
>the server, the search robot/spider or agent will be best placed to search
>information by placing the request directly to the server, but there is no
>method to authenticate the robot/spider or agent. And, since the servlets
>are the ones generating the output, how can the robot/spider or agent search
>that particular output rather than the .java file which it is not searching
Basically, servlets are not a problem as long as you go via HTTP and have
not listed the appropriate URLs in the bad_urls list.
Your problem seems to be with authentication. AFAIK, ht://Dig supports
the standard (basic) authentication scheme.
What authentication scheme are you using?
What does htdig (verbose mode) tell you?
What do the server logs say?
-- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: email@example.com Internet: http://www.inwise.de
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Wed Jun 16 1999 - 00:22:31 PDT