Re: [htdig] Local URL's


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Wed, 23 Jun 1999 09:48:31 -0500 (CDT)


According to info@edoc.co.za:
> my server was moved behind a firewall that does not allow http
> requests out.

Ouch! Is there a proxy server you can go through? If so, htdig can be
configured to use the proxy server.

> My htdig.conf file is as follows:
>
> The definitions are in one long line.
>
> start_url: http://www.edoc.co.za/ http://smithfield.co.za/
> http://iwd.co.za/
>
> limit_urls_to: ${start_url}
>
> # Particulars of local pages
> local_default_doc: index.html
> local_urls:
> http://www.edoc.co.za/=/usr/home/wwwusers/edoc/edoc/
> http://edoc.co.za/=/usr/home/wwwusers/edoc/edoc/
> http://smithfield.co.za/=/usr/home/wwwusers/iwd/smithfield/
>
> I believe that this should force htdig to read the files without trying
> to make a http call.
>
> I run htdig 3.1.2

Unfortunately, right now htdig needs to establish an initial HTTP
connection with the server being indexed, even when you use local_urls.
It uses this to get the robots.txt file, or just check to see if there
is one. It also falls back on HTTP for any file that doesn't have a
.html, .htm or .txt suffix. There are plans to eventually allow totally
local indexing, without using HTTP at all, but it's not there yet.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Jun 23 1999 - 07:04:34 PDT