Re: [htdig] Local Director


Subject: Re: [htdig] Local Director
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue Aug 29 2000 - 09:02:00 PDT


According to Eric Maquiling:
> Hmm, I think this is a FAQ but I searched the web pages and didn't get a
> full explanation.

Well, we should add something to the FAQ about local_urls and
local_urls_only, but it would be hard to address your question in the
FAQ because it's really about 4 different issues/questions in one.

> I have 4 Apache Servers behind a Cisco Local Director. For security
> reasons, those 4 servers are in Exodus. The 4 servers cannot, from the
> shell, connect to www.whatever.com.
>
> I have in my config:
>
> local_urls_only true
> start_url 192.168.10.XX
> local_url http://192.168.10.XX=/home/apache/htdocs
>
> Okay, indexing works but hitting the site from the "outside", I get this
> in the ouput:
>
> search_words
> http://192.168.10.XX/directory/search_words
>
>
> If I do:
> local_urls_only true
> start_url http://www.ourcompany.com
> local_url http://www.ourcompany.com = /home/apache/htdocs
>
> I cannot index because it cannot do an http connection from within the
> Exodus facility.

OK, first of all, if you correctly set local_urls_only to true, htdig
should not attempt an HTTP connection to your server, provided you're
running a release that supports it, i.e. 3.1.5. Earlier versions don't
support it. You didn't mention which version you're running, so I'll
assume this is a potential stumbling block.

Secondly, if you want to index files only via local_urls, you need to
choose your start_url, your URL mappings in local_urls, and your file
types carefully to make sure your files fit within the rather rigid
restrictions imposed by the local_urls handling. That means restricting
yourself to .html and .txt, and a few other file extensions itemized
in http://www.htdig.org/attrs.html#local_urls.

Thirdly, staying within the restrictions of local_urls also means you
cannot count on server redirects for directory URLs, so all directory
URLs you use, whether in your htdig.conf or in links in the documents
you're indexing, must have the trailing slash to clearly identify them as
directories as opposed to regular files. You are missing these trailing
slashes in your start_url and local_urls settings.

And finally, spelling and punctuation not only matter, they are critical.
You have misspelled local_urls in both examples above, you're missing
the colon after all attribute names and you've added spaces on either
side of the '=' sign in the second example. Given these typos, I'd have
to assume that either the first example never did work correctly either,
or perhaps more likely neither example is an accurate reflection of the
actual contents of your configuration file, leaving us to assume (i.e.
take a wild guess) as to what the problem might actually be.

What you should be using, with htdig version 3.1.5, is:

local_urls_only: true
start_url: http://www.ourcompany.com/
local_urls: http://www.ourcompany.com/=/home/apache/htdocs/

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Aug 29 2000 - 09:03:08 PDT