Re: [htdig] Different domains?


Subject: Re: [htdig] Different domains?
From: David Adams (D.J.Adams@soton.ac.uk)
Date: Fri Jul 28 2000 - 04:27:43 PDT


Quoting Ken Convery <ken@aviansportal.com>:

> ht://dig looks like a great tool for maintainers of intranets and/or several
> internet web servers. I have a question about it's application to something
> we are trying to do here. We are developing relationships with a few other
> online companies and want to make content from their sites available by link
> on our site. We are thinking we can use ht://dig to index those other sites
> so we can search out and display the pertinent information on our site in
> summary form and provide the link to a specific page on their site for more
> information.
>
> In a nutshell: can ht://dig index other web servers specified outside my
> domain or network?

Yes. I maintain a "local community" index which now covers almost a thousand
servers (real and virtual), most of them commercial.

I would recommend that you access them through a proxy, specified by the
http_proxy: statement in the configuration file.

>
> If so would we need other than http to these other servers or any special
> access such as file system privileges?
>

No, but https servers are a special case, I can't answer for them.

> secondly are there any problems with sites that generate content
> dynamically? Or will ht://dig simply look at static HTML pages or other
> static documents?
>

There are usually no difficulties with dynamic pages, but problems can occur.
The exclude_urls: statement is intended to trap them. In my case I only have

exclude_urls: &referer=

I suggest caution, adding sites one by one to your search list, and keeping
max_hop_count and server_max_docs low at first.

>
> Thank you very much
> Ken Convery
> Avian Pilot Systems Inc.
>

David Adams
<D.J.Adams@soton.ac.uk>
Computing Services
Southampton University

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Thu Jul 27 2000 - 18:25:52 PDT