htdig: Re: Remote traversal with WebGLIMPSE


Aaron Newsome (aaron.d.newsome@wdc.com)
Mon, 04 May 1998 12:35:34 -0700


I did run confarc, and I did answer yes to "Traverse Remote Links". I also
set the explicit option to no. I actually experimented with *all*
combinations of options for those two paramters. None would index the remote
files.

I have a very simple setup so I'll include the files and the output in this
email.

archive.cfg:
=========
title Western Digital Global Search
urlpath http://pixie.wdc.com/index
traverse_type 1
explicit_only 0
numhops 5
nhhops 3
local_limit 99999
remote_limit 10000
addboxes 0
vhost default
usemaxmem 0
urllist http://pixie.wdc.com/index/index.html

http://pixie.wdc.com/index/index.html:
===========================
http://gatekeeper.wdc.com http://year2000.wdc.com http://delta2.wdc.com/oraimp http://dragon.wdc.com:80/ is remote...
Getting remote url: http://dragon.wdc.com:80/
Url http://year2000.wdc.com:80/ is remote...
Getting remote url: http://year2000.wdc.com:80/
Url http://delta2.wdc.com:80/oraimp is remote...
Getting remote url: http://delta2.wdc.com:80/oraimp
Url http://gatekeeper.wdc.com:80/ is remote...
Getting remote url: http://gatekeeper.wdc.com:80/
No more links to traverse.

------------------------------------------------------
Collected 1 local pages and 0 remote pages.
------------------------------------------------------

Creating neighborhood for /usr/local/apache/html/index/index.html.
No search boxes used

This is glimpseindex version 4.1, 1997.

Indexing "/usr/local/apache/html/index/index.html
http://pixie.wdc.com/index/ind
ex.html" ...

Size of files being indexed = 508 B, Total #of files = 1

Index-directory: "/usr/local/apache/html/index"
Glimpse-files created here:
-rw-r--r-- 1 root root 91 May 4 12:29 .glimpse_filehash
-rw-r--r-- 1 root root 262144 May 4 12:29
.glimpse_filehash_index
-rw-r--r-- 1 root root 89 May 4 12:29 .glimpse_filenames
-rw-r--r-- 1 root root 4 May 4 12:29
.glimpse_filenames_index
-rw-r--r-- 1 root root 4 May 4 12:29 .glimpse_filetimes
-rw-r--r-- 1 root root 4 May 4 12:29
.glimpse_filetimes.index
-rw-r--r-- 1 root root 175 May 3 17:17 .glimpse_filters
-rw------- 1 root root 306 May 4 12:29 .glimpse_index
-rw-r--r-- 1 root root 116 May 4 12:29 .glimpse_messages
-rw------- 1 root root 58 May 4 12:29 .glimpse_partitions
-rw-r--r-- 1 root root 1353 May 4 12:29 .glimpse_statistics
-rw-r--r-- 1 root root 262144 May 4 12:29 .glimpse_turbo
Zero sized output for: /usr/local/apache/html/index/.nh.index.html
hash_misses=0 num_input_filenames=1
pixie:/usr/local/apache/html/index#

Does anybody have a clue what is going on here. I have fought with this too
much I think.

No disrespect to the authors, but I have read and re-read all of the docs and
can still not figure this out. I did however download and compile ht://Dig.
It worked perfectly the first time (on all the remote files). And it has two
advatntages over webglimpse.

* It's free <- This is a big one
* It works <- equally important

For now ht://Dig has solved my needs but I would still like too understand
how to make webglimpse work. I may want to run it at home or something.

Thanks for all your help.

Golda Bernstein wrote:

> At 05:54 PM 5/3/98 -0700, Aaron Newsome wrote:
> >I have tried every combination of archive.cfg directives I can think of.
> >When I try to archive remote sites I get a message that says:
> >
> >Skipping non-local url:
> >
> >Is there any way to make webglimpse index non-local URL's.
> >
> >Thanks,
> >Aaron Newsome
> >aaron.d.newsome@wdc.com
>
> Yes, see the sample archive.cfg file at
>
> http://tucson.com/webglimpse/sample.archive.cfg
>
> for more complete docs on what each setting does. When you run confarc it
> should also prompt you for whether to index remote pages, and you should
> answer Y to get the right archive.cfg setting put in automatically.
>
> If you're having trouble running confarc on your machine, you may want to
> try the latest beta (1.6b2). You can download it from
> http://tucson.com/webglimpse.
>
> --Golda
>
> ------------------------------------------------------------------
> Golda Bernstein mailto:gberns@tucson.com Ph. (520) 620-6878
> Internet WorkShop http://tucson.com FAX (520) 620-6841

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:15 PST