htdig: Question from new user...


Jim Serio (jim@rollercoaster.com)
Sun, 21 Sep 1997 18:57:33 -0700


Hi All,
   Just discovered this great program. Was using GlimpsHTTP
for a while and wasn't impressed. Anyway, I installed
3.0.8b2 without problems, except the rundig gives the
following errors:

[root@rollercoaster bin]# rundig
././rundig: @BIN_DIR@/htdig: No such file or directory
././rundig: @BIN_DIR@/htmerge: No such file or directory
././rundig: @BIN_DIR@/htnotify: No such file or directory
././rundig: @BIN_DIR@/htfuzzy: No such file or directory
././rundig: @BIN_DIR@/htfuzzy: No such file or directory

That's not that important though as I can still run htdig
manually. I am pretty sure I have setup the config file
properly. Relevant portions are:

start_url: http://www.rollercoaster.com/
limit_urls_to: ${start_url}
exclude_urls: /cgi-bin/ .cgi
exclude_urls: /images/ .gif
max_head_length: 75000

Now, when I run "htdig -i -v -s" I get the following:

[root@rollercoaster bin]# htdig -i -v -s

New server: www.rollercoaster.com, 80
0:0:0:http://www.rollercoaster.com/: ---+-* size = 2714
1:1:1:http://www.rollercoaster.com/hosted_pages.html: -----+------------
size = 1912
2:3:2:http://www.rollercoaster.com/thetrack: redirect
htdig: Run complete
htdig: 1 server seen:
htdig: www.rollercoaster.com:80 3 documents

It seems that it is not traversing my entire directory
structure, as I have multiple sub-directories with other
html files.

A test search for a word only in a file in one of the sub-dirs
confirms that it did not index them.

The only thing I can think of that I am doing wrong on my
end is that I do not use the full URL to any file in my
.html files. I also make extensive use of SSI. So, in my
index.html file, all files are referenced relatively, like
<a href="/census"> instead of <a href="http://blah.com/census">

Could that be the problem? It seems from what I understand
of this program is that it acts like a spider traversing
each link.

Any help on this would be appreciated.

BTW, I also run a few majordomo lists and *really* love the
web archive (Webarc) for this list. Besides hypermail, is
Webarc currently available or does anyone have pointers to
other majordomo -> www converters?

Jim

--
Jim Serio jim@rollercoaster.com (PGP Key ID: 0xE5E9F23E)
World of Coasters - http://www.rollercoaster.com
The Web's Premier Coaster Site!
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:25:05 PST