[htdig] A few beginner questions

Mitchell Marks (mitchell@walrus.uchicago.edu)
Sun, 13 Jun 1999 16:41:26 -0500


I've just started using ht://Dig, and have been getting pretty good results
close to what I'm looking for. However, there are a few things I'm
evidently not getting right yet, or perhaps these are defaults or
assumptions of the system I haven't caught onto yet. Anyway, I hope
someone can help me with these questions.

1. Will the htdig program look on other ports? I have material on :9673
(which some of you will recognize as Zope), and in htdig.conf I include a
URL to it:
          start_url: http://cuip.uchicago.edu:9673/
I've also tried using the (unusual) default document name:
        start_url: http://cuip.uchicago.edu:9673/
Either way, nothing from that port shows up in the indexs, and when I save
a URL list the URL with alternate port is mentioned only twice, which I
think is from its mention in links on the standard-port root page.
        If it makes any difference, there aren't pre-made separate files there,
but they're generated to look like HTML files.

2. Part of our site is intentionally devoid of directory-default documents,
and material under them is not being caught. Does htdig strictly only
follow links found *in documents* by starting at the specified
start_url? Or is there an option for it to accept the server's
file-listing of a directory as documents it should also grab and continue

3. I'm unclear on the significance of this, from the configuration docs:


> description:
> This is the list of URLs that will be used to start a dig when there
>was no
> existing database. Note that multiple URLs can be given here.

Does this mean that if I change the start_url entry in htdig.conf, the
change will not affect subsequent runs unless I erase the existing database

4. Another htdig.conf point I don't quite get: does use of a local_urls
entry only define the filesystem equivalents, or does it also tell htdig to
dig there? That is, if the LHS is something that *would* go in start_url,
is having it in local_urls a reason for omitting it from start_url?

5. (More on local_urls) I've got this in htdig.conf:

        local_urls: http://cuip.uchicago.edu/=/pub/

First off, does that look syntactically correct? Htdig does not seem to be
using filesystem access to get to those pages, though. Or so it looks to
me by running netstat while htdig is running, and also from what the web
server's access log seems to say.
        Any suggestions on how to help htdig find these through the filesystem and
not have to switch to http so readily?

Thanks very much,

Mitch Marks

Mitchell Marks                          (773) 702-6041 AAC-010A
CUIP: Chicago Public Schools / Univ. of Chicago Internet Project
    "The University of Chicago -- where they split the etym"

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Sun Jun 13 1999 - 14:05:11 PDT