Re: [htdig] questions about htdig


Subject: Re: [htdig] questions about htdig
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Thu Jul 27 2000 - 08:53:07 PDT


According to ti980247:
> Hi.. I'm a newcomers in this searching stuff. I already installed htdig on my
> mandrake 7.0, php 4.0, apache1.3.12., everything going fine until I tried to
> indexing my server.
>
> I change htdig.conf and change the url into my website.
> I run ./htdig -h 5 -s but It returns
> htdig: my.web.server:80 1 document
>
> then i checked the wordlist file.. it's very short, I think something wrong
> when it index my web cause my web contains 3975 html files.

Try adding "-i -vvv" to the above htdig command, and look for clues in the
verbose output. For some reason, it's not going beyond the start_url.
My guess is that your limit_urls_to is too restrictive. It defaults
to the same value as start_url, so if you set the latter to the URL
of a single page, rather than the main URL for a site or subdirectory,
that's all you get unless you set limit_urls_to more liberally.

> my html file always link in dynamic not static link (e.g a href="../h.html"
> instead of a href="http://my.web.com/h.html")

These are both static links. The first is relative, the second is
absolute. htdig can properly handle both. Dynamic links are those
constructed by the browser software, e.g. from JavaScript code, which
htdig will not handle.

> Any idea ?? is htdig index depend on link on files or depend on files on my web
> directory ??

Links in files, exclusively. htdig will NOT look at directories, unless
the web server feeds them to htdig as HTML documents containing links
to files (which web servers like Apache commonly do when there's no
index.html).

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Jul 26 2000 - 22:51:42 PDT