Re: htdig: ht://dig doesn't work.


webmaster@www.nisu.flinders.edu.au
Tue, 29 Sep 1998 11:15:01 +0930 (CST)


On 28 Sep, Phillip Morgan wrote:
> Hello webmaster,
>
>> > I'm having terrible problems getting ht://dig to work on my SLackware Linux system
>> >(kernel 2.0.30).
>
>> You might try setting your start point to
>>
>> http://www.ehcs.com.au/index.htm
>>
>> and leave your limit_urls at http://www.ehcs.com.au. Use rundig to
>> rebuild from scratch.
>>
>> I found I needed to point my htdig at my main navigation page, rather
>> than the home page address. If this works, I think you are then going to
>> have trouble with your frames...
>
> Sigh. It never ends.... :-(
>
> Initially, not knowing anything about search engines, I thought it would simply index all
> pages in the document rout specified as start_utl. Then it became obvious htdig actually
> follows [some] links through the web pages.
>
> What's irking me is why it is only some, not others. I have two systems.
> http://www.ehcs.com.au is the primary machine, ftp://ftp.ehcs.com.au is another. The web
> pages are currently stored at http://www.ehcs.com.au, which is really
> /usr/local/etc/httpd/htdocs.
>
> Here's an extract from the chain of web files.. The first is index.htm, which points to
> ftpmenu.htm which has four buttons.
>
> [index.htm]
> a href="ftpmenu.htm" TARGET="main" onMouseOver="window.status='Visit our HUGE FTP site and
> download everything for free!' ;return true" onMouseOut="window.status='';return true">
> <IMG SRC="./pics/idxbut10.gif" BORDER=0 ALT="Huge FTP Archive."></A>
>
> --
>
> [ftpmenu.htm]
> <a href="htdigsrch.html"><img src="./pics/idxbut23.gif" border=0 alt="Use our search
> engine to locate the files you need"></a>
>
> <a href="filelogo.htm"><img src="./pics/idxbut22.gif" border=0 alt="Browse through the
> files by category"></a>
>
> <a href="ftp://ftp.ehcs.com.au"><img src="./pics/idxbut24.gif" border=0 alt="Browse
> through the files via text directory listings"></a>
>
> <a href="ftpindex.htm"><img src="./pics/idxbut21.gif" border=0 alt="View the HUGE list
> of files by filename."></a>
>
> The third button (idxbut24.gif), brings up the text based directory listing, which is
> probably obvious from the code. The last (idxbut21.gif), is a list of every file on the
> ftp machine, as shown by the next few lines...
>
> --
>
> [ftpindex.htm]
> <A HREF="ftp://ftp.ehcs.com.au/lists/allfiles.zip"><B>allfiles.zip</B></A>
> <A HREF="ftp://ftp.ehcs.com.au/lists/xxxfiles.zip"><B>xxxfiles.zip</
>
> htdig won't catalog these pages. In fact, it won't even catalog ftpindex.htm, but it does
> catalog ftpmenu.htm
>
> Has me baffled :-(
>
> --
> cheers,
>
> Phillip Morgan,
>
> email: admin@ehcs.com.au fax (03) 9876 5294
> vox 0419 874 804
> (03) 9876 5295

I'm no great expert on htdig; hoewver I did a little grepping on the
code and found this comment in the htdig source:

// Currently we only deal with HTTP URLs. Gopher and ftp will come
later.

That's presumably why it's not following the links into your ftp box
via the Directory button because they are not http-prefixed links
and/or because you may have a limit_urls that restricts to
www.ehcs.com.au/

However, I would expect that htdig should find it's way down to the
Area List and Master List sections as they are HTTP URLs. If you have
set a low number for max_hop_count, this could be a result.

Cheers

-- 
David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet.flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA Visit the PHP mirror at http://au.php.net:81/

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:53 PST