Subject: Re: [htdig] htdig and cgi
From: Geoff Hutchison (firstname.lastname@example.org)
Date: Wed Mar 29 2000 - 06:29:09 PST
At 11:26 AM +0200 3/29/00, Matthias Kleine - Patzschke + Rasp
Software AG wrote:
>Folders are converted in folder-links, much like in ftp-directories.
>Documents are converted in links - you can click the link (=filename)
>and the document is opened.
>Now for my problem:
>Up to now, only the folder-Links are found by the htdig search engine,
>and not the documents. What I don't understand is the mechanism, how
>the database is created. I suppose that this mechanism is getting into
>conflict with our cgi-structure.
Conceptually it's fairly easy. The database is generated from
following all the links it finds from the URL(s) listed in start_url.
It ignores links that are forbidden by robots.txt files, META robots
tags, those matching patterns in exclude_urls and those that don't
match a pattern in limit_urls_to.
Of course if you turn on some of the debugging flags (say run htdig
-vvv) you'll see the reasons htdig is rejecting links.
-- -Geoff Hutchison Williams Students Online http://wso.williams.edu/
------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Wed Mar 29 2000 - 05:33:50 PST