Re: [htdig] htdig indexing using local_url question


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Mon, 25 Oct 1999 15:31:08 -0500 (CDT)


According to Stephen Yeoh:
> After searching the htdig site for information on indexing an SSL site, I have
> set up indexing by using the local_url setting.
>
> htdig does to go the file system and grabs index.html, but none of the other
> files in the directory. I have tried with and without the local_default_doc
> setting.
>
> I touch index.html and then I run with the -vvv option, I get the message while
> parsing index.html for every single link that is found:
>
> start_url: https://foo.com/
> limit_urls_to: ${start_url}
> local_urls: https://foo.com/=/www/foo/
>
> Rejected: not an http or relative link
>
> on every single link, even if they are on the same https://foo.com/ site. My
> site uses all relative links except for external references.

Currently, htdig will not support URLs that begin with https://, even when
using local_urls to bypass the server. A trick that might work would be
to index using http:// instead, but use local_urls to point to the directory
that contains the contents of the secure server. You'd need to use separate
configuration files for digging and searching, and use url_part_aliases in
each of these configuration files to rewrite the http:// into https:// in the
search results.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Mon Oct 25 1999 - 15:42:35 PDT