Re: [htdig] indexing local pages problem


Subject: Re: [htdig] indexing local pages problem
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue Mar 14 2000 - 11:27:07 PST


According to Wilfried Geis:
> All the pages are in the directory /http/userpages/johndoe/
> These pages can be called with: http://www.myserver.de/homepages/johndoe
>
> In order to index the pages I have created a small html-file with all
> the links to the pages in it and have set
> local_urls:
> http://www.myserver.de/homepages/=/http/userpages/
>
> This works fine, however, htdig does not find the appropriate
> index-pages locally and thus falls back to http-retrieval.
>
> Then I wanted to force htdig to fetch the pages locally and have set:
> local_default_doc: welcome.htm index.html
> local_urls_only: true
>
> But with this setting htdig does not index anything.
> The only way it works is when my start-document (the script-generated
> index-page) contains the entire path including welcome.htm or
> index.html.
>
> That looks to me like the setting 'local_default_doc' is not working. (I
> have upgraded to version 3.1.5)
>
> Has anyone else problems with this setting or is there something I might
> have missed in the docs?

For URLs that point to directories, you must have the trailing "/" for
local indexing to work. When indexing through HTTP, a directory URL that
lacks the trailing slash causes a redirect, to the same URL with the slash
appended, so that the directory index can be fetched normally. When you
index files locally, this redirect does not occur, leaving htdig with no
fallback position when local_urls_only is true.

It would probably be a simple matter to change Document::RetrieveLocal()
to issue the redirect when the local file turns out to be a directory,
but so far no one has implemented this.

I use local_urls indexing almost exclusively myself, but this has never
been a problem for me because I never use incomplete URLs for directories,
to avoid forcing an unnecessary redirect.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Mar 14 2000 - 11:32:26 PST