Re: [htdig] local_user_urls: query


Subject: Re: [htdig] local_user_urls: query
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Wed Feb 09 2000 - 11:29:25 PST


According to Geoff Hutchison:
> At 9:09 AM +0000 2/9/00, D.J.Adams@soton.ac.uk wrote:
> >Does htdig recognize that some files may have server-side includes and
> >always fetch them via http despite these attributes in the config file?
> >
> >An Apache server will process SSIs in .shtm and .shtml files, plus .htm
> >and .html files with execution permission set.
>
> Currently, the local_* attributes only read .htm and .html files. It
> makes no attempt to emulate server-parsing. So if you have set
> XBitHack for your Apache server, there isn't any way htdig will know
> that and it will fly right through, ignoring your SSI code.
>
> However, .shtml, .phtml, .php3 files and the like will not be indexed
> through the local filesystem, instead going to HTTP.

Actually, htdig 3.1.4 also accepts .txt, .asc, .pdf, .ps and .eps files
locally. For some reason, that change never made it into 3.2.0b1.
I imagine it got lost in the merge. Anyway, with 3.2's mime.types
support, that's the way RetrieveLocal() should determine the content-type
for local files. It'll just need a few lines of code to add that in,
I expect.

In any case, htdig has no equivalent to Apache's XBitHack, so for SSI
documents, I'd recommend using .shtml if you want server-side parsing.
For my own system, I use SSI only to add a few bits and pieces, so I
don't mind that that stuff doesn't get indexed. I now index everything
through local_urls.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Feb 09 2000 - 11:31:49 PST