Gilles Detillieux (firstname.lastname@example.org)
Mon, 16 Aug 1999 14:29:00 -0500 (CDT)
According to peter karlsson:
> > Well, Peter, I don't know what to say. Neither Geoff nor I can
> > reproduce the error from here, so the problem must lie on your system.
> I just noticed something, because a web browser I tried (w3m) didn't show
> the pages correctly, either, that somehow the Squid proxy seems to remove
> the Content-Type header from some of the pages on the server:
> This is strange, though, since the previous indexing was *not* done through
> a proxy. It might be a problem with the web server, though
> (phttpd/0.99.72.1). But when I try to connect directly, I do get a
> Content-Type header:
There are two strange things about this. First of all, as you point out,
the problem started before you started indexing through Squid. Secondly,
if htdig doesn't receive a Content-Type header, it shouldn't even attempt
to index the document at all.
> What headers are htdig sending to the server? It might be one of those that
> interfere with what headers phttpd sends back.
htdig sends these headers, in this order:
GET url-path HTTP/1.0
User-Agent: htdig/3.1.2 (maintainer)
Referer: url <- if referring document is known
If-Modified-Since: date <- if document previously indexed
Authorization: Basic username/password <- if given with -u option
Host: url-host <- unless allow_virtual_hosts disabled
Each line ends with a CR/LF. When you run htdig -vvv, it shows the entire
retrieval command used, with all headers. They're all sent in a single
-- Gilles R. Detillieux E-mail: <email@example.com> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org containing the single word unsubscribe in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Mon Aug 16 1999 - 12:30:11 PDT