Re: [htdig] httpd Internal Server Error


Subject: Re: [htdig] httpd Internal Server Error
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Mon Jul 17 2000 - 11:03:03 PDT


According to Greg Lepore:
> I've isolated the error to the sort by title parameter that I pass along
> with the search terms. When I search with sort by score the results are
> returned to the browser in the same time it takes to search by command
> line. When I sort by title - crash-o-rama. Searching by reverse score
> works, but not by time, reverse time, or reverse title.
> To sum up, the server will not return an error with sorting by score or
> reverse score; any other sorting causes the internal server error,
> presumably due to a timeout.

OK, that makes sense. See http://www.htdig.org/FAQ.html#q5.10

> In researching the Premature End of Script Headers problem at the Apache
> website, it was pointed out that
> "The second most common cause of this is a result of an interaction with
> Perl's output buffering.... To make Perl flush its buffers after each
> output statement...This is generally only necessary when you are calling
> external programs from your script that send output to stdout, or if there
> will be a long delay between the time the headers are sent and the actual
> content starts being emitted... If your script isn't written in Perl, do
> the equivalent thing for whatever language you are using (e.g., for C, call
> fflush() after writing the headers). "
> Might this be relevant?

No, it's not likely to be a buffering problem. htsearch doesn't start
outputting anything until it's processed and sorted all matches, and
after that the time to output is actually quite small. The problem is
that when there are a lot of hits, the time to process them can be quite
long, especially when htsearch must fetch the db.docdb record for each
match (rather than for just the few it actually displays).

> At 10:34 AM 7/13/00 , Gilles Detillieux wrote:
> >According to Greg Lepore:
> >> I need to work on my powers of estimation, the actual command line time
> >> for a search that returns all pages (112,000) is around 20-25 seconds. At
> >> the time of the tests, there was only one install of HTDIG and therefore
> >> only one database and conf file. No unusual input parameters, basically
> >> "search everything" with the defaults.
> >
> >OK, but just to be sure we rule out any input parameter differences, how
> >about setting the method from POST to GET in the search form, so you can
> >see the query string, and then calling htsearch from the command line
> >with the QUERY_STRING environment variable set to the same query string
> >you saw in the URL in your browser, and the REQUEST_METHOD environment
> >variable set to GET. Perhaps try the query that actually works from the
> >browser and returns the largest number of hits, and compare its timing to
> >the time it takes from the command line.
> >
> >Unless you're running on a busy server and the CGI scripts run at a much
> >higher nice level, I'm at a loss to explain why htsearch takes so much
> >longer when run as a CGI.
> >
> >> I am trying to install 3.2b2 but the
> >> dig time is outrageous, still running after 16 hours versus 5 hours for a
> >> complete dig with 3.1.3.
> >> However, searches that return 20,000 hits and over are still giving the
> >> crash. I was hoping that upgrading would give me some speed benefits. Of
> >> course, I will also install another 128MB of RAM and cross my fingers...
> >
> >The 3.2 series is supposed to give some speed benefits for searches, at the
> >cost of longer digging time.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Jul 17 2000 - 23:11:06 PDT