Ciao amici,
I strongly need you to test the patch I wrote, please. It's very
important.
I have been thinking about the Retriever code. Waiting for the new
Configuration to be available, where we will be able to set attributes
for each server, now persistent connections are configurable only for the
whole retrieving system.
The starting point is that every server has a flag for persistent
connections. How can we use persistent connections? If we choose to use
them, of course ... and if the remote server can accomplish our request
(obviously). The first case, can be done through the initial
configuration (both the actual and the future): and so, no problem.
The second can be discovered only after the first HTTP request. And when
is this? Just inside the Server constructor, for retrieving the
robots.txt file. After this call, we are able to determine if a server
supports persistent connections or not, and so to set the flag
_persistent_connections of the Server class.
So we now know if we can ask a server for a document only once (the
server don't use pcs) or for several times. Here in my patch, I let do it
for the infinite but, as Geoff suggests, we can issue a new attribute
"server_repeat_connections" (or maybe another name, like
"max_consecutive_requests") to determine how many requests I can do
consecutively. The default may be -1 (infinite) but we can set the
maximum. Let me know if you vote for it or not.
In order to do that, I modified the Retriever.cc code (obviously), but I
also modified the Server.cc code (in order to let it really decide if a
server can accomplish persistent connections or not). In order to do
that, I have had to modify the Document.h code and add a public method
for getting the pointer to the HTTPConnect private attribute (HtHTTP
*GetHTTPHandler()).
So, we have 3 loops and from the outer they are:
1) while (more && noSignal)
2) while ( (server = (Server *)servers.Get_NextElement()) && noSignal)
And 3)
+ while ( ( (max_repeat_requests ==-1) ||
+ (count << max_repeat_requests) ) &&
+ (ref = server->pop()) && noSignal)
+ {
I hope I have been clear. Please try the patch and let me know what you
think about it, so I can commit it as soon as possible and include
persistent connections in the new release.
It seems to work on my environment, but I am going to leave it running
the night (I'm leaving from work just now). So tomorrow I can be more
exact.
But please, please, please ... TRY IT !!! It won't crash your machine, I
swear.
And let me know for the attribute and maybe the right name.
Ciao ciao :-)
-Gabriele