Nice, thanks for sharing! Added to
http://code.google.com/p/gevent/wiki/ProjectsUsingGevent
If I may suggest a small interface change:
rename geventhttpclient.httplibcompat to geventhttpclient.httplib
Inside geventhttpclient/ you'd have to replace
import httplib
with this:
httplib = __import__('httplib')
but it's a small price to pay for the less noisy module name.
Another question is about exceptions. From reading the code it seems that
geventhttpclient.httplibcompat may raise HTTPParseException, which is
a subclass of Exception.
This means that httplibcompat is not compatible to httplib exception-wise.
However, rather than catching HTTPParseException and re-raising exceptions
of the right type (which is ugly and loses stack traces), I suggest to
derive HTTPParseException from httplib.HTTPException.
Thanks!
> If I may suggest a small interface change:
>
> rename geventhttpclient.httplibcompat to geventhttpclient.httplib
>
> Inside geventhttpclient/ you'd have to replace
>
> import httplib
>
> with this:
>
> httplib = __import__('httplib')
>
> but it's a small price to pay for the less noisy module name.
>
>
> Another question is about exceptions. From reading the code it seems that
> geventhttpclient.httplibcompat may raise HTTPParseException, which is
> a subclass of Exception.
>
> This means that httplibcompat is not compatible to httplib exception-wise.
> However, rather than catching HTTPParseException and re-raising exceptions
> of the right type (which is ugly and loses stack traces), I suggest to
> derive HTTPParseException from httplib.HTTPException.
Make sense, done.
Antonin
The http parser is the same indeed.
geventhttpclient is more simple and focused on gevent.
I added restkit to my stupid simple benchmark, for what it's worth
geventhttpclient is faster.
Antonin
None of the above, the HTTPClient class has a built-in connection
pool. It knows when the connection can be reused.
The number of concurrent connections can be set with the concurrency
parameters of the HTTPClient __init__ method (1 by default).
Then, you can share a HTTPClient instance among greenlets. When one of
them makes a request it will get a connection from the pool and if the
connection is reusable, other greenlets will reuse it (note that you
need to consume all the response body from the response before the
connection can be reused).
If you have 100 greenlets and 10 connections the 11th greenlet will
wait for the any of the first ten requests to complete and will reuse
the already opened connection to the server.
This model is especially efficient when pulling from some API's
(facebook, twitter...) or when you connect to a daemon (neo4j),
because it allows to limit the concurrency while optimizing the reuse
of connections, a bit like a db connection pool.
geventhttpclient doesn't support pipelining as very few servers
support it correctly. Furthermore the above connection handling model
is much more efficient than pipelining when you run on an evented
core: Pipelining is FIFO so if the first request is very long to
process, all the other greenlets would need to wait for it to
complete. It can be interesting though when you don't want to make too
many connections to the server.
For those interested in the subject, I recommend this great article :
http://www.igvita.com/2011/10/04/optimizing-http-keep-alive-and-pipelining/
I made some fixes as I'm putting it in production so please update.
Regards,
Antonin