async http client

251 views
Skip to first unread message

Colin LeMahieu

unread,
Jan 17, 2016, 5:49:14 PM1/17/16
to The C++ Network Library
I see the async client requests are of the form "response_ = client_.post(request_, body, callback)"  This seems to indicate the http call is performed synchronously though the body is processed asynchronously.

It seems this fails to hide the latency of performing the actual http request and getting back the status code, am I missing something?

Dean Michael Berris

unread,
Jan 21, 2016, 10:18:39 PM1/21/16
to The C++ Network Library
So the latency occurs at the point of first pull of the data from the response. The line:

  response_ = client.post(...);

Will return immediately. And when you get the body of the response_, or any of the fields, that's when it will block if the request has not finished getting the data yet. This means you can start multiple requests and wait for responses in various orders, and they happen concurrently to each other.

It may not be immediately obvious though, and I agree. This needs a bit more explanation and documentation.

Cheers

On Mon, Jan 18, 2016 at 9:49 AM Colin LeMahieu <clem...@gmail.com> wrote:
I see the async client requests are of the form "response_ = client_.post(request_, body, callback)"  This seems to indicate the http call is performed synchronously though the body is processed asynchronously.

It seems this fails to hide the latency of performing the actual http request and getting back the status code, am I missing something?

--
You received this message because you are subscribed to the Google Groups "The C++ Network Library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cpp-netlib+...@googlegroups.com.
To post to this group, send email to cpp-n...@googlegroups.com.
Visit this group at https://groups.google.com/group/cpp-netlib.
For more options, visit https://groups.google.com/d/optout.

Colin LeMahieu

unread,
Jan 21, 2016, 10:39:38 PM1/21/16
to The C++ Network Library
I see.  The issue I see with this is in order to eliminate any thread from blocking I either need a notification when something like the status code is available, which hides the tcp and initial http handshake latency, or I need to have some thread at some time blocking waiting for it to be available which seemingly defeats the purpose of using boost ASIO.

This kind of ties in to another difficulty I've had with the library how it requires a thread pool which is out of line with how boost ASIO typically operates.

I see you guys are going to rework the API, I'd suggest something like void client::post(request, callback) where the callback is something of the form void (boost::system::ec &, status_code, response) and a method on response can fetch blocks of the body in similar callback fashion.  This would mean no thread is ever blocking on getting an http response and it falls more in line with the style of ASIO using callbacks.

It would also remove the need for a specialized threadpool which I highly recommend.  I feel people familiar with ASIO understand you don't do slow things inside the handler and have their own IO dispatching mechanism which may be a thread pool but as is, it is just forcing a new dependency on the library user.

I'd also suggest removing share_ptr wrapping the boost::asio::io_service and instead hold a reference leaving the library consumer to choose how the lifetime of io_service is managed.  When I switched to using this library I had to replumb about 10 classes just to add the shared_ptr to the signature which was purely vestigious in my case.

Dean Michael Berris

unread,
Jan 21, 2016, 10:55:36 PM1/21/16
to cpp-n...@googlegroups.com
On Fri, Jan 22, 2016 at 2:39 PM Colin LeMahieu <clem...@gmail.com> wrote:
I see.  The issue I see with this is in order to eliminate any thread from blocking I either need a notification when something like the status code is available, which hides the tcp and initial http handshake latency, or I need to have some thread at some time blocking waiting for it to be available which seemingly defeats the purpose of using boost ASIO.


So on the client side, this is true -- but there is a way out. I was working on a helper called "when_ready(...)" which allows you to do something like this:

when_ready(client.get(...), [](response& r, system::error_code ec) {
  // do something with r
});

This "should just happen", but I got blocked by some other issues which I think have been fixed since. Will something like that be more acceptable to you?
 
This kind of ties in to another difficulty I've had with the library how it requires a thread pool which is out of line with how boost ASIO typically operates.


You mean, for the HTTP server part, yes?
 
I see you guys are going to rework the API, I'd suggest something like void client::post(request, callback) where the callback is something of the form void (boost::system::ec &, status_code, response) and a method on response can fetch blocks of the body in similar callback fashion.  This would mean no thread is ever blocking on getting an http response and it falls more in line with the style of ASIO using callbacks.


It's already there. ;)


response_ = client_.get(request_, callback)
Perform an HTTP GET request, and have the body chunks be handled by the callback parameter. The signature of callback should be the following: void(iterator_range<char const *> const &, boost::system::error_code const &).
 
It would also remove the need for a specialized threadpool which I highly recommend.  I feel people familiar with ASIO understand you don't do slow things inside the handler and have their own IO dispatching mechanism which may be a thread pool but as is, it is just forcing a new dependency on the library user.


So the threadpool is meant to handle the non-networking part of the application logic in the server. This means you're isolating the network events from the non-network events, and you can start writing blocking code there (and tune the number of threads you have concurrently happening). You can even do admission control on the handlers to start rejecting requests that come in while the pool is busy (admittedly that feature isn't built-in but easy to implement).
 
I'd also suggest removing share_ptr wrapping the boost::asio::io_service and instead hold a reference leaving the library consumer to choose how the lifetime of io_service is managed.  When I switched to using this library I had to replumb about 10 classes just to add the shared_ptr to the signature which was purely vestigious in my case.


Interesting.

I think the reason we took a shared_ptr instead is to that we can tie the lifetime of the io_service to the operations that were still pending on that io_service. So imagine when you create a client object and you do a post, then the client goes out of scope but the post isn't finished yet -- the shared pointer keeps the io_service alive until all the operations that need it are done. This is much easier to get wrong if we took a reference to an optional io_service -- note not all users need to provide an io_service they control, because not everyone will be using Boost.Asio already.

Thanks for the feedback Colin!

Is there anything else I can help with?

Colin LeMahieu

unread,
Jan 21, 2016, 11:16:12 PM1/21/16
to The C++ Network Library


On Thursday, January 21, 2016 at 9:55:36 PM UTC-6, Dean Michael Berris wrote:
On Fri, Jan 22, 2016 at 2:39 PM Colin LeMahieu <clem...@gmail.com> wrote:
I see.  The issue I see with this is in order to eliminate any thread from blocking I either need a notification when something like the status code is available, which hides the tcp and initial http handshake latency, or I need to have some thread at some time blocking waiting for it to be available which seemingly defeats the purpose of using boost ASIO.


So on the client side, this is true -- but there is a way out. I was working on a helper called "when_ready(...)" which allows you to do something like this:

when_ready(client.get(...), [](response& r, system::error_code ec) {
  // do something with r
});

This "should just happen", but I got blocked by some other issues which I think have been fixed since. Will something like that be more acceptable to you? 
 
This kind of ties in to another difficulty I've had with the library how it requires a thread pool which is out of line with how boost ASIO typically operates.


You mean, for the HTTP server part, yes?
 
I see you guys are going to rework the API, I'd suggest something like void client::post(request, callback) where the callback is something of the form void (boost::system::ec &, status_code, response) and a method on response can fetch blocks of the body in similar callback fashion.  This would mean no thread is ever blocking on getting an http response and it falls more in line with the style of ASIO using callbacks.


It's already there. ;)


response_ = client_.get(request_, callback)
Perform an HTTP GET request, and have the body chunks be handled by the callback parameter. The signature of callback should be the following: void(iterator_
range<char const *> const &, boost::system::error_code const &).
This is the one I use, it processes the http body chunks though I was expecting something similar before processing the body where you would be given the header and status code when they're available.  Rather than get(...) having a callback to body chunks, it would have a callback with the header and status code.   Inside this callback you could then request the body with another callback to process the body chunks.  I haven't tested, is this called if the http response has no body like for a 500?

 
It would also remove the need for a specialized threadpool which I highly recommend.  I feel people familiar with ASIO understand you don't do slow things inside the handler and have their own IO dispatching mechanism which may be a thread pool but as is, it is just forcing a new dependency on the library user.


So the threadpool is meant to handle the non-networking part of the application logic in the server. This means you're isolating the network events from the non-network events, and you can start writing blocking code there (and tune the number of threads you have concurrently happening). You can even do admission control on the handlers to start rejecting requests that come in while the pool is busy (admittedly that feature isn't built-in but easy to implement).

Blocking inside the ASIO threads is never a good idea agreed, though by explicitly using a pool cppnetlib requires the library user to give this guarantee in a specific way which may not be the way the user wants to do it.  For instance what if their http server just proxies the request off to another ASIO call?  Maybe it doesn't do I/O processing or already does asynchronous disk IO inside the http handler.  Any of these cases makes the threadpool work against how the library user wants to operate.  In the end they could always just use a threadpool but it would be their own option.  Sometimes the operations being performed inside a callback require special thread-local setup maybe Windows COM initialization.  The user already had COM initialized in their own thread pool which was servicing the io_service proactor, now with this new thread pool they're required to use, they need to make sure these threads are COM initialized or thunk calls off to the threads they already had running the io_service.
 
 
I'd also suggest removing share_ptr wrapping the boost::asio::io_service and instead hold a reference leaving the library consumer to choose how the lifetime of io_service is managed.  When I switched to using this library I had to replumb about 10 classes just to add the shared_ptr to the signature which was purely vestigious in my case.


Interesting.

I think the reason we took a shared_ptr instead is to that we can tie the lifetime of the io_service to the operations that were still pending on that io_service. So imagine when you create a client object and you do a post, then the client goes out of scope but the post isn't finished yet -- the shared pointer keeps the io_service alive until all the operations that need it are done. This is much easier to get wrong if we took a reference to an optional io_service -- note not all users need to provide an io_service they control, because not everyone will be using Boost.Asio already.

Thanks for the feedback Colin!

Is there anything else I can help with?

No that's good, thanks for taking a look!

Dean Michael Berris

unread,
Jan 22, 2016, 12:12:47 AM1/22/16
to cpp-n...@googlegroups.com
On Fri, Jan 22, 2016 at 3:16 PM Colin LeMahieu <clem...@gmail.com> wrote:


On Thursday, January 21, 2016 at 9:55:36 PM UTC-6, Dean Michael Berris wrote:
On Fri, Jan 22, 2016 at 2:39 PM Colin LeMahieu <clem...@gmail.com> wrote:
I see.  The issue I see with this is in order to eliminate any thread from blocking I either need a notification when something like the status code is available, which hides the tcp and initial http handshake latency, or I need to have some thread at some time blocking waiting for it to be available which seemingly defeats the purpose of using boost ASIO.


So on the client side, this is true -- but there is a way out. I was working on a helper called "when_ready(...)" which allows you to do something like this:

when_ready(client.get(...), [](response& r, system::error_code ec) {
  // do something with r
});

This "should just happen", but I got blocked by some other issues which I think have been fixed since. Will something like that be more acceptable to you? 
 
This kind of ties in to another difficulty I've had with the library how it requires a thread pool which is out of line with how boost ASIO typically operates.


You mean, for the HTTP server part, yes?
 
I see you guys are going to rework the API, I'd suggest something like void client::post(request, callback) where the callback is something of the form void (boost::system::ec &, status_code, response) and a method on response can fetch blocks of the body in similar callback fashion.  This would mean no thread is ever blocking on getting an http response and it falls more in line with the style of ASIO using callbacks.


It's already there. ;)


response_ = client_.get(request_, callback)
Perform an HTTP GET request, and have the body chunks be handled by the callback parameter. The signature of callback should be the following: void(iterator_
range<char const *> const &, boost::system::error_code const &).
This is the one I use, it processes the http body chunks though I was expecting something similar before processing the body where you would be given the header and status code when they're available.  Rather than get(...) having a callback to body chunks, it would have a callback with the header and status code.   Inside this callback you could then request the body with another callback to process the body chunks.  I haven't tested, is this called if the http response has no body like for a 500?


Right. Yes, on errors the body callback should be invoked with an empty range.

The "when_ready" API is the one that should do this, where it gets called as soon as the headers and status parsing is done. And then pulling the body could happen asynchronously later, with some other function.

I suspect the simpler idea here might be that we can provide a byte source (similar to Boost.IOStreams) where you can pull the body on demand, in a blocking fashion or in an asynchronous completion-style manner. This way as soon as status and header parsing is done we can invoke the callback to when_ready, and all body pulling happens via a stream interface.

That might look like this:

when_ready(client.get(...), [](response& r, auto ec) {
  if (ec) {
    // log error and return;
    return;
  }
  for (const auto& chink : r.body_chunks()) {
    // As the chunks come, you're processing it.
  }
});

Or if you'd like it fully asynchronous:

when_ready(client.get(...), [](response& r, auto ec) {
  if (ec) {
    // log error and return;
    return;
  }
  on_chunks(r.body_async(), [](const auto& range, auto ec) {
    if (ec) { return false; }  // stop on errors
    // deal with range here
    return true;
  }
});

This shouldn't be too hard to do, and I'd be happy to either do it or review it. This would be using the lifetime thread(s) running the client.
 
 
It would also remove the need for a specialized threadpool which I highly recommend.  I feel people familiar with ASIO understand you don't do slow things inside the handler and have their own IO dispatching mechanism which may be a thread pool but as is, it is just forcing a new dependency on the library user.


So the threadpool is meant to handle the non-networking part of the application logic in the server. This means you're isolating the network events from the non-network events, and you can start writing blocking code there (and tune the number of threads you have concurrently happening). You can even do admission control on the handlers to start rejecting requests that come in while the pool is busy (admittedly that feature isn't built-in but easy to implement).

Blocking inside the ASIO threads is never a good idea agreed, though by explicitly using a pool cppnetlib requires the library user to give this guarantee in a specific way which may not be the way the user wants to do it.  For instance what if their http server just proxies the request off to another ASIO call?  Maybe it doesn't do I/O processing or already does asynchronous disk IO inside the http handler.  Any of these cases makes the threadpool work against how the library user wants to operate.  In the end they could always just use a threadpool but it would be their own option.  Sometimes the operations being performed inside a callback require special thread-local setup maybe Windows COM initialization.  The user already had COM initialized in their own thread pool which was servicing the io_service proactor, now with this new thread pool they're required to use, they need to make sure these threads are COM initialized or thunk calls off to the threads they already had running the io_service.
 

Interesting. I hadn't thought about this.

You can already do this with the sync_server, which I've just deleted (oops) for 0.12 because it's been deprecated in 0.11.x -- especially because it's too easy to get users to forget that these are happening in the network threads too.

There's a balance here to be struck, and I think there's a way to get a simpler HTTP server implementation in that didn't have a lot of the simplifications in the sync server implementation but has the flexibility of the async implementation. Something to think about especially in light of the HTTP/2 ambitions for the project.
 
 
I'd also suggest removing share_ptr wrapping the boost::asio::io_service and instead hold a reference leaving the library consumer to choose how the lifetime of io_service is managed.  When I switched to using this library I had to replumb about 10 classes just to add the shared_ptr to the signature which was purely vestigious in my case.


Interesting.

I think the reason we took a shared_ptr instead is to that we can tie the lifetime of the io_service to the operations that were still pending on that io_service. So imagine when you create a client object and you do a post, then the client goes out of scope but the post isn't finished yet -- the shared pointer keeps the io_service alive until all the operations that need it are done. This is much easier to get wrong if we took a reference to an optional io_service -- note not all users need to provide an io_service they control, because not everyone will be using Boost.Asio already.

Thanks for the feedback Colin!

Is there anything else I can help with?

No that's good, thanks for taking a look!
 

Always happy to help!

On another note, are you willing to have a look at an implementation of the above asynchronous functions?

Colin LeMahieu

unread,
Jan 22, 2016, 11:46:56 AM1/22/16
to The C++ Network Library
For sure, I can take a look, just let me know. 

Christophe B.

unread,
Jan 29, 2016, 8:36:08 AM1/29/16
to The C++ Network Library
Hi Dean,

I need to implement a fully asynchronous client that reacts when the response is ready. Due to constraints, multiple HTTP client instances must share the same io_service and the same thread and it is not possible to add more threads or io_service's. I could negotiate the addition of a thread for handling all client responses. But still this thread should not be blocked by the fact that it checks a response status. Actually I'd like to handle the response once it is ready (ala boost asio)

Is it possible to accomplish this with an asynchronous HTTP client ?
If yes how? Could you give some guidelines or code snippets showing how to handle multiple clients responses asynchronously without blocking each other?

I downloaded the 0.12.0 rc0 from github, but I didn't find the when_ready(...) function. I guess that it is still under development. Do you already have some implementation to provide ?

Thank you.

Dean Michael Berris

unread,
Jan 29, 2016, 10:08:58 PM1/29/16
to cpp-n...@googlegroups.com
Hi Christophe!

> On 30 Jan 2016, at 00:36, Christophe B. <bou...@gmail.com> wrote:
>
> Hi Dean,
>
> I need to implement a fully asynchronous client that reacts when the response is ready. Due to constraints, multiple HTTP client instances must share the same io_service and the same thread and it is not possible to add more threads or io_service's. I could negotiate the addition of a thread for handling all client responses. But still this thread should not be blocked by the fact that it checks a response status. Actually I'd like to handle the response once it is ready (ala boost asio)
>
> Is it possible to accomplish this with an asynchronous HTTP client ?
> If yes how? Could you give some guidelines or code snippets showing how to handle multiple clients responses asynchronously without blocking each other?
>

You can do something like this, but it's not fully asynchronous (i.e. you're going to need to poll in some sense):

std::vector<std::pair<client::request, client::response>> responses;
for (const auto& url : urls) {
auto req = client::request(url);
responses.push_back(std::make_pair(req, client.get(req));
}
while (!responses.empty()) {
for (auto i = responses.begin(); i != responses.end(); ++i) {
if (ready(i->second)) {
// here, the response is ready, do something with it
responses.erase(i);
break;
}
}
// maybe sleep for a little while
}

> I downloaded the 0.12.0 rc0 from github, but I didn't find the when_ready(...) function. I guess that it is still under development. Do you already have some implementation to provide ?
>

Sorry, yeah I haven't implemented this completely yet. I had a pull request that was defunct and I need to revisit at a later time.

But ideally it should be possible to do this fully asynchronously. Just haven't gotten around to doing so.

Cheers

Christophe B.

unread,
Jan 30, 2016, 4:43:12 PM1/30/16
to The C++ Network Library
Hi Dean,
Thank you for your swift reaction. I did not find the entry about "ready" in the documentation.
Do you have any plan for providing with a full asynchronous client in the future?
Best.

Dean Michael Berris

unread,
Jan 31, 2016, 7:09:45 PM1/31/16
to cpp-n...@googlegroups.com
On Sun, Jan 31, 2016 at 8:43 AM Christophe B. <bou...@gmail.com> wrote:
Hi Dean,
Thank you for your swift reaction.

My pleasure!
 
I did not find the entry about "ready" in the documentation.

Oh, that's a bug. Can you file an issue on GitHub, so we can track the documentation failures?
 
Do you have any plan for providing with a full asynchronous client in the future?

Yes. I think we're getting braver now that we have enough experience with actual use-cases for a more flexible API. The simplistic API didn't quite work for the advanced users, so there's room to provide a richer API for these use-cases.

Of course if you have time to work on something like this, I'd gladly review pull requests and work on getting something that works for users implemented by the users. :)
 
Best.

Thanks Cristophe!

Cheers

Christophe B.

unread,
Feb 3, 2016, 4:09:27 AM2/3/16
to The C++ Network Library
Hi Dean,
I am sorry for the delay. I just reported the issue in github: https://github.com/cpp-netlib/cpp-netlib/issues/592
For contributing, difficult to make any promise. Should I make an attempt in this direction, for sure, I will propose the result.
I already investigated the code a bit. From what I understood, the main part resides (at least in version 0.11) inside boost/network/protocol/http/client/connection/async_normal.hpp that makes the glue between asio asynchronous calls and response's member futures. I guess that async_normal should be implemented in term of a new fully asynchronous connection. What is not clear to me is how I can make this new full asynchronous connection rippling up to the client interface.
If you could give some directions to make the job done, it could give me more chance to progress on it.
Reply all
Reply to author
Forward
0 new messages