Processing long-running http requests with seastar::http::server

166 views
Skip to first unread message

Kfir Gollan

<kfir.gollan@gmail.com>
unread,
Aug 2, 2023, 6:25:53 AM8/2/23
to seastar-dev
Hey,
While working with the http_server I noticed that it only handles a single request per listener per shard at a time. This implies that the implementation is limited to the number of parallel requests that can be handled. Assuming the server performs some heavy operation in the context of a single http request (e.g using some other service by its API) the shard will not be able to process any additional requests until the completion of the previous request.

This made me wonder, why not add some seastar::thread pool to the http server. The size of the pool can be configurable according to the application needs. Overcoming the limitation of a single connection per shard.

I didn't find any discussion on this topic in the community here. Figured that I will ask before implementing such a solution.

Thanks in advance,
Kfir

Marcin Maliszkiewicz

<marcinmal@scylladb.com>
unread,
Aug 2, 2023, 7:19:34 AM8/2/23
to Kfir Gollan, seastar-dev
Hello!
Looking at http_server::do_accept_one it seems that it'll create any number of connections per shard you request. How are you planning to support io to a single connection concurrently?

--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/eeb9f386-37f5-4061-a491-c6bbde5287e4n%40googlegroups.com.
Message has been deleted

Marcin Maliszkiewicz

<marcinmal@scylladb.com>
unread,
Aug 20, 2023, 2:43:56 AM8/20/23
to Kfir Gollan, seastar-dev
Thank you for the example! As I understand it shouldn't block things like that. I didn't have time to fully debug it but made two observations:

- In your demo when I call /slow endpoint then before it finishes I call /fast endpoint the latter awaits response forever, it seems the query was somehow lost or stuck.

- It does seem to work correctly in alternator code. I added 5s sleep in local_nodelist_handler::do_handle then I executed simple script:

```
#!/bin/bash
for i in `seq 100`
do
curl 'http://127.0.0.1:8000/localnodes' &
echo $i
done

wait
```

it works as expected, it takes just 5s (sleep time) to finish all requests concurrently.

- You can try the above script with your code. It seems to deadlock for me after a couple initial requests.

Regards,
Marcin

On Wed, Aug 2, 2023 at 4:26 PM Kfir Gollan <kfir....@gmail.com> wrote:
I figured that the simplest way of showing the problem is with a reproducible case.
The following program is an http_server (no sharding for simplicity) that handles two requests fast & slow.
Accessing the fast api is instantaneous and the slow api takes time (10 seconds). Note that the shard is idle while waiting for the slow api to complete


#include <seastar/core/seastar.hh>
#include <seastar/core/future-util.hh>
#include <seastar/core/app-template.hh>
#include <seastar/http/routes.hh>
#include <seastar/http/request.hh>
#include <seastar/http/function_handlers.hh>
#include <seastar/http/httpd.hh>
#include <seastar/core/sleep.hh>
#include <seastar/coroutine/all.hh>
#include <seastar/util/log.hh>
#include <iostream>
#include <chrono>

using namespace std::chrono_literals;

using namespace seastar;
using namespace seastar::httpd;

logger applog{"http-requests"};

void set_routes(routes& r) {
    function_handler* h1 = new function_handler([](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
        applog.info("{} slow", req->get_url());
        co_await seastar::sleep(10s);
        co_return json::json_return_type("json-future");
    });
    function_handler* h2 = new function_handler([](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
        applog.info("{} fast", req->get_url());
        co_return json::json_return_type("json-future");
    });
    r.add(operation_type::GET, url("/slow"), h1);
    r.add(operation_type::GET, url("/fast"), h2);
}

int main(int argc, char** argv) {
    app_template app;
    return app.run(argc, argv, [] () -> future<int> {
        auto server = new http_server("seastar");
        set_routes(server->_routes);
        co_await server->listen(seastar::make_ipv4_address({1234}));

        co_await seastar::sleep(10000s);
        co_await server->stop();
        delete server;
        co_return 0;
    });
}

To see it an action just invoke two calls - trigger the slow endpoint and then the fast endpoint.

The server will hang.

My suggested fix is to invoke the actual handlers (in this case the function handler that sleeps) on a seastar::thread. Like that seastar will be able to handle concurrent requests.

Hope this makes more sense now,
Kfir.

Nadav Har'El

<nyh@scylladb.com>
unread,
Aug 20, 2023, 3:28:19 AM8/20/23
to Kfir Gollan, seastar-dev
On Wed, Aug 2, 2023 at 1:25 PM Kfir Gollan <kfir....@gmail.com> wrote:
Hey,
While working with the http_server I noticed that it only handles a single request per listener per shard at a time.

As Marcin noted experimentally (from Scylla's DynamoDB-compatible API "Alternator", which uses Seastar's HTTP server), this isn't actually true -  as soon as Seastar HTTPD accepts a connection on a listening socket, it processes this new connection in the background, and immediately continues to accept more connections on the same listening socket. A snippet from the httpd code:

future<> http_server::do_accept_one(int which) {
    return _listeners[which].accept().then([this] (accept_result ar) mutable {
        auto conn = std::make_unique<connection>(*this, std::move(ar.connection), std::move(ar.remote_address));
        (void)try_with_gate(_task_gate, [conn = std::move(conn)]() mutable {
            return conn->process().handle_exception([conn = std::move(conn)] (std::exception_ptr ex) {
                hlogger.error("request error: {}", ex);

Note how we do not wait for the handling of the connection (conn->process()) before returning from do_accept_one, and going on to accept another connection on the same listening socket.

That being said, your observation seems to be correct in another sense: When processing a flow of requests on a single connection, we indeed read the requests - and process them - one after another, without parallelism. This is almost, but not quite, a built-in limitation of HTTP (before HTTP 2), which doesn't multiplex multiple request and response streams. I'm saying "almost but not quite" because there is one thing we're doing non-optimally - HTTP 1.1 pipelining. With pipelining, a client can send multiple requests without waiting for the responses, and the server could - but today doesn't - start all of them in parallel but still return the responses in the requested order (this is a limitation of http 1.1). As far as I remember, we don't do this today. The code in src/http/httpd.cc actually has separate "reading" and "writing" threads, but I think (I haven't looked at the code in a while) we don't read the next request on a connection before we finished writing the previous response.

I was surprised to see we don't have an open issue about HTTP 1.1 pipelining. In any case, HTTP 1.1 pipelining is a fairly fragile and hated feature, and we can gain more by supporting HTTP 2 which does the request multiplexing much better.
 
This implies that the implementation is limited to the number of parallel requests that can be handled. Assuming the server performs some heavy operation in the context of a single http request (e.g using some other service by its API) the shard will not be able to process any additional requests until the completion of the previous request.

As I noted above, this is not true if new requests come from other HTTP connections
It is true if the client is trying to push additional requests on the existing connections which are already busy. In any case with HTTP 1.1, even if the client does manage to pipeline a second request on an existing busy connection, he can't get the response before the previous response returned.



This made me wonder, why not add some seastar::thread pool to the http server. The size of the pool can be configurable according to the application needs. Overcoming the limitation of a single connection per shard.

We don't need seastar::thread to do parallel work in Seastar. All seastar::thread adds is a stack, which is always better to do without if you're trying to do massive parallelism.
 

I didn't find any discussion on this topic in the community here. Figured that I will ask before implementing such a solution.

Thanks in advance,
Kfir

Reply all
Reply to author
Forward
0 new messages