I guess firstly… can you confirm when you say “same packet” you are referring to TCP packets and not WebSocket messages, correct?
If so, I am curious why it is important that your messages be sent in separate TCP packets. WebSocket++ aggregates multiple small WebSocket messages going to the same destination at the same time into single TCP packets because it drastically reduces framing overhead on the wire and the cpu time used by the endpoint.
——
If your goal is to space out responses in time (as in your original example) using timers is a good way. If your goal is to improve latency of response (i.e. if you have 10 requests to process and you want to send a response before processing the second response) then what you need is to control the total runtime of your message handler. Responses won’t be written out to the wire until after the handler that sends the message returns. Some strategies for doing this:
1. If an individual request might take a long time to process and you will have multiple connections then your only sane choice is to use a background thread for processing (see broadcast_server for an example).
2. If each request is fairly short, but 10 requests might be too long to wait then you want your handlers to process requests in batches rather than a loop that does all of them. Example:
on_msg(hdl,msg)
{
request_queue.push(msg);
process_batch(hdl);
}
on_interrupt(hdl)
{
process_batch(hdl);
}
process_batch(hdl)
{
// send some finite number of responses based on expected run time of each
// response and desired latency. Adjusting batch size will affect minimum and
// average latency and server resource usage. This example uses a batch size
// of one for simplicity. You should test with your data and workloads what the
// best batch size would be.
response = actually_process_request(request_queue.front());
endpoint.send(response)
request_queue.pop();
// instead of calling process_request inline which will result in extending the
// run time of the handler, we use an interrupt to yield control back to the main
// event loop but request that this connection gets its interrupt handler called
// as soon as the library has had a chance to process some network data.
if(!response_queue.isempty())
endpoint.interrupt(hdl);
}
3. A third option, that might be better for a busy server, especially one with lots of clients but not as many messages per client, would be to approximate a worker thread using a perpetual timer. Benefits of this method are that you can use one timer and queue for all connections which reduces overhead. Drawbacks are that requests will only be processed at specified intervals. Example:
main() {
// sometime before accepting requests
endpoint.set_timer(1000, process_batch);
}
on_msg(hdl,msg)
{
global_request_queue.push({hdl,msg});
}
process_batch() {
// process a batch of requests if there are any
if (!global_request_queue.empty()) {
std::tie(hdl,msg) = global_request_queue.front();
response = actually_process_request(msg);
endpoint.send(hdl,response);
global_request_queue.pop();
}
// reset the timer for the next interval
endpoint.set_timer(1000, process_batch);
}
Finally, if you have a really compelling reason to want to push out bunches of websocket messages on the same connection in distinct TCP packets with low latency between messages the only way to do that is going to be me providing an option to disable request coalescing.