Crash in TCP

37 views
Skip to first unread message

Iñaki Baz Castillo

unread,
Oct 22, 2019, 9:15:16 PM10/22/19
to li...@googlegroups.com
Hi,

Although I've never been able to reproduce it (neither in Linux nor in
OSX) it seems that there is a bug in my libuv based C++ app related to
TCP. I'm a little desperate because I've spent a lot of days trying to
figure out where the issue is with no luck.

A user of my software (who runs it on Linux) is experimenting TCP
related crashes. It's not even always the same coredump, but it seems
to always be related to TCP.

Could anyone tell me if there is something wrong here?

-------------------------------------------------
void TcpConnection::Close()
{
MS_TRACE();

if (this->closed)
return;

int err;

this->closed = true;

// Tell the UV handle that the TcpConnection has been closed.
this->uvHandle->data = nullptr;

// Don't read more.
err = uv_read_stop(reinterpret_cast<uv_stream_t*>(this->uvHandle));

if (err != 0)
MS_ABORT("uv_read_stop() failed: %s", uv_strerror(err));

// If there is no error and the peer didn't close its connection
side then close gracefully.
if (!this->hasError && !this->isClosedByPeer)
{
// Use uv_shutdown() so pending data to be written will be sent to the peer
// before closing.
auto req = new uv_shutdown_t;
req->data = static_cast<void*>(this);
err = uv_shutdown(
req, reinterpret_cast<uv_stream_t*>(this->uvHandle),
static_cast<uv_shutdown_cb>(onShutdown));

if (err != 0)
MS_ABORT("uv_shutdown() failed: %s", uv_strerror(err));
}
// Otherwise directly close the socket.
else
{
uv_close(reinterpret_cast<uv_handle_t*>(this->uvHandle),
static_cast<uv_close_cb>(onClose));
}
}
-------------------------------------------------

This is, I do uv_shutdown() if the connection is still open and within
the uv_shutdown_cb I call uv_close(). Otherwise, if the remote peer
closed its TCP side I directly call to uv_close():

---------------------------------------------------
inline static void onClose(uv_handle_t* handle)
{
delete handle;
}

inline static void onShutdown(uv_shutdown_t* req, int /*status*/)
{
auto* handle = req->handle;

delete req;

// Now do close the handle.
uv_close(reinterpret_cast<uv_handle_t*>(handle),
static_cast<uv_close_cb>(onClose));
}
---------------------------------------------------

Both this->hasError and this->isClosedByPeer are booleans set to true
if there is an error in the uv_read_cb:

------------------------------------------------
inline void TcpConnection::OnUvRead(ssize_t nread, const uv_buf_t* /*buf*/)
{
MS_TRACE();

if (nread == 0)
return;

// Data received.
if (nread > 0)
{
// Update received bytes.
this->recvBytes += nread;

// Update the buffer data length.
this->bufferDataLen += static_cast<size_t>(nread);

// Notify the subclass.
UserOnTcpConnectionRead();
}
// Client disconneted.
else if (nread == UV_EOF || nread == UV_ECONNRESET)
{
MS_DEBUG_DEV("connection closed by peer, closing server side");

this->isClosedByPeer = true;

// Close server side of the connection.
Close();

// Notify the listener.
this->listener->OnTcpConnectionClosed(this);
}
// Some error.
else
{
MS_WARN_DEV("read error, closing the connection: %s", uv_strerror(nread));

this->hasError = true;

// Close server side of the connection.
Close();

// Notify the listener.
this->listener->OnTcpConnectionClosed(this);
}
}
------------------------------------------------


The whole code is here:

- https://github.com/versatica/mediasoup/blob/tcp-crash-wip/worker/src/handles/TcpConnection.cpp?ts=2
- https://github.com/versatica/mediasoup/blob/tcp-crash-wip/worker/include/handles/TcpConnection.hpp?ts=2

And an example coredump is here:

https://mediasoup.discourse.group/t/error-worker-worker-process-died-unexpectedly/268/84?u=ibc


I'm pretty sure everything is ok in the code above. Could anyone
identify an issue into it?

Really thanks a lot.


--
Iñaki Baz Castillo
<i...@aliax.net>

Santiago Gimeno

unread,
Oct 23, 2019, 2:50:25 AM10/23/19
to li...@googlegroups.com
Hi,

I haven't looked at the code, but as an idea, have you run the code with valgrind? It might provide some hints.

Cheers,

Santi

--
You received this message because you are subscribed to the Google Groups "libuv" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libuv+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libuv/CALiegf%3DhsERWve0k0O4Yut4SU9QPBXzbWuRS9s4une5xbyKwJw%40mail.gmail.com.
Message has been deleted

Iñaki Baz Castillo

unread,
Oct 24, 2019, 8:38:56 AM10/24/19
to li...@googlegroups.com
Yes, we are running it with Valgrind. But must do it more. Will have
it working with Valgrind for long time and let's see.
> To view this discussion on the web visit https://groups.google.com/d/msgid/libuv/CAAJY-XOOau8SJbWTArW8Y5EdAJCw5g8mGeyrZNTf_0RH%2Bf8ifQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages