net/sctp: sock memory leak

Dmitry Vyukov

unread,

Dec 30, 2015, 3:42:48 PM12/30/15

to Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

Hello,

The following program leads to a leak of two sock objects:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <pthread.h>

int fd;

void *thr(void *arg)
{
memcpy((void*)0x2000bbbe,
"\x0a\x00\x33\xdc\x14\x4d\x5b\xd1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xdd\x01\xf8\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
128);
syscall(SYS_sendto, fd, 0x2000b000ul, 0x70ul, 0x8000ul,
0x2000bbbeul, 0x80ul);
return 0;
}

int main()
{
long i;
pthread_t th[6];

syscall(SYS_mmap, 0x20000000ul, 0x20000ul, 0x3ul, 0x32ul,
0xfffffffffffffffful, 0x0ul);
fd = syscall(SYS_socket, 0xaul, 0x1ul, 0x84ul, 0, 0, 0);
memcpy((void*)0x20003000,
"\x02\x00\x33\xdf\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
128);
syscall(SYS_bind, fd, 0x20003000ul, 0x80ul, 0, 0, 0);
pthread_create(&th[0], 0, thr, (void*)0);
usleep(100000);
syscall(SYS_listen, fd, 0x3ul, 0, 0, 0, 0);
syscall(SYS_accept, fd, 0x20005f80ul, 0x20003000ul, 0, 0, 0);
return 0;
}

unreferenced object 0xffff8800342540c0 (size 1864):
comm "a.out", pid 24109, jiffies 4299060398 (age 27.984s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
backtrace:
[<ffffffff85c73a22>] kmemleak_alloc+0x72/0xc0 mm/kmemleak.c:915
[< inline >] kmemleak_alloc_recursive include/linux/kmemleak.h:47
[< inline >] slab_post_alloc_hook mm/slub.c:1335
[< inline >] slab_alloc_node mm/slub.c:2594
[< inline >] slab_alloc mm/slub.c:2602
[<ffffffff816cc14d>] kmem_cache_alloc+0x12d/0x2c0 mm/slub.c:2607
[<ffffffff84b642c9>] sk_prot_alloc+0x69/0x340 net/core/sock.c:1344
[<ffffffff84b6d36a>] sk_alloc+0x3a/0x6b0 net/core/sock.c:1419
[<ffffffff850c6d57>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:173
[<ffffffff84b5f47c>] __sock_create+0x37c/0x640 net/socket.c:1162
[< inline >] sock_create net/socket.c:1202
[< inline >] SYSC_socket net/socket.c:1232
[<ffffffff84b5f96f>] SyS_socket+0xef/0x1b0 net/socket.c:1212
[<ffffffff85c8eaf6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
[<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff880034253780 (size 1864):
comm "a.out", pid 24109, jiffies 4299060500 (age 27.882s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 33 dc 00 00 ............3...
0a 00 07 40 00 00 00 00 d8 40 25 34 00 88 ff ff ...@.....@%4....
backtrace:
[<ffffffff85c73a22>] kmemleak_alloc+0x72/0xc0 mm/kmemleak.c:915
[< inline >] kmemleak_alloc_recursive include/linux/kmemleak.h:47
[< inline >] slab_post_alloc_hook mm/slub.c:1335
[< inline >] slab_alloc_node mm/slub.c:2594
[< inline >] slab_alloc mm/slub.c:2602
[<ffffffff816cc14d>] kmem_cache_alloc+0x12d/0x2c0 mm/slub.c:2607
[<ffffffff84b642c9>] sk_prot_alloc+0x69/0x340 net/core/sock.c:1344
[<ffffffff84b6d36a>] sk_alloc+0x3a/0x6b0 net/core/sock.c:1419
[<ffffffff85750e00>] sctp_v6_create_accept_sk+0xf0/0x790 net/sctp/ipv6.c:646
[<ffffffff857242a9>] sctp_accept+0x409/0x6d0 net/sctp/socket.c:3925
[<ffffffff84fa33b3>] inet_accept+0xe3/0x660 net/ipv4/af_inet.c:671
[<ffffffff84b5a68c>] SYSC_accept4+0x32c/0x630 net/socket.c:1474
[< inline >] SyS_accept4 net/socket.c:1424
[< inline >] SYSC_accept net/socket.c:1508
[<ffffffff84b601e6>] SyS_accept+0x26/0x30 net/socket.c:1505
[<ffffffff85c8eaf6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
[<ffffffffffffffff>] 0xffffffffffffffff

On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28).

Marcelo Ricardo Leitner

unread,

Dec 30, 2015, 3:47:38 PM12/30/15

to Dmitry Vyukov, Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote:
> Hello,
>
> The following program leads to a leak of two sock objects:

Damn, Dmitry ;-)
If no one takes care of it by then, I'll look into it next week, thanks.

Marcelo

Marcelo Ricardo Leitner

unread,

Jan 15, 2016, 1:46:26 PM1/15/16

to Dmitry Vyukov, Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote:

> Hello,
>
> The following program leads to a leak of two sock objects:

...

>
> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28).

I'm afraid I cannot reproduce this one?
I enabled dynprintk at sctp_destroy_sock and it does print twice when I
run this test app.
Also added debugs to check association lifetime, and then it was
destroyed. Same for endpoint.

Checking with trace-cmd, both calls to sctp_close() resulted in
sctp_destroy_sock() being called.

As for sock_hold/put, they are matched too.

Ideas? Log is below for double checking

[ 1112.558217] sctp_endpoint_init(ffff88000015d400) sock_hold sk:ffff8800336da000
[ 1112.558539] sctp_endpoint_hold(ffff88000015d400)
[ 1112.558544] sctp_association_init(ffff8800b2db9000) sock_hold sk:ffff8800336da000
[ 1112.558878] sctp_association_hold(ffff8800b2db9000)
[ 1112.558957] sctp_association_hold(ffff8800b2db9000)
[ 1112.559062] sctp_association_hold(ffff8800b2db9000)
[ 1112.559079] sctp_association_hold(ffff8800b2db9000)
[ 1112.658745] sctp_endpoint_init(ffff88000015e200) sock_hold sk:ffff8800336dc800
[ 1112.658815] sctp_endpoint_put(ffff88000015d400)
[ 1112.658819] sctp_assoc_migrate(ffff8800b2db9000) sock_put sk:ffff8800336da000 oldsk:ffff8800336da000
[ 1112.658822] sctp_endpoint_hold(ffff88000015e200)
[ 1112.658824] sctp_assoc_migrate(ffff8800b2db9000) sock_hold sk:ffff8800336dc800
[ 1112.659627] sctp_association_put(ffff8800b2db9000)
[ 1112.659673] sctp_association_free(ffff8800b2db9000)
[ 1112.659691] sctp_association_put(ffff8800b2db9000)
[ 1112.659735] sctp_transport_put(ffff8800b3426000)
[ 1112.659737] sctp_transport_destroy(ffff8800b3426000)
[ 1112.659741] sctp_association_put(ffff8800b2db9000)
[ 1112.659745] sctp_association_put(ffff8800b2db9000)
[ 1112.659757] sctp_association_put(ffff8800b2db9000)
[ 1112.659759] sctp_association_destroy(ffff8800b2db9000)
[ 1112.659761] sctp_endpoint_put(ffff88000015e200)
[ 1112.659773] sctp_association_destroy(ffff8800b2db9000) sock_put sk:ffff8800336dc800
[ 1112.659814] sctp_close sock_hold sk:ffff8800336dc800
[ 1112.659818] sctp: sctp_destroy_sock: sk:ffff8800336dc800
[ 1112.659823] sctp_endpoint_put(ffff88000015e200)
[ 1112.659825] sctp_endpoint_destroy(ffff88000015e200)
[ 1112.659841] sctp_endpoint_destroy(ffff88000015e200) sock_put sk:ffff8800336dc800
[ 1112.659852] sctp_close sock_put sk:ffff8800336dc800
[ 1112.662437] sctp_close sock_hold sk:ffff8800336da000
[ 1112.662443] sctp: sctp_destroy_sock: sk:ffff8800336da000
[ 1112.662448] sctp_endpoint_put(ffff88000015d400)
[ 1112.662450] sctp_endpoint_destroy(ffff88000015d400)
[ 1112.662466] sctp_endpoint_destroy(ffff88000015d400) sock_put sk:ffff8800336da000
[ 1112.662476] sctp_close sock_put sk:ffff8800336da000
[ 1112.677226] sctp_transport_destroy_rcu(ffff8800b3426000)

Dmitry Vyukov

unread,

Jan 15, 2016, 2:11:23 PM1/15/16

to Marcelo Ricardo Leitner, Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

On Fri, Jan 15, 2016 at 7:46 PM, Marcelo Ricardo Leitner
<marcelo...@gmail.com> wrote:
> On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> The following program leads to a leak of two sock objects:
> ...
>>
>> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28).
>
> I'm afraid I cannot reproduce this one?
> I enabled dynprintk at sctp_destroy_sock and it does print twice when I
> run this test app.
> Also added debugs to check association lifetime, and then it was
> destroyed. Same for endpoint.
>
> Checking with trace-cmd, both calls to sctp_close() resulted in
> sctp_destroy_sock() being called.
>
> As for sock_hold/put, they are matched too.
>
> Ideas? Log is below for double checking

Hummm... I can reproduce it pretty reliably.

[ 197.459024] kmemleak: 11 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 307.494874] kmemleak: 409 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 549.784022] kmemleak: 125 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)

I double checked via /proc/slabinfo:

SCTPv6 4373 4420 2368 13 8 : tunables 0 0
0 : slabdata 340 340 0

SCTPv6 starts with almost 0, but grows infinitely while I run the
program in a loop.

Here is my SCTP related configs:

CONFIG_IP_SCTP=y
CONFIG_NET_SCTPPROBE=y
CONFIG_SCTP_DBG_OBJCNT=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y
# CONFIG_SCTP_COOKIE_HMAC_MD5 is not set
# CONFIG_SCTP_COOKIE_HMAC_SHA1 is not set

I am on commit 67990608c8b95d2b8ccc29932376ae73d5818727 and I don't
seem to have any sctp-related changes on top.

Marcelo Ricardo Leitner

unread,

Jan 15, 2016, 4:41:03 PM1/15/16

to net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, vyas...@gmail.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

Ok, now I can. Enabled slub debugs, now I cannot see calls to
sctp_destroy_sock. I see to sctp_close, but not to sctp_destroy_sock.

And SCTPv6 grew by 2 sockets after the execution.

Further checking, it's a race within SCTP asoc migration:
thread 0 thread 1
- app creates a sock
- sends a packet to itself
- sctp will create an asoc and do implicit
handshake
- send the packet
- listen()
- accept() is called and
that asoc is migrated
- packet is delivered
- skb->destructor is called, BUT:

(note that if accept() is called after packet is delivered and skb is freed, it
doesn't happen)

static void sctp_wfree(struct sk_buff *skb)
{
struct sctp_chunk *chunk = skb_shinfo(skb)->destructor_arg;
struct sctp_association *asoc = chunk->asoc;
struct sock *sk = asoc->base.sk;
...
atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);

and it's pointing to the new socket already. So one socket gets a leak
on sk_wmem_alloc and another gets a negative value:

--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1537,12 +1537,14 @@ static void sctp_close(struct sock *sk, long timeout)
/* Hold the sock, since sk_common_release() will put sock_put()
* and we have just a little more cleanup.
*/
+ printk("%s sock_hold %p\n", __func__, sk);
sock_hold(sk);
sk_common_release(sk);

bh_unlock_sock(sk);
spin_unlock_bh(&net->sctp.addr_wq_lock);

+ printk("%s sock_put %p %d %d\n", __func__, sk, atomic_read(&sk->sk_refcnt), atomic_read(&sk->sk_wmem_alloc));
sock_put(sk);

SCTP_DBG_OBJCNT_DEC(sock);

gave me:

[ 99.456944] sctp_close sock_hold ffff880137df8940
...
[ 99.457337] sctp_close sock_put ffff880137df8940 1 -247
[ 99.458313] sctp_close sock_hold ffff880137dfef00
...
[ 99.458383] sctp_close sock_put ffff880137dfef00 1 249

That's why the socket is not freed..

---8<---

As reported by Dmitry, we cannot migrate asocs that have skbs in tx
queue because they have the destructor callback pointing to the asoc,
but which will point to a different socket if we migrate the asoc in
between the packet sent and packet release.

This patch implements proper error handling for sctp_sock_migrate and
this first sanity check.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
net/sctp/socket.c | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 9bb80ec4c08f..5a22a6cfb699 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -99,8 +99,8 @@ static int sctp_send_asconf(struct sctp_association *asoc,
struct sctp_chunk *chunk);
static int sctp_do_bind(struct sock *, union sctp_addr *, int);
static int sctp_autobind(struct sock *sk);
-static void sctp_sock_migrate(struct sock *, struct sock *,
- struct sctp_association *, sctp_socket_type_t);
+static int sctp_sock_migrate(struct sock *, struct sock *,
+ struct sctp_association *, sctp_socket_type_t);

static int sctp_memory_pressure;
static atomic_long_t sctp_memory_allocated;
@@ -3929,7 +3929,11 @@ static struct sock *sctp_accept(struct sock *sk, int flags, int *err)
/* Populate the fields of the newsk from the oldsk and migrate the
* asoc to the newsk.
*/
- sctp_sock_migrate(sk, newsk, asoc, SCTP_SOCKET_TCP);
+ error = sctp_sock_migrate(sk, newsk, asoc, SCTP_SOCKET_TCP);
+ if (error) {
+ sk_common_release(newsk);
+ newsk = NULL;
+ }

out:
release_sock(sk);
@@ -4436,10 +4440,16 @@ int sctp_do_peeloff(struct sock *sk, sctp_assoc_t id, struct socket **sockp)
/* Populate the fields of the newsk from the oldsk and migrate the
* asoc to the newsk.
*/
- sctp_sock_migrate(sk, sock->sk, asoc, SCTP_SOCKET_UDP_HIGH_BANDWIDTH);
+ err = sctp_sock_migrate(sk, sock->sk, asoc,
+ SCTP_SOCKET_UDP_HIGH_BANDWIDTH);
+ if (err) {
+ sk_common_release(sock->sk);
+ goto out;
+ }

*sockp = sock;

+out:
return err;
}
EXPORT_SYMBOL(sctp_do_peeloff);
@@ -7217,9 +7227,9 @@ static inline void sctp_copy_descendant(struct sock *sk_to,
/* Populate the fields of the newsk from the oldsk and migrate the assoc
* and its messages to the newsk.
*/
-static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk,
- struct sctp_association *assoc,
- sctp_socket_type_t type)
+static int sctp_sock_migrate(struct sock *oldsk, struct sock *newsk,
+ struct sctp_association *assoc,
+ sctp_socket_type_t type)
{
struct sctp_sock *oldsp = sctp_sk(oldsk);
struct sctp_sock *newsp = sctp_sk(newsk);
@@ -7229,6 +7239,12 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk,
struct sctp_ulpevent *event;
struct sctp_bind_hashbucket *head;

+ /* We cannot migrate asocs that have skbs tied to it otherwise
+ * its destructor will update the wrong socket
+ */
+ if (assoc->sndbuf_used)
+ return -EBUSY;
+
/* Migrate socket buffer sizes and all the socket level options to the
* new socket.
*/
@@ -7343,6 +7359,8 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk,

newsk->sk_state = SCTP_SS_ESTABLISHED;
release_sock(newsk);
+
+ return 0;
}

Vlad Yasevich

unread,

Jan 19, 2016, 9:19:44 AM1/19/16

to Marcelo Ricardo Leitner, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

Interesting... sctp_sock_migrate() accounts for this race in the
receive buffer, but not the send buffer.

On the one hand I am not crazy about the connect-to-self scenario.
On the other, I think to support it correctly, we should support
skb migrations for the send case just like we do the receive case.

-vlad

Marcelo Ricardo Leitner

unread,

Jan 19, 2016, 10:59:08 AM1/19/16

to Vlad Yasevich, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

Yes, not thrilled here either about connect-to-self.

But there is a big difference on how both works. For rx we can just look
for wanted skbs in rx queue, as they aren't going anywhere, but for tx I
don't think we can easily block sctp_wfree() call because that may be
happening on another CPU (or am I mistaken here? sctp still doesn't have
RFS but even irqbalance could affect this AFAICT) and more than one skb
may be in transit at a time.

The lockings for this on sctp_chunk would be pretty nasty, I think, and
normal usage lets say wouldn't be benefit from it. Considering the
possible migration, as we can't trust chunk->asoc right away in
sctp_wfree, the lock would reside in sctp_chunk and we would have to go
on taking locks one by one on tx queue for the migration. Ugh ;)

Marcelo

Vlad Yasevich

unread,

Jan 19, 2016, 1:37:56 PM1/19/16

to Marcelo Ricardo Leitner, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

The way it's done now, we wouldn't have to block sctp_wfree. Chunks are released under
lock when they are acked, so we are OK here. The tx completions will just put 1 byte back
to the socket associated with the tx'ed skb, and that should still be ok as
sctp_packet_release_owner will call sk_free().

> The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets
> say wouldn't be benefit from it. Considering the possible migration, as we can't trust
> chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would
> have to go on taking locks one by one on tx queue for the migration. Ugh ;)
>

No, the chunks manipulation is done under the socket locket so I don't think we have to
worry about a per chunk lock. We should be able to trust chunk->asoc pointer always
because each chunk holds a ref on the association. The only somewhat ugly thing
about moving tx chunks is that you have to potentially walk a lot of lists to move
things around. There are all the lists in the sctp_outqueue struct, plus the
per-transport retransmit list...

Even though the above seems to be a PITA, my main reason for recommending this is
that can happen in normal situations too. Consider a very busy association that is
transferring a lot of a data on a 1-to-many socket. The app decides to move do a
peel-off, and we could now be stuck not being able to peel-off for a quite a while
if there is a hick-up in the network and we have to rtx multiple times.

-vlad

> Marcelo
>

Marcelo Ricardo Leitner

unread,

Jan 19, 2016, 2:32:05 PM1/19/16

to Vlad Yasevich, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

Please let me rephrase it. I'm actually worried about the asoc->base.sk
part of the story and how it's fetched in sctp_wfree(). I think we can
update that sk pointer after sock_wfree() has fetched it but not used it
yet, possibly leading to accounting it twice, one during migration and
one on sock_wfree.
In sock_wfree() it will update some sk stats like sk->sk_wmem_alloc,
among others.

That is, I don't see anything that would avoid that.

>> The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets
>> say wouldn't be benefit from it. Considering the possible migration, as we can't trust
>> chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would
>> have to go on taking locks one by one on tx queue for the migration. Ugh ;)
>>
>
> No, the chunks manipulation is done under the socket locket so I don't think we have to
> worry about a per chunk lock. We should be able to trust chunk->asoc pointer always
> because each chunk holds a ref on the association. The only somewhat ugly thing
> about moving tx chunks is that you have to potentially walk a lot of lists to move
> things around. There are all the lists in the sctp_outqueue struct, plus the
> per-transport retransmit list...

Agreed, no per-chunk lock needed, maybe just one to protect
sctp_ep_common.sk ?

> Even though the above seems to be a PITA, my main reason for recommending this is
> that can happen in normal situations too. Consider a very busy association that is
> transferring a lot of a data on a 1-to-many socket. The app decides to move do a
> peel-off, and we could now be stuck not being able to peel-off for a quite a while
> if there is a hick-up in the network and we have to rtx multiple times.

Fair point.

Marcelo

Vlad Yasevich

unread,

Jan 19, 2016, 2:56:02 PM1/19/16

to Marcelo Ricardo Leitner, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

sctp_wfree() is only used on skbs that were created as sctp chunks to be transmitted.
Right now, these skbs aren't actually submitted to the IP or to nic to be transmitted.
They are queued at the association level (either in transports or in the outqueue).
They are only freed during ACK processing.

The ACK processing happens under a socket lock and thus asoc->base.sk can not move.

The migration process also happens under a socket lock. As a result, during migration
we are guaranteed the chunk queues remain consistent and that asoc->base.sk linkage
remains consistent. In fact, if you look at the sctp_sock_migrate, we lock both
sockets when we reassign the assoc->base.sk so we know both sockets are properly locked.

So, I am not sure that what you are worried about can happen. Please feel free to
double-check the above of course.

Thanks
-vlad

Marcelo Ricardo Leitner

unread,

Jan 19, 2016, 3:08:54 PM1/19/16

to Vlad Yasevich, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com

Ohh, right. That makes sense. I'll rework the patch. Thanks Vlad.

Marcelo

Dmitry Vyukov

unread,

Feb 3, 2016, 11:13:45 AM2/3/16

to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, linux...@vger.kernel.org, Eric Dumazet, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

Hi Marcelo,

Any updates on this? I still see the leak.

Marcelo Ricardo Leitner

unread,

Feb 4, 2016, 4:47:20 AM2/4/16

to Dmitry Vyukov, Vlad Yasevich, netdev, linux...@vger.kernel.org, Eric Dumazet, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin

Hi Dmitry,

No, not yet, and I'll be out for 3 weeks starting monday. So if I don't
get it by sunday, it will be a while, sorry.

Marcelo

Dmitry Vyukov

unread,

Mar 2, 2016, 3:57:11 AM3/2/16

to Marcelo Ricardo Leitner, Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

Still happens on 4.5-rc6.

Marcelo, try to apply my config (if yours differs), run the program in
a parallel loop and check /proc/slabinfo (or kmemleak).

Marcelo Ricardo Leitner

unread,

Mar 2, 2016, 2:42:21 PM3/2/16

to Dmitry Vyukov, Vlad Yasevich, Neil Horman, David S. Miller, linux...@vger.kernel.org, netdev, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet

Hi Dmitry, I'm just back from PTOs. Will get back to this asap.

Thanks,
Marcelo

Reply all

Reply to author

Forward