Hello,
syzbot found the following crash on:
HEAD commit: e195ca6cb6f2 Merge branch 'for-linus' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10d3e6a3400000
From the console output:
20:36:14 executing program 4: semget$private(0x12000000, 0x39d0, 0x0)
I don't understand the 0x12000000.
What does that mean? What is the actual syscall?
Is 0x39d0 the number of semaphores in the array, i.e. create
~13.000 semaphores?
Unlink_queue means transfer all waiting threads to the wake-q.
There are 2*(1+<semaphores in the array>) linked lists in
an array.
And this fails, because one linked list contains 0x100000 instead
of a real pointer.
I could not find any semop() in the log --> all lists must be empty.
Actually, the lists were initialized in newary(), and then never
touched.
freeary+0xbd1/0x1a40 ipc/sem.c:1160Free a semaphore array
free_ipcs+0x9f/0x1c0 ipc/namespace.c:112
sem_exit_ns+0x20/0x40 ipc/sem.c:237
free_ipc_ns ipc/namespace.c:120 [inline]
put_ipc_ns+0x66/0x180 ipc/namespace.c:152
free_nsproxy+0xcf/0x220 kernel/nsproxy.c:180
switch_task_namespaces+0xb3/0xd0 kernel/nsproxy.c:229
exit_task_namespaces+0x17/0x20 kernel/nsproxy.c:234
do_exit+0x1ad1/0x26d0 kernel/exit.c:866
do_group_exit+0x177/0x440 kernel/exit.c:970
get_signal+0x8b0/0x1980 kernel/signal.c:2517
do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
exit_to_usermode_loop+0x2e5/0x380 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x410fa0
This is time code 604.599748 in the console output:
[ 604.599748] RIP: 0033:0x410fa0
Questions:
1) What is this?
[ 600.924691] entry_SYSCALL_64_after_hwframe+0x49/0xbe^MIsn't this a kernel stack overrun?
[ 600.929872] RIP: 0033:0x7f3e597d0120^M
[ 600.933576] Code: Bad RIP value.^M
[ 600.936920] RSP: 002b:00007ffc2d83e008 EFLAGS: 00000246 ORIG_RAX: 0000000000000002^M
[ 600.944608] RAX: ffffffffffffffda RBX: 000055ca2995b436 RCX: 00007f3e597d0120^M
[ 600.951856] RDX: 00007ffc2d83e244 RSI: 0000000000080000 RDI: 00007ffc2d83e220^M
[ 600.959107] RBP: 000055ca2995b1e0 R08: 0000000000000000 R09: 000055ca2995b099^M
[ 600.966355] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001^M
[ 600.973628] R13: 000055ca2995b090 R14: 000055ca2995b190 R15: 00007ffc2d83e220^M
RSP: 0x..83e008. Assuming 8 kB kernel stack, and 8 kB alignment, we have used up everything.
- How/where are namespaces used by the bot?
I.e. what triggered the namespace exit?
- There are ~370 calls to semget(), most with large (>10.000) semaphores in the arrays.
Starting from [442.544635], the OOM killer starts to kill processes.
Is this as intended?
[ 433.304586] FAULT_INJECTION: forcing a failure.^M
[ 433.304586] name fail_page_alloc, interval 1, probability 0, space 0, times 0^M
[ 433.316471] CPU: 1 PID: 19653 Comm: syz-executor4 Not tainted 4.20.0-rc3+ #348^M
[ 433.323841] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011^M
I need some more background, then I can review the code.
Right now, I would put it into my "unknown syzcaller finding" folder.
--
Manfred