[PATCH] kunit: Fix race condition in try-catch completion

6 views
Skip to first unread message

David Gow

unread,
Apr 11, 2024, 10:59:11 PMApr 11
to Rae Moar, Kees Cook, Mickaël Salaün, Naresh Kamboju, Shuah Khan, David Gow, Will Deacon, Dan Carpenter, Guenter Roeck, linux-k...@vger.kernel.org, kuni...@googlegroups.com, linux-...@vger.kernel.org, Brendan Higgins, Linux Kernel Functional Testing
KUnit's try-catch infrastructure now uses vfork_done, which is always
set to a valid completion when a kthread is created, but which is set to
NULL once the thread terminates. This creates a race condition, where
the kthread exits before we can wait on it.

Keep a copy of vfork_done, which is taken before we wake_up_process()
and so valid, and wait on that instead.

Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
Reported-by: Linux Kernel Functional Testing <lk...@linaro.org>
Closes: https://lore.kernel.org/lkml/20240410102710.359...@linaro.org/
Tested-by: Linux Kernel Functional Testing <lk...@linaro.org>
Acked-by: Mickaël Salaün <m...@digikod.net>
Signed-off-by: David Gow <davi...@google.com>
---
lib/kunit/try-catch.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
index fa687278ccc9..6bbe0025b079 100644
--- a/lib/kunit/try-catch.c
+++ b/lib/kunit/try-catch.c
@@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
{
struct kunit *test = try_catch->test;
struct task_struct *task_struct;
+ struct completion *task_done;
int exit_code, time_remaining;

try_catch->context = context;
@@ -75,13 +76,16 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
return;
}
get_task_struct(task_struct);
- wake_up_process(task_struct);
/*
* As for a vfork(2), task_struct->vfork_done (pointing to the
* underlying kthread->exited) can be used to wait for the end of a
- * kernel thread.
+ * kernel thread. It is set to NULL when the thread exits, so we
+ * keep a copy here.
*/
- time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
+ task_done = task_struct->vfork_done;
+ wake_up_process(task_struct);
+
+ time_remaining = wait_for_completion_timeout(task_done,
kunit_test_timeout());
if (time_remaining == 0) {
try_catch->try_result = -ETIMEDOUT;
--
2.44.0.683.g7961c838ac-goog

Rae Moar

unread,
Apr 12, 2024, 5:32:35 PMApr 12
to David Gow, Kees Cook, Mickaël Salaün, Naresh Kamboju, Shuah Khan, Will Deacon, Dan Carpenter, Guenter Roeck, linux-k...@vger.kernel.org, kuni...@googlegroups.com, linux-...@vger.kernel.org, Brendan Higgins, Linux Kernel Functional Testing
On Thu, Apr 11, 2024 at 10:59 PM David Gow <davi...@google.com> wrote:
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is created, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <lk...@linaro.org>
> Closes: https://lore.kernel.org/lkml/20240410102710.359...@linaro.org/
> Tested-by: Linux Kernel Functional Testing <lk...@linaro.org>
> Acked-by: Mickaël Salaün <m...@digikod.net>
> Signed-off-by: David Gow <davi...@google.com>

Hello,

This fix looks good to me. I have tested it and besides the fortify
test error discussed in the previous patch series I am happy.

Thanks!
-Rae

Reviewed-by: Rae Moar <rm...@google.com>

Miguel Ojeda

unread,
Apr 13, 2024, 5:05:25 PMApr 13
to davi...@google.com, brendan...@linux.dev, dan.ca...@linaro.org, kees...@chromium.org, kuni...@googlegroups.com, linux-...@vger.kernel.org, linux-k...@vger.kernel.org, li...@roeck-us.net, lk...@linaro.org, m...@digikod.net, naresh....@linaro.org, rm...@google.com, sk...@linuxfoundation.org, wi...@kernel.org, Miguel Ojeda
On Thu, Apr 11, 2024 at 10:59 PM David Gow <davi...@google.com> wrote:
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is created, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <lk...@linaro.org>
> Closes: https://lore.kernel.org/lkml/20240410102710.359...@linaro.org/
> Tested-by: Linux Kernel Functional Testing <lk...@linaro.org>
> Acked-by: Mickaël Salaün <m...@digikod.net>
> Signed-off-by: David Gow <davi...@google.com>

I noticed it with the Rust tests too, and indeed this fixed it:

Tested-by: Miguel Ojeda <oj...@kernel.org>

Thanks!

Cheers,
Miguel
Reply all
Reply to author
Forward
0 new messages