On Fri, Apr 26, 2019 at 6:20 AM Lukas Bulwahn <
lukas....@gmail.com> wrote:
>
> Dear all,
>
> I have been applying the clang thread static analyser on the kernel,
> and after around hundreds of annotations came around this code in
> v5.1-rc5:
>
> drivers/md/md.c:3393:3: warning: releasing mutex
> 'mddev->reconfig_mutex' that was not held [-Wthread-safety-analysis]
> mddev_unlock(mddev);
> ^
Nice find!
> This looks as follows:
>
> rv = mddev ? mddev_lock(mddev): -EBUSY;
> if (!rv) {
> if (rdev->mddev == NULL)
> rv = -EBUSY;
> else
> rv = entry->store(rdev, page, length);
> mddev_unlock(mddev);
> }
> return rv;
> }
>
>
> If my brain is not completely messed up after looking at so much
> locking code, there should be an issue here:
>
> Assume rdev->mddev==NULL (or by aliasing the local mddev==NULL).
> Then rv==-EBUSY
> Then, we take the then branch in if(!rv), check again (rdev->mddev ==
> NULL) and set rv (that is redundant at this point), but then we unlock
> where we never locked.
>
> Is this a bug or where my thinking mistake?
Generally speaking, try to look at the git history 'git log -p $FILE'
to see how this came about, and you'll find commit 27c529bb8e90
("md: lock access to rdev attributes properly") that explains the
intention.
Adding Neil Brown to Cc as the author of that commit.
The idea here is that rdev->mddev can change underneath us
from another thread, and we take a snapshot of it at the start of the
function. If it's non-NULL, we take the lock that prevents it from
becoming NULL, and then check if it has changed in the meantime.
Finally, once we know that rdev is valid and remains valid, we
call the store() function, and then unlock it again. It is probably
a valid assumption that rdev->mmdev can change from
a valid pointer to a NULL pointer, but not to another pointer,
which would break this spectacularly.
What I think the thread analyser has found is the aliasing
between mddev and rdev->mddev that allows the compiler to
eliminate the 'mddev' variable and dereference rdev->mddev
twice, leading to a race of
A B
if (rdev->mddev)
rdev->mddev = NULL
mutex_lock(&rdev->mddev->reconfig_mutex)
We could add a spinlock around the access to the rdev->mddev
pointer here, or possibly just use a READ_ONCE/WRITE_ONCE
pair like
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 05ffffb8b769..495b3e4377d5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2259,7 +2259,7 @@ static void unbind_rdev_from_array(struct md_rdev *rdev)
bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk);
list_del_rcu(&rdev->same_set);
pr_debug("md: unbind<%s>\n", bdevname(rdev->bdev,b));
- rdev->mddev = NULL;
+ WRITE_ONCE(rdev->mddev, NULL);
sysfs_remove_link(&rdev->kobj, "block");
sysfs_put(rdev->sysfs_state);
rdev->sysfs_state = NULL;
@@ -3378,7 +3378,7 @@ rdev_attr_store(struct kobject *kobj, struct
attribute *attr,
struct rdev_sysfs_entry *entry = container_of(attr, struct
rdev_sysfs_entry, attr);
struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
ssize_t rv;
- struct mddev *mddev = rdev->mddev;
+ struct mddev *mddev = READ_ONCE(rdev->mddev);
if (!entry->store)
return -EIO;
I have not looked at the code to see if that is sufficient to fix the race,
but I assume we need at least that.
Arnd