Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] seqlock: fix raw_read_seqcount_latch()

0 views
Skip to first unread message

Alexey Dobriyan

unread,
May 21, 2016, 4:15:15 PM5/21/16
to ak...@linux-foundation.org, pet...@infradead.org, linux-...@vger.kernel.org
lockless_dereference() is supposed to take pointer not integer.

Signed-off-by: Alexey Dobriyan <adob...@gmail.com>
---

include/linux/seqlock.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -277,7 +277,7 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)

static inline int raw_read_seqcount_latch(seqcount_t *s)
{
- return lockless_dereference(s->sequence);
+ return lockless_dereference(s)->sequence;
}

/**
@@ -331,7 +331,7 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
* unsigned seq, idx;
*
* do {
- * seq = lockless_dereference(latch->seq);
+ * seq = lockless_dereference(latch)->seq;
*
* idx = seq & 0x01;
* entry = data_query(latch->data[idx], ...);

Peter Zijlstra

unread,
May 22, 2016, 6:48:37 AM5/22/16
to Alexey Dobriyan, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney
On Sat, May 21, 2016 at 11:14:49PM +0300, Alexey Dobriyan wrote:
> lockless_dereference() is supposed to take pointer not integer.

Urgh :/

Is there any way we can make lockless_dereference() issue a warning if
we don't feed it a pointer?

Would something like so work? All pointer types should silently cast to
void * while integer (and others) should refuse to.

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index b5ff9881bef8..8886de704d33 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -544,6 +544,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
*/
#define lockless_dereference(p) \
({ \
+ __maybe_unused void * _________p2 = p; \
typeof(p) _________p1 = READ_ONCE(p); \
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_________p1); \

Alexey Dobriyan

unread,
May 22, 2016, 2:51:02 PM5/22/16
to Peter Zijlstra, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney
On Sun, May 22, 2016 at 12:48:27PM +0200, Peter Zijlstra wrote:
> On Sat, May 21, 2016 at 11:14:49PM +0300, Alexey Dobriyan wrote:
> > lockless_dereference() is supposed to take pointer not integer.
>
> Urgh :/
>
> Is there any way we can make lockless_dereference() issue a warning if
> we don't feed it a pointer?
>
> Would something like so work? All pointer types should silently cast to
> void * while integer (and others) should refuse to.

This works (and spammy enough in case of seqlock, which is good)
but not for "unsigned long":

include/linux/percpu-refcount.h:146:36: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
percpu_ptr = lockless_dereference(ref->percpu_count_ptr);

Peter Zijlstra

unread,
May 23, 2016, 5:36:34 AM5/23/16
to Alexey Dobriyan, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney, Tejun Heo
On Sun, May 22, 2016 at 09:50:40PM +0300, Alexey Dobriyan wrote:
> On Sun, May 22, 2016 at 12:48:27PM +0200, Peter Zijlstra wrote:
> > On Sat, May 21, 2016 at 11:14:49PM +0300, Alexey Dobriyan wrote:
> > > lockless_dereference() is supposed to take pointer not integer.
> >
> > Urgh :/
> >
> > Is there any way we can make lockless_dereference() issue a warning if
> > we don't feed it a pointer?
> >
> > Would something like so work? All pointer types should silently cast to
> > void * while integer (and others) should refuse to.
>
> This works (and spammy enough in case of seqlock, which is good)
> but not for "unsigned long":
>
> include/linux/percpu-refcount.h:146:36: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
> percpu_ptr = lockless_dereference(ref->percpu_count_ptr);

TJ; would you prefer casting or not using lockless_dereference() here?

Tejun Heo

unread,
May 25, 2016, 3:57:55 PM5/25/16
to Peter Zijlstra, Alexey Dobriyan, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney
Hello,

On Mon, May 23, 2016 at 11:36:18AM +0200, Peter Zijlstra wrote:
> > include/linux/percpu-refcount.h:146:36: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
> > percpu_ptr = lockless_dereference(ref->percpu_count_ptr);
>
> TJ; would you prefer casting or not using lockless_dereference() here?

Casting is nasty - *(unsigned long __percpu **)& - because the macro
expects an lvalue. I think it'd be better to revert to opencoding
READ_ONCE() and barrier there. It's a pretty special case anyway.

Thanks.

--
tejun

Tejun Heo

unread,
May 25, 2016, 4:12:12 PM5/25/16
to Peter Zijlstra, Alexey Dobriyan, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney, Pranith Kumar, kerne...@fb.com
lockless_dereference() is planned to grow a sanity check to ensure
that the input parameter is a pointer. __ref_is_percpu() passes in an
unsinged long value which is a combination of a pointer and a flag.
While it can be casted to a pointer lvalue, the casting looks messy
and it's a special case anyway. Let's revert back to open-coding
READ_ONCE() and explicit barrier.

This doesn't cause any functional changes.

Signed-off-by: Tejun Heo <t...@kernel.org>
Link: http://lkml.kernel.org/g/20160522185...@p183.telecom.by
Cc: Pranith Kumar <bobby...@gmail.com>
Cc: Alexey Dobriyan <adob...@gmail.com>
Cc: Peter Zijlstra <pet...@infradead.org>
---
So, something like this. Please feel free to include in the series.

Thanks.

include/linux/percpu-refcount.h | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 84f542d..1c7eec0 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -136,14 +136,12 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
* used as a pointer. If the compiler generates a separate fetch
* when using it as a pointer, __PERCPU_REF_ATOMIC may be set in
* between contaminating the pointer value, meaning that
- * ACCESS_ONCE() is required when fetching it.
- *
- * Also, we need a data dependency barrier to be paired with
- * smp_store_release() in __percpu_ref_switch_to_percpu().
- *
- * Use lockless deref which contains both.
+ * READ_ONCE() is required when fetching it.
*/
- percpu_ptr = lockless_dereference(ref->percpu_count_ptr);
+ percpu_ptr = READ_ONCE(ref->percpu_count_ptr);
+
+ /* paired with smp_store_release() in __percpu_ref_switch_to_percpu() */
+ smp_read_barrier_depends();

/*
* Theoretically, the following could test just ATOMIC; however,

Peter Zijlstra

unread,
May 27, 2016, 7:11:28 AM5/27/16
to Alexey Dobriyan, ak...@linux-foundation.org, linux-...@vger.kernel.org, Paul McKenney
On Sat, May 21, 2016 at 11:14:49PM +0300, Alexey Dobriyan wrote:
So while the code was dubious; I it is now wrong, but my head hurts.

I'll queue the below, TJs per-cpu change and the lockless_dereference()
void * cast trick.


---
Subject: seqcount: Re-fix raw_read_seqcount_latch()

Commit 50755bc1c305 ("seqlock: fix raw_read_seqcount_latch()") broke
raw_read_seqcount_latch().

If you look at the comment that was modified; the thing that changes is
the seq count, not the latch pointer.

* void latch_modify(struct latch_struct *latch, ...)
* {
* smp_wmb(); <- Ensure that the last data[1] update is visible
* latch->seq++;
* smp_wmb(); <- Ensure that the seqcount update is visible
*
* modify(latch->data[0], ...);
*
* smp_wmb(); <- Ensure that the data[0] update is visible
* latch->seq++;
* smp_wmb(); <- Ensure that the seqcount update is visible
*
* modify(latch->data[1], ...);
* }
*
* The query will have a form like:
*
* struct entry *latch_query(struct latch_struct *latch, ...)
* {
* struct entry *entry;
* unsigned seq, idx;
*
* do {
* seq = lockless_dereference(latch->seq);

So here we have:

seq = READ_ONCE(latch->seq);
smp_read_barrier_depends();

Which is exactly what we want; the new code:

seq = ({ p = READ_ONCE(latch);
smp_read_barrier_depends(); p })->seq;

is just wrong; because it looses the volatile read on seq, which can now
be torn or worse 'optimized'. And the read_depend barrier is also placed
wrong, we want it after the load of seq, to match the above data[]
up-to-date wmb()s.

Such that when we dereference latch->data[] below, we're guaranteed to
observe the right data.

*
* idx = seq & 0x01;
* entry = data_query(latch->data[idx], ...);
*
* smp_rmb();
* } while (seq != latch->seq);
*
* return entry;
* }

So yes, not passing a pointer is not pretty, but the code was correct,
and isn't anymore now.

Change to explicit READ_ONCE()+smp_read_barrier_depends() to avoid
confusion and allow strict lockless_dereference() checking.

Fixes: 50755bc1c305 ("seqlock: fix raw_read_seqcount_latch()")
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
---
include/linux/seqlock.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 7973a821ac58..f3db247cebc8 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -277,7 +277,9 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)

static inline int raw_read_seqcount_latch(seqcount_t *s)
{
- return lockless_dereference(s)->sequence;
+ int seq = READ_ONCE(s->sequence);
+ smp_read_barrier_depends();
+ return seq;
}

/**
@@ -331,7 +333,7 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
* unsigned seq, idx;
*
* do {
- * seq = lockless_dereference(latch)->seq;
+ * seq = raw_read_seqcount_latch(&latch->seq);

tip-bot for Peter Zijlstra

unread,
Jun 3, 2016, 6:47:13 AM6/3/16
to linux-ti...@vger.kernel.org, pet...@infradead.org, adob...@gmail.com, h...@zytor.com, pau...@linux.vnet.ibm.com, linux-...@vger.kernel.org, torv...@linux-foundation.org, tg...@linutronix.de, ak...@linux-foundation.org, mi...@kernel.org
Commit-ID: 55eed755c6e30a89be3a791a6b0ad208aadd9bdc
Gitweb: http://git.kernel.org/tip/55eed755c6e30a89be3a791a6b0ad208aadd9bdc
Author: Peter Zijlstra <pet...@infradead.org>
AuthorDate: Fri, 27 May 2016 13:11:17 +0200
Committer: Ingo Molnar <mi...@kernel.org>
CommitDate: Fri, 3 Jun 2016 08:37:25 +0200

locking/seqcount: Re-fix raw_read_seqcount_latch()
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexey Dobriyan <adob...@gmail.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Paul E. McKenney <pau...@linux.vnet.ibm.com>
Cc: Paul McKenney <pau...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <tg...@linutronix.de>
Fixes: 50755bc1c305 ("seqlock: fix raw_read_seqcount_latch()")
Link: http://lkml.kernel.org/r/2016052711...@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
include/linux/seqlock.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 7973a82..ead9765 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -277,7 +277,10 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)

static inline int raw_read_seqcount_latch(seqcount_t *s)
{
- return lockless_dereference(s)->sequence;
+ int seq = READ_ONCE(s->sequence);
+ /* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
+ smp_read_barrier_depends();
+ return seq;
}

/**
@@ -331,7 +334,7 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)

tip-bot for Peter Zijlstra

unread,
Jun 3, 2016, 6:58:52 AM6/3/16
to linux-ti...@vger.kernel.org, pau...@linux.vnet.ibm.com, ak...@linux-foundation.org, torv...@linux-foundation.org, adob...@gmail.com, tg...@linutronix.de, linux-...@vger.kernel.org, mi...@kernel.org, h...@zytor.com, pet...@infradead.org
Commit-ID: 25841ee0e9d2a1d952828138416701f20ea831eb
Gitweb: http://git.kernel.org/tip/25841ee0e9d2a1d952828138416701f20ea831eb
Author: Peter Zijlstra <pet...@infradead.org>
AuthorDate: Sun, 22 May 2016 12:48:27 +0200
Committer: Ingo Molnar <mi...@kernel.org>
CommitDate: Fri, 3 Jun 2016 12:06:11 +0200

locking/barriers: Validate lockless_dereference() is used on a pointer type

Add a cast to void * to validate the argument @p is indeed a pointer.

Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexey Dobriyan <adob...@gmail.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Paul E. McKenney <pau...@linux.vnet.ibm.com>
Cc: Paul McKenney <pau...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <tg...@linutronix.de>
Link: http://lkml.kernel.org/r/2016052210...@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
include/linux/compiler.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 793c082..e9c6417 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -545,9 +545,13 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
* Similar to rcu_dereference(), but for situations where the pointed-to
* object's lifetime is managed by something other than RCU. That
* "something other" might be reference counting or simple immortality.
+ *
+ * The seemingly unused void * variable is to validate @p is indeed a pointer
+ * type. All pointer types silently cast to void *.
*/
#define lockless_dereference(p) \
({ \
+ __maybe_unused const void * const _________p2 = p; \

Mateusz Guzik

unread,
Jun 6, 2016, 5:31:58 PM6/6/16
to Peter Zijlstra, linux-ti...@vger.kernel.org, pau...@linux.vnet.ibm.com, ak...@linux-foundation.org, torv...@linux-foundation.org, adob...@gmail.com, tg...@linutronix.de, linux-...@vger.kernel.org, mi...@kernel.org, h...@zytor.com, pet...@infradead.org
This causes issues, e.g.:
BUG: KASAN: user-memory-access on address 000060ff95001931^M
Read of size 1 by task NetworkManager/897^M
CPU: 1 PID: 897 Comm: NetworkManager Not tainted 4.7.0-rc1dupa+ #355^M
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011^M
ffff88003a1f8040 0000000017d08b23 ffff880033aff4b0 ffffffff8187b2f8^M
ffff88005b44f2c8 ffff880033aff548 ffff880033aff538 ffffffff813506a6^M
ffffffff81167e5esystemd[1]: sshd.service: Forked /usr/sbin/sshd as 904^M
^M
ffffed000675fe9d 0000000000000286 ffff880033aff588^M
Call Trace:^M
[<ffffffff8187b2f8>] dump_stack+0x85/0xcd^M
[<ffffffff813506a6>] kasan_report_error+0x456/0x560^M
[<ffffffff81167e5e>] ? vprintk_default+0x3e/0x60^M
[<ffffffff812a19e3>] ? printk+0xa8/0xd8^M
[<ffffffff812a193b>] ? power_down+0xa9/0xa9^M
[<ffffffff81350d48>] kasan_report+0x58/0x60^M
[<ffffffff81e84fe5>] ? leaf_walk_rcu+0x235/0x2d0^M
[<ffffffff8134f447>] __asan_load1+0x47/0x50^M
[<ffffffff81e84fe5>] leaf_walk_rcu+0x235/0x2d0^M
[snip]

The reason is that leaf_walk_rcu does get_child_rcu(pn, cindex++), which
ends up in lockless_dereference as pn[cindex++], which is now evaluated
twice.

The simplest fix I see does the assignment later, that is:
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e9c6417..06f27fd 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -551,8 +551,8 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
*/
#define lockless_dereference(p) \
({ \
- __maybe_unused const void * const _________p2 = p; \
typeof(p) _________p1 = READ_ONCE(p); \
+ __maybe_unused const void * const _________p2 = _________p1; \
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_________p1); \
})

--
Mateusz Guzik

Peter Zijlstra

unread,
Jun 7, 2016, 3:10:41 AM6/7/16
to Mateusz Guzik, linux-ti...@vger.kernel.org, pau...@linux.vnet.ibm.com, ak...@linux-foundation.org, torv...@linux-foundation.org, adob...@gmail.com, tg...@linutronix.de, linux-...@vger.kernel.org, mi...@kernel.org, h...@zytor.com
On Mon, Jun 06, 2016 at 11:31:39PM +0200, Mateusz Guzik wrote:
> The reason is that leaf_walk_rcu does get_child_rcu(pn, cindex++), which
> ends up in lockless_dereference as pn[cindex++], which is now evaluated
> twice.

Indeed, bad that; yes I think the below will work, will you send a
proper patch so I can apply?

tip-bot for Peter Zijlstra

unread,
Jun 8, 2016, 10:20:31 AM6/8/16
to linux-ti...@vger.kernel.org, tg...@linutronix.de, pet...@infradead.org, ak...@linux-foundation.org, h...@zytor.com, pau...@linux.vnet.ibm.com, adob...@gmail.com, mi...@kernel.org, torv...@linux-foundation.org, linux-...@vger.kernel.org
Commit-ID: 331b6d8c7afc2e5b900b9dcd850c265e1ba8d8e7
Gitweb: http://git.kernel.org/tip/331b6d8c7afc2e5b900b9dcd850c265e1ba8d8e7
Author: Peter Zijlstra <pet...@infradead.org>
AuthorDate: Sun, 22 May 2016 12:48:27 +0200
Committer: Ingo Molnar <mi...@kernel.org>
CommitDate: Wed, 8 Jun 2016 14:22:47 +0200

locking/barriers: Validate lockless_dereference() is used on a pointer type

Use the type to validate the argument @p is indeed a pointer type.

Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexey Dobriyan <adob...@gmail.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Paul E. McKenney <pau...@linux.vnet.ibm.com>
Cc: Paul McKenney <pau...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <tg...@linutronix.de>
Link: http://lkml.kernel.org/r/2016052210...@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
include/linux/compiler.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 793c082..06f27fd 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -545,10 +545,14 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
* Similar to rcu_dereference(), but for situations where the pointed-to
* object's lifetime is managed by something other than RCU. That
* "something other" might be reference counting or simple immortality.
+ *
+ * The seemingly unused void * variable is to validate @p is indeed a pointer
+ * type. All pointer types silently cast to void *.
*/
#define lockless_dereference(p) \
({ \
typeof(p) _________p1 = READ_ONCE(p); \
+ __maybe_unused const void * const _________p2 = _________p1; \
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_________p1); \
})
0 new messages