Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH RFC 0/7] rcu: v2 Inlinable preemptible rcu_read_lock() and rcu_read_unlock()

40 views
Skip to first unread message

Paul E. McKenney

unread,
Apr 14, 2012, 12:20:25 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org
Hello!

This series is version two of the inlinable versions of preemptible
RCU's __rcu_read_lock() and __rcu_read_unlock(). The first version may
be found at https://lkml.org/lkml/2012/3/25/94. The individual commits
in this new series are as follows:

1. Move preemptible RCU's hook in the scheduler from the common
RCU scheduler-entry hook to just before the scheduler's call
to switch_to. This reduces overhead in the case where the
scheduler is called but does not switch and also sets the
stage for saving and restoring the per-CPU variables needed
for inlining.

2. Create the per-CPU variables and rename rcu_read_unlock_special()
to avoid name conflict.

3. Make exit_rcu() use a more precise method of checking the need
for exit-time RCU-related cleanup, and consolidate the two
identical versions of exit_rcu() into one place.

4. Make __rcu_read_lock() and __rcu_read_unlock() use the per-CPU
variables, but leave them out of line for the moment. This
requires adding a second preemptible-RCU hook in the scheduler
to restore the values of the per-CPU variables.

5. Silence bogus copy_to_user() build errors that seem to be triggered
by differences in gcc's inlining decisions when __rcu_read_lock()
becomes inlinable. Apparently, copy_to_user() needs to be inlined
in order to function correctly? Hmmm, sort of like kfree_rcu().

6. Inline __rcu_read_lock().

7. Inline __rcu_read_unlock().

With these changes, the 32-bit x86 gcc compiler compiles this:

void rcu_read_lock_code(void)
{
rcu_read_lock();
}

to this:

000000d0 <rcu_read_lock_code>:
d0: 64 ff 05 00 00 00 00 incl %fs:0x0
d7: c3 ret
d8: 90 nop
d9: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi

It also compiles this:

void rcu_read_unlock_code(void)
{
rcu_read_unlock();
}

to this:

000000e0 <rcu_read_unlock_code>:
e0: 64 a1 00 00 00 00 mov %fs:0x0,%eax
e6: 83 f8 01 cmp $0x1,%eax
e9: 74 0d je f8 <rcu_read_unlock_code+0x18>
eb: 64 ff 0d 00 00 00 00 decl %fs:0x0
f2: c3 ret
f3: 90 nop
f4: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
f8: 64 c7 05 00 00 00 00 movl $0x80000000,%fs:0x0
ff: 00 00 00 80
103: 64 a1 00 00 00 00 mov %fs:0x0,%eax
109: 85 c0 test %eax,%eax
10b: 75 0c jne 119 <rcu_read_unlock_code+0x39>
10d: 64 c7 05 00 00 00 00 movl $0x0,%fs:0x0
114: 00 00 00 00
118: c3 ret
119: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
120: e8 fc ff ff ff call 121 <rcu_read_unlock_code+0x41>
125: eb e6 jmp 10d <rcu_read_unlock_code+0x2d>

It is therefore not at all clear to me that the final patch in this
series is worthwhile. Unless someone comes up with a good reason to
keep it, I will drop it. The only possible justification I can see is
that gcc could (in theory, anyway) drop dead code in the case of nested
RCU read-side critical sections (everything from address f3 onwards),
but this just doesn't cut it for me at the moment. I could also imagine
having the inlined portion contain only the nesting check and decrement,
along with a call to an out-of-line function that does the rest, but
this looks to me to bloat the code for no good reason.

Thoughts?

Thanx, Paul

arch/um/drivers/mconsole_kern.c | 3
b/arch/um/drivers/mconsole_kern.c | 1
b/fs/binfmt_misc.c | 4 -
b/include/linux/init_task.h | 4 -
b/include/linux/rcupdate.h | 1
b/include/linux/rcutiny.h | 6 -
b/include/linux/rcutree.h | 12 ---
b/include/linux/sched.h | 10 +++
b/kernel/rcu.h | 4 +
b/kernel/rcupdate.c | 5 +
b/kernel/rcutiny_plugin.h | 10 +--
b/kernel/rcutree.c | 1
b/kernel/rcutree.h | 1
b/kernel/rcutree_plugin.h | 14 ----
b/kernel/sched/core.c | 1
include/linux/rcupdate.h | 72 ++++++++++++++++++++-
include/linux/rcutiny.h | 5 -
include/linux/sched.h | 92 +++++++++++++++++++++++++--
kernel/rcu.h | 4 -
kernel/rcupdate.c | 126 ++++++++++++++++++++++----------------
kernel/rcutiny_plugin.h | 123 +++++++------------------------------
kernel/rcutree_plugin.h | 114 ++++++++--------------------------
kernel/sched/core.c | 3
23 files changed, 321 insertions(+), 295 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Paul E. McKenney

unread,
Apr 14, 2012, 12:21:07 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@polymtl.ca, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org, Paul E. McKenney, Paul E. McKenney
From: "Paul E. McKenney" <paul.m...@linaro.org>

Move __rcu_read_unlock() from kernel/rcupdate.c to
include/linux/rcupdate.h, allowing the compiler to inline it.

Suggested-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paul.m...@linaro.org>
Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
---
include/linux/rcupdate.h | 35 ++++++++++++++++++++++++++++++++++-
kernel/rcu.h | 4 ----
kernel/rcupdate.c | 33 ---------------------------------
kernel/rcutiny_plugin.h | 1 +
kernel/rcutree_plugin.h | 1 +
5 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9967b2b..8113505 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -162,7 +162,40 @@ static inline void __rcu_read_lock(void)
barrier(); /* Keep code within RCU read-side critical section. */
}

-extern void __rcu_read_unlock(void);
+extern void rcu_read_unlock_do_special(void);
+
+/*
+ * Tree-preemptible RCU implementation for rcu_read_unlock().
+ * Decrement rcu_read_lock_nesting. If the result is zero (outermost
+ * rcu_read_unlock()) and rcu_read_unlock_special is non-zero, then
+ * invoke rcu_read_unlock_do_special() to clean up after a context switch
+ * in an RCU read-side critical section and other special cases.
+ * Set rcu_read_lock_nesting to a large negative value during cleanup
+ * in order to ensure that if rcu_read_unlock_special is non-zero, then
+ * rcu_read_lock_nesting is also non-zero.
+ */
+static inline void __rcu_read_unlock(void)
+{
+ if (__this_cpu_read(rcu_read_lock_nesting) != 1)
+ __this_cpu_dec(rcu_read_lock_nesting);
+ else {
+ barrier(); /* critical section before exit code. */
+ __this_cpu_write(rcu_read_lock_nesting, INT_MIN);
+ barrier(); /* assign before ->rcu_read_unlock_special load */
+ if (unlikely(__this_cpu_read(rcu_read_unlock_special)))
+ rcu_read_unlock_do_special();
+ barrier(); /* ->rcu_read_unlock_special load before assign */
+ __this_cpu_write(rcu_read_lock_nesting, 0);
+ }
+#ifdef CONFIG_PROVE_LOCKING
+ {
+ int rln = __this_cpu_read(rcu_read_lock_nesting);
+
+ WARN_ON_ONCE(rln < 0 && rln > INT_MIN / 2);
+ }
+#endif /* #ifdef CONFIG_PROVE_LOCKING */
+}
+
void synchronize_rcu(void);

/*
diff --git a/kernel/rcu.h b/kernel/rcu.h
index 6243d8d..8ba99cd 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -109,8 +109,4 @@ static inline bool __rcu_reclaim(char *rn, struct rcu_head *head)
}
}

-#ifdef CONFIG_PREEMPT_RCU
-extern void rcu_read_unlock_do_special(void);
-#endif /* #ifdef CONFIG_PREEMPT_RCU */
-
#endif /* __LINUX_RCU_H */
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index d52c68e..f607cb5 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -59,39 +59,6 @@ DEFINE_PER_CPU(struct task_struct *, rcu_current_task);
#endif /* #ifdef CONFIG_PROVE_RCU */

/*
- * Tree-preemptible RCU implementation for rcu_read_unlock().
- * Decrement rcu_read_lock_nesting. If the result is zero (outermost
- * rcu_read_unlock()) and rcu_read_unlock_special is non-zero, then
- * invoke rcu_read_unlock_do_special() to clean up after a context switch
- * in an RCU read-side critical section and other special cases.
- * Set rcu_read_lock_nesting to a large negative value during cleanup
- * in order to ensure that if rcu_read_unlock_special is non-zero, then
- * rcu_read_lock_nesting is also non-zero.
- */
-void __rcu_read_unlock(void)
-{
- if (__this_cpu_read(rcu_read_lock_nesting) != 1)
- __this_cpu_dec(rcu_read_lock_nesting);
- else {
- barrier(); /* critical section before exit code. */
- __this_cpu_write(rcu_read_lock_nesting, INT_MIN);
- barrier(); /* assign before ->rcu_read_unlock_special load */
- if (unlikely(__this_cpu_read(rcu_read_unlock_special)))
- rcu_read_unlock_do_special();
- barrier(); /* ->rcu_read_unlock_special load before assign */
- __this_cpu_write(rcu_read_lock_nesting, 0);
- }
-#ifdef CONFIG_PROVE_LOCKING
- {
- int rln = __this_cpu_read(rcu_read_lock_nesting);
-
- WARN_ON_ONCE(rln < 0 && rln > INT_MIN / 2);
- }
-#endif /* #ifdef CONFIG_PROVE_LOCKING */
-}
-EXPORT_SYMBOL_GPL(__rcu_read_unlock);
-
-/*
* Check for a task exiting while in a preemptible-RCU read-side
* critical section, clean up if so. No need to issue warnings,
* as debug_check_no_locks_held() already does this if lockdep
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 6b416af..49cb5b0 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -598,6 +598,7 @@ void rcu_read_unlock_do_special(void)
}
local_irq_restore(flags);
}
+EXPORT_SYMBOL_GPL(rcu_read_unlock_do_special);

/*
* Check for a quiescent state from the current CPU. When a task blocks,
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 20be289..7afde96 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -409,6 +409,7 @@ void rcu_read_unlock_do_special(void)
local_irq_restore(flags);
}
}
+EXPORT_SYMBOL_GPL(rcu_read_unlock_do_special);

#ifdef CONFIG_RCU_CPU_STALL_VERBOSE

--
1.7.8

Paul E. McKenney

unread,
Apr 14, 2012, 12:21:08 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@polymtl.ca, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org, Paul E. McKenney, Paul E. McKenney
From: "Paul E. McKenney" <paul.m...@linaro.org>

Move __rcu_read_lock() from kernel/rcupdate.c to include/linux/rcupdate.h,
allowing the compiler to inline it.

Suggested-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paul.m...@linaro.org>
Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
---
include/linux/rcupdate.h | 13 ++++++++++++-
kernel/rcupdate.c | 12 ------------
2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 89f7e97..9967b2b 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -43,6 +43,7 @@
#include <linux/completion.h>
#include <linux/debugobjects.h>
#include <linux/compiler.h>
+#include <linux/percpu.h>

#ifdef CONFIG_RCU_TORTURE_TEST
extern int rcutorture_runnable; /* for sysctl */
@@ -150,7 +151,17 @@ DECLARE_PER_CPU(int, rcu_read_unlock_special);
DECLARE_PER_CPU(struct task_struct *, rcu_current_task);
#endif /* #ifdef CONFIG_PROVE_RCU */

-extern void __rcu_read_lock(void);
+/*
+ * Preemptible-RCU implementation for rcu_read_lock(). Just increment
+ * the per-CPU rcu_read_lock_nesting: Shared state and per-task state will
+ * be updated if we block.
+ */
+static inline void __rcu_read_lock(void)
+{
+ __this_cpu_inc(rcu_read_lock_nesting);
+ barrier(); /* Keep code within RCU read-side critical section. */
+}
+
extern void __rcu_read_unlock(void);
void synchronize_rcu(void);

diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index f77a5fc..d52c68e 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -59,18 +59,6 @@ DEFINE_PER_CPU(struct task_struct *, rcu_current_task);
#endif /* #ifdef CONFIG_PROVE_RCU */

/*
- * Preemptible-RCU implementation for rcu_read_lock(). Just increment
- * the per-CPU rcu_read_lock_nesting: Shared state and per-task state will
- * be updated if we block.
- */
-void __rcu_read_lock(void)
-{
- __this_cpu_inc(rcu_read_lock_nesting);
- barrier(); /* Keep code within RCU read-side critical section. */
-}
-EXPORT_SYMBOL_GPL(__rcu_read_lock);
-
-/*
* Tree-preemptible RCU implementation for rcu_read_unlock().
* Decrement rcu_read_lock_nesting. If the result is zero (outermost
* rcu_read_unlock()) and rcu_read_unlock_special is non-zero, then
--
1.7.8

Paul E. McKenney

unread,
Apr 14, 2012, 12:21:33 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@polymtl.ca, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org, Paul E. McKenney, Paul E. McKenney
From: "Paul E. McKenney" <paul.m...@linaro.org>

When running preemptible RCU, if a task exits in an RCU read-side
critical section having blocked within that same RCU read-side critical
section, the task must be removed from the list of tasks blocking a
grace period (perhaps the current grace period, perhaps the next grace
period, depending on timing). The exit() path invokes exit_rcu() to
do this cleanup.

However, the current implementation of exit_rcu() needlessly does the
cleanup even if the task did not block within the current RCU read-side
critical section, which wastes time and needlessly increases the size
of the state space. Fix this by only doing the cleanup if the current
task is actually on the list of tasks blocking some grace period.

While we are at it, consolidate the two identical exit_rcu() functions
into a single function.

Signed-off-by: Paul E. McKenney <paul.m...@linaro.org>
Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
---
include/linux/rcupdate.h | 1 +
include/linux/rcutiny.h | 5 -----
include/linux/rcutree.h | 12 ------------
kernel/rcupdate.c | 27 ++++++++++++++++++++++++++-
kernel/rcutiny_plugin.h | 16 ----------------
kernel/rcutree_plugin.h | 16 ----------------
6 files changed, 27 insertions(+), 50 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 1cf19ef..5a94dcf 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -193,6 +193,7 @@ extern void rcu_idle_enter(void);
extern void rcu_idle_exit(void);
extern void rcu_irq_enter(void);
extern void rcu_irq_exit(void);
+extern void exit_rcu(void);

/**
* RCU_NONIDLE - Indicate idle-loop code that needs RCU readers
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 080b5bd..adb5e5a 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -87,10 +87,6 @@ static inline void kfree_call_rcu(struct rcu_head *head,

#ifdef CONFIG_TINY_RCU

-static inline void exit_rcu(void)
-{
-}
-
static inline int rcu_needs_cpu(int cpu)
{
return 0;
@@ -98,7 +94,6 @@ static inline int rcu_needs_cpu(int cpu)

#else /* #ifdef CONFIG_TINY_RCU */

-extern void exit_rcu(void);
int rcu_preempt_needs_cpu(void);

static inline int rcu_needs_cpu(int cpu)
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index e8ee5dd..782a8ab 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -45,18 +45,6 @@ static inline void rcu_virt_note_context_switch(int cpu)
rcu_note_context_switch(cpu);
}

-#ifdef CONFIG_TREE_PREEMPT_RCU
-
-extern void exit_rcu(void);
-
-#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
-
-static inline void exit_rcu(void)
-{
-}
-
-#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
-
extern void synchronize_rcu_bh(void);
extern void synchronize_sched_expedited(void);
extern void synchronize_rcu_expedited(void);
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index eb5d160..1acf0da 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -54,7 +54,32 @@
#ifdef CONFIG_PREEMPT_RCU
DEFINE_PER_CPU(int, rcu_read_lock_nesting);
DEFINE_PER_CPU(int, rcu_read_unlock_special);
-#endif /* #ifdef CONFIG_PREEMPT_RCU */
+
+/*
+ * Check for a task exiting while in a preemptible-RCU read-side
+ * critical section, clean up if so. No need to issue warnings,
+ * as debug_check_no_locks_held() already does this if lockdep
+ * is enabled.
+ */
+void exit_rcu(void)
+{
+ struct task_struct *t = current;
+
+ if (likely(list_empty(&current->rcu_node_entry)))
+ return;
+ t->rcu_read_lock_nesting = 1;
+ barrier();
+ t->rcu_read_unlock_special = RCU_READ_UNLOCK_BLOCKED;
+ __rcu_read_unlock();
+}
+
+#else /* #ifdef CONFIG_PREEMPT_RCU */
+
+void exit_rcu(void)
+{
+}
+
+#endif /* #else #ifdef CONFIG_PREEMPT_RCU */

#ifdef CONFIG_DEBUG_LOCK_ALLOC
static struct lock_class_key rcu_lock_key;
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index ff7ec65..66a542a 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -851,22 +851,6 @@ int rcu_preempt_needs_cpu(void)
return rcu_preempt_ctrlblk.rcb.rcucblist != NULL;
}

-/*
- * Check for a task exiting while in a preemptible -RCU read-side
- * critical section, clean up if so. No need to issue warnings,
- * as debug_check_no_locks_held() already does this if lockdep
- * is enabled.
- */
-void exit_rcu(void)
-{
- struct task_struct *t = current;
-
- if (t->rcu_read_lock_nesting == 0)
- return;
- t->rcu_read_lock_nesting = 1;
- __rcu_read_unlock();
-}
-
#else /* #ifdef CONFIG_TINY_PREEMPT_RCU */

#ifdef CONFIG_RCU_TRACE
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index f60b315..76a1ba9 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -969,22 +969,6 @@ static void __init __rcu_init_preempt(void)
rcu_init_one(&rcu_preempt_state, &rcu_preempt_data);
}

-/*
- * Check for a task exiting while in a preemptible-RCU read-side
- * critical section, clean up if so. No need to issue warnings,
- * as debug_check_no_locks_held() already does this if lockdep
- * is enabled.
- */
-void exit_rcu(void)
-{
- struct task_struct *t = current;
-
- if (t->rcu_read_lock_nesting == 0)
- return;
- t->rcu_read_lock_nesting = 1;
- __rcu_read_unlock();
-}
-
#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */

static struct rcu_state *rcu_state = &rcu_sched_state;
--
1.7.8
Message has been deleted

Paul E. McKenney

unread,
Apr 14, 2012, 12:21:54 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@polymtl.ca, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org, Paul E. McKenney, Paul E. McKenney
From: "Paul E. McKenney" <paul.m...@linaro.org>

The copy_to_user() function now does some compile-time buffer-size
checks, but these checks can be fooled by pointers, which always
appear to be 4 or 8 bytes long for 32-bit and 64-bit builds, respectively.
Take care of the warnings in fs/binfmt_misc.c by invoking the underlying
_copy_to_user() function.

Signed-off-by: Paul E. McKenney <paul.m...@linaro.org>
Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
---
fs/binfmt_misc.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index a9198df..e5ea8c3 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -310,7 +310,7 @@ static Node *create_entry(const char __user *buffer, size_t count)
p = buf = (char *)e + sizeof(Node);

memset(e, 0, sizeof(Node));
- if (copy_from_user(buf, buffer, count))
+ if (_copy_from_user(buf, buffer, count))
goto Efault;

del = *p++; /* delimeter */
@@ -418,7 +418,7 @@ static int parse_command(const char __user *buffer, size_t count)
return 0;
if (count > 3)
return -EINVAL;
- if (copy_from_user(s, buffer, count))
+ if (_copy_from_user(s, buffer, count))
return -EFAULT;
if (s[count-1] == '\n')
count--;
--
1.7.8

Paul E. McKenney

unread,
Apr 14, 2012, 12:22:07 PM4/14/12
to linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@polymtl.ca, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org, torv...@linux-foundation.org, Paul E. McKenney, Paul E. McKenney
From: "Paul E. McKenney" <paul.m...@linaro.org>

Currently, PREEMPT_RCU readers are enqueued upon entry to the scheduler.
This is inefficient because enqueuing is required only if there is a
context switch, and entry to the scheduler does not guarantee a context
switch.

The commit therefore moves the enqueuing to immediately precede the
call to switch_to() from the scheduler.

Signed-off-by: Paul E. McKenney <paul.m...@linaro.org>
Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
---
arch/um/drivers/mconsole_kern.c | 1 +
include/linux/rcupdate.h | 1 +
include/linux/rcutiny.h | 6 ------
include/linux/sched.h | 10 ++++++++++
kernel/rcutree.c | 1 -
kernel/rcutree.h | 1 -
kernel/rcutree_plugin.h | 14 +++-----------
kernel/sched/core.c | 1 +
8 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
index c70e047..f69ff2d 100644
--- a/arch/um/drivers/mconsole_kern.c
+++ b/arch/um/drivers/mconsole_kern.c
@@ -704,6 +704,7 @@ static void stack_proc(void *arg)
struct task_struct *from = current, *to = arg;

to->thread.saved_task = from;
+ rcu_switch_from(from);
switch_to(from, to, from);
}

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9372174..aca4ef0 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -183,6 +183,7 @@ static inline int rcu_preempt_depth(void)
/* Internal to kernel */
extern void rcu_sched_qs(int cpu);
extern void rcu_bh_qs(int cpu);
+extern void rcu_preempt_note_context_switch(void);
extern void rcu_check_callbacks(int cpu, int user);
struct notifier_block;
extern void rcu_idle_enter(void);
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index e93df77..080b5bd 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -87,10 +87,6 @@ static inline void kfree_call_rcu(struct rcu_head *head,

#ifdef CONFIG_TINY_RCU

-static inline void rcu_preempt_note_context_switch(void)
-{
-}
-
static inline void exit_rcu(void)
{
}
@@ -102,7 +98,6 @@ static inline int rcu_needs_cpu(int cpu)

#else /* #ifdef CONFIG_TINY_RCU */

-void rcu_preempt_note_context_switch(void);
extern void exit_rcu(void);
int rcu_preempt_needs_cpu(void);

@@ -116,7 +111,6 @@ static inline int rcu_needs_cpu(int cpu)
static inline void rcu_note_context_switch(int cpu)
{
rcu_sched_qs(cpu);
- rcu_preempt_note_context_switch();
}

/*
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e692aba..a749c9d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1879,12 +1879,22 @@ static inline void rcu_copy_process(struct task_struct *p)
INIT_LIST_HEAD(&p->rcu_node_entry);
}

+static inline void rcu_switch_from(struct task_struct *prev)
+{
+ if (prev->rcu_read_lock_nesting != 0)
+ rcu_preempt_note_context_switch();
+}
+
#else

static inline void rcu_copy_process(struct task_struct *p)
{
}

+static inline void rcu_switch_from(struct task_struct *prev)
+{
+}
+
#endif

#ifdef CONFIG_SMP
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 1050d6d..6135150 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -192,7 +192,6 @@ void rcu_note_context_switch(int cpu)
{
trace_rcu_utilization("Start context switch");
rcu_sched_qs(cpu);
- rcu_preempt_note_context_switch(cpu);
trace_rcu_utilization("End context switch");
}
EXPORT_SYMBOL_GPL(rcu_note_context_switch);
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index cdd1be0..d6b70b08 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -423,7 +423,6 @@ DECLARE_PER_CPU(char, rcu_cpu_has_work);
/* Forward declarations for rcutree_plugin.h */
static void rcu_bootup_announce(void);
long rcu_batches_completed(void);
-static void rcu_preempt_note_context_switch(int cpu);
static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp);
#ifdef CONFIG_HOTPLUG_CPU
static void rcu_report_unblock_qs_rnp(struct rcu_node *rnp,
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index c023464..b1ac22e 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -153,7 +153,7 @@ static void rcu_preempt_qs(int cpu)
*
* Caller must disable preemption.
*/
-static void rcu_preempt_note_context_switch(int cpu)
+void rcu_preempt_note_context_switch(void)
{
struct task_struct *t = current;
unsigned long flags;
@@ -164,7 +164,7 @@ static void rcu_preempt_note_context_switch(int cpu)
(t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {

/* Possibly blocking in an RCU read-side critical section. */
- rdp = per_cpu_ptr(rcu_preempt_state.rda, cpu);
+ rdp = __this_cpu_ptr(rcu_preempt_state.rda);
rnp = rdp->mynode;
raw_spin_lock_irqsave(&rnp->lock, flags);
t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BLOCKED;
@@ -228,7 +228,7 @@ static void rcu_preempt_note_context_switch(int cpu)
* means that we continue to block the current grace period.
*/
local_irq_save(flags);
- rcu_preempt_qs(cpu);
+ rcu_preempt_qs(smp_processor_id());
local_irq_restore(flags);
}

@@ -1018,14 +1018,6 @@ void rcu_force_quiescent_state(void)
EXPORT_SYMBOL_GPL(rcu_force_quiescent_state);

/*
- * Because preemptible RCU does not exist, we never have to check for
- * CPUs being in quiescent states.
- */
-static void rcu_preempt_note_context_switch(int cpu)
-{
-}
-
-/*
* Because preemptible RCU does not exist, there are never any preempted
* RCU readers.
*/
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5255c9d..10689da 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2051,6 +2051,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
#endif

/* Here we just switch the register state and the stack. */
+ rcu_switch_from(prev);
switch_to(prev, next, prev);

barrier();
--
1.7.8
Message has been deleted

Linus Torvalds

unread,
Apr 14, 2012, 12:28:56 PM4/14/12
to pau...@linux.vnet.ibm.com, linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org
Yeah, the read-unlock does not seem to be worth inlining as-is. As
mentioned earlier, it tends to be the rcu_read_lock() that we really
want to inline anyway (not just because it's small, but also because
the call clobbers registers, and that has other bad interaction).

Linus

Paul E. McKenney

unread,
Apr 14, 2012, 12:53:41 PM4/14/12
to Linus Torvalds, linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org
On Sat, Apr 14, 2012 at 09:38:50AM -0700, Linus Torvalds wrote:
> On Sat, Apr 14, 2012 at 9:19 AM, Paul E. McKenney
> <pau...@linux.vnet.ibm.com> wrote:
> >
> > This series is version two of the inlinable versions of preemptible
> > RCU's __rcu_read_lock() and __rcu_read_unlock().
>
> Btw, what is this supposed to apply on top of? It doesn't seem to
> apply cleanly to either v3.3, v3.4-rc[12], or current top-of-tree.
>
> I can make it apply by reducing the context to just one line for
> patch#6, but your base seems to be odd.

My next step is to forward-port to 4.4-rc2, retest, and expose to -next.
The base is on top of the series that was pulled into the 4.4 merge window,
but on their base on top of 3.4-rc4.

I probably should have forward-ported before submitting, but I was so
happy about it finally actually passing a moderate version of the full
set of rcutorture tests that I just couldn't restrain myself.

Thanx, Paul

Linus Torvalds

unread,
Apr 14, 2012, 1:09:22 PM4/14/12
to pau...@linux.vnet.ibm.com, linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org
On Sat, Apr 14, 2012 at 9:58 AM, Linus Torvalds
<torv...@linux-foundation.org> wrote:
>
> I'll see if it boots and what it does to my profiles and
> microbenchmark, though.

Ok, I can't see any performance difference in the numbers - my
benchmark noise is *much* bigger than anything this would show.

The profile looks fine, and obviously __rcu_read_lock() is entirely
gone. The top user (avc_has_perm_flags()) looks fine. I note that you
might want to look at the placement of the percpu data - I think it
probably makes sense to put the RCU data close to 'current' etc to get
as much cacheline sharing as possible, and it doesn't seem to be right
now, but it looks reasonable.

But on the whole, I can't claim that it looks noticeable ;*(

Linus

Paul E. McKenney

unread,
Apr 14, 2012, 1:26:50 PM4/14/12
to Linus Torvalds, linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org
On Sat, Apr 14, 2012 at 10:08:54AM -0700, Linus Torvalds wrote:
> On Sat, Apr 14, 2012 at 9:58 AM, Linus Torvalds
> <torv...@linux-foundation.org> wrote:
> >
> > I'll see if it boots and what it does to my profiles and
> > microbenchmark, though.
>
> Ok, I can't see any performance difference in the numbers - my
> benchmark noise is *much* bigger than anything this would show.

Might still be worthwhile on embedded CPUs that don't optimize
function calls as thoroughly as does x86, maybe?

> The profile looks fine, and obviously __rcu_read_lock() is entirely
> gone. The top user (avc_has_perm_flags()) looks fine. I note that you
> might want to look at the placement of the percpu data - I think it
> probably makes sense to put the RCU data close to 'current' etc to get
> as much cacheline sharing as possible, and it doesn't seem to be right
> now, but it looks reasonable.

Is there somewhere in non-architecture-specific code that would be a
good place to put this? Or is the DEFINE_PER_CPU() for current_task
moving from arch/x86 to core code?

> But on the whole, I can't claim that it looks noticeable ;*(

Well, then, I guess I don't feel quite so bad about having prioritized
this so low for so long. ;-)

Thanx, Paul

Paul E. McKenney

unread,
Apr 15, 2012, 12:27:33 PM4/15/12
to Linus Torvalds, linux-...@vger.kernel.org, mi...@elte.hu, la...@cn.fujitsu.com, dipa...@in.ibm.com, ak...@linux-foundation.org, mathieu....@efficios.com, jo...@joshtriplett.org, n...@us.ibm.com, tg...@linutronix.de, pet...@infradead.org, ros...@goodmis.org, Valdis.K...@vt.edu, dhow...@redhat.com, eric.d...@gmail.com, dar...@dvhart.com, fwei...@gmail.com, pat...@linaro.org
On Sat, Apr 14, 2012 at 10:25:04AM -0700, Paul E. McKenney wrote:
> On Sat, Apr 14, 2012 at 10:08:54AM -0700, Linus Torvalds wrote:
> > On Sat, Apr 14, 2012 at 9:58 AM, Linus Torvalds
> > <torv...@linux-foundation.org> wrote:
> > >
> > > I'll see if it boots and what it does to my profiles and
> > > microbenchmark, though.
> >
> > Ok, I can't see any performance difference in the numbers - my
> > benchmark noise is *much* bigger than anything this would show.
>
> Might still be worthwhile on embedded CPUs that don't optimize
> function calls as thoroughly as does x86, maybe?
>
> > The profile looks fine, and obviously __rcu_read_lock() is entirely
> > gone. The top user (avc_has_perm_flags()) looks fine. I note that you
> > might want to look at the placement of the percpu data - I think it
> > probably makes sense to put the RCU data close to 'current' etc to get
> > as much cacheline sharing as possible, and it doesn't seem to be right
> > now, but it looks reasonable.
>
> Is there somewhere in non-architecture-specific code that would be a
> good place to put this? Or is the DEFINE_PER_CPU() for current_task
> moving from arch/x86 to core code?
>
> > But on the whole, I can't claim that it looks noticeable ;*(
>
> Well, then, I guess I don't feel quite so bad about having prioritized
> this so low for so long. ;-)

One other thing -- may I add your Tested-by to the series?
0 new messages