Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] perf_events: add sampling period randomization support

0 views
Skip to first unread message

era...@google.com

unread,
Mar 2, 2010, 1:10:02 AM3/2/10
to
This patch adds support for randomizing the sampling period.
Randomization is very useful to mitigate the bias that exists
with sampling. The random number generator does not need to
be sophisticated. This patch uses the builtin random32()
generator.

The user activates randomization by setting the perf_event_attr.random
field to 1 and by passing a bitmask to control the range of variation
above the base period. Period will vary from period to period & mask.
Note that randomization is not available when a target interrupt rate
(freq) is enabled.

The last used period can be collected using the PERF_SAMPLE_PERIOD flag
in sample_type.

The patch has been tested on X86. There is also code for PowerPC but
I could not test it.

Signed-off-by: Stephane Eranian <era...@google.com>

--
arch/powerpc/kernel/perf_event.c | 3 +++
arch/x86/kernel/cpu/perf_event.c | 2 ++
arch/x86/kernel/cpu/perf_event_intel.c | 4 ++++
include/linux/perf_event.h | 7 +++++--
kernel/perf_event.c | 24 ++++++++++++++++++++++++
5 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/perf_event.c b/arch/powerpc/kernel/perf_event.c
index b6cf8f1..994df17 100644
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -1150,6 +1150,9 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
val = 0;
left = atomic64_read(&event->hw.period_left) - delta;
if (period) {
+ if (event->attr.random)
+ perf_randomize_event_period(event);
+
if (left <= 0) {
left += period;
if (left <= 0)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 641ccb9..159d951 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1110,6 +1110,8 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
if (val & (1ULL << (x86_pmu.event_bits - 1)))
continue;

+ if (event->attr.random)
+ perf_randomize_event_period(event);
/*
* event overflow
*/
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index cf6590c..5c8d6ed 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -690,6 +690,10 @@ static int intel_pmu_save_and_restart(struct perf_event *event)
int ret;

x86_perf_event_update(event, hwc, idx);
+
+ if (event->attr.random)
+ perf_randomize_event_period(event);
+
ret = x86_perf_event_set_period(event, hwc, idx);

return ret;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 04f06b4..e91a759 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -203,8 +203,8 @@ struct perf_event_attr {
enable_on_exec : 1, /* next exec enables */
task : 1, /* trace fork/exit */
watermark : 1, /* wakeup_watermark */
-
- __reserved_1 : 49;
+ random : 1, /* period randomization */
+ __reserved_1 : 48;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -214,6 +214,8 @@ struct perf_event_attr {
__u32 bp_type;
__u64 bp_addr;
__u64 bp_len;
+
+ __u64 random_mask;
};

/*
@@ -877,6 +879,7 @@ extern int perf_swevent_get_recursion_context(void);
extern void perf_swevent_put_recursion_context(int rctx);
extern void perf_event_enable(struct perf_event *event);
extern void perf_event_disable(struct perf_event *event);
+extern void perf_randomize_event_period(struct perf_event *event);
#else
static inline void
perf_event_task_sched_in(struct task_struct *task) { }
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index a661e79..61ee8c6 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -30,6 +30,7 @@
#include <linux/perf_event.h>
#include <linux/ftrace_event.h>
#include <linux/hw_breakpoint.h>
+#include <linux/random.h>

#include <asm/irq_regs.h>

@@ -3876,6 +3877,21 @@ int perf_event_overflow(struct perf_event *event, int nmi,
return __perf_event_overflow(event, nmi, 1, data, regs);
}

+void perf_randomize_event_period(struct perf_event *event)
+{
+ u64 new_seed;
+ u64 mask = event->attr.random_mask;
+
+ event->hw.last_period = event->hw.sample_period;
+
+ new_seed = random32();
+
+ if (unlikely(mask >> 32))
+ new_seed |= (u64)random32() << 32;
+
+ event->hw.sample_period = event->attr.sample_period + (new_seed & mask);
+}
+
/*
* Generic software event infrastructure
*/
@@ -4190,6 +4206,9 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer)
ret = HRTIMER_NORESTART;
}

+ if (event->attr.random)
+ perf_randomize_event_period(event);
+
period = max_t(u64, 10000, event->hw.sample_period);
hrtimer_forward_now(hrtimer, ns_to_ktime(period));

@@ -4736,6 +4755,11 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
if (attr->read_format & ~(PERF_FORMAT_MAX-1))
return -EINVAL;

+ if (attr->random && attr->freq)
+ return -EINVAL;
+
+ if (attr->random && !attr->random_mask)
+ return -EINVAL;
out:
return ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

David Miller

unread,
Mar 2, 2010, 1:30:01 AM3/2/10
to
From: era...@google.com
Date: Mon, 1 Mar 2010 22:07:09 -0800

> This patch adds support for randomizing the sampling period.
> Randomization is very useful to mitigate the bias that exists
> with sampling. The random number generator does not need to
> be sophisticated. This patch uses the builtin random32()
> generator.
>
> The user activates randomization by setting the perf_event_attr.random
> field to 1 and by passing a bitmask to control the range of variation
> above the base period. Period will vary from period to period & mask.
> Note that randomization is not available when a target interrupt rate
> (freq) is enabled.
>
> The last used period can be collected using the PERF_SAMPLE_PERIOD flag
> in sample_type.
>
> The patch has been tested on X86. There is also code for PowerPC but
> I could not test it.
>
> Signed-off-by: Stephane Eranian <era...@google.com>

Please add this, which adds the feature to sparc too.

diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 9f2b2ba..3e28225 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1214,6 +1214,9 @@ static int __kprobes perf_event_nmi_handler(struct notifier_block *self,
if (val & (1ULL << 31))


continue;

+ if (event->attr.random)
+ perf_randomize_event_period(event);

+
data.period = event->hw.last_period;
if (!sparc_perf_event_set_period(event, hwc, idx))
continue;

Peter Zijlstra

unread,
Mar 2, 2010, 4:20:01 AM3/2/10
to
On Mon, 2010-03-01 at 22:07 -0800, era...@google.com wrote:
> This patch adds support for randomizing the sampling period.
> Randomization is very useful to mitigate the bias that exists
> with sampling. The random number generator does not need to
> be sophisticated. This patch uses the builtin random32()
> generator.
>
> The user activates randomization by setting the perf_event_attr.random
> field to 1 and by passing a bitmask to control the range of variation
> above the base period. Period will vary from period to period & mask.
> Note that randomization is not available when a target interrupt rate
> (freq) is enabled.
>
> The last used period can be collected using the PERF_SAMPLE_PERIOD flag
> in sample_type.
>
> The patch has been tested on X86. There is also code for PowerPC but
> I could not test it.

I don't thikn we need to touch the arch code, we didn't need to for
frequency driven sampling, so I don't see a reason to do so for
randomization either.

Robert Richter

unread,
Mar 2, 2010, 6:00:01 AM3/2/10
to
On 01.03.10 22:07:09, era...@google.com wrote:
> This patch adds support for randomizing the sampling period.
> Randomization is very useful to mitigate the bias that exists
> with sampling. The random number generator does not need to
> be sophisticated. This patch uses the builtin random32()
> generator.
>
> The user activates randomization by setting the perf_event_attr.random
> field to 1 and by passing a bitmask to control the range of variation
> above the base period. Period will vary from period to period & mask.
> Note that randomization is not available when a target interrupt rate
> (freq) is enabled.

Instead of providing a mask I would prefer to either use a bit width
parameter there the mask can be calculated from or to specify a range
the period may vary.

>
> The last used period can be collected using the PERF_SAMPLE_PERIOD flag
> in sample_type.
>
> The patch has been tested on X86. There is also code for PowerPC but
> I could not test it.
>
> Signed-off-by: Stephane Eranian <era...@google.com>
>
> --
> arch/powerpc/kernel/perf_event.c | 3 +++
> arch/x86/kernel/cpu/perf_event.c | 2 ++
> arch/x86/kernel/cpu/perf_event_intel.c | 4 ++++

I agree with Peter, I also don't see the need to touch arch specific
code.

> include/linux/perf_event.h | 7 +++++--
> kernel/perf_event.c | 24 ++++++++++++++++++++++++
> 5 files changed, 38 insertions(+), 2 deletions(-)
>

[...]

> +void perf_randomize_event_period(struct perf_event *event)
> +{
> + u64 new_seed;
> + u64 mask = event->attr.random_mask;
> +
> + event->hw.last_period = event->hw.sample_period;
> +
> + new_seed = random32();
> +
> + if (unlikely(mask >> 32))
> + new_seed |= (u64)random32() << 32;
> +
> + event->hw.sample_period = event->attr.sample_period + (new_seed & mask);

Only adding the random value will lead to longer sample periods on
average. To compensate this you could calculate something like:

event->hw.sample_period = event->attr.sample_period + (new_seed & mask) - (mask >> 1);

Or, the offset is already in sample_period.

Also a range check for sample_period is necessary to avoid over- or
underflows.

-Robert

> +}

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert....@amd.com

Peter Zijlstra

unread,
Mar 2, 2010, 6:50:02 AM3/2/10
to
On Tue, 2010-03-02 at 11:53 +0100, Robert Richter wrote:
>
> Only adding the random value will lead to longer sample periods on
> average. To compensate this you could calculate something like:
>
> event->hw.sample_period = event->attr.sample_period + (new_seed & mask) - (mask >> 1);

Or cheat and do something like:

sample_period ^= (new_seed & mask);

Frederic Weisbecker

unread,
Mar 2, 2010, 8:20:01 AM3/2/10
to


I'd rather name this field random_period. Even though the comment
tell us enough, it's better that the code speak for itself.

Robert Richter

unread,
Mar 2, 2010, 9:20:02 AM3/2/10
to
On 02.03.10 12:41:18, Peter Zijlstra wrote:
> On Tue, 2010-03-02 at 11:53 +0100, Robert Richter wrote:
> >
> > Only adding the random value will lead to longer sample periods on
> > average. To compensate this you could calculate something like:
> >
> > event->hw.sample_period = event->attr.sample_period + (new_seed & mask) - (mask >> 1);
>
> Or cheat and do something like:
>
> sample_period ^= (new_seed & mask);

This wont work, it will be asymmetric, e.g. for

(event->attr.sample_period & mask) == 0

the offset would be always positive. Only for

(event->attr.sample_period & mask) == (mask & ~(mask >> 1))

it is correct.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert....@amd.com

--

0 new messages