Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[GIT PULL] perf/hw-breakpoints: Various updates

3 views
Skip to first unread message

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
Ingo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
perf/core

I have hesitations about the 2 last patches as they should probably
go into another branch. Their scope is wider than perf as they also
fix x86-64 stacktraces in general.

Tell me if you'd prefer them in a separate tree based on something else.

Thanks,
Frederic
---

Frederic Weisbecker (9):
hw-breakpoints: Add two reserved fields for future extensions
perf: Remove pointless union that wraps the hw breakpoint fields
perf: Remove unused struct perf_event::event_callback
hw-breakpoints: Drop callback and task parameters from modify helper
hw-breakpoints: Use overflow handler instead of the event callback
perf: Remove the "event" callback from perf events
x86/perf: Exclude the debug stack from the callchains
x86: Fixup wrong debug exception frame link in stacktraces
x86: Fixup wrong irq frame link in stacktraces


arch/x86/kernel/cpu/perf_event.c | 9 ++++---
arch/x86/kernel/dumpstack_64.c | 33 +++++++++++++++++++++++++++++-
arch/x86/kernel/entry_64.S | 6 ++--
arch/x86/kernel/hw_breakpoint.c | 5 +--
arch/x86/kernel/ptrace.c | 13 +++++++----
include/linux/hw_breakpoint.h | 34 +++++++++++-------------------
include/linux/perf_event.h | 28 +++++++++++--------------
kernel/hw_breakpoint.c | 22 ++++++-------------
kernel/perf_event.c | 24 ++++++++-------------
kernel/trace/trace_ksym.c | 5 ++-
samples/hw_breakpoint/data_breakpoint.c | 7 ++++-
11 files changed, 99 insertions(+), 87 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
Dumping the callchains from breakpoint events with perf gives strange
results:

3.75% perf [kernel] [k] _raw_read_unlock
|
--- _raw_read_unlock
perf_callchain
perf_prepare_sample
__perf_event_overflow
perf_swevent_overflow
perf_swevent_add
perf_bp_event
hw_breakpoint_exceptions_notify
notifier_call_chain
__atomic_notifier_call_chain
atomic_notifier_call_chain
notify_die
do_debug
debug
munmap

We are infected with all the debug stack. Like the nmi stack, the debug
stack is undesired as it is part of the profiling path, not helpful for
the user.

Ignore it.

Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---
arch/x86/kernel/cpu/perf_event.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c1bbed1..d35f260 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2287,7 +2287,7 @@ void callchain_store(struct perf_callchain_entry *entry, u64 ip)

static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_irq_entry);
static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_nmi_entry);
-static DEFINE_PER_CPU(int, in_nmi_frame);
+static DEFINE_PER_CPU(int, in_ignored_frame);


static void
@@ -2303,8 +2303,9 @@ static void backtrace_warning(void *data, char *msg)

static int backtrace_stack(void *data, char *name)
{
- per_cpu(in_nmi_frame, smp_processor_id()) =
- x86_is_stack_id(NMI_STACK, name);
+ per_cpu(in_ignored_frame, smp_processor_id()) =
+ x86_is_stack_id(NMI_STACK, name) ||
+ x86_is_stack_id(DEBUG_STACK, name);

return 0;
}
@@ -2313,7 +2314,7 @@ static void backtrace_address(void *data, unsigned long addr, int reliable)
{
struct perf_callchain_entry *entry = data;

- if (per_cpu(in_nmi_frame, smp_processor_id()))
+ if (per_cpu(in_ignored_frame, smp_processor_id()))
return;

if (reliable)
--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
As it is not used anymore and has been superseded by overflow_handler.

Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---

include/linux/perf_event.h | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d2f2667..89098e3 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -567,7 +567,6 @@ struct perf_pending_entry {

struct perf_sample_data;

-typedef void (*perf_callback_t)(struct perf_event *, void *);
typedef void (*perf_overflow_handler_t)(struct perf_event *, int,
struct perf_sample_data *,
struct pt_regs *regs);
@@ -669,8 +668,6 @@ struct perf_event {
struct event_filter *filter;
#endif

- perf_callback_t callback;
-
#endif /* CONFIG_PERF_EVENTS */
};

--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
It stands to anonymize a structure, but structures can already
anonymize by themselves.

Reported-by: Peter Zijlstra <pet...@infradead.org>
Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>


Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---

include/linux/perf_event.h | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a61e4de..53230e9 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -215,14 +215,12 @@ struct perf_event_attr {
__u32 wakeup_watermark; /* bytes before wakeup */
};

- union {
- struct { /* Hardware breakpoint info */
- __u64 bp_addr;
- __u32 bp_type;
- __u32 bp_len;
- __u64 __bp_reserved_1;
- __u64 __bp_reserved_2;
- };
+ struct { /* Hardware breakpoint info */
+ __u64 bp_addr;
+ __u32 bp_type;
+ __u32 bp_len;
+ __u64 __bp_reserved_1;
+ __u64 __bp_reserved_2;
};

__u32 __reserved_2;
--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
Drop the callback and task parameters from modify_user_hw_breakpoint().
For now we have no user that need to modify a breakpoint to the point
of changing its handler or its task context.

Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---
arch/x86/kernel/ptrace.c | 4 ++--
include/linux/hw_breakpoint.h | 9 ++-------
kernel/hw_breakpoint.c | 7 +++----
3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 04d182a..dbb3955 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -618,7 +618,7 @@ ptrace_modify_breakpoint(struct perf_event *bp, int len, int type,
attr.bp_type = gen_type;
attr.disabled = disabled;

- return modify_user_hw_breakpoint(bp, &attr, bp->callback, tsk);
+ return modify_user_hw_breakpoint(bp, &attr);
}

/*
@@ -740,7 +740,7 @@ static int ptrace_set_breakpoint_addr(struct task_struct *tsk, int nr,

attr = bp->attr;
attr.bp_addr = addr;
- bp = modify_user_hw_breakpoint(bp, &attr, bp->callback, tsk);
+ bp = modify_user_hw_breakpoint(bp, &attr);
}
/*
* CHECKME: the previous code returned -EIO if the addr wasn't a
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index a03daed..d33096e 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -57,10 +57,7 @@ register_user_hw_breakpoint(struct perf_event_attr *attr,

/* FIXME: only change from the attr, and don't unregister */
extern struct perf_event *
-modify_user_hw_breakpoint(struct perf_event *bp,
- struct perf_event_attr *attr,
- perf_callback_t triggered,
- struct task_struct *tsk);
+modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr);

/*
* Kernel breakpoints are not associated with any particular thread.
@@ -97,9 +94,7 @@ register_user_hw_breakpoint(struct perf_event_attr *attr,
struct task_struct *tsk) { return NULL; }
static inline struct perf_event *
modify_user_hw_breakpoint(struct perf_event *bp,
- struct perf_event_attr *attr,
- perf_callback_t triggered,
- struct task_struct *tsk) { return NULL; }
+ struct perf_event_attr *attr) { return NULL; }
static inline struct perf_event *
register_wide_hw_breakpoint_cpu(struct perf_event_attr *attr,
perf_callback_t triggered,
diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index cf5ee16..2d10b01 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -312,9 +312,7 @@ EXPORT_SYMBOL_GPL(register_user_hw_breakpoint);
* @tsk: pointer to 'task_struct' of the process to which the address belongs
*/
struct perf_event *
-modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr,
- perf_callback_t triggered,
- struct task_struct *tsk)
+modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr)
{
/*
* FIXME: do it without unregistering
@@ -323,7 +321,8 @@ modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr,
*/
unregister_hw_breakpoint(bp);

- return perf_event_create_kernel_counter(attr, -1, tsk->pid, triggered);
+ return perf_event_create_kernel_counter(attr, -1, bp->ctx->task->pid,
+ bp->callback);
}
EXPORT_SYMBOL_GPL(modify_user_hw_breakpoint);

--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
Add two reserved fields for future extensions in the hardware
breakpoints interface. Further needs may arise.

Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: Peter Zijlstra <pet...@infradead.org>


Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>

Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>


Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---

include/linux/perf_event.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 43adbd7..a61e4de 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -220,6 +220,8 @@ struct perf_event_attr {
__u64 bp_addr;
__u32 bp_type;


__u32 bp_len;
+ __u64 __bp_reserved_1;
+ __u64 __bp_reserved_2;
};

};

--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
When we enter in irq, two things can happen to preserve the link
to the previous frame pointer:

- If we were in an irq already, we don't switch to the irq stack
as we are inside. We just need to save the previous frame
pointer and to link the new one to the previous.

- Otherwise we need another level of indirection. We enter the irq with
the previous stack. We save the previous bp inside and make bp
pointing to its saved address. Then we switch to the irq stack and
push bp another time but to the new stack. This makes two levels to
dereference instead of one.

In the second case, the current stacktrace code omits the second level
and loses the frame pointer accuracy. The stack that follows will then
be considered as unreliable.

Handling that makes the perf callchain happier.
Before:

43.94% [k] _raw_read_lock
|
--- _read_lock
|
|--60.53%-- send_sigio
| __kill_fasync
| kill_fasync
| evdev_pass_event
| evdev_event
| input_pass_event
| input_handle_event
| input_event
| synaptics_process_byte
| psmouse_handle_byte
| psmouse_interrupt
| serio_interrupt
| i8042_interrupt
| handle_IRQ_event
| handle_edge_irq
| handle_irq
| __irqentry_text_start
| ret_from_intr
| |
| |--30.43%-- __select
| |
| |--17.39%-- 0x454f15
| |
| |--13.04%-- __read
| |
| |--13.04%-- vread_hpet
| |
| |--13.04%-- _xcb_lock_io
| |
| --13.04%-- 0x7f630878ce8

After:

50.00% [k] _raw_read_lock
|
--- _read_lock
|
|--98.97%-- send_sigio
| __kill_fasync
| kill_fasync
| evdev_pass_event
| evdev_event
| input_pass_event
| input_handle_event
| input_event
| |
| |--96.88%-- synaptics_process_byte
| | psmouse_handle_byte
| | psmouse_interrupt
| | serio_interrupt
| | i8042_interrupt
| | handle_IRQ_event
| | handle_edge_irq
| | handle_irq
| | __irqentry_text_start
| | ret_from_intr
| | |
| | |--39.78%-- __const_udelay
| | | |
| | | |--91.89%-- ath5k_hw_register_timeout
| | | | ath5k_hw_noise_floor_calibration
| | | | ath5k_hw_reset
| | | | ath5k_reset
| | | | ath5k_config
| | | | ieee80211_hw_config
| | | | |
| | | | |--88.24%-- ieee80211_scan_work
| | | | | worker_thread
| | | | | kthread
| | | | | child_rip
| | | | |
| | | | --11.76%-- ieee80211_scan_completed
| | | | ieee80211_scan_work
| | | | worker_thread
| | | | kthread
| | | | child_rip
| | | |
| | | --8.11%-- ath5k_hw_noise_floor_calibration
| | | ath5k_hw_reset
| | | ath5k_reset
| | | ath5k_config

Note: This does not only affect perf events but also x86-64
stacktraces. They were considered as unreliable once we quit
the irq stack frame.

Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>

Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
---
arch/x86/kernel/dumpstack_64.c | 33 ++++++++++++++++++++++++++++++++-
1 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index a071e6b..004b8aa 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -101,6 +101,35 @@ static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack,
return NULL;
}

+static inline int
+in_irq_stack(unsigned long *stack, unsigned long *irq_stack,
+ unsigned long *irq_stack_end)
+{
+ return (stack >= irq_stack && stack < irq_stack_end);
+}
+
+/*
+ * We are returning from the irq stack and go to the previous one.
+ * If the previous stack is also in the irq stack, then bp in the first
+ * frame of the irq stack points to the previous, interrupted one.
+ * Otherwise we have another level of indirection: We first save
+ * the bp of the previous stack, then we switch the stack to the irq one
+ * and save a new bp that links to the previous one.
+ * (See save_args())
+ */
+static inline unsigned long
+fixup_bp_irq_link(unsigned long bp, unsigned long *stack,
+ unsigned long *irq_stack, unsigned long *irq_stack_end)
+{
+#ifdef CONFIG_FRAME_POINTER
+ struct stack_frame *frame = (struct stack_frame *)bp;
+
+ if (!in_irq_stack(stack, irq_stack, irq_stack_end))
+ return (unsigned long)frame->next_frame;
+#endif
+ return bp;
+}
+
/*
* x86-64 can have up to three kernel stacks:
* process stack
@@ -173,7 +202,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
irq_stack = irq_stack_end -
(IRQ_STACK_SIZE - 64) / sizeof(*irq_stack);

- if (stack >= irq_stack && stack < irq_stack_end) {
+ if (in_irq_stack(stack, irq_stack, irq_stack_end)) {
if (ops->stack(data, "IRQ") < 0)
break;
bp = print_context_stack(tinfo, stack, bp,
@@ -184,6 +213,8 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
* pointer (index -1 to end) in the IRQ stack:
*/
stack = (unsigned long *) (irq_stack_end[-1]);
+ bp = fixup_bp_irq_link(bp, stack, irq_stack,
+ irq_stack_end);
irq_stack_end = NULL;
ops->stack(data, "EOI");
continue;
--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
struct perf_event::event callback was called when a breakpoint
triggers. But this is a rather opaque callback, pretty
tied-only to the breakpoint API and not really integrated into perf
as it triggers even when we don't overflow.

We prefer to use overflow_handler() as it fits into the perf events
rules, being called only when we overflow.

Reported-by: Peter Zijlstra <pet...@infradead.org>
Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>


Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>

---
arch/x86/kernel/hw_breakpoint.c | 5 ++---
arch/x86/kernel/ptrace.c | 9 ++++++---
include/linux/hw_breakpoint.h | 25 +++++++++++--------------
include/linux/perf_event.h | 13 +++++++------
kernel/hw_breakpoint.c | 17 +++++------------
kernel/perf_event.c | 24 +++++++++---------------
kernel/trace/trace_ksym.c | 5 +++--
samples/hw_breakpoint/data_breakpoint.c | 7 +++++--
8 files changed, 48 insertions(+), 57 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index d42f65a..05d5fec 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -362,8 +362,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp,
return ret;
}

- if (bp->callback)
- ret = arch_store_info(bp);
+ ret = arch_store_info(bp);

if (ret < 0)
return ret;
@@ -519,7 +518,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
break;
}

- (bp->callback)(bp, args->regs);
+ perf_bp_event(bp, args->regs);

rcu_read_unlock();
}
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index dbb3955..b361d28 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -555,7 +555,9 @@ static int genregs_set(struct task_struct *target,
return ret;
}

-static void ptrace_triggered(struct perf_event *bp, void *data)
+static void ptrace_triggered(struct perf_event *bp, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
{
int i;
struct thread_struct *thread = &(current->thread);
@@ -599,7 +601,7 @@ ptrace_modify_breakpoint(struct perf_event *bp, int len, int type,
{
int err;
int gen_len, gen_type;
- DEFINE_BREAKPOINT_ATTR(attr);
+ struct perf_event_attr attr;

/*
* We shoud have at least an inactive breakpoint at this
@@ -721,9 +723,10 @@ static int ptrace_set_breakpoint_addr(struct task_struct *tsk, int nr,
{
struct perf_event *bp;
struct thread_struct *t = &tsk->thread;
- DEFINE_BREAKPOINT_ATTR(attr);
+ struct perf_event_attr attr;

if (!t->ptrace_bps[nr]) {
+ hw_breakpoint_init(&attr);
/*
* Put stub len and type to register (reserve) an inactive but
* correct bp
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index d33096e..4d14a38 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -20,19 +20,16 @@ enum {

#ifdef CONFIG_HAVE_HW_BREAKPOINT

-/* As it's for in-kernel or ptrace use, we want it to be pinned */
-#define DEFINE_BREAKPOINT_ATTR(name) \
-struct perf_event_attr name = { \
- .type = PERF_TYPE_BREAKPOINT, \
- .size = sizeof(name), \
- .pinned = 1, \
-};
-
static inline void hw_breakpoint_init(struct perf_event_attr *attr)
{
attr->type = PERF_TYPE_BREAKPOINT;
attr->size = sizeof(*attr);
+ /*
+ * As it's for in-kernel or ptrace use, we want it to be pinned
+ * and to call its callback every hits.
+ */
attr->pinned = 1;
+ attr->sample_period = 1;
}

static inline unsigned long hw_breakpoint_addr(struct perf_event *bp)
@@ -52,7 +49,7 @@ static inline int hw_breakpoint_len(struct perf_event *bp)

extern struct perf_event *
register_user_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered,
+ perf_overflow_handler_t triggered,
struct task_struct *tsk);



/* FIXME: only change from the attr, and don't unregister */

@@ -64,12 +61,12 @@ modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr);


*/
extern struct perf_event *

register_wide_hw_breakpoint_cpu(struct perf_event_attr *attr,
- perf_callback_t triggered,
+ perf_overflow_handler_t triggered,
int cpu);

extern struct perf_event **
register_wide_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered);
+ perf_overflow_handler_t triggered);

extern int register_perf_hw_breakpoint(struct perf_event *bp);
extern int __register_perf_hw_breakpoint(struct perf_event *bp);
@@ -90,18 +87,18 @@ static inline struct arch_hw_breakpoint *counter_arch_bp(struct perf_event *bp)



static inline struct perf_event *

register_user_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered,
+ perf_overflow_handler_t triggered,


struct task_struct *tsk) { return NULL; }
static inline struct perf_event *
modify_user_hw_breakpoint(struct perf_event *bp,

struct perf_event_attr *attr) { return NULL; }
static inline struct perf_event *
register_wide_hw_breakpoint_cpu(struct perf_event_attr *attr,

- perf_callback_t triggered,
+ perf_overflow_handler_t triggered,
int cpu) { return NULL; }
static inline struct perf_event **
register_wide_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered) { return NULL; }
+ perf_overflow_handler_t triggered) { return NULL; }
static inline int
register_perf_hw_breakpoint(struct perf_event *bp) { return -ENOSYS; }
static inline int
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 84bd28a..d2f2667 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -565,10 +565,13 @@ struct perf_pending_entry {
void (*func)(struct perf_pending_entry *);


};

-typedef void (*perf_callback_t)(struct perf_event *, void *);

-
struct perf_sample_data;

+typedef void (*perf_callback_t)(struct perf_event *, void *);
+typedef void (*perf_overflow_handler_t)(struct perf_event *, int,
+ struct perf_sample_data *,
+ struct pt_regs *regs);
+
/**
* struct perf_event - performance event kernel representation:
*/
@@ -660,9 +663,7 @@ struct perf_event {
struct pid_namespace *ns;
u64 id;

- void (*overflow_handler)(struct perf_event *event,
- int nmi, struct perf_sample_data *data,
- struct pt_regs *regs);
+ perf_overflow_handler_t overflow_handler;

#ifdef CONFIG_EVENT_PROFILE
struct event_filter *filter;
@@ -779,7 +780,7 @@ extern struct perf_event *
perf_event_create_kernel_counter(struct perf_event_attr *attr,
int cpu,
pid_t pid,
- perf_callback_t callback);
+ perf_overflow_handler_t callback);
extern u64 perf_event_read_value(struct perf_event *event,
u64 *enabled, u64 *running);

diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index 2d10b01..b600fc2 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -259,7 +259,7 @@ void release_bp_slot(struct perf_event *bp)
}


-int __register_perf_hw_breakpoint(struct perf_event *bp)
+int register_perf_hw_breakpoint(struct perf_event *bp)
{
int ret;

@@ -276,19 +276,12 @@ int __register_perf_hw_breakpoint(struct perf_event *bp)
* This is a quick hack that will be removed soon, once we remove
* the tmp breakpoints from ptrace
*/
- if (!bp->attr.disabled || bp->callback == perf_bp_event)
+ if (!bp->attr.disabled || !bp->overflow_handler)
ret = arch_validate_hwbkpt_settings(bp, bp->ctx->task);

return ret;
}

-int register_perf_hw_breakpoint(struct perf_event *bp)
-{
- bp->callback = perf_bp_event;
-
- return __register_perf_hw_breakpoint(bp);
-}
-
/**
* register_user_hw_breakpoint - register a hardware breakpoint for user space
* @attr: breakpoint attributes
@@ -297,7 +290,7 @@ int register_perf_hw_breakpoint(struct perf_event *bp)
*/
struct perf_event *
register_user_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered,
+ perf_overflow_handler_t triggered,
struct task_struct *tsk)
{


return perf_event_create_kernel_counter(attr, -1, tsk->pid, triggered);

@@ -322,7 +315,7 @@ modify_user_hw_breakpoint(struct perf_event *bp, struct perf_event_attr *attr)
unregister_hw_breakpoint(bp);



return perf_event_create_kernel_counter(attr, -1, bp->ctx->task->pid,

- bp->callback);
+ bp->overflow_handler);
}
EXPORT_SYMBOL_GPL(modify_user_hw_breakpoint);

@@ -347,7 +340,7 @@ EXPORT_SYMBOL_GPL(unregister_hw_breakpoint);
*/
struct perf_event **
register_wide_hw_breakpoint(struct perf_event_attr *attr,
- perf_callback_t triggered)
+ perf_overflow_handler_t triggered)
{
struct perf_event **cpu_events, **pevent, *bp;
long err;
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 6b7ddba..fd43ff4 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4286,15 +4286,8 @@ static void bp_perf_event_destroy(struct perf_event *event)
static const struct pmu *bp_perf_event_init(struct perf_event *bp)
{
int err;
- /*
- * The breakpoint is already filled if we haven't created the counter
- * through perf syscall
- * FIXME: manage to get trigerred to NULL if it comes from syscalls
- */
- if (!bp->callback)
- err = register_perf_hw_breakpoint(bp);
- else
- err = __register_perf_hw_breakpoint(bp);
+
+ err = register_perf_hw_breakpoint(bp);
if (err)
return ERR_PTR(err);

@@ -4390,7 +4383,7 @@ perf_event_alloc(struct perf_event_attr *attr,
struct perf_event_context *ctx,
struct perf_event *group_leader,
struct perf_event *parent_event,
- perf_callback_t callback,
+ perf_overflow_handler_t overflow_handler,
gfp_t gfpflags)
{
const struct pmu *pmu;
@@ -4433,10 +4426,10 @@ perf_event_alloc(struct perf_event_attr *attr,

event->state = PERF_EVENT_STATE_INACTIVE;

- if (!callback && parent_event)
- callback = parent_event->callback;
+ if (!overflow_handler && parent_event)
+ overflow_handler = parent_event->overflow_handler;

- event->callback = callback;
+ event->overflow_handler = overflow_handler;

if (attr->disabled)
event->state = PERF_EVENT_STATE_OFF;
@@ -4776,7 +4769,8 @@ err_put_context:
*/
struct perf_event *
perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
- pid_t pid, perf_callback_t callback)
+ pid_t pid,
+ perf_overflow_handler_t overflow_handler)
{
struct perf_event *event;
struct perf_event_context *ctx;
@@ -4793,7 +4787,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
}

event = perf_event_alloc(attr, cpu, ctx, NULL,
- NULL, callback, GFP_KERNEL);
+ NULL, overflow_handler, GFP_KERNEL);
if (IS_ERR(event)) {
err = PTR_ERR(event);
goto err_put_context;
diff --git a/kernel/trace/trace_ksym.c b/kernel/trace/trace_ksym.c
index ddfa0fd..acb87d4 100644
--- a/kernel/trace/trace_ksym.c
+++ b/kernel/trace/trace_ksym.c
@@ -79,11 +79,12 @@ void ksym_collect_stats(unsigned long hbp_hit_addr)
}
#endif /* CONFIG_PROFILE_KSYM_TRACER */

-void ksym_hbp_handler(struct perf_event *hbp, void *data)
+void ksym_hbp_handler(struct perf_event *hbp, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
{
struct ring_buffer_event *event;
struct ksym_trace_entry *entry;
- struct pt_regs *regs = data;
struct ring_buffer *buffer;
int pc;

diff --git a/samples/hw_breakpoint/data_breakpoint.c b/samples/hw_breakpoint/data_breakpoint.c
index 2952550..c69cbe9 100644
--- a/samples/hw_breakpoint/data_breakpoint.c
+++ b/samples/hw_breakpoint/data_breakpoint.c
@@ -41,7 +41,9 @@ module_param_string(ksym, ksym_name, KSYM_NAME_LEN, S_IRUGO);
MODULE_PARM_DESC(ksym, "Kernel symbol to monitor; this module will report any"
" write operations on the kernel symbol");

-static void sample_hbp_handler(struct perf_event *temp, void *data)
+static void sample_hbp_handler(struct perf_event *bp, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
{
printk(KERN_INFO "%s value is changed\n", ksym_name);
dump_stack();
@@ -51,8 +53,9 @@ static void sample_hbp_handler(struct perf_event *temp, void *data)
static int __init hw_break_module_init(void)
{
int ret;
- DEFINE_BREAKPOINT_ATTR(attr);
+ struct perf_event_attr attr;

+ hw_breakpoint_init(&attr);
attr.bp_addr = kallsyms_lookup_name(ksym_name);
attr.bp_len = HW_BREAKPOINT_LEN_4;
attr.bp_type = HW_BREAKPOINT_W | HW_BREAKPOINT_R;
--
1.6.2.3

Frederic Weisbecker

unread,
Dec 6, 2009, 2:40:02 AM12/6/09
to
This field might result from an older manual rebasing mistake.
We don't use it.

Reported-by: Peter Zijlstra <pet...@infradead.org>
Signed-off-by: Frederic Weisbecker <fwei...@gmail.com>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: "K. Prasad" <pra...@linux.vnet.ibm.com>
---

include/linux/perf_event.h | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53230e9..84bd28a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -670,8 +670,6 @@ struct perf_event {

perf_callback_t callback;

- perf_callback_t event_callback;


-
#endif /* CONFIG_PERF_EVENTS */
};

--
1.6.2.3

--

Ingo Molnar

unread,
Dec 6, 2009, 3:00:02 AM12/6/09
to

* Frederic Weisbecker <fwei...@gmail.com> wrote:

> Ingo,
>
> Please pull the perf/core branch that can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> perf/core
>
> I have hesitations about the 2 last patches as they should probably
> go into another branch. Their scope is wider than perf as they also
> fix x86-64 stacktraces in general.

Pulled, thanks a lot Frederic! These are some very nice fixes - and the
call-graph frame linking fixes will improve regular stacktraces as well.

> Tell me if you'd prefer them in a separate tree based on something
> else.

perf/core is fine. Since Linus pulled them upstream earlier today these
bits will end up in perf/urgent, please base future tress on that topic
tree.

Thanks,

Ingo

Peter Zijlstra

unread,
Dec 6, 2009, 5:50:02 AM12/6/09
to

So I'm a bit puzzled by the need for
- that structure to begin with
- specialized __bp reserves

Furthermore, you still got the packing wrong, leading to different
structure layout on 32 and 64 bit platforms,..

How about?

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 89098e3..5595154 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -215,17 +215,12 @@ struct perf_event_attr {


__u32 wakeup_watermark; /* bytes before wakeup */
};

- struct { /* Hardware breakpoint info */
- __u64 bp_addr;
- __u32 bp_type;
- __u32 bp_len;
- __u64 __bp_reserved_1;
- __u64 __bp_reserved_2;
- };

-
__u32 __reserved_2;

- __u64 __reserved_3;
+ /* Hardware breakpoint info */


+ __u64 bp_addr;
+ __u32 bp_type;
+ __u32 bp_len;

};

/*

Peter Zijlstra

unread,
Dec 6, 2009, 5:50:02 AM12/6/09
to
On Sun, 2009-12-06 at 08:34 +0100, Frederic Weisbecker wrote:
> As it is not used anymore and has been superseded by overflow_handler.

Thanks!

> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index d2f2667..89098e3 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -567,7 +567,6 @@ struct perf_pending_entry {
>
> struct perf_sample_data;
>
> -typedef void (*perf_callback_t)(struct perf_event *, void *);
> typedef void (*perf_overflow_handler_t)(struct perf_event *, int,
> struct perf_sample_data *,
> struct pt_regs *regs);
> @@ -669,8 +668,6 @@ struct perf_event {
> struct event_filter *filter;
> #endif
>
> - perf_callback_t callback;
> -
> #endif /* CONFIG_PERF_EVENTS */
> };
>


--

Frederic Weisbecker

unread,
Dec 6, 2009, 8:50:01 PM12/6/09
to

It has no practical use. It's just a logical separation
that makes it easier to review.

I won't mind much if you prefer to remove it.


> - specialized __bp reserves


Because we'll probably have further new needs in the future
in the breakpoint fields. But well, these can map to the
current reserved fields already.

> Furthermore, you still got the packing wrong, leading to different
> structure layout on 32 and 64 bit platforms,..
>
> How about?
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 89098e3..5595154 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -215,17 +215,12 @@ struct perf_event_attr {
> __u32 wakeup_watermark; /* bytes before wakeup */
> };
>
> - struct { /* Hardware breakpoint info */
> - __u64 bp_addr;
> - __u32 bp_type;
> - __u32 bp_len;
> - __u64 __bp_reserved_1;
> - __u64 __bp_reserved_2;
> - };
> -
> __u32 __reserved_2;
>
> - __u64 __reserved_3;
> + /* Hardware breakpoint info */
> + __u64 bp_addr;
> + __u32 bp_type;
> + __u32 bp_len;
> };

Right this fixes the packing layout, but what if
we need other fields for the breakpoints in the future?

Thanks.

0 new messages