Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH v3 0/2] perf/sdt: Directly record SDT events with 'perf record'

148 views
Skip to first unread message

Ravi Bangoria

unread,
Feb 24, 2017, 2:50:05 AM2/24/17
to
All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe point for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.

As suggested by Ingo[1], it's better to make this process simple by
creating probe point automatically with 'perf record' for SDT events.
Same effort was done by Hemant some time ago[2]. This patch series
is based on work he has done.

Changes in v3:
- Rebased to current acme/perf/core

- v2 was having build failure if LIB_ELF support is missing. Solved
it in this series.

- Event starting with 'sdt_' will be considered as SDT event. v2
was forcing to prepend SDT event name with '%' for 'perf record'
but not for 'perf probe'. This change was suggested by Brendan/
Masami in one of the review comment in v1.[3]

- There was one more problem with v2. If internal 'perf probe' fails
to lookup SDT event in cache because of too big list (-E2BIG),
'perf record' starts recording cycles event instead of showing
warning/error. This can totally confuse user. I've resolved it. ex:

With v2:
$ perf record -a -e sdt_qemu:*
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.246 MB perf.data (1523 samples) ]

$ perf evlist
cpu-clock

With this patch:
$ perf record -a -e sdt_qemu:*
event syntax error: 'sdt_qemu:*'
\___ Cache lookup failed

- This patch series still allows user to place a probe manually with
'perf probe' but also shows a hint:

$ perf probe sdt_libpthread:mutex_entry
...
Hint: SDT event can be directly recorded with 'perf record'. No need to create probe manually.

- v2 was always trying to look for sdt event in probe cache by ignoring
entries of uprobe_events. Hence, it creates new probe points for event
even if it already exists. This was confusing from user's point of
view.

With v2:
$ perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

$ perf record -a -e %sdt_libpthread:mutex_entry
Warning : Recording on 2 occurrences of sdt_libpthread:mutex_entry

$ perf evlist
sdt_libpthread:mutex_entry_3
sdt_libpthread:mutex_entry_2

There are two issues:
1st, It confuses user about new names.
2nd, (and more important) Perf won't allow you to record
'sdt_libpthread:mutex_entry_1' with 'perf record' even if it
exists in uprobe_events, because it won't find event with that
name in probe cache.

I've solved these issues. This patch gives first priority to existing
entries of uprobe_events file. After that, it looks into probe cache.
For ex,

With this patch:
$ perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ perf record -a -e sdt_libpthread:mutex_entry_1
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

It also lists matching entries as 'name addr@filename' followed by the
hint about how to delete them with 'perf probe -d'.

If there is no matching entry found in uprobe_events with that *name*,
perf will find all events with that name from probe cache. Once again
it checks whether these events exists in uprobe_events, but this time
it uses *address* and *filename* instead of eventname. If entry found,
it reuse that entry instead of creating new one. In continuation of
above example,

With this patch:
$ perf probe -d sdt_libpthread:mutex_entry
Removed event: sdt_libpthread:mutex_entry

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

$ perf record -a -e sdt_libpthread:*
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

Warning: Recording on 35 occurrences of sdt_libpthread:*

$ perf evlist
sdt_libpthread:mutex_entry
sdt_libpthread:pthread_create
sdt_libpthread:mutex_entry_1
...

Here, Perf has reused entry for event with 0xbcbb address, but also it
has created new entry for event with 0x9ddb address. It also maintains
list of entries created for particular record session, and uses that
list to remove entries at the end of session.

Finally, If somehow tool fails to clean events from uprobe_events at
the end of session, user has to clean events manually with
'perf probe -d'. But perf will give Warning in such case. For ex,

$ perf record -a -e sdt_libpthread:mutex_entry
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_entry
/** Fails with segfault **/

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

When next time user tries to record, it will show a warning:

$ perf record -a -e sdt_libpthread:mutex_entry
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry 0x9ddb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

Warning: Found 2 events from probe-cache with name 'sdt_libpthread:mutex_entry'.
Since probe point already exists with this name, recording only 1 event.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry*' to allow record on all events.

But no such warning for 'sdt_libpthread:mutex_entry_1'.

$ perf record -a -e sdt_libpthread:mutex_entry_1
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).


[1] https://lkml.org/lkml/2017/2/7/59
[2] https://lkml.org/lkml/2016/5/3/810
[3] https://lkml.org/lkml/2016/5/2/689


Hemant Kumar (1):
perf/sdt: Directly record SDT events with 'perf record'

Ravi Bangoria (1):
perf/sdt: Introduce util func is_sdt_event()

tools/lib/api/fs/tracing_path.c | 17 +-
tools/perf/builtin-probe.c | 21 ++-
tools/perf/builtin-record.c | 23 +++
tools/perf/perf.h | 2 +
tools/perf/util/parse-events.c | 56 +++++-
tools/perf/util/parse-events.h | 2 +
tools/perf/util/probe-event.c | 53 +++++-
tools/perf/util/probe-event.h | 4 +
tools/perf/util/probe-file.c | 376 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/probe-file.h | 27 +++
tools/perf/util/util.c | 12 ++
tools/perf/util/util.h | 2 +
12 files changed, 567 insertions(+), 28 deletions(-)

--
2.9.3

Ravi Bangoria

unread,
Feb 24, 2017, 2:50:07 AM2/24/17
to
From: Hemant Kumar <hem...@linux.vnet.ibm.com>

Add support for directly recording SDT events which are present in
the probe cache. Without this patch, we could probe into SDT events
using 'perf probe' and 'perf record'. With this patch, we can probe
the SDT events directly using 'perf record'.

For example :

$ perf list sdt
sdt_libpthread:mutex_entry [SDT event]
sdt_libc:setjmp [SDT event]
...

$ perf record -a -e sdt_libc:setjmp
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.286 MB perf.data (1065 samples) ]

$ perf script
bash 803 [002] 6492.190311: sdt_libc:setjmp: (7f1d503b56a1)
login 488 [001] 6496.791583: sdt_libc:setjmp: (7ff3013d56a1)
fprintd 11038 [003] 6496.808032: sdt_libc:setjmp: (7fdedf5936a1)

Recording on SDT events with same provider and marker names is also
supported:

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, Base: 0x00000000000139cc, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, Base: 0x00000000000139cc, ...

$ perf record -a -e sdt_libpthread:mutex_entry
Info: Recording on 2 occurrences of sdt_libpthread:mutex_entry
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.197 MB perf.data (31 samples) ]

$ perf script
in:imjournal 442 [000] 6625.701833: sdt_libpthread:mutex_entry: (7fb1a1940ddb)
rs:main Q:Reg 443 [001] 6625.701889: sdt_libpthread:mutex_entry_1: (7fb1a1942cbb)


After invoking 'perf record', behind the scenes, it checks whether
the event specified is an SDT event by using the flag '%' or string
'sdt_'. After that, it first checks whether event already exists
with that *name* in uprobe_events file. If found, it records that
particular event. Otherwise, it does a lookup of the probe cache to
find out the SDT event. If its not present, it throws an error.
If found, it again tries to find existing events from uprobe_events
file, but this time it uses *address* and *filename* for comparison.
Finally it writes new events into the uprobe_events file and starts
recording. It also maintains a list of the event names that were
written to uprobe_events file for this session. After finishing the
record session, it removes the events from the uprobe_events file
using the maintained name list.

Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/lib/api/fs/tracing_path.c | 17 +-
tools/perf/builtin-probe.c | 21 ++-
tools/perf/builtin-record.c | 23 +++
tools/perf/perf.h | 2 +
tools/perf/util/parse-events.c | 56 +++++-
tools/perf/util/parse-events.h | 2 +
tools/perf/util/probe-event.c | 44 ++++-
tools/perf/util/probe-event.h | 4 +
tools/perf/util/probe-file.c | 376 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/probe-file.h | 27 +++
10 files changed, 552 insertions(+), 20 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 3e606b9..fa52e67 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -103,19 +103,10 @@ int tracing_path__strerror_open_tp(int err, char *buf, size_t size,
* - jirka
*/
if (debugfs__configured() || tracefs__configured()) {
- /* sdt markers */
- if (!strncmp(filename, "sdt_", 4)) {
- snprintf(buf, size,
- "Error:\tFile %s/%s not found.\n"
- "Hint:\tSDT event cannot be directly recorded on.\n"
- "\tPlease first use 'perf probe %s:%s' before recording it.\n",
- tracing_events_path, filename, sys, name);
- } else {
- snprintf(buf, size,
- "Error:\tFile %s/%s not found.\n"
- "Hint:\tPerhaps this kernel misses some CONFIG_ setting to enable this feature?.\n",
- tracing_events_path, filename);
- }
+ snprintf(buf, size,
+ "Error:\tFile %s/%s not found.\n"
+ "Hint:\tPerhaps this kernel misses some CONFIG_ setting to enable this feature?.\n",
+ tracing_events_path, filename);
break;
}
snprintf(buf, size, "%s",
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 1fcebc3..d255a6d 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -45,6 +45,7 @@
#define DEFAULT_VAR_FILTER "!__k???tab_* & !__crc_*"
#define DEFAULT_FUNC_FILTER "!_*"
#define DEFAULT_LIST_FILTER "*"
+#define MAX_EVENT_LENGTH 512

/* Session management structure */
static struct {
@@ -312,6 +313,19 @@ static void pr_err_with_code(const char *msg, int err)
pr_err("\n");
}

+/* Show how to use the event. */
+static void record_syntax_hint(const char *group, const char *event)
+{
+ char ge[MAX_EVENT_LENGTH];
+
+ pr_info("\nYou can now use it in all perf tools, such as:\n\n");
+ pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
+
+ snprintf(ge, strlen(group) + strlen(event) + 2, "%s:%s", group, event);
+ if (is_sdt_event(ge))
+ pr_info("Hint: SDT event can be directly recorded with 'perf record'. No need to create probe manually.\n");
+}
+
static int perf_add_probe_events(struct perf_probe_event *pevs, int npevs)
{
int ret;
@@ -356,11 +370,8 @@ static int perf_add_probe_events(struct perf_probe_event *pevs, int npevs)
}

/* Note that it is possible to skip all events because of blacklist */
- if (event) {
- /* Show how to use the event. */
- pr_info("\nYou can now use it in all perf tools, such as:\n\n");
- pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
- }
+ if (event)
+ record_syntax_hint(group, event);

out_cleanup:
cleanup_perf_probe_events(pevs, npevs);
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index bc84a37..fa9b345 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -39,6 +39,7 @@
#include "util/trigger.h"
#include "util/perf-hooks.h"
#include "asm/bug.h"
+#include "util/probe-file.h"

#include <unistd.h>
#include <sched.h>
@@ -73,6 +74,7 @@ struct record {
bool timestamp_filename;
struct switch_output switch_output;
unsigned long long samples;
+ struct list_head sdt_event_list;
};

static volatile int auxtrace_record__snapshot_started;
@@ -1503,6 +1505,25 @@ static struct record record = {
},
};

+void sdt_event_list__add(struct list_head *sdt_event_list)
+{
+ if (list_empty(sdt_event_list))
+ return;
+ list_splice(sdt_event_list, &record.sdt_event_list);
+}
+
+bool is_cmd_record(void)
+{
+ return (record.evlist != NULL);
+}
+
+void sdt_event_list__remove(void)
+{
+#ifdef HAVE_LIBELF_SUPPORT
+ remove_sdt_event_list(&record.sdt_event_list);
+#endif
+}
+
const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
"\n\t\t\t\tDefault: fp";

@@ -1671,6 +1692,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

+ INIT_LIST_HEAD(&rec->sdt_event_list);
err = perf_config(perf_record_config, rec);
if (err)
return err;
@@ -1841,6 +1863,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);
+ sdt_event_list__remove();
return err;
}

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1c27d94..4ad12bb 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -76,4 +76,6 @@ struct record_opts {
struct option;
extern const char * const *record_usage;
extern struct option *record_options;
+
+bool is_cmd_record(void);
#endif
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 67a8aeb..7d1f042 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1726,12 +1726,66 @@ static void parse_events_print_error(struct parse_events_error *err,

#undef MAX_WIDTH

+/* SDT event needs LIBELF support for creating a probe point */
+#ifdef HAVE_LIBELF_SUPPORT
+static int parse_sdt_event(struct perf_evlist *evlist, const char *str,
+ struct parse_events_error *err)
+{
+ char *ptr = NULL;
+ int ret;
+ struct list_head *sdt_evlist;
+ struct sdt_event_list *event;
+
+ if (str[0] == '%')
+ str++;
+
+ ptr = strdup(str);
+ if (ptr == NULL)
+ return -ENOMEM;
+
+ sdt_evlist = zalloc(sizeof(*sdt_evlist));
+ if (!sdt_evlist) {
+ free(ptr);
+ pr_err("Error in sdt_evlist memory allocation\n");
+ return -ENOMEM;
+ }
+ INIT_LIST_HEAD(sdt_evlist);
+
+ /*
+ * If there is an error in this call, no need to free
+ * up sdt_evlist, its already free'ed up in the previous
+ * call. Free up 'ptr' though.
+ */
+ ret = add_sdt_event(ptr, sdt_evlist, err);
+ if (!ret) {
+ list_for_each_entry(event, sdt_evlist, list) {
+ ret = parse_events(evlist, event->event_info, err);
+ if (ret < 0)
+ goto ret;
+ }
+ /* Add it to the record struct */
+ sdt_event_list__add(sdt_evlist);
+ }
+
+ret:
+ free(ptr);
+ return ret;
+}
+#endif /* HAVE_LIBELF_SUPPORT */
+
int parse_events_option(const struct option *opt, const char *str,
int unset __maybe_unused)
{
struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
struct parse_events_error err = { .idx = 0, };
- int ret = parse_events(evlist, str, &err);
+ int ret = 0;
+
+#ifdef HAVE_LIBELF_SUPPORT
+ if (is_sdt_event((char *)str) && is_cmd_record())
+ ret = parse_sdt_event(evlist, str, &err);
+ else
+#endif
+ ret = parse_events(evlist, str, &err);

if (ret)
parse_events_print_error(&err, str);
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1af6a26..205f682 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -196,4 +196,6 @@ int is_valid_tracepoint(const char *event_string);
int valid_event_mount(const char *eventfs);
char *parse_events_formats_error_string(char *additional_terms);

+void sdt_event_list__add(struct list_head *sdt_event_list);
+void sdt_event_list__remove(void);
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 2b1409f..a19b00d 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1293,7 +1293,7 @@ int parse_line_range_desc(const char *arg, struct line_range *lr)
return err;
}

-static int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
{
char *ptr;

@@ -3166,6 +3166,12 @@ static int find_cached_events_all(struct perf_probe_event *pev,
return ret;
}

+int find_sdt_events_from_cache(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs)
+{
+ return find_cached_events_all(pev, tevs);
+}
+
static int find_probe_trace_events_from_cache(struct perf_probe_event *pev,
struct probe_trace_event **tevs)
{
@@ -3476,3 +3482,39 @@ int copy_to_probe_trace_arg(struct probe_trace_arg *tvar,
tvar->name = NULL;
return 0;
}
+
+/*
+ * Record session for SDT events has ended. Delete the SDT events
+ * from uprobe_events file that were created initially.
+ */
+void remove_sdt_event_list(struct list_head *sdt_events)
+{
+ struct sdt_event_list *event;
+ struct strfilter *filter = NULL;
+ const char *err = NULL;
+ int ret = 0;
+
+ if (list_empty(sdt_events))
+ return;
+
+ list_for_each_entry(event, sdt_events, list) {
+ if (event->existing_event)
+ continue;
+ if (!filter) {
+ filter = strfilter__new(event->event_info, &err);
+ if (!filter)
+ goto free_list;
+ } else {
+ ret = strfilter__or(filter, event->event_info, &err);
+ }
+ }
+
+ if (filter)
+ ret = del_perf_probe_events(filter);
+
+ if (ret)
+ pr_err("Error in deleting the SDT list\n");
+
+free_list:
+ free_sdt_list(sdt_events);
+}
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 5d4e940..5ec648e 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -182,4 +182,8 @@ struct map *get_target_map(const char *target, bool user);
void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
int ntevs);

+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev);
+
+int find_sdt_events_from_cache(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs);
#endif /*_PROBE_EVENT_H */
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 436b647..c941f38 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -27,8 +27,10 @@
#include "probe-event.h"
#include "probe-file.h"
#include "session.h"
+#include "probe-finder.h"

#define MAX_CMDLEN 256
+#define MAX_EVENT_LENGTH 512

static void print_open_warning(int err, bool uprobe)
{
@@ -933,3 +935,377 @@ bool probe_type_is_available(enum probe_type type)

return ret;
}
+
+void free_sdt_list(struct list_head *sdt_events)
+{
+ struct sdt_event_list *tmp, *ptr;
+
+ if (list_empty(sdt_events))
+ return;
+ list_for_each_entry_safe(tmp, ptr, sdt_events, list) {
+ list_del(&tmp->list);
+ free(tmp->event_info);
+ free(tmp);
+ }
+}
+
+static int alloc_exst_sdt_event(struct exst_sdt_event_list **esl)
+{
+ struct exst_sdt_event_list *tmp;
+
+ tmp = zalloc(sizeof(*tmp));
+ if (!tmp)
+ return -ENOMEM;
+
+ tmp->tev = zalloc(sizeof(struct probe_trace_event));
+ if (!tmp->tev) {
+ free(tmp);
+ return -ENOMEM;
+ }
+
+ tmp->match = false;
+ *esl = tmp;
+ return 0;
+}
+
+static void free_exst_sdt_event(struct exst_sdt_event_list *esl)
+{
+ if (!esl)
+ return;
+
+ if (esl->tev) {
+ free(esl->tev->args);
+ free(esl->tev);
+ }
+
+ free(esl);
+}
+
+static void probe_file__free_exst_sdt_list(struct exst_sdt_event_list *esl)
+{
+ struct list_head *pos, *q;
+ struct exst_sdt_event_list *tmp;
+
+ list_for_each_safe(pos, q, &(esl->list)) {
+ tmp = list_entry(pos, struct exst_sdt_event_list, list);
+ free_exst_sdt_event(tmp);
+ list_del(pos);
+ }
+}
+
+/*
+ * Look into uprobe_events file and prepare list of sdt events
+ * whose probepoint is already registered.
+ */
+static int probe_file__get_exst_sdt_list(struct exst_sdt_event_list *esl)
+{
+ int fd, ret = 0;
+ struct strlist *rawlist;
+ struct str_node *ent;
+ struct exst_sdt_event_list *tmp = NULL;
+
+ fd = probe_file__open(PF_FL_RW | PF_FL_UPROBE);
+ if (fd < 0)
+ return fd;
+
+ rawlist = probe_file__get_rawlist(fd);
+
+ strlist__for_each_entry(ent, rawlist) {
+ ret = alloc_exst_sdt_event(&tmp);
+ if (ret < 0)
+ goto error;
+
+ ret = parse_probe_trace_command(ent->s, tmp->tev);
+ if (ret < 0) {
+ free_exst_sdt_event(tmp);
+ goto error;
+ }
+
+ if (!strncmp(tmp->tev->group, "sdt_", 4))
+ list_add_tail(&(tmp->list), &(esl->list));
+ else
+ free_exst_sdt_event(tmp);
+ }
+ return 0;
+
+error:
+ probe_file__free_exst_sdt_list(esl);
+ return ret;
+}
+
+/*
+ * Remove ith tev from pev->tevs list and shift remaining
+ * tevs(i+1 to pev->ntevs) one step.
+ */
+static void shift_pev(struct perf_probe_event *pev, int i)
+{
+ int j;
+
+ free(pev->tevs[i].event);
+ free(pev->tevs[i].group);
+ free(pev->tevs[i].args);
+ free(pev->tevs[i].point.realname);
+ free(pev->tevs[i].point.symbol);
+ free(pev->tevs[i].point.module);
+
+ /*
+ * If ith element is last element, no need to shift,
+ * just decrement pev->ntevs.
+ */
+ if (i == pev->ntevs - 1)
+ goto ret;
+
+ for (j = i; j < pev->ntevs - 1; j++) {
+ pev->tevs[j].event = pev->tevs[j + 1].event;
+ pev->tevs[j].group = pev->tevs[j + 1].group;
+ pev->tevs[j].nargs = pev->tevs[j + 1].nargs;
+ pev->tevs[j].uprobes = pev->tevs[j + 1].uprobes;
+ pev->tevs[j].args = pev->tevs[j + 1].args;
+ pev->tevs[j].point.realname = pev->tevs[j + 1].point.realname;
+ pev->tevs[j].point.symbol = pev->tevs[j + 1].point.symbol;
+ pev->tevs[j].point.module = pev->tevs[j + 1].point.module;
+ pev->tevs[j].point.offset = pev->tevs[j + 1].point.offset;
+ pev->tevs[j].point.address = pev->tevs[j + 1].point.address;
+ pev->tevs[j].point.retprobe = pev->tevs[j + 1].point.retprobe;
+ }
+
+ret:
+ pev->ntevs--;
+}
+
+/* Compare address and filename */
+static bool is_sdt_match(struct probe_trace_event *tev1,
+ struct probe_trace_event *tev2)
+{
+ return (tev1->point.address == tev2->point.address &&
+ !(strcmp(tev1->point.module, tev2->point.module)));
+}
+
+/*
+ * Filter out all those pev->tevs which already exists in uprobe_events.
+ * Return 'true' if any matching entry found otherwise return 'false'.
+ */
+static bool filter_exst_sdt_events_from_pev(struct perf_probe_event *pev,
+ struct exst_sdt_event_list *esl)
+{
+ int i;
+ bool ret = false;
+ struct exst_sdt_event_list *tmp;
+
+ list_for_each_entry(tmp, &(esl->list), list) {
+ for (i = 0; i < pev->ntevs; i++) {
+ if (is_sdt_match(&(pev->tevs[i]), tmp->tev)) {
+ shift_pev(pev, i);
+ tmp->match = true;
+ ret = true;
+ }
+ }
+ }
+
+ return ret;
+}
+
+static int list_add_sdt_event(struct list_head *sdt_events,
+ bool existing_event,
+ struct probe_trace_event *tev)
+{
+ struct sdt_event_list *tmp;
+
+ tmp = zalloc(sizeof(*tmp));
+ if (!tmp)
+ return -ENOMEM;
+
+ tmp->existing_event = existing_event;
+
+ INIT_LIST_HEAD(&tmp->list);
+ tmp->event_info = zalloc(MAX_EVENT_LENGTH * sizeof(char));
+ if (!tmp->event_info) {
+ free(tmp);
+ return -ENOMEM;
+ }
+
+ snprintf(tmp->event_info, strlen(tev->group) + strlen(tev->event) + 2,
+ "%s:%s", tev->group, tev->event);
+
+ list_add(&tmp->list, sdt_events);
+
+ return 0;
+}
+
+static void print_exst_sdt_events(struct exst_sdt_event_list *tmp)
+{
+ static bool msg_head;
+
+ if (!msg_head) {
+ pr_info("Matching event(s) from uprobe_events:\n");
+ msg_head = true;
+ }
+
+ pr_info(" %s:%s 0x%" PRIx64 "@%s\n", tmp->tev->group,
+ tmp->tev->event, tmp->tev->point.address,
+ tmp->tev->point.module);
+}
+
+static void print_exst_sdt_event_footer(void)
+{
+ pr_info("Use 'perf probe -d <event>' to delete event(s).\n\n");
+}
+
+/*
+ * If there is entry with the same name in uprobe_events, record it.
+ * Return value 0: no error, not found
+ * <0: error
+ * >0: found
+ */
+static int probe_file__add_exst_sdt_event(struct exst_sdt_event_list *esl,
+ struct list_head *sdt_events,
+ struct perf_probe_event *pev)
+{
+ struct exst_sdt_event_list *tmp;
+ int ret = 0;
+
+ list_for_each_entry(tmp, &(esl->list), list) {
+ if (strcmp(tmp->tev->group, pev->group) ||
+ strcmp(tmp->tev->event, pev->event))
+ continue;
+
+ tmp->match = true;
+
+ ret = list_add_sdt_event(sdt_events, true, tmp->tev);
+ if (ret < 0)
+ return ret;
+
+ print_exst_sdt_events(tmp);
+ print_exst_sdt_event_footer();
+ if (pev->ntevs > 1)
+ pr_warning("Warning: Found %d events from probe-cache with name '%s:%s'.\n"
+ "\t Since probe point already exists with this name, recording only 1 event.\n"
+ "Hint: Please use 'perf probe -d %s:%s*' to allow record on all events.\n\n",
+ pev->ntevs, pev->group, pev->event, pev->group, pev->event);
+ return 1;
+ }
+ return 0;
+}
+
+int add_sdt_event(char *event, struct list_head *sdt_events,
+ struct parse_events_error *err)
+{
+ struct perf_probe_event *pev;
+ int ret = 0, i, ctr = 0, found = 0, exst_ctr = 0;
+ char *str;
+ struct exst_sdt_event_list *tmp;
+ struct exst_sdt_event_list esl;
+ bool filter;
+
+ pev = zalloc(sizeof(*pev));
+ if (!pev)
+ return -ENOMEM;
+
+ pev->sdt = true;
+ pev->uprobes = true;
+
+ str = strdup(event);
+ if (!str)
+ return -ENOMEM;
+
+ /*
+ * Parse event to find the group name and event name of
+ * the sdt event.
+ */
+ ret = parse_perf_probe_event_name(&event, pev);
+ if (ret) {
+ pr_err("Error in parsing sdt event %s\n", str);
+ free(pev);
+ goto free_str;
+ }
+
+ probe_conf.max_probes = MAX_PROBES;
+ probe_conf.force_add = 1;
+
+ /*
+ * Find the sdt event from the cache. We deliberately check failure
+ * of this function after checking entries in uprobe_events. Because,
+ * we may find matching entry from uprobe_events and in that case we
+ * should continue recording that event.
+ */
+ pev->ntevs = find_sdt_events_from_cache(pev, &pev->tevs);
+
+ /* Prepare list of existing sdt events from uprobe_events */
+ INIT_LIST_HEAD(&esl.list);
+ ret = probe_file__get_exst_sdt_list(&esl);
+ if (ret < 0)
+ goto free_str;
+
+ /* If there is entry with the same name in uprobe_events, record it. */
+ found = probe_file__add_exst_sdt_event(&esl, sdt_events, pev);
+ if (found) {
+ ret = (found > 0) ? 0 : found;
+ goto free_str;
+ }
+
+ /* Reaching here means no matching entry found in uprobe_events. */
+ filter = filter_exst_sdt_events_from_pev(pev, &esl);
+ if (!filter && pev->ntevs == 0) {
+ pr_err("%s:%s not found in the cache\n", pev->group,
+ pev->event);
+ ret = -EINVAL;
+ goto free_str;
+ } else if (pev->ntevs < 0) {
+ err->str = strdup("Cache lookup failed.\n");
+ ret = pev->ntevs;
+ goto free_str;
+ }
+
+ /* Create probe points for new events. */
+ ret = apply_perf_probe_events(pev, 1);
+ if (ret) {
+ pr_err("Error in adding SDT event : %s\n", event);
+ goto free_str;
+ }
+
+ /* Add existing event names to "sdt_events" list */
+ list_for_each_entry(tmp, &(esl.list), list) {
+ if (!tmp->match)
+ continue;
+
+ ret = list_add_sdt_event(sdt_events, true, tmp->tev);
+ if (ret < 0)
+ goto free_str;
+ print_exst_sdt_events(tmp);
+ ctr++;
+ exst_ctr++;
+ }
+ if (exst_ctr)
+ print_exst_sdt_event_footer();
+
+ /* Add new event names to "sdt_events" list */
+ for (i = 0; i < pev->ntevs; i++) {
+ ret = list_add_sdt_event(sdt_events, false, &(pev->tevs[i]));
+ if (ret < 0)
+ goto free_str;
+
+ ctr++;
+ }
+
+ /* Print warning for multiple events */
+ if (ctr > 1)
+ pr_warning("Warning: Recording on %d occurrences of %s:%s\n",
+ ctr, pev->group, pev->event);
+
+free_str:
+ /*
+ * User may ask for multiple events in the same record command like,
+ * perf record -a -e sdt_lib1:* -e sdt_a:b
+ *
+ * If sdt_lib1:* events are already added and there is some failure
+ * for sdt_a:b, we need to clean sdt_lib1:* events from
+ * record.sdt_event_list
+ */
+ if (ret < 0)
+ sdt_event_list__remove();
+
+ free(str);
+ probe_file__free_exst_sdt_list(&esl);
+ cleanup_perf_probe_events(pev, 1);
+ return ret;
+}
diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
index eba44c3..e785eb6 100644
--- a/tools/perf/util/probe-file.h
+++ b/tools/perf/util/probe-file.h
@@ -4,6 +4,7 @@
#include "strlist.h"
#include "strfilter.h"
#include "probe-event.h"
+#include "parse-events.h"

/* Cache of probe definitions */
struct probe_cache_entry {
@@ -19,6 +20,28 @@ struct probe_cache {
struct list_head entries;
};

+struct sdt_event_list {
+ char *event_info;
+ struct list_head list;
+
+ /*
+ * Flag to check whether this event already exists in urobe_events
+ * file. This helps at a time of freeing sdt_event_list.
+ */
+ bool existing_event;
+};
+
+/*
+ * This maintains list of sdt events which are already exists in
+ * uprobe_events file at the time of creating probe point for any
+ * sdt event with 'perf record'.
+ */
+struct exst_sdt_event_list {
+ struct probe_trace_event *tev;
+ struct list_head list;
+ bool match; /* Does this event match with any new event */
+};
+
enum probe_type {
PROBE_TYPE_U = 0,
PROBE_TYPE_S,
@@ -64,6 +87,10 @@ struct probe_cache_entry *probe_cache__find_by_name(struct probe_cache *pcache,
const char *group, const char *event);
int probe_cache__show_all_caches(struct strfilter *filter);
bool probe_type_is_available(enum probe_type type);
+int add_sdt_event(char *event, struct list_head *sdt_events,
+ struct parse_events_error *err);
+void remove_sdt_event_list(struct list_head *sdt_event_list);
+void free_sdt_list(struct list_head *sdt_events);
#else /* ! HAVE_LIBELF_SUPPORT */
static inline struct probe_cache *probe_cache__new(const char *tgt __maybe_unused)
{
--
2.9.3

Ravi Bangoria

unread,
Feb 24, 2017, 2:50:07 AM2/24/17
to
No Functionality changes.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/probe-event.c | 9 +--------
tools/perf/util/util.c | 12 ++++++++++++
tools/perf/util/util.h | 2 ++
3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 28fb62c..2b1409f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1339,14 +1339,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
if (!arg)
return -EINVAL;

- /*
- * If the probe point starts with '%',
- * or starts with "sdt_" and has a ':' but no '=',
- * then it should be a SDT/cached probe point.
- */
- if (arg[0] == '%' ||
- (!strncmp(arg, "sdt_", 4) &&
- !!strchr(arg, ':') && !strchr(arg, '='))) {
+ if (is_sdt_event(arg)) {
pev->sdt = true;
if (arg[0] == '%')
arg++;
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index d8b45ce..b827428 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -802,3 +802,15 @@ int unit_number__scnprintf(char *buf, size_t size, u64 n)

return scnprintf(buf, size, "%" PRIu64 "%c", n, unit[i]);
}
+
+/*
+ * If the probe point starts with '%',
+ * or starts with "sdt_" and has a ':' but no '=',
+ * then it should be a SDT/cached probe point.
+ */
+bool is_sdt_event(char *str)
+{
+ return (str[0] == '%' ||
+ (!strncmp(str, "sdt_", 4) &&
+ !!strchr(str, ':') && !strchr(str, '=')));
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index c74708d..ee599dc 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -364,4 +364,6 @@ int is_printable_array(char *p, unsigned int len);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);

int unit_number__scnprintf(char *buf, size_t size, u64 n);
+
+bool is_sdt_event(char *str);
#endif /* GIT_COMPAT_UTIL_H */
--
2.9.3

Ravi Bangoria

unread,
Feb 27, 2017, 12:00:05 AM2/27/17
to
Looks like I've older address of Masami(masami.hi...@hitachi.com) in my database.

Adding To: mhir...@kernel.org

-Ravi

Masami Hiramatsu

unread,
Feb 27, 2017, 8:40:05 PM2/27/17
to
On Fri, 24 Feb 2017 13:13:24 +0530
Ravi Bangoria <ravi.b...@linux.vnet.ibm.com> wrote:

> No Functionality changes.

Please describe even it seems not have much info.
----
Factor out the SDT event name checking routine as is_sdt_event().
----

BTW, would we really need to move it in util.c? I think parse-event.{c,h}
is a better place.

Thank you,
--
Masami Hiramatsu <mhir...@kernel.org>

Masami Hiramatsu

unread,
Feb 28, 2017, 1:00:05 AM2/28/17
to
OK, the behavior looks good. However, the implementation seems
too complex, could you make it simpler?
I have some comments on the code, see below.

[...]
Here, we don't need this message because anyway perf_add_probe_events already
created this probe...
Could you make this message pr_debug() and/or use the 'err' ?
Would we really need this wrapper? (or just rename find_sdt_events_all and expose it)
I think we don't have to care about errors while removing SDT events,
since we have no exclusive lock on that, someone can use it in parallel.
We can just ignore it.
This code smells no good...
Why don't you make a list of newly added probes ?
Curious, this should be done before searching cached SDT, so that
we can just skip listing up existing probes.

> +
> + /* Reaching here means no matching entry found in uprobe_events. */
> + filter = filter_exst_sdt_events_from_pev(pev, &esl);
> + if (!filter && pev->ntevs == 0) {
> + pr_err("%s:%s not found in the cache\n", pev->group,
> + pev->event);
> + ret = -EINVAL;
> + goto free_str;
> + } else if (pev->ntevs < 0) {
> + err->str = strdup("Cache lookup failed.\n");
> + ret = pev->ntevs;
> + goto free_str;
> + }

Again, why is the esl needed here? We can remove events by using it's name.

What I expected was,

0) add "generated_sdts" in record structure.
1) search existing events by using its name from tracing/events/.
(Don't need to touch uprobe_events)
2) if it does not exist, add new probe points and store its name in "generated_sdts".
3) record events
4) remove the events by using "generated_sdts" which only has the event name.

Thank you,
--
Masami Hiramatsu <mhir...@kernel.org>

Ravi Bangoria

unread,
Feb 28, 2017, 6:00:10 AM2/28/17
to
Thanks Masami for the review.
Please find my replies to your comments.

On Tuesday 28 February 2017 11:15 AM, Masami Hiramatsu wrote:
> On Fri, 24 Feb 2017 13:13:25 +0530
> Ravi Bangoria <ravi.b...@linux.vnet.ibm.com> wrote:
>
>> After invoking 'perf record', behind the scenes, it checks whether
>> the event specified is an SDT event by using the flag '%' or string
>> 'sdt_'. After that, it first checks whether event already exists
>> with that *name* in uprobe_events file. If found, it records that
>> particular event. Otherwise, it does a lookup of the probe cache to
>> find out the SDT event. If its not present, it throws an error.
>> If found, it again tries to find existing events from uprobe_events
>> file, but this time it uses *address* and *filename* for comparison.
>> Finally it writes new events into the uprobe_events file and starts
>> recording. It also maintains a list of the event names that were
>> written to uprobe_events file for this session. After finishing the
>> record session, it removes the events from the uprobe_events file
>> using the maintained name list.
> OK, the behavior looks good. However, the implementation seems
> too complex, could you make it simpler?
> I have some comments on the code, see below.

Yes, it's complex code. It's all because of multiple SDT events with same name
exists and we wants to allow both -- 'perf probe' as well as 'perf record' -- which
creates complex situations that has to be taken care. See my replies below for
such examples.
Yes. This is little ugly. Consider a following scenario.

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, Base: 0x00000000000139cc, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, Base: 0x00000000000139cc, ...

$ perf probe sdt_libpthread:mutex_entry

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

$ perf probe -d sdt_libpthread:mutex_entry

$ perf record -a -e sdt_libpthread:mutex_entry
/* This should record:
sdt_libpthread/mutex_entry <== New entry (delete at the end of session)
sdt_libpthread/mutex_entry_1 <== Reuse (do not delete at the end of session)
*/

First of all, perf looks into uprobe_events for entry with sdt_libpthread:mutex_entry
*name*. As it's not present, perf does a lookup in probe-cache and gets two entries.
Perf stores them in 'pev->tevs' array. Now it again compares uprobe_events entries
with entries found from probe-cache with *filename* and *address* as comparison
parameter (because this combo makes unique identifier for any SDT event in a
particular system - please correct me if that's wrong). And perf finds one entry
sdt_libpthread/mutex_entry_1.

Keeping this entry in 'pevs->tevs' and calling apply_perf_probe_events() will add
new entry for that event even if it exists. So this code removes that particular
element of an array and shift all remaining elements one step.

Please suggest if you have a better approach.

>> +
>> +/* Compare address and filename */
>> +static bool is_sdt_match(struct probe_trace_event *tev1,
>> + struct probe_trace_event *tev2)
>> +{
>> + return (tev1->point.address == tev2->point.address &&
>> + !(strcmp(tev1->point.module, tev2->point.module)));
>> +}
>> +
...
Yes that's intentional. We warn user if event with exact name found from
uprobe_events and there are multiple instances of that event found from
probe-cache. For example,

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, Base: 0x00000000000139cc, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, Base: 0x00000000000139cc, ...

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb

$ perf record -a -e sdt_libpthread/mutex_entry
Warning: Found 2 events from probe-cache with name 'sdt_libpthread:mutex_entry'.
Since probe point already exists with this name, recording only 1 event.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry*' to allow record on all events.

>> +
>> + /* Reaching here means no matching entry found in uprobe_events. */
>> + filter = filter_exst_sdt_events_from_pev(pev, &esl);
>> + if (!filter && pev->ntevs == 0) {
>> + pr_err("%s:%s not found in the cache\n", pev->group,
>> + pev->event);
>> + ret = -EINVAL;
>> + goto free_str;
>> + } else if (pev->ntevs < 0) {
>> + err->str = strdup("Cache lookup failed.\n");
>> + ret = pev->ntevs;
>> + goto free_str;
>> + }
> Again, why is the esl needed here? We can remove events by using it's name.
>
> What I expected was,
>
> 0) add "generated_sdts" in record structure.
> 1) search existing events by using its name from tracing/events/.
> (Don't need to touch uprobe_events)

I didn't get your point to look at tracing/events/. It's easy to look into
uprobe_events, because perf has a builtin functions that easily transform
entries of uprobe_events to 'tevs'. And uprobe_events has all data like
eventname, filename, address which are crucial for this implementation.

> 2) if it does not exist, add new probe points and store its name in "generated_sdts".

This step is little difficult. As I mentioned earlier, major issue is, event
exists in uprobe_events but with different name. In such case I've to
reuse existing entries and also have to add new entries for events
which are absent.

> 3) record events
> 4) remove the events by using "generated_sdts" which only has the event name.
>

Let me know if I've misunderstood any of your point.

Thanks,
-Ravi

Brendan Gregg

unread,
Feb 28, 2017, 5:40:05 PM2/28/17
to
G'Day Ravi,

On Thu, Feb 23, 2017 at 11:43 PM, Ravi Bangoria
<ravi.b...@linux.vnet.ibm.com> wrote:
>
> From: Hemant Kumar <hem...@linux.vnet.ibm.com>
>
> Add support for directly recording SDT events which are present in
> the probe cache. Without this patch, we could probe into SDT events
> using 'perf probe' and 'perf record'. With this patch, we can probe
> the SDT events directly using 'perf record'.
>
> For example :
>
> $ perf list sdt
> sdt_libpthread:mutex_entry [SDT event]
> sdt_libc:setjmp [SDT event]
> ...
>
> $ perf record -a -e sdt_libc:setjmp
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.286 MB perf.data (1065 samples) ]
>
> $ perf script
> bash 803 [002] 6492.190311: sdt_libc:setjmp: (7f1d503b56a1)
> login 488 [001] 6496.791583: sdt_libc:setjmp: (7ff3013d56a1)
> fprintd 11038 [003] 6496.808032: sdt_libc:setjmp: (7fdedf5936a1)
> [...]


Thanks, I like the usage. I ran into trouble testing on Node.js:

# ./perf buildid-cache --add `which node`
# ./perf list | grep sdt_node
sdt_node:gc__done [SDT event]
sdt_node:gc__start [SDT event]
sdt_node:http__client__request [SDT event]
sdt_node:http__client__response [SDT event]
sdt_node:http__server__request [SDT event]
sdt_node:http__server__response [SDT event]
sdt_node:net__server__connection [SDT event]
sdt_node:net__stream__end [SDT event]
# ./perf record -e sdt_node:http__server__request -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.308 MB perf.data ]
# ./perf script
#

No events. I can see it had set it up:

# cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_node/http__server__request /usr/local/bin/node:0x00000000009c2e69

Ok. Am I sure my workload is working? Trying from bcc/eBPF:

# /mnt/src/bcc/tools/trace.py 'u:node:http__server__request'
failed to enable probe 'http__server__request'; a possible cause can
be that the probe requires a pid to enable
# /mnt/src/bcc/tools/trace.py -p `pgrep node` 'u:node:http__server__request'
In file included from /virtual/main.c:41:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/blkdev.h:9:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/genhd.h:64:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/device.h:24:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/pinctrl/devinfo.h:21:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/pinctrl/consumer.h:17:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/seq_file.h:10:
/lib/modules/4.10.0-rc8-virtual/build/include/linux/fs.h:2648:9:
warning: comparison of unsigned enum expression < 0 is always false
[-Wtautological-compare]
if (id < 0 || id >= READING_MAX_ID)
~~ ^ ~
1 warning generated.
PID TID COMM FUNC
7646 7646 node http__server__request
7646 7646 node http__server__request
7646 7646 node http__server__request
^C

(ignore the warning; I just asked lkml about it). So that works. It
instrumented:

# cat /sys/kernel/debug/tracing/uprobe_events
p:uprobes/p__usr_local_bin_node_0x9c2e69_bcc_25410
/usr/local/bin/node:0x00000000009c2e69

Now retrying perf:

# ./perf record -e sdt_node:http__server__request -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.446 MB perf.data (3 samples) ]
# ./perf script
node 7646 [002] 361.012364:
sdt_node:http__server__request: (dc2e69)
node 7646 [002] 361.204718:
sdt_node:http__server__request: (dc2e69)
node 7646 [002] 361.363043:
sdt_node:http__server__request: (dc2e69)

Now perf works.

If I restart the node process, it goes back to the broken state.

This doesn't fix it either:

# ./perf probe sdt_node:http__server__request
Added new event:
sdt_node:http__server__request (on %http__server__request in
/usr/local/bin/node)

You can now use it in all perf tools, such as:

perf record -e sdt_node:http__server__request -aR sleep 1

Hint: SDT event can be directly recorded with 'perf record'. No need
to create probe manually.

Brendan

Masami Hiramatsu

unread,
Mar 1, 2017, 1:50:05 AM3/1/17
to
On Tue, 28 Feb 2017 16:26:19 +0530
OK, but in that case, could you split those support for review?
E.g. the first one just combine perf-probe and perf-record and
second one support multiple SDT events.
Hmm, OK. but this also not becomes the reason why you need 2 different
lists. After calling apply_perf_probe_events(), pev->tevs have been
updated, which have applied names. So you can pull it from the trace_event.

>
> First of all, perf looks into uprobe_events for entry with sdt_libpthread:mutex_entry
> *name*. As it's not present, perf does a lookup in probe-cache and gets two entries.
> Perf stores them in 'pev->tevs' array. Now it again compares uprobe_events entries
> with entries found from probe-cache with *filename* and *address* as comparison
> parameter (because this combo makes unique identifier for any SDT event in a
> particular system - please correct me if that's wrong). And perf finds one entry
> sdt_libpthread/mutex_entry_1.
>
> Keeping this entry in 'pevs->tevs' and calling apply_perf_probe_events() will add
> new entry for that event even if it exists. So this code removes that particular
> element of an array and shift all remaining elements one step.
>
> Please suggest if you have a better approach.

As I described above, you can get actual name of newly added event after calling
apply_perf_probe_events(). (if not, please feel free to update it to do so.)
OK, but could you split this out to another patch?
OK, I see. But could you break this patch into a series? At first, we just
need a name-based event setting, and checking event definition should be done
in following patches in the series. One patch should solve one thing, that
help us to review it smoother.

Thank you,

>
> > 3) record events
> > 4) remove the events by using "generated_sdts" which only has the event name.
> >
>
> Let me know if I've misunderstood any of your point.
>
> Thanks,
> -Ravi
>


--
Masami Hiramatsu <mhir...@kernel.org>

Brendan Gregg

unread,
Mar 1, 2017, 2:20:05 AM3/1/17
to
On Tue, Feb 28, 2017 at 2:31 PM, Brendan Gregg
<brendan...@gmail.com> wrote:
> G'Day Ravi,
>
[...]
> Now retrying perf:
>
> # ./perf record -e sdt_node:http__server__request -a
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.446 MB perf.data (3 samples) ]
> # ./perf script
> node 7646 [002] 361.012364:
> sdt_node:http__server__request: (dc2e69)
> node 7646 [002] 361.204718:
> sdt_node:http__server__request: (dc2e69)
> node 7646 [002] 361.363043:
> sdt_node:http__server__request: (dc2e69)
>
> Now perf works.
>
> If I restart the node process, it goes back to the broken state.
>

Oh sorry, I forgot about that these Node.js probes are behind an
is-enabled semaphore.

$ readelf -n `which node`
[...]
stapsdt 0x00000089 NT_STAPSDT (SystemTap probe descriptors)
Provider: node
Name: http__server__request
Location: 0x0000000000dc2e69, Base: 0x000000000112e064, Semaphore:
0x0000000001470954
Arguments: 8@%r14 8@%rax 8@-4344(%rbp) -4@-4348(%rbp)
8@-4304(%rbp) 8@-4312(%rbp) -4@-4352(%rbp)
# dd if=/proc/31695/mem bs=1 count=1 skip=$(( 0x0000000001470954 ))
2>/dev/null | xxd
00000000: 00 .
# printf "\x1" | dd of=/proc/31695/mem bs=1 count=1 seek=$((
0x0000000001470954 )) 2>/dev/null
# dd if=/proc/31695/mem bs=1 count=1 skip=$(( 0x0000000001470954 ))
2>/dev/null | xxd
00000000: 01 .
# ./perf record -e sdt_node:http__server__request -a
Matching event(s) from uprobe_events:
sdt_node:http__server__request 0x9c2e69@/usr/local/bin/node
Use 'perf probe -d <event>' to delete event(s).
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.280 MB perf.data (3 samples) ]
# ./perf script
node 31695 [003] 24947.168761:
sdt_node:http__server__request: (dc2e69)
node 31695 [003] 24947.476143:
sdt_node:http__server__request: (dc2e69)
node 31695 [003] 24947.679090:
sdt_node:http__server__request: (dc2e69)

So setting that to 1 made the probe work from perf. I guess this is
not a problem with this patch set, but rather a feature request for
the next one: is-enabled SDT support.

Were probe arguments supposed to work? I don't notice them in the perf
script output.

PS, if it's helpful, here's the commands to build node with these SDT probes:

$ sudo apt-get install systemtap-sdt-dev # adds "dtrace", used
by node build
$ wget https://nodejs.org/dist/v4.4.1/node-v4.4.1.tar.gz
$ tar xvf node-v4.4.1.tar.gz
$ cd node-v4.4.1
$ ./configure --with-dtrace
$ make -j 8

Brendan

Ravi Bangoria

unread,
Mar 1, 2017, 6:10:05 AM3/1/17
to
Thank you Brendan for reviewing,

On Wednesday 01 March 2017 10:34 AM, Brendan Gregg wrote:
> On Tue, Feb 28, 2017 at 2:31 PM, Brendan Gregg
> <brendan...@gmail.com> wrote:
>> G'Day Ravi,
>>
> [...]
>> Now retrying perf:
>>
>> # ./perf record -e sdt_node:http__server__request -a
>> ^C[ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.446 MB perf.data (3 samples) ]
>> # ./perf script
>> node 7646 [002] 361.012364:
>> sdt_node:http__server__request: (dc2e69)
>> node 7646 [002] 361.204718:
>> sdt_node:http__server__request: (dc2e69)
>> node 7646 [002] 361.363043:
>> sdt_node:http__server__request: (dc2e69)
>>
>> Now perf works.
>>
>> If I restart the node process, it goes back to the broken state.
>>
> Oh sorry, I forgot about that these Node.js probes are behind an
> is-enabled semaphore.

Yes. Perf does not support "is-enabled" markers yet.
Yes. Actually I'm thinking about how this can be accomplished. Perf is a userspace
tool and, unlike systemtap, it cannot change semaphore value easily. This is what
I was thinking:

'perf record', at the start of session will increments semaphore in /proc/<pid>/mem.
and at the end of session, perf will decrement it (same as bcc). This does not require
any support from kernel infrastructure. But there are challenges with this approach:

1. What if user starts workload after starting 'perf record'. How perf will be able
to increment semaphore value.

2. Systemwide record. We have to loop over all pids and check if any process is
using SDT with semaphore that is being recorded.

3. Dynamic library loading. How to handle SDT probes in library that is not loaded
at the time of 'perf record'?

Please let me know your thoughts.

> Were probe arguments supposed to work? I don't notice them in the perf
> script output.

Not yet. Alexis[1] (and followed by me[2]) had sent patches for that. Please
have a look at them.

[1] https://lkml.org/lkml/2016/12/13/784
[2] https://lkml.org/lkml/2017/2/2/145


So, why perf is able to record data after recording them with bcc?

Ideally, bcc should increment semaphore value at the start of session and
it should decrement at the end of the session. So after bcc process exits,
semaphore value should be zero. But actually it's not happening.

I've seen this when I was experimenting bcc with is-enabled markers.
See this example,

$ readelf -n /usr/bin/node | grep -A2 Provider
Provider: node
Name: http__server__request
Location: 0x0000000000e5f484, Base: 0x00000000011a0bc4, Semaphore: 0x0000000001558cf2

$ sudo cat /proc/1426/maps
00400000-01306000 r-xp 00000000 08:02 1083365 /usr/bin/node
01506000-01551000 r--p 00f06000 08:02 1083365 /usr/bin/node
01551000-01559000 rw-p 00f51000 08:02 1083365 /usr/bin/node
...

[TERMINAL-1]$ gdb 1426
(gdb) x/1 0x1558cf2
0x1558cf2: 0

[TERMINAL-2]$ sudo ./trace.py -p 1426 'u:node:http__server__request'
PID TID COMM FUNC
/* Do not exit yet. */

[TERMINAL-1]
(gdb) x/1 0x1558cf2
0x1558cf2: 2

[TERMINAL-2]
^C /* Exit bcc trace.py */

[TERMINAL-1]
(gdb) x/1 0x1558cf2
0x1558cf2: 2

Here it's maintaining value 2 as it is. it should be 0. I suspect this is a bug in
bcc. Please let me know if I'm understanding it wrong.

>
> PS, if it's helpful, here's the commands to build node with these SDT probes:
>
> $ sudo apt-get install systemtap-sdt-dev # adds "dtrace", used
> by node build
> $ wget https://nodejs.org/dist/v4.4.1/node-v4.4.1.tar.gz
> $ tar xvf node-v4.4.1.tar.gz
> $ cd node-v4.4.1
> $ ./configure --with-dtrace
> $ make -j 8

Thanks for this. :)
-Ravi

Ravi Bangoria

unread,
Mar 6, 2017, 9:50:06 AM3/6/17
to
From: Hemant Kumar <hem...@linux.vnet.ibm.com>

Add basic support for directly recording SDT events which are present
in the probe cache. Without this patch, we could probe into SDT events
using 'perf probe' and 'perf record'. With this patch, we can probe the
SDT events directly using 'perf record'.

For example :

$ sudo ./perf list sdt
sdt_libpthread:mutex_entry [SDT event]
sdt_libc:setjmp [SDT event]

$ sudo ./perf record -a -e sdt_libc:setjmp

$ sudo ./perf script
bash 793 [002] 260.382957: sdt_libc:setjmp: (7ff85b6596a1)
reset 1296 [000] 260.511983: sdt_libc:setjmp: (7f26862e06a1)

Recording on SDT events with same provider and marker names is also
supported:

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, Base: 0x00000000000139cc, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, Base: 0x00000000000139cc, ...

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning : Recording on 2 occurences of sdt_libpthread:mutex_entry

$ sudo ./perf evlist
sdt_libpthread:mutex_entry_1
sdt_libpthread:mutex_entry

After invoking 'perf record', behind the scenes, it checks whether the
event specified is an SDT event using the string 'sdt_' or flag '%'.
After that, it does a lookup of the probe cache to find out the SDT
event. If its not present, it throws an error. Otherwise, it goes on
and writes the event into the uprobe_events file and starts recording.
It also maintains a list of the event names that were written to
uprobe_events file. At the end of the record session, it removes the
events from the uprobe_events file using the maintained name list.

As mentioned, it always tries to look for sdt event in probe cache and
ignores entries of uprobe_events. Hence, it creates new probe points
for event even if it already exists.

$ sudo ./perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning : Recording on 2 occurences of sdt_libpthread:mutex_entry

$ sudo ./perf evlist
sdt_libpthread:mutex_entry_3
sdt_libpthread:mutex_entry_2

As it does not look at uprobe_events file, it can't record those events
whose probe points are created with different name. For ex,

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry_1
Error: sdt_libpthread:mutex_entry_1 not found in the cache
invalid or unsupported event: 'sdt_libpthread:mutex_entry_1'

Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/builtin-record.c | 24 ++++++++
tools/perf/perf.h | 1 +
tools/perf/util/parse-events.c | 56 +++++++++++++++++-
tools/perf/util/parse-events.h | 2 +
tools/perf/util/probe-event.c | 35 ++++++++++-
tools/perf/util/probe-event.h | 4 ++
tools/perf/util/probe-file.c | 131 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/probe-file.h | 8 +++
8 files changed, 257 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index bc84a37..e87b19b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -39,6 +39,7 @@
#include "util/trigger.h"
#include "util/perf-hooks.h"
#include "asm/bug.h"
+#include "util/probe-file.h"

#include <unistd.h>
#include <sched.h>
@@ -73,6 +74,7 @@ struct record {
bool timestamp_filename;
struct switch_output switch_output;
unsigned long long samples;
+ struct list_head sdt_event_list;
};

static volatile int auxtrace_record__snapshot_started;
@@ -1503,6 +1505,26 @@ static struct record record = {
},
};

+void sdt_event_list__add(struct list_head *sdt_event_list)
+{
+ if (list_empty(sdt_event_list))
+ return;
+ list_splice(sdt_event_list, &record.sdt_event_list);
+}
+
+bool is_cmd_record(void)
+{
+ return (record.evlist != NULL);
+}
+
+static void
+sdt_event_list__remove(struct list_head *sdt_event_list __maybe_unused)
+{
+#ifdef HAVE_LIBELF_SUPPORT
+ return remove_sdt_event_list(sdt_event_list);
+#endif
+}
+
const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
"\n\t\t\t\tDefault: fp";

@@ -1671,6 +1693,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

+ INIT_LIST_HEAD(&rec->sdt_event_list);
err = perf_config(perf_record_config, rec);
if (err)
return err;
@@ -1841,6 +1864,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);
+ sdt_event_list__remove(&rec->sdt_event_list);
return err;
}

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1c27d94..9d8e5fe 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -76,4 +76,5 @@ struct record_opts {
struct option;
extern const char * const *record_usage;
extern struct option *record_options;
+bool is_cmd_record(void);
#endif
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 54355d3..1fcc9d13 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1727,12 +1727,66 @@ static void parse_events_print_error(struct parse_events_error *err,

#undef MAX_WIDTH

+/* SDT event needs LIBELF support for creating a probe point */
+#ifdef HAVE_LIBELF_SUPPORT
+static int parse_sdt_event(struct perf_evlist *evlist, const char *str,
+ struct parse_events_error *err)
+{
+ char *ptr = NULL;
+ int ret;
+ struct list_head *sdt_evlist;
+ struct sdt_event_list *sdt_event;
+
+ if (str[0] == '%')
+ str++;
+
+ ptr = strdup(str);
+ if (ptr == NULL)
+ return -ENOMEM;
+
+ sdt_evlist = zalloc(sizeof(*sdt_evlist));
+ if (!sdt_evlist) {
+ free(ptr);
+ pr_debug("Error in sdt_evlist memory allocation\n");
+ return -ENOMEM;
+ }
+ INIT_LIST_HEAD(sdt_evlist);
+
+ /*
+ * If there is an error in this call, no need to free
+ * up sdt_evlist, its already free'ed up in the previous
+ * call. Free up 'ptr' though.
+ */
+ ret = add_sdt_event(ptr, sdt_evlist);
+ if (!ret) {
+ list_for_each_entry(sdt_event, sdt_evlist, list) {
+ ret = parse_events(evlist, sdt_event->name, err);
+ if (ret < 0)
index c6172cd..0887269 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -208,4 +208,6 @@ static inline bool is_sdt_event(char *str)
(!strncmp(str, "sdt_", 4) &&
!!strchr(str, ':') && !strchr(str, '=')));
}
+
+void sdt_event_list__add(struct list_head *sdt_event_list);
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 2b1409f..b879076 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1293,7 +1293,7 @@ int parse_line_range_desc(const char *arg, struct line_range *lr)
return err;
}

-static int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
{
char *ptr;

@@ -3125,8 +3125,8 @@ static int find_cached_events(struct perf_probe_event *pev,
}

/* Try to find probe_trace_event from all probe caches */
-static int find_cached_events_all(struct perf_probe_event *pev,
- struct probe_trace_event **tevs)
+int find_cached_events_all(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs)
{
struct probe_trace_event *tmp_tevs = NULL;
struct strlist *bidlist;
@@ -3476,3 +3476,32 @@ int copy_to_probe_trace_arg(struct probe_trace_arg *tvar,
tvar->name = NULL;
return 0;
}
+
+/*
+ * Record session for SDT events has ended. Delete the SDT events
+ * from uprobe_events file that were created initially.
+ */
+void remove_sdt_event_list(struct list_head *sdt_events)
+{
+ struct sdt_event_list *sdt_event;
+ struct strfilter *filter = NULL;
+ const char *err = NULL;
+
+ if (list_empty(sdt_events))
+ return;
+
+ list_for_each_entry(sdt_event, sdt_events, list) {
+ if (!filter) {
+ filter = strfilter__new(sdt_event->name, &err);
+ if (!filter)
+ goto free_list;
+ } else {
+ strfilter__or(filter, sdt_event->name, &err);
+ }
+ }
+
+ del_perf_probe_events(filter);
+
+free_list:
+ free_sdt_list(sdt_events);
+}
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 5d4e940..91e277e 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -182,4 +182,8 @@ struct map *get_target_map(const char *target, bool user);
void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
int ntevs);

+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev);
+
+int find_cached_events_all(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs);
#endif /*_PROBE_EVENT_H */
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 1a62dac..c1cf67f 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -27,8 +27,10 @@
#include "probe-event.h"
#include "probe-file.h"
#include "session.h"
+#include "probe-finder.h"

#define MAX_CMDLEN 256
+#define MAX_EVENT_LENGTH 512

static void print_open_warning(int err, bool uprobe)
{
@@ -935,3 +937,132 @@ bool probe_type_is_available(enum probe_type type)

return ret;
}
+
+void free_sdt_list(struct list_head *sdt_evlist)
+{
+ struct sdt_event_list *tmp, *ptr;
+
+ if (list_empty(sdt_evlist))
+ return;
+ list_for_each_entry_safe(tmp, ptr, sdt_evlist, list) {
+ list_del(&tmp->list);
+ free(tmp->name);
+ free(tmp);
+ }
+}
+
+static int get_sdt_events_from_cache(struct perf_probe_event *pev)
+{
+ int ret = 0;
+
+ pev->ntevs = find_cached_events_all(pev, &pev->tevs);
+
+ if (pev->ntevs < 0) {
+ pr_err("Error: Cache lookup failed (code: %d)\n", pev->ntevs);
+ ret = pev->ntevs;
+ } else if (!pev->ntevs) {
+ pr_err("Error: %s:%s not found in the cache\n",
+ pev->group, pev->event);
+ ret = -EINVAL;
+ } else if (pev->ntevs > 1) {
+ pr_warning("Warning : Recording on %d occurences of %s:%s\n",
+ pev->ntevs, pev->group, pev->event);
+ }
+
+ return ret;
+}
+
+static int add_event_to_sdt_evlist(struct probe_trace_event *tev,
+ struct list_head *sdt_evlist)
+{
+ struct sdt_event_list *tmp;
+
+ tmp = zalloc(sizeof(*tmp));
+ if (!tmp)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&tmp->list);
+ tmp->name = zalloc(MAX_EVENT_LENGTH * sizeof(char));
+ if (!tmp->name)
+ return -ENOMEM;
+
+ snprintf(tmp->name, strlen(tev->group) + strlen(tev->event) + 2,
+ "%s:%s", tev->group, tev->event);
+ list_add(&tmp->list, sdt_evlist);
+
+ return 0;
+}
+
+static int add_events_to_sdt_evlist(struct perf_probe_event *pev,
+ struct list_head *sdt_evlist)
+{
+ int i, ret;
+
+ for (i = 0; i < pev->ntevs; i++) {
+ ret = add_event_to_sdt_evlist(&pev->tevs[i], sdt_evlist);
+
+ if (ret < 0)
+ return ret;
+ }
+ return 0;
+}
+
+/*
+ * Find the SDT event from the cache and if found add it/them
+ * to the uprobe_events file
+ */
+int add_sdt_event(char *event, struct list_head *sdt_evlist)
+{
+ struct perf_probe_event *pev;
+ int ret;
+
+ pev = zalloc(sizeof(*pev));
+ if (!pev)
+ return -ENOMEM;
+
+ pev->sdt = true;
+ pev->uprobes = true;
+
+ /*
+ * Parse event to find the group name and event name of
+ * the sdt event.
+ */
+ ret = parse_perf_probe_event_name(&event, pev);
+ if (ret) {
+ pr_err("Error in parsing sdt event %s\n", event);
+ free(pev);
+ return ret;
+ }
+
+ probe_conf.max_probes = MAX_PROBES;
+ probe_conf.force_add = 1;
+
+ /* Fetch all matching events from cache. */
+ ret = get_sdt_events_from_cache(pev);
+ if (ret < 0)
+ goto free_pev;
+
+ /*
+ * Create probe point for all events by adding them in
+ * uprobe_events file
+ */
+ ret = apply_perf_probe_events(pev, 1);
+ if (ret) {
+ pr_err("Error in adding SDT event : %s\n", event);
+ goto free_pev;
+ }
+
+ /* Add events to sdt_evlist */
+ ret = add_events_to_sdt_evlist(pev, sdt_evlist);
+ if (ret < 0)
+ goto free_pev;
+
+ ret = 0;
+
+free_pev:
+ if (ret < 0)
+ free_sdt_list(sdt_evlist);
+ cleanup_perf_probe_events(pev, 1);
+ free(pev);
+ return ret;
+}
diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
index a17a82e..6d2d3e5 100644
--- a/tools/perf/util/probe-file.h
+++ b/tools/perf/util/probe-file.h
@@ -19,6 +19,11 @@ struct probe_cache {
struct list_head entries;
};

+struct sdt_event_list {
+ char *name; /* group:event */
+ struct list_head list;
+};
+
enum probe_type {
PROBE_TYPE_U = 0,
PROBE_TYPE_S,
@@ -65,6 +70,9 @@ struct probe_cache_entry *probe_cache__find_by_name(struct probe_cache *pcache,
const char *group, const char *event);
int probe_cache__show_all_caches(struct strfilter *filter);
bool probe_type_is_available(enum probe_type type);
+int add_sdt_event(char *event, struct list_head *sdt_event_list);

Ravi Bangoria

unread,
Mar 6, 2017, 10:30:06 AM3/6/17
to
All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe point for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.

As suggested by Ingo[1], it's better to make this process simple by
creating probe point automatically with 'perf record' for SDT events.

Features:
- Allow both 'perf probe' and 'perf record' on sdt events without
changing current functionality.

- Event starting with 'sdt_' or '%' will be considered as SDT event.

- Always prioritize events from uprobe_events by first checking if
event exists with exact name. If not found and user has used
pattern, again try to find pattern matching entries from
uprobe_events. If found use them. If not, lookup into probe-cache.
If events found from probe-cache, again check if any event exists
in uprobe_events by matching filepath+address, as it might exists
in uprobe_events but with different name. Reuse those events which
exists in uprobe_events and create new entries for missing one.
Also maintain list for new entries being created and at the end
of the session, delete them.

- Show various warnings/hints to help user understand _which_ events
are being recorded and _why_. For ex,

When multiple events of same name found and all are being recorded:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_entry

Events being reused from uprobe_events is listed as 'name addr@file'
followed by hint on how to delete them:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry 0x9ddb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

If number of events found from cache is not equal to number of events
being recorded:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning: Found 2 events from probe-cache with name 'sdt_libpthread:mutex_entry'.
Since 1 probe point already exists, recording only it.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry' to allow record on all events.

Changes in v4:
- Moved util function is_sdt_event() from tools/perf/util/util.c to
tools/perf/util/parse-events.h. [PATCH 1/7]

- Split hunk into multiple patches.

- Changed pr_err() to pr_debug() (only where Masami has suggested).

- Removed unnecessary wrapper func find_sdt_events_from_cache() by
exposing find_cached_events_all().

- Removed 'struct exst_sdt_event_list', instead using array of
existing data structure 'struct probe_trace_event'.

- No hint will be shown at 'perf probe' for SDT events.

- Functionality change. If all events found from probe-cache are not
present in uprobe_events, v3 was recording all events found from
cache. That has been changed in v4. If user has used pattern to
specify event, v4 will record only those events which are present
in uprobe_events. This is to make perf semantics consistent across
normal and SDT events. And If user has not used pattern, it will
record all events found from probe-cache by reusing name for
existing one and adding entries for missing one. For ex (with v4),

$ sudo ./perf probe sdt_libpthread:mutex_release
Added new events:
sdt_libpthread:mutex_release (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_1 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_2 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_3 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
$ sudo ./perf probe -d sdt_libpthread:mutex_release
$ sudo ./perf probe -d sdt_libpthread:mutex_release_2

$ sudo ./perf record -a -e sdt_libpthread:mutex_release*
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_release*

$ sudo ./perf record -a -e sdt_libpthread:mutex_release
Warning: Recording on 4 occurrences of sdt_libpthread:mutex_release


This patchset is prepared on top of acme/perf/core.

v3 link: https://lkml.org/lkml/2017/2/24/27
Hemant Kumar (1):
perf/sdt: Directly record SDT events with 'perf record'

Ravi Bangoria (6):
perf/sdt: Introduce util func is_sdt_event()
perf/sdt: Allow recording of existing events
perf/sdt: Clean uprobe_events when event(out of multiple events)
parsing fails
perf/sdt: Warn when number of events recorded are not equal to cached
events
perf/sdt: List events fetched from uprobe_events
perf/sdt: Remove stale warning

tools/lib/api/fs/tracing_path.c | 17 +--
tools/perf/builtin-record.c | 23 +++
tools/perf/perf.h | 2 +
tools/perf/util/parse-events.c | 56 +++++++-
tools/perf/util/parse-events.h | 14 ++
tools/perf/util/probe-event.c | 100 +++++++++++--
tools/perf/util/probe-event.h | 9 ++
tools/perf/util/probe-file.c | 310 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/probe-file.h | 9 ++
9 files changed, 513 insertions(+), 27 deletions(-)

--
2.9.3

Ravi Bangoria

unread,
Mar 6, 2017, 11:40:07 AM3/6/17
to
User may ask for multiple events in the same record command like,

perf record -a -e sdt_1:* -e sdt_2:*

If sdt_1:* events are already added to uprobe_events and sdt_2:*
event parsing fails, clean sdt_1:* events from uprobe_events.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/builtin-record.c | 7 +++----
tools/perf/perf.h | 1 +
tools/perf/util/probe-file.c | 4 +++-
3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e87b19b..46d447e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1517,11 +1517,10 @@ bool is_cmd_record(void)
return (record.evlist != NULL);
}

-static void
-sdt_event_list__remove(struct list_head *sdt_event_list __maybe_unused)
+void sdt_event_list__remove(void)
{
#ifdef HAVE_LIBELF_SUPPORT
- return remove_sdt_event_list(sdt_event_list);
+ return remove_sdt_event_list(&record.sdt_event_list);
#endif
}

@@ -1864,7 +1863,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);
- sdt_event_list__remove(&rec->sdt_event_list);
+ sdt_event_list__remove();
return err;
}

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 9d8e5fe..8a411f1 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -77,4 +77,5 @@ struct option;
extern const char * const *record_usage;
extern struct option *record_options;
bool is_cmd_record(void);
+void sdt_event_list__remove(void);
#endif
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 47b624a..358ca98 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -1192,8 +1192,10 @@ int add_sdt_event(char *event, struct list_head *sdt_evlist)
ret = 0;

free_pev:
- if (ret < 0)
+ if (ret < 0) {
free_sdt_list(sdt_evlist);
+ sdt_event_list__remove();
+ }
cleanup_perf_probe_events(pev, 1);
clear_probe_trace_events(exst_tevs, exst_ntevs);
free(pev);
--
2.9.3

Ravi Bangoria

unread,
Mar 6, 2017, 12:50:06 PM3/6/17
to
Factor out the SDT event name checking routine as is_sdt_event().

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/parse-events.h | 12 ++++++++++++
tools/perf/util/probe-event.c | 9 +--------
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1af6a26..c6172cd 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -8,6 +8,7 @@
#include <stdbool.h>
#include <linux/types.h>
#include <linux/perf_event.h>
+#include <string.h>

struct list_head;
struct perf_evsel;
@@ -196,4 +197,15 @@ int is_valid_tracepoint(const char *event_string);
int valid_event_mount(const char *eventfs);
char *parse_events_formats_error_string(char *additional_terms);

+/*
+ * If the probe point starts with '%',
+ * or starts with "sdt_" and has a ':' but no '=',
+ * then it should be a SDT/cached probe point.
+ */
+static inline bool is_sdt_event(char *str)
+{
+ return (str[0] == '%' ||
+ (!strncmp(str, "sdt_", 4) &&
+ !!strchr(str, ':') && !strchr(str, '=')));
+}
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 28fb62c..2b1409f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1339,14 +1339,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
if (!arg)
return -EINVAL;

- /*
- * If the probe point starts with '%',
- * or starts with "sdt_" and has a ':' but no '=',
- * then it should be a SDT/cached probe point.
- */
- if (arg[0] == '%' ||
- (!strncmp(arg, "sdt_", 4) &&
- !!strchr(arg, ':') && !strchr(arg, '='))) {
+ if (is_sdt_event(arg)) {
pev->sdt = true;
if (arg[0] == '%')
arg++;
--
2.9.3

Ravi Bangoria

unread,
Mar 6, 2017, 1:40:06 PM3/6/17
to
If number of events found from probe-cache is not equal to number of
existing events(fetched from uprobe_events), and somehow we decides
to record only existing events, we warn user about the same. For ex,

$ sudo ./perf probe sdt_libpthread:mutex_release
Added new events:
sdt_libpthread:mutex_release (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_1 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_2 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_3 (on %mutex_release in /usr/lib64/libpthread-2.24.so)

$ sudo ./perf record -a -e sdt_libpthread:*
Warning: Recording on 4 occurrences of sdt_libpthread:*
Warning: Found 35 events from probe-cache with name 'sdt_libpthread:*'.
Since 4 probe points already exists, recording only them.
Hint: Please use 'perf probe -d sdt_libpthread:*' to allow record on all events.

$ sudo ./perf evlist
sdt_libpthread:mutex_release_3
sdt_libpthread:mutex_release_2
sdt_libpthread:mutex_release_1
sdt_libpthread:mutex_release

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/probe-file.c | 59 ++++++++++++++++++++++++++++----------------
1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 358ca98..90444e5 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -951,24 +951,6 @@ void free_sdt_list(struct list_head *sdt_evlist)
}
}

-static int get_sdt_events_from_cache(struct perf_probe_event *pev)
-{
- int ret = 0;
-
- pev->ntevs = find_cached_events_all(pev, &pev->tevs);
-
- if (pev->ntevs < 0) {
- pr_err("Error: Cache lookup failed (code: %d)\n", pev->ntevs);
- ret = pev->ntevs;
- } else if (!pev->ntevs) {
- pr_err("Error: %s:%s not found in the cache\n",
- pev->group, pev->event);
- ret = -EINVAL;
- }
-
- return ret;
-}
-
static int add_event_to_sdt_evlist(struct probe_trace_event *tev,
struct list_head *sdt_evlist,
bool exst)
@@ -1076,6 +1058,17 @@ static void shift_sdt_events(struct perf_probe_event *pev, int i)
pev->ntevs--;
}

+static void sdt_warn_abt_exist_events(struct perf_probe_event *pev, int ctr)
+{
+ pr_warning("Warning: Found %d events from probe-cache with name '%s:%s'.\n"
+ "\t Since %d probe point%c already exists, recording only %s.\n"
+ "Hint: Please use 'perf probe -d %s:%s' to allow record on all events.\n\n",
+ pev->ntevs, pev->group, pev->event, ctr,
+ ctr > 1 ? 's' : '\0',
+ ctr > 1 ? "them" : "it",
+ pev->group, pev->event);
+}
+
static int sdt_merge_events(struct perf_probe_event *pev,
struct probe_trace_event *exst_tevs,
int exst_ntevs,
@@ -1155,6 +1148,15 @@ int add_sdt_event(char *event, struct list_head *sdt_evlist)
probe_conf.max_probes = MAX_PROBES;
probe_conf.force_add = 1;

+ /*
+ * This call is intentionally placed before fetching events
+ * from uprobe_events file. If number of events found from probe-
+ * cache is not equal to number of existing events, and somehow
+ * we decides to record only existing events, we warn user about
+ * the same (sdt_warn_abt_exist_events()).
+ */
+ pev->ntevs = find_cached_events_all(pev, &pev->tevs);
+
/* Fetch all sdt events from uprobe_events */
exst_ntevs = get_exist_sdt_events(&exst_tevs);
if (exst_ntevs < 0) {
@@ -1166,14 +1168,29 @@ int add_sdt_event(char *event, struct list_head *sdt_evlist)
ret = sdt_event_probepoint_exists(pev, exst_tevs,
exst_ntevs, sdt_evlist);
if (ret) {
+ if (ret > 0 && pev->ntevs > 0 && ret != pev->ntevs)
+ sdt_warn_abt_exist_events(pev, ret);
ret = ret > 0 ? 0 : ret;
goto free_pev;
}

- /* Fetch all matching events from cache. */
- ret = get_sdt_events_from_cache(pev);
- if (ret < 0)
+ /*
+ * Check if find_cached_events_all() failed.
+ * We deliberately check failure of this function after checking
+ * entries in uprobe_events. Because, even if this function fails,
+ * we may find matching entry from uprobe_events and in that case
+ * we should continue recording that event.
+ */
+ if (pev->ntevs < 0) {
+ pr_err("Error: Cache lookup failed (code: %d)\n", pev->ntevs);
+ ret = pev->ntevs;
goto free_pev;
+ } else if (!pev->ntevs) {
+ pr_err("Error: %s:%s not found in the cache\n",
+ pev->group, pev->event);
+ ret = -EINVAL;
+ goto free_pev;
+ }

/*
* Merge events found from uprobe_events with events found
--
2.9.3

Masami Hiramatsu

unread,
Mar 8, 2017, 7:40:06 AM3/8/17
to
Hi Ravi,
Hmm, exporting this from builtin-record.c makes inverted reference from
builtin-record.o to libperf. I think it is not good way to do.

[...]
To avoid these ifdefs, could you add a dummy (error return) function
of add_sdt_event() for !HAVE_LIBELF_SUPPORT case in probe-event.h?
Hmm, why would you put this function in probe-file.c? I would like to
ask you move this kind of functions which operates perf_probe_event
in probe-event.c.

Thanks,
--
Masami Hiramatsu <mhir...@kernel.org>

Ravi Bangoria

unread,
Mar 14, 2017, 11:10:06 AM3/14/17
to
From: Hemant Kumar <hem...@linux.vnet.ibm.com>

Add basic support for directly recording SDT events which are present
in the probe cache. Without this patch, we could probe into SDT events
using 'perf probe' and 'perf record'. With this patch, we can probe the
SDT events directly using 'perf record'.

After invoking 'perf record', behind the scenes, it checks whether the
event specified is an SDT event using the string 'sdt_' or flag '%'.
After that, it does a lookup of the probe cache to find out the SDT
event. If its not present, it throws an error. Otherwise, it goes on
and writes the event into the uprobe_events file and starts recording.
It also maintains a list of the event names that were written to
uprobe_events file. At the end of the record session, it removes the
events from the uprobe_events file using the maintained name list.

For example:

$ sudo ./perf list sdt
sdt_libpthread:mutex_entry [SDT event]
sdt_libc:setjmp [SDT event]

$ sudo ./perf record -a -e sdt_libc:setjmp

$ sudo ./perf script
bash 793 [002] 260.382957: sdt_libc:setjmp: (7ff85b6596a1)
reset 1296 [000] 260.511983: sdt_libc:setjmp: (7f26862e06a1)

Recording on SDT events with same provider and marker names is also
supported:

$ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
Provider: libpthread
Name: mutex_entry
Location: 0x0000000000009ddb, Base: 0x00000000000139cc, ...
--
Provider: libpthread
Name: mutex_entry
Location: 0x000000000000bcbb, Base: 0x00000000000139cc, ...

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning : Recording on 2 occurences of sdt_libpthread:mutex_entry

$ sudo ./perf evlist
sdt_libpthread:mutex_entry_1
sdt_libpthread:mutex_entry

Note that, it always fetch sdt events in probe cache and ignores
entries of uprobe_events file. Hence, it creates new probe points
for event even if it already exists in uprobe_events.

$ sudo ./perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning : Recording on 2 occurences of sdt_libpthread:mutex_entry

$ sudo ./perf evlist
sdt_libpthread:mutex_entry_3
sdt_libpthread:mutex_entry_2

As it does not look at uprobe_events file, it can't record those events
whose probe points are created with different name. For ex,

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry_1
Error: sdt_libpthread:mutex_entry_1 not found in the cache
invalid or unsupported event: 'sdt_libpthread:mutex_entry_1'

Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/builtin-record.c | 27 ++++++-
tools/perf/util/parse-events.c | 60 +++++++++++++++
tools/perf/util/parse-events.h | 3 +
tools/perf/util/probe-event.c | 165 ++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/probe-event.h | 12 +++
5 files changed, 261 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index bc84a37..e8e1f73 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -73,6 +73,7 @@ struct record {
bool timestamp_filename;
struct switch_output switch_output;
unsigned long long samples;
+ struct list_head sdt_event_list;
};

static volatile int auxtrace_record__snapshot_started;
@@ -1503,6 +1504,23 @@ static struct record record = {
},
};

+static int record__parse_events_option(const struct option *opt,
+ const char *str,
+ int unset)
+{
+ if (is_sdt_event((char *) str))
+ return parse_sdt_events_option(opt, str, unset);
+ else
+ return parse_events_option(opt, str, unset);
+}
+
+static void sdt_event_list__remove(void)
+{
+#ifdef HAVE_LIBELF_SUPPORT
+ remove_sdt_event_list(&record.sdt_event_list);
+#endif
+}
+
const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
"\n\t\t\t\tDefault: fp";

@@ -1516,9 +1534,10 @@ static bool dry_run;
* using pipes, etc.
*/
static struct option __record_options[] = {
- OPT_CALLBACK('e', "event", &record.evlist, "event",
- "event selector. use 'perf list' to list available events",
- parse_events_option),
+ OPT_CALLBACK_ARG('e', "event", &record.evlist,
+ &record.sdt_event_list, "event",
+ "event selector. use 'perf list' to list available events",
+ record__parse_events_option),
OPT_CALLBACK(0, "filter", &record.evlist, "filter",
"event filter", parse_filter),
OPT_CALLBACK_NOOPT(0, "exclude-perf", &record.evlist,
@@ -1671,6 +1690,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

+ INIT_LIST_HEAD(&rec->sdt_event_list);
err = perf_config(perf_record_config, rec);
if (err)
return err;
@@ -1841,6 +1861,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);
+ sdt_event_list__remove();
return err;
}

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 54355d3..252dac1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1727,6 +1727,66 @@ static void parse_events_print_error(struct parse_events_error *err,

#undef MAX_WIDTH

+/* SDT event needs LIBELF support for creating a probe point */
+#ifdef HAVE_LIBELF_SUPPORT
+int parse_sdt_events_option(const struct option *opt, const char *str,
+ int unset __maybe_unused)
+{
+ struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
+ struct parse_events_error err = { .idx = 0, };
+ char *ptr = NULL;
+ int ret = 0;
+ ret = parse_events(evlist, sdt_event->name, &err);
+ if (ret < 0)
+ goto ret;
+ }
+ /* Add it to the record struct */
+ list_splice(sdt_evlist, opt->data);
+ }
+
+ret:
+ if (ret) {
+ remove_sdt_event_list(opt->data);
+ parse_events_print_error(&err, str);
+ }
+ free(ptr);
+ return ret;
+}
+#else
+int parse_sdt_events_option(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ return -1;
+}
+#endif /* HAVE_LIBELF_SUPPORT */
+
int parse_events_option(const struct option *opt, const char *str,
int unset __maybe_unused)
{
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 8c72b0f..8e29cb6 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -197,6 +197,9 @@ int is_valid_tracepoint(const char *event_string);
int valid_event_mount(const char *eventfs);
char *parse_events_formats_error_string(char *additional_terms);

+int parse_sdt_events_option(const struct option *opt, const char *str,
+ int unset);
+
#ifdef HAVE_LIBELF_SUPPORT
/*
* If the probe point starts with '%',
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 2b1409f..f725953 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -49,6 +49,7 @@

#define MAX_CMDLEN 256
#define PERFPROBE_GROUP "probe"
+#define MAX_EVENT_LENGTH 512

bool probe_event_dry_run; /* Dry run flag */
struct probe_conf probe_conf;
@@ -1293,7 +1294,7 @@ int parse_line_range_desc(const char *arg, struct line_range *lr)
return err;
}

-static int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
{
char *ptr;

@@ -3125,8 +3126,8 @@ static int find_cached_events(struct perf_probe_event *pev,
}

/* Try to find probe_trace_event from all probe caches */
-static int find_cached_events_all(struct perf_probe_event *pev,
- struct probe_trace_event **tevs)
+int find_cached_events_all(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs)
{
struct probe_trace_event *tmp_tevs = NULL;
struct strlist *bidlist;
@@ -3476,3 +3477,161 @@ int copy_to_probe_trace_arg(struct probe_trace_arg *tvar,
tvar->name = NULL;
return 0;
}
+
+static void free_sdt_list(struct list_head *sdt_evlist)
+{
+ struct sdt_event_list *tmp, *ptr;
+
+ if (list_empty(sdt_evlist))
+ return;
+ list_for_each_entry_safe(tmp, ptr, sdt_evlist, list) {
+ list_del(&tmp->list);
+ free(tmp->name);
+ free(tmp);
+ }
+}
+
+/*
+ * Delete the SDT events from uprobe_events file that
+ * were created initially.
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 5d4e940..6812230 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -113,6 +113,12 @@ struct variable_list {
struct strlist *vars; /* Available variables */
};

+/* List of sdt events */
+struct sdt_event_list {
+ struct list_head list;
+ char *name; /* group:event */
+};
+
struct map;
int init_probe_symbol_maps(bool user_only);
void exit_probe_symbol_maps(void);
@@ -182,4 +188,10 @@ struct map *get_target_map(const char *target, bool user);
void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
int ntevs);

+int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev);
+
+int find_cached_events_all(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs);
+int add_sdt_event(char *event, struct list_head *sdt_event_list);
+void remove_sdt_event_list(struct list_head *sdt_event_list);
#endif /*_PROBE_EVENT_H */
--
2.9.3

Ravi Bangoria

unread,
Mar 14, 2017, 11:10:07 AM3/14/17
to
If number of events found from probe-cache is not equal to number of
existing events(fetched from uprobe_events), and somehow we decides
to record only existing events, we warn user about the same. For ex,

$ sudo ./perf probe sdt_libpthread:mutex_release
Added new events:
sdt_libpthread:mutex_release (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_1 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_2 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_3 (on %mutex_release in /usr/lib64/libpthread-2.24.so)

$ sudo ./perf record -a -e sdt_libpthread:*
Warning: Recording on 4 occurrences of sdt_libpthread:*
Warning: Found 35 events from probe-cache with name 'sdt_libpthread:*'.
Since 4 probe points already exists, recording only them.
Hint: Please use 'perf probe -d sdt_libpthread:*' to allow record on all events.

$ sudo ./perf evlist
sdt_libpthread:mutex_release_3
sdt_libpthread:mutex_release_2
sdt_libpthread:mutex_release_1
sdt_libpthread:mutex_release

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/probe-event.c | 53 ++++++++++++++++++++++++++++---------------
1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 94b9105..7bf8783 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3523,22 +3523,15 @@ void remove_sdt_event_list(struct list_head *sdt_events)
free_sdt_list(sdt_events);
}

-static int get_sdt_events_from_cache(struct perf_probe_event *pev)
+static void sdt_warn_abt_exist_events(struct perf_probe_event *pev, int ctr)
{
- int ret = 0;
-
- pev->ntevs = find_cached_events_all(pev, &pev->tevs);
-
- if (pev->ntevs < 0) {
- pr_err("Error: Cache lookup failed (code: %d)\n", pev->ntevs);
- ret = pev->ntevs;
- } else if (!pev->ntevs) {
- pr_err("Error: %s:%s not found in the cache\n",
- pev->group, pev->event);
- ret = -EINVAL;
- }
-
- return ret;
+ pr_warning("Warning: Found %d events from probe-cache with name '%s:%s'.\n"
+ "\t Since %d probe point%c already exists, recording only %s.\n"
+ "Hint: Please use 'perf probe -d %s:%s' to allow record on all events.\n\n",
+ pev->ntevs, pev->group, pev->event, ctr,
+ ctr > 1 ? 's' : '\0',
+ ctr > 1 ? "them" : "it",
+ pev->group, pev->event);
}

static int add_event_to_sdt_evlist(struct probe_trace_event *tev,
@@ -3727,6 +3720,15 @@ int add_sdt_event(char *event, struct list_head *sdt_evlist)
probe_conf.max_probes = MAX_PROBES;
probe_conf.force_add = 1;

+ /*
+ * This call is intentionally placed before fetching events
+ * from uprobe_events file. If number of events found from probe-
+ * cache is not equal to number of existing events, and somehow
+ * we decides to record only existing events, we warn user about
+ * the same (sdt_warn_abt_exist_events()).
+ */
+ pev->ntevs = find_cached_events_all(pev, &pev->tevs);
+
/* Fetch all sdt events from uprobe_events */
exst_ntevs = probe_file__get_sdt_events(&exst_tevs);
if (exst_ntevs < 0) {
@@ -3738,14 +3740,29 @@ int add_sdt_event(char *event, struct list_head *sdt_evlist)
ret = sdt_event_probepoint_exists(pev, exst_tevs,
exst_ntevs, sdt_evlist);
if (ret) {
+ if (ret > 0 && pev->ntevs > 0 && ret != pev->ntevs)
+ sdt_warn_abt_exist_events(pev, ret);
ret = ret > 0 ? 0 : ret;
goto free_pev;
}

- /* Fetch all matching events from cache. */
- ret = get_sdt_events_from_cache(pev);
- if (ret < 0)
+ /*
+ * Check if find_cached_events_all() failed.
+ * We deliberately check failure of this function after checking
+ * entries in uprobe_events. Because, even if this function fails,
+ * we may find matching entry from uprobe_events and in that case
+ * we should continue recording that event.
+ */
+ if (pev->ntevs < 0) {
+ pr_err("Error: Cache lookup failed (code: %d)\n", pev->ntevs);
+ ret = pev->ntevs;
goto free_pev;
+ } else if (!pev->ntevs) {
+ pr_err("Error: %s:%s not found in the cache\n",
+ pev->group, pev->event);
+ ret = -EINVAL;

Ravi Bangoria

unread,
Mar 14, 2017, 11:10:07 AM3/14/17
to
Factor out the SDT event name checking routine as is_sdt_event().

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/parse-events.h | 20 ++++++++++++++++++++
tools/perf/util/probe-event.c | 9 +--------
2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1af6a26..8c72b0f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -8,6 +8,7 @@
#include <stdbool.h>
#include <linux/types.h>
#include <linux/perf_event.h>
+#include <string.h>

struct list_head;
struct perf_evsel;
@@ -196,4 +197,23 @@ int is_valid_tracepoint(const char *event_string);
int valid_event_mount(const char *eventfs);
char *parse_events_formats_error_string(char *additional_terms);

+#ifdef HAVE_LIBELF_SUPPORT
+/*
+ * If the probe point starts with '%',
+ * or starts with "sdt_" and has a ':' but no '=',
+ * then it should be a SDT/cached probe point.
+ */
+static inline bool is_sdt_event(char *str)
+{
+ return (str[0] == '%' ||
+ (!strncmp(str, "sdt_", 4) &&
+ !!strchr(str, ':') && !strchr(str, '=')));
+}
+#else
+static inline bool is_sdt_event(char *str __maybe_unused)
+{
+ return false;
+}
+#endif /* HAVE_LIBELF_SUPPORT */
+
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 28fb62c..2b1409f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c

Ravi Bangoria

unread,
Mar 14, 2017, 11:10:07 AM3/14/17
to
All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe point for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.

As suggested by Ingo[1], it's better to make this process simple by
creating probe point automatically with 'perf record' for SDT events.

Features:
- Allow both 'perf probe' and 'perf record' on sdt events without
changing current functionality.

- Event starting with 'sdt_' or '%' will be considered as SDT event.

- Always prioritize events from uprobe_events by first checking if
event exists with exact name. If not found and user has used
pattern, again try to find pattern matching entries from
uprobe_events. If found use them. If not, lookup into probe-cache.
If events found from probe-cache, again check if any event exists
in uprobe_events by matching filepath+address, as it might exists
in uprobe_events but with different name. Reuse those events which
exists in uprobe_events and create new entries for missing one.
Also maintain list for new entries being created and at the end
of the session, delete them.

- Show various warnings/hints to help user understand _which_ events
are being recorded and _why_. For ex,

When multiple events of same name found and all are being recorded:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_entry

Events being reused from uprobe_events is listed as 'name addr@file'
followed by hint on how to delete them:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry 0x9ddb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

If number of events found from cache is not equal to number of events
being recorded:

$ sudo ./perf record -a -e sdt_libpthread:mutex_entry
Warning: Found 2 events from probe-cache with name 'sdt_libpthread:mutex_entry'.
Since 1 probe point already exists, recording only it.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry' to allow record on all events.

- If all events found from probe-cache are not present in uprobe_events,
and user has used pattern to specify event, perf will record only
those events which are present in uprobe_events. This is to make perf
semantics consistent across normal and SDT events. And If user has
not used pattern, perf will record all events found from probe-cache
by reusing name for existing one and adding entries for missing one.
For ex,

$ sudo ./perf probe sdt_libpthread:mutex_release
Added new events:
sdt_libpthread:mutex_release (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_1 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_2 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_release_3 (on %mutex_release in /usr/lib64/libpthread-2.24.so)
$ sudo ./perf probe -d sdt_libpthread:mutex_release
$ sudo ./perf probe -d sdt_libpthread:mutex_release_2

$ sudo ./perf record -a -e sdt_libpthread:mutex_release*
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_release*

$ sudo ./perf record -a -e sdt_libpthread:mutex_release
Warning: Recording on 4 occurrences of sdt_libpthread:mutex_release

Changes in v5:
- Patch 2/7 is new. New option introduced in this patch helps to pass
custome data from builtin-*.c to libperf.

- All direct callbacks from libelf to builtin-record.c is removed.

- Merged 2nd and 4th patch of v4 into patch 2 of v5.

- Moved all functions from util/probe-file.c to util/probe-event.c
which operates on perf_probe_event.

- Made free_sdt_list() static as it's only used inside util/probe-event.c.

- Couple of other changes as Masami has suggested in v4 review.

Note: Patchset is prepared on top of acme/perf/core.

v4 link: https://lkml.org/lkml/2017/3/6/565

[1] https://lkml.org/lkml/2017/2/7/59
[2] https://lkml.org/lkml/2016/5/3/810


Hemant Kumar (1):
perf/sdt: Directly record SDT events with 'perf record'

Ravi Bangoria (6):
perf/sdt: Introduce util func is_sdt_event()
perf tool: Add option macro OPT_CALLBACK_ARG
perf/sdt: Allow recording of existing events
perf/sdt: Warn when number of events recorded are not equal to cached
events
perf/sdt: List events fetched from uprobe_events
perf/sdt: List events fetched from uprobe_events

tools/lib/api/fs/tracing_path.c | 17 +-
tools/lib/subcmd/parse-options.h | 4 +
tools/perf/builtin-record.c | 27 ++-
tools/perf/util/parse-events.c | 60 +++++++
tools/perf/util/parse-events.h | 23 +++
tools/perf/util/probe-event.c | 362 +++++++++++++++++++++++++++++++++++++--
tools/perf/util/probe-event.h | 16 ++
tools/perf/util/probe-file.c | 48 ++++++
tools/perf/util/probe-file.h | 1 +
9 files changed, 527 insertions(+), 31 deletions(-)

--
2.9.3

Ravi Bangoria

unread,
Mar 14, 2017, 11:10:11 AM3/14/17
to
List those events which are fetched from uprobe_events as 'event addr@file'
followed by hint on how these events can be deleted with 'perf probe -d'
command.

For example:
$ sudo cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_release /usr/lib64/libpthread-2.24.so:0x000000000000b126

$ sudo ./perf record -a -e sdt_libpthread:mutex_release
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_release 0xb126@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
---
tools/perf/util/probe-event.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 7bf8783..e150de0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3593,6 +3593,24 @@ static void sdt_warn_multi_events(int ctr, struct perf_probe_event *pev)
ctr, pev->group, pev->event);
}

+static void print_exst_sdt_events(struct probe_trace_event *tev)
+{
+ static bool msg_head;
+
+ if (!msg_head) {
+ pr_info("Matching event(s) from uprobe_events:\n");
+ msg_head = true;
+ }
+
+ pr_info(" %s:%s 0x%" PRIx64 "@%s\n", tev->group,
+ tev->event, tev->point.address, tev->point.module);
+}
+
+static void print_exst_sdt_event_footer(void)
+{
+ pr_info("Use 'perf probe -d <event>' to delete event(s).\n\n");
+}
+
static int sdt_event_probepoint_exists(struct perf_probe_event *pev,
struct probe_trace_event *tevs,
int ntevs,
@@ -3607,10 +3625,14 @@ static int sdt_event_probepoint_exists(struct perf_probe_event *pev,
if (ret < 0)
return ret;

+ print_exst_sdt_events(&tevs[i]);
ctr++;
}
}

+ if (ctr > 0)
+ print_exst_sdt_event_footer();
+
if (ctr > 1)
sdt_warn_multi_events(ctr, pev);

@@ -3660,11 +3682,16 @@ static int sdt_merge_events(struct perf_probe_event *pev,

if (!ptrn_used)
shift_sdt_events(pev, i);
+
+ print_exst_sdt_events(&exst_tevs[j]);
ctr++;
}
}
}

+ if (ctr > 0)
+ print_exst_sdt_event_footer();
+
if (!ptrn_used || ctr == 0) {
/*
* Create probe point for all probe-cached events by
--
2.9.3

Jiri Olsa

unread,
Mar 15, 2017, 8:10:05 AM3/15/17
to
On Tue, Mar 14, 2017 at 08:36:54PM +0530, Ravi Bangoria wrote:

SNIP
so what happens if there're more than one event in 'str' like:
-e cycles,std_...

would it be better to plug this directly into parse-events.y
parser.. and handle it like any other event type?

thanks,
jirka

Arnaldo Carvalho de Melo

unread,
Mar 15, 2017, 9:20:05 AM3/15/17
to
Em Wed, Mar 15, 2017 at 01:03:31PM +0100, Jiri Olsa escreveu:
> On Tue, Mar 14, 2017 at 08:36:54PM +0530, Ravi Bangoria wrote:
> > +++ b/tools/perf/builtin-record.c
> > +static int record__parse_events_option(const struct option *opt,
> > + const char *str,
> > + int unset)
> > +{
> > + if (is_sdt_event((char *) str))
> > + return parse_sdt_events_option(opt, str, unset);
> > + else
> > + return parse_events_option(opt, str, unset);
>
> so what happens if there're more than one event in 'str' like:
> -e cycles,std_...
>
> would it be better to plug this directly into parse-events.y
> parser.. and handle it like any other event type?

Yeah, I went to bed thinking about this :-)

Ravi,

Please test something like:

perf record -e cycles,sdt_glib:mem__alloc,sched:*switch -a sleep 5s

- Arnaldo

- Arnaldo

Ravi Bangoria

unread,
Mar 15, 2017, 10:00:05 AM3/15/17
to
Thanks Jiri, Arnaldo,

Yeah, this is failing :)

$ sudo ./perf record -a -e sdt_libpthread:mutex_release,sced:*switch
Error: sdt_libpthread:mutex_release,sched:*switch not found in the cache
invalid or unsupported event: 'sdt_libpthread:mutex_release,sched:*switch'

Actually I tested with 'perf record -e ev1 -e ev2', but in this case
record__parse_events_option will get called separately for each
individual event.

Anyway, I'll fix this in next version.

Thanks for the review,
Ravi

Masami Hiramatsu

unread,
Mar 16, 2017, 6:00:07 AM3/16/17
to
Hi Ravi,
Could you also describe which patches are updated? It seems 1/7 is not
modified, correct?

Thank you,

--
Masami Hiramatsu <mhir...@kernel.org>

Ravi Bangoria

unread,
Mar 16, 2017, 7:30:06 AM3/16/17
to
Hi Masami,

On Thursday 16 March 2017 03:21 PM, Masami Hiramatsu wrote:
> On Tue, 14 Mar 2017 20:36:51 +0530
> Ravi Bangoria <ravi.b...@linux.vnet.ibm.com> wrote:
>
>> Changes in v5:
>> - Patch 2/7 is new. New option introduced in this patch helps to pass
>> custome data from builtin-*.c to libperf.
>>
>> - All direct callbacks from libelf to builtin-record.c is removed.
>>

Minor correction.. s/libelf/libperf/

>> - Merged 2nd and 4th patch of v4 into patch 2 of v5.

s/patch 2 of v5/patch 3 of v5/

>>
>> - Moved all functions from util/probe-file.c to util/probe-event.c
>> which operates on perf_probe_event.
>>
>> - Made free_sdt_list() static as it's only used inside util/probe-event.c.
>>
>> - Couple of other changes as Masami has suggested in v4 review.
> Hi Ravi,
> Could you also describe which patches are updated? It seems 1/7 is not
> modified, correct?

Let me list a patch-wise brief changelog.

patch 1/7:- Introduced dummy version of is_sdt_event() which always return false
if !HAVE_LIBELF_SUPPORT.

patch 2/7: - is new. A new option introduced in this patch helps to passcustom
data from builtin-*.c to libperf.

patch 3/7: - Removed direct calls from libperf to builtin-record.c which was used
to prepare record.sdt_event_list. Instead pass list to libperf and let
libperf manage it.

- Introduce new wrapper func record__parse_events_option() that can
differentiate between sdt and other events while parsing them in
perf record.

- Moved all functions from util/probe-file.c to util/probe-event.c
which operates on perf_probe_event.

- Merged 2nd and 4th patch of v4 into this patch.

- Made free_sdt_list() static as it's only used inside util/probe-event.c.

patch 4/7:- Removed direct calls from libperf to builtin-record.c which was used
to prepare record.sdt_event_list. Instead pass list to libperf and let
libperf manage it.

- Moved all functions from util/probe-file.c to util/probe-event.c
which operates on perf_probe_event.

patch 5/7: No changes

patch 6/7: No changes

patch 7/7: No changes

Let me know if you need more details.

Thanks,
Ravi

tip-bot for Ravi Bangoria

unread,
Mar 16, 2017, 12:50:06 PM3/16/17
to
Commit-ID: af9100ad149cf31a1ab1160f71bb4025443dbdb6
Gitweb: http://git.kernel.org/tip/af9100ad149cf31a1ab1160f71bb4025443dbdb6
Author: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
AuthorDate: Tue, 14 Mar 2017 20:36:52 +0530
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 15 Mar 2017 17:48:37 -0300

perf probe: Introduce util func is_sdt_event()

Factor out the SDT event name checking routine as is_sdt_event().

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Hemant Kumar <hem...@linux.vnet.ibm.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Mathieu Poirier <mathieu...@linaro.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sukadev Bhattiprolu <suk...@linux.vnet.ibm.com>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20170314150658.70...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
index c9bdc9d..b19d178 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1341,14 +1341,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)

Masami Hiramatsu

unread,
Mar 17, 2017, 1:00:05 AM3/17/17
to
Hi Ravi,

On Thu, 16 Mar 2017 16:57:52 +0530
Thanks! that's very helpful for me to review it.


--
Masami Hiramatsu <mhir...@kernel.org>

Masami Hiramatsu

unread,
Mar 17, 2017, 5:20:05 AM3/17/17
to
Hi Ravi,

(I avoided to review parser part since it may go to yacc in next version)

On Tue, 14 Mar 2017 20:36:54 +0530
Ravi Bangoria <ravi.b...@linux.vnet.ibm.com> wrote:

[SNIP]
> @@ -1516,9 +1534,10 @@ static bool dry_run;
> * using pipes, etc.
> */
> static struct option __record_options[] = {
> - OPT_CALLBACK('e', "event", &record.evlist, "event",
> - "event selector. use 'perf list' to list available events",
> - parse_events_option),
> + OPT_CALLBACK_ARG('e', "event", &record.evlist,
> + &record.sdt_event_list, "event",
> + "event selector. use 'perf list' to list available events",
> + record__parse_events_option),

Does --event option NOT requires argument without this patch?
If it should be changed to use OPT_CALLBACK_ARG(), would it be
better merge this part to previous patch?

[SNIP]
> +/*
> + * Delete the SDT events from uprobe_events file that
> + * were created initially.
> + */
> +void remove_sdt_event_list(struct list_head *sdt_events)
> +{
> + struct sdt_event_list *sdt_event;
> + struct strfilter *filter = NULL;
> + const char *err = NULL;
> +
> + if (list_empty(sdt_events))
> + return;
> +
> + list_for_each_entry(sdt_event, sdt_events, list) {
> + if (!filter) {
> + filter = strfilter__new(sdt_event->name, &err);
> + if (!filter)
> + goto free_list;

Don't we need to return error code for this case?

> + } else {
> + strfilter__or(filter, sdt_event->name, &err);

strfilter__or() can fail here.

> + }
> + }
> +
> + del_perf_probe_events(filter);

Here too, if it is ignored silently by design, please comment it here.
Well, strbuf can make this simpler as below ;-)

struct strbuf buf = STRBUF_INIT;

> +
> + tmp = zalloc(sizeof(*tmp));
> + if (!tmp)
> + return -ENOMEM;
> +
> + INIT_LIST_HEAD(&tmp->list);

if (strbuf_addf(&buf, "%s:%s", tev->group, tev->event))
goto error;

tmp->name = strbuf_detach(&buf);

> + list_add(&tmp->list, sdt_evlist);
> +
> + return 0;

error:
free(tmp);

return -ENOMEM;
> +}
> +
> +static int add_events_to_sdt_evlist(struct perf_probe_event *pev,
> + struct list_head *sdt_evlist)
> +{
> + int i, ret;
> +
> + for (i = 0; i < pev->ntevs; i++) {
> + ret = add_event_to_sdt_evlist(&pev->tevs[i], sdt_evlist);
> +
> + if (ret < 0)
> + return ret;
> + }
> + return 0;
> +}


Thanks,

--
Masami Hiramatsu <mhir...@kernel.org>

Ravi Bangoria

unread,
Mar 20, 2017, 12:00:05 AM3/20/17
to
Thanks Masami for detailed review.

Please see my comments below.

On Friday 17 March 2017 02:35 PM, Masami Hiramatsu wrote:
> Hi Ravi,
>
> (I avoided to review parser part since it may go to yacc in next version)
>
> On Tue, 14 Mar 2017 20:36:54 +0530
> Ravi Bangoria <ravi.b...@linux.vnet.ibm.com> wrote:
>
> [SNIP]
>> @@ -1516,9 +1534,10 @@ static bool dry_run;
>> * using pipes, etc.
>> */
>> static struct option __record_options[] = {
>> - OPT_CALLBACK('e', "event", &record.evlist, "event",
>> - "event selector. use 'perf list' to list available events",
>> - parse_events_option),
>> + OPT_CALLBACK_ARG('e', "event", &record.evlist,
>> + &record.sdt_event_list, "event",
>> + "event selector. use 'perf list' to list available events",
>> + record__parse_events_option),
> Does --event option NOT requires argument without this patch?
> If it should be changed to use OPT_CALLBACK_ARG(), would it be
> better merge this part to previous patch?

Ok. Yes, it does. I think macro name is confusing.

This new macro allows passing of extra data from builtin-*.c to libelf.
One such macro already exists (OPT_CALLBACK_OPTARG), but the
argument is optional for it and thus it ignores the argument. I need
a macro in which argument is necessary and it also allows to pass
extra data. Hence, I introduced this macro.

Will change macro to OPT_CALLBACK_ARGDATA. Please suggest if
you have better name.
Sure. Will think about handling errors in this function.
Sure, will do it.

Thanks :)
Ravi
0 new messages