Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[GIT PULL 00/15] perf/core improvements and fixes

272 views
Skip to first unread message

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:40:12 PM11/14/16
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end.

The following changes since commit 91a79e5fa696fa626bfbd47f827eaf3eb7d76dc5:

Merge tag 'perf-core-for-mingo-20161028' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-28 19:37:34 +0200)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161114

for you to fetch changes up to fef51ecd1056b5e090c9fb73e0833bd751389572:

perf report: Show branch info in callchain entry for browser mode (2016-11-14 13:34:08 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Allow querying and setting .perfconfig variables (Taeung Song)

- Show branch information in callchains (predicted, TSX aborts, loop
iteractions, etc) (Jin Yao)

Infrastructure:

- Support kbuild's CFLAGS_REMOVE_ in tools/build (Jiri Olsa)

- Plug building jvmti to the main perf Makefile (Jiri Olsa)

Documentation:

- Update Intel PT documentation about context switch events (Arnaldo Carvalho de Melo)

- Fix 'perf record --call-graph dwarf' help/config in builds not linking
with a unwind library, mentioning that is a possible record option (Rabin Vincent)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (1):
perf intel-pt: Update documentation about context switch events

Jin Yao (5):
perf report: Add branch flag to callchain cursor node
perf report: Create a symbol_conf flag for showing branch flag counting
perf report: Calculate and return the branch flag counting
perf report: Show branch info in callchain entry for stdio mode
perf report: Show branch info in callchain entry for browser mode

Jiri Olsa (4):
tools build: Add CFLAGS_REMOVE_* support
tools build: Add jvmti feature detection support
perf jvmti: Plug compilation into perf build
perf kvmti: Remove unused Makefile file

Rabin Vincent (1):
perf callchain: Fixup help/config for no-unwinding

Taeung Song (4):
perf config: Add support for getting config key-value pairs
perf config: Validate config variable arguments before trying use them
perf config: Add support setting variables in a config file
perf config: Mark where are config items from (user or system)

tools/build/Build.include | 4 +-
tools/build/Documentation/Build.txt | 6 +-
tools/build/feature/Makefile | 6 +-
tools/build/feature/test-jvmti.c | 13 ++
tools/perf/Documentation/intel-pt.txt | 19 ++-
tools/perf/Documentation/perf-config.txt | 35 ++++++
tools/perf/Makefile.config | 26 ++++
tools/perf/Makefile.perf | 24 +++-
tools/perf/builtin-config.c | 137 ++++++++++++++++++++-
tools/perf/builtin-report.c | 3 +
tools/perf/jvmti/Build | 8 ++
tools/perf/jvmti/Makefile | 89 --------------
tools/perf/tests/make | 2 +-
tools/perf/ui/browsers/hists.c | 20 ++-
tools/perf/ui/stdio/hist.c | 35 +++++-
tools/perf/util/callchain.c | 205 ++++++++++++++++++++++++++++++-
tools/perf/util/callchain.h | 26 +++-
tools/perf/util/config.c | 20 +++
tools/perf/util/config.h | 4 +
tools/perf/util/machine.c | 82 ++++++++++---
tools/perf/util/symbol.h | 1 +
21 files changed, 634 insertions(+), 131 deletions(-)
create mode 100644 tools/build/feature/test-jvmti.c
create mode 100644 tools/perf/jvmti/Build
delete mode 100644 tools/perf/jvmti/Makefile

[root@jouet ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: detect openat syscall event : Ok
3: detect openat syscall event on all cpus : Ok
4: read samples using the mmap interface : Ok
5: parse events tests : Ok
6: Validate PERF_RECORD_* events & perf_sample fields : Ok
7: Test perf pmu format parsing : Ok
8: Test dso data read : Ok
9: Test dso data cache : Ok
10: Test dso data reopen : Ok
11: roundtrip evsel->name check : Ok
12: Check parsing of sched tracepoints fields : Ok
13: Generate and check syscalls:sys_enter_openat event fields: Ok
14: struct perf_event_attr setup : Ok
15: Test matching and linking multiple hists : Ok
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : Ok
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : Ok
29: Test cumulation of child hist entries : Ok
30: Test tracking with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: Test kmod_path__parse function : Ok
34: Test thread map : Ok
35: Test LLVM searching and compiling :
35.1: Basic BPF llvm compiling test : Ok
35.2: Test kbuild searching : Ok
35.3: Compile source for BPF prologue generation test : Ok
35.4: Compile source for BPF relocation test : Ok
36: Test topology in session : Ok
37: Test BPF filter :
37.1: Test basic BPF filtering : Ok
37.2: Test BPF prologue generation : Ok
37.3: Test BPF relocation checker : Ok
38: Test thread map synthesize : Ok
39: Test cpu map synthesize : Ok
40: Test stat config synthesize : Ok
41: Test stat synthesize : Ok
42: Test stat round synthesize : Ok
43: Test attr update synthesize : Ok
44: Test events times : Ok
45: Test backward reading from ring buffer : Ok
46: Test cpu map print : Ok
47: Test SDT event probing : Ok
48: Test is_printable_array function : Ok
49: Test bitmap print : Ok
50: x86 rdpmc test : Ok
51: Test converting perf time to TSC : Ok
52: Test dwarf unwind : Ok
53: Test x86 instruction decoder - new instructions : Ok
54: Test intel cqm nmi context read : Skip
[root@jouet ~]#

[root@zoo ~]# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 fedora:20: Ok
11 fedora:21: Ok
12 fedora:22: Ok
13 fedora:23: Ok
14 fedora:24: Ok
15 fedora:24-x-ARC-uClibc: Ok
16 fedora:rawhide: Ok
17 mageia:5: Ok
18 opensuse:13.2: Ok
19 opensuse:42.1: Ok
20 opensuse:tumbleweed: Ok
21 ubuntu:12.04.5: Ok
22 ubuntu:14.04: Ok
23 ubuntu:14.04.4: Ok
24 ubuntu:15.10: Ok
25 ubuntu:16.04: Ok
26 ubuntu:16.04-x-arm: Ok
27 ubuntu:16.04-x-arm64: Ok
28 ubuntu:16.04-x-powerpc: Ok
29 ubuntu:16.04-x-powerpc64: Ok
30 ubuntu:16.04-x-powerpc64el: Ok
31 ubuntu:16.04-x-s390: Ok
32 ubuntu:16.10: Ok

real 61m29.498s
user 0m3.969s
sys 0m3.525s
[root@zoo ~]#

[acme@jouet linux]$ perf stat make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_install_O: make install
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libperl_O: make NO_LIBPERL=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_clean_all_O: make clean all
make_debug_O: make DEBUG=1
make_no_newt_O: make NO_NEWT=1
make_perf_o_O: make perf.o
make_no_demangle_O: make NO_DEMANGLE=1
make_doc_O: make doc
make_install_bin_O: make install-bin
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_slang_O: make NO_SLANG=1
make_no_libelf_O: make NO_LIBELF=1
make_static_O: make LDFLAGS=-static
make_util_map_o_O: make util/map.o
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_pure_O: make
make_help_O: make help
make_no_gtk2_O: make NO_GTK2=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_tags_O: make tags
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:04 PM11/14/16
to
From: Jiri Olsa <jo...@kernel.org>

Adding support to detect jvmti support. It is not plugged into the
FEATURE_TESTS machinery, because it's quite rare and will be used
separately from perf via feature_check call.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Stephane Eranian <era...@google.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: William Cohen <wco...@redhat.com>
Link: http://lkml.kernel.org/r/1478093749-5602-3-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/feature/Makefile | 6 +++++-
tools/build/feature/test-jvmti.c | 13 +++++++++++++
2 files changed, 18 insertions(+), 1 deletion(-)
create mode 100644 tools/build/feature/test-jvmti.c

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index ac9c477a2a48..8f668bce8996 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -47,7 +47,8 @@ FILES= \
test-bpf.bin \
test-get_cpuid.bin \
test-sdt.bin \
- test-cxx.bin
+ test-cxx.bin \
+ test-jvmti.bin

FILES := $(addprefix $(OUTPUT),$(FILES))

@@ -225,6 +226,9 @@ $(OUTPUT)test-sdt.bin:
$(OUTPUT)test-cxx.bin:
$(BUILDXX) -std=gnu++11

+$(OUTPUT)test-jvmti.bin:
+ $(BUILD)
+
-include $(OUTPUT)*.d

###############################
diff --git a/tools/build/feature/test-jvmti.c b/tools/build/feature/test-jvmti.c
new file mode 100644
index 000000000000..1c665f09b9d6
--- /dev/null
+++ b/tools/build/feature/test-jvmti.c
@@ -0,0 +1,13 @@
+#include <jvmti.h>
+#include <jvmticmlr.h>
+
+int main(void)
+{
+ JavaVM jvm __attribute__((unused));
+ jvmtiEventCallbacks cb __attribute__((unused));
+ jvmtiCapabilities caps __attribute__((unused));
+ jvmtiJlocationFormat format __attribute__((unused));
+ jvmtiEnv jvmti __attribute__((unused));
+
+ return 0;
+}
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:04 PM11/14/16
to
From: Taeung Song <treeze...@gmail.com>

Add setting feature that can add config variables with their values to a
config file (i.e. user or system config file) or modify config key-value
pairs in a config file. For the syntax examples:

perf config [<file-option>] [section.name[=value] ...]

e.g. You can set the ui.show-headers to false with

# perf config ui.show-headers=false

If you want to add or modify several config items, you can do like

# perf config annotate.show_nr_jumps=false kmem.default=slab

Committer notes:

Testing it:

$ perf config -l
top.children=true
report.children=false
$
$ perf config top.children=false
$ perf config -l
top.children=false
report.children=false
$
$ perf config kmem.default=slab
$ perf config -l
top.children=false
report.children=false
kmem.default=slab
$

Signed-off-by: Taeung Song <treeze...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Nambong Ha <over...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: Wookje Kwon <awe...@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-5-git-...@gmail.com
[ Combined patch with docs update with this one ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-config.txt | 19 ++++++++-
tools/perf/builtin-config.c | 68 +++++++++++++++++++++++++++++---
tools/perf/util/config.c | 6 +++
tools/perf/util/config.h | 2 +
4 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 1714b0c8c8e1..9365b75fd04f 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -8,7 +8,7 @@ perf-config - Get and set variables in a configuration file.
SYNOPSIS
--------
[verse]
-'perf config' [<file-option>] [section.name ...]
+'perf config' [<file-option>] [section.name[=value] ...]
or
'perf config' [<file-option>] -l | --list

@@ -120,6 +120,23 @@ Given a $HOME/.perfconfig like this:
children = true
group = true

+You can hide source code of annotate feature setting the config to false with
+
+ % perf config annotate.hide_src_code=true
+
+If you want to add or modify several config items, you can do like
+
+ % perf config ui.show-headers=false kmem.default=slab
+
+To modify the sort order of report functionality in user config file(i.e. `~/.perfconfig`), do
+
+ % perf config --user report sort-order=srcline
+
+To change colors of selected line to other foreground and background colors
+in system config file (i.e. `$(sysconf)/perfconfig`), do
+
+ % perf config --system colors.selected=yellow,green
+
To query the record mode of call graph, do

% perf config call-graph.record-mode
diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 88a43fe4963c..7c861b54f3a6 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -17,7 +17,7 @@
static bool use_system_config, use_user_config;

static const char * const config_usage[] = {
- "perf config [<file-option>] [options] [section.name ...]",
+ "perf config [<file-option>] [options] [section.name[=value] ...]",
NULL
};

@@ -33,6 +33,39 @@ static struct option config_options[] = {
OPT_END()
};

+static int set_config(struct perf_config_set *set, const char *file_name,
+ const char *var, const char *value)
+{
+ struct perf_config_section *section = NULL;
+ struct perf_config_item *item = NULL;
+ const char *first_line = "# this file is auto-generated.";
+ FILE *fp;
+
+ if (set == NULL)
+ return -1;
+
+ fp = fopen(file_name, "w");
+ if (!fp)
+ return -1;
+
+ perf_config_set__collect(set, var, value);
+ fprintf(fp, "%s\n", first_line);
+
+ /* overwrite configvariables */
+ perf_config_items__for_each_entry(&set->sections, section) {
+ fprintf(fp, "[%s]\n", section->name);
+
+ perf_config_items__for_each_entry(&section->items, item) {
+ if (item->value)
+ fprintf(fp, "\t%s = %s\n",
+ item->name, item->value);
+ }
+ }
+ fclose(fp);
+
+ return 0;
+}
+
static int show_spec_config(struct perf_config_set *set, const char *var)
{
struct perf_config_section *section;
@@ -82,7 +115,7 @@ static int show_config(struct perf_config_set *set)
return 0;
}

-static int parse_config_arg(char *arg, char **var)
+static int parse_config_arg(char *arg, char **var, char **value)
{
const char *last_dot = strchr(arg, '.');

@@ -99,7 +132,21 @@ static int parse_config_arg(char *arg, char **var)
return -1;
}

- *var = arg;
+ *value = strchr(arg, '=');
+ if (*value == NULL)
+ *var = arg;
+ else if (!strcmp(*value, "=")) {
+ pr_err("The config variable does not contain a value: %s\n", arg);
+ return -1;
+ } else {
+ *value = *value + 1; /* excluding a first character '=' */
+ *var = strsep(&arg, "=");
+ if (*var[0] == '\0') {
+ pr_err("invalid config variable: %s\n", arg);
+ return -1;
+ }
+ }
+
return 0;
}

@@ -153,7 +200,8 @@ int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
default:
if (argc) {
for (i = 0; argv[i]; i++) {
- char *var, *arg = strdup(argv[i]);
+ char *var, *value;
+ char *arg = strdup(argv[i]);

if (!arg) {
pr_err("%s: strdup failed\n", __func__);
@@ -161,13 +209,21 @@ int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
break;
}

- if (parse_config_arg(arg, &var) < 0) {
+ if (parse_config_arg(arg, &var, &value) < 0) {
free(arg);
ret = -1;
break;
}

- ret = show_spec_config(set, var);
+ if (value == NULL)
+ ret = show_spec_config(set, var);
+ else {
+ const char *config_filename = config_exclusive_filename;
+
+ if (!config_exclusive_filename)
+ config_filename = user_config;
+ ret = set_config(set, config_filename, var, value);
+ }
free(arg);
}
} else
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 18dae745034f..c8fb65d923cb 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -602,6 +602,12 @@ static int collect_config(const char *var, const char *value,
return -1;
}

+int perf_config_set__collect(struct perf_config_set *set,
+ const char *var, const char *value)
+{
+ return collect_config(var, value, set);
+}
+
static int perf_config_set__init(struct perf_config_set *set)
{
int ret = -1;
diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h
index 6f813d46045e..0fcdb8c594b0 100644
--- a/tools/perf/util/config.h
+++ b/tools/perf/util/config.h
@@ -33,6 +33,8 @@ const char *perf_etc_perfconfig(void);

struct perf_config_set *perf_config_set__new(void);
void perf_config_set__delete(struct perf_config_set *set);
+int perf_config_set__collect(struct perf_config_set *set,
+ const char *var, const char *value);
void perf_config__init(void);
void perf_config__exit(void);
void perf_config__refresh(void);
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Taeung Song <treeze...@gmail.com>

To write config items to a particular config file, we should know where
is each config section and item from.

Current setting functionality of perf-config use autogenerating way by
overwriting collected config items to a config file.

For example, when collecting config items from user and system config
files (i.e. ~/.perfconfig and $(sysconf)/perfconfig), perf_config_set
can contain both user and system config items. So we should know where
each value is from to avoid merging user and system config items on user
config file.

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Nambong Ha <over...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: Wookje Kwon <awe...@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-7-git-...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-config.c | 6 +++++-
tools/perf/util/config.c | 16 +++++++++++++++-
tools/perf/util/config.h | 4 +++-
3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 7c861b54f3a6..8c0d93b7c2f0 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -48,14 +48,18 @@ static int set_config(struct perf_config_set *set, const char *file_name,
if (!fp)
return -1;

- perf_config_set__collect(set, var, value);
+ perf_config_set__collect(set, file_name, var, value);
fprintf(fp, "%s\n", first_line);

/* overwrite configvariables */
perf_config_items__for_each_entry(&set->sections, section) {
+ if (!use_system_config && section->from_system_config)
+ continue;
fprintf(fp, "[%s]\n", section->name);

perf_config_items__for_each_entry(&section->items, item) {
+ if (!use_system_config && section->from_system_config)
+ continue;
if (item->value)
fprintf(fp, "\t%s = %s\n",
item->name, item->value);
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index c8fb65d923cb..3d906dbbef74 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -594,6 +594,19 @@ static int collect_config(const char *var, const char *value,
goto out_free;
}

+ /* perf_config_set can contain both user and system config items.
+ * So we should know where each value is from.
+ * The classification would be needed when a particular config file
+ * is overwrited by setting feature i.e. set_config().
+ */
+ if (strcmp(config_file_name, perf_etc_perfconfig()) == 0) {
+ section->from_system_config = true;
+ item->from_system_config = true;
+ } else {
+ section->from_system_config = false;
+ item->from_system_config = false;
+ }
+
ret = set_value(item, value);
return ret;

@@ -602,9 +615,10 @@ static int collect_config(const char *var, const char *value,
return -1;
}

-int perf_config_set__collect(struct perf_config_set *set,
+int perf_config_set__collect(struct perf_config_set *set, const char *file_name,
const char *var, const char *value)
{
+ config_file_name = file_name;
return collect_config(var, value, set);
}

diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h
index 0fcdb8c594b0..1a59a6b43f8b 100644
--- a/tools/perf/util/config.h
+++ b/tools/perf/util/config.h
@@ -7,12 +7,14 @@
struct perf_config_item {
char *name;
char *value;
+ bool from_system_config;
struct list_head node;
};

struct perf_config_section {
char *name;
struct list_head items;
+ bool from_system_config;
struct list_head node;
};

@@ -33,7 +35,7 @@ const char *perf_etc_perfconfig(void);

struct perf_config_set *perf_config_set__new(void);
void perf_config_set__delete(struct perf_config_set *set);
-int perf_config_set__collect(struct perf_config_set *set,
+int perf_config_set__collect(struct perf_config_set *set, const char *file_name,
const char *var, const char *value);
void perf_config__init(void);
void perf_config__exit(void);
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Jin Yao <yao...@linux.intel.com>

Since the branch ip has been added to call stack for easier browsing,
this patch adds more branch information. For example, add a flag to
indicate if this ip is a branch, and also add with the branch flag.

Then we can know if the cursor node represents a branch and know what
the branch flag it has.

The branch history code has a loop detection pass that removes loops. It
would be nice for knowing how many loops were removed then in next
steps, we can compute out the average number of iterations.

For example:

Before remove_loops(),
entry0: from = 0x100, to = 0x200
entry1: from = 0x300, to = 0x250
entry2: from = 0x300, to = 0x250
entry3: from = 0x300, to = 0x250
entry4: from = 0x700, to = 0x800

After remove_loops()
entry0: from = 0x100, to = 0x200
entry1: from = 0x300, to = 0x250
entry2: from = 0x700, to = 0x800

The original entry2 and entry3 are removed. So the number of iterations
(from = 0x300, to = 0x250) is equal to removed number + 1 (2 + 1).

iterations = removed number + 1;
average iteractions = Sum(iteractions) / number of samples

This formula ignores other cases, for example, iterations cross multiple
buffers and one buffer contains 2+ loops. Because in practice, it's good
enough.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Linux-...@vger.kernel.org
Cc: Yao Jin <yao...@linux.intel.com>
Link: http://lkml.kernel.org/n/1477876794-30749-2-g...@linux.intel.com
[ Renamed 'iter' to 'nr_loop_iter' for clarity ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 14 ++++++--
tools/perf/util/callchain.h | 8 ++++-
tools/perf/util/machine.c | 82 ++++++++++++++++++++++++++++++++++++---------
3 files changed, 86 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index ae58b493af45..138a415fad0d 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -728,7 +728,8 @@ merge_chain_branch(struct callchain_cursor *cursor,

list_for_each_entry_safe(list, next_list, &src->val, list) {
callchain_cursor_append(cursor, list->ip,
- list->ms.map, list->ms.sym);
+ list->ms.map, list->ms.sym,
+ false, NULL, 0, 0);
list_del(&list->list);
free(list);
}
@@ -765,7 +766,9 @@ int callchain_merge(struct callchain_cursor *cursor,
}

int callchain_cursor_append(struct callchain_cursor *cursor,
- u64 ip, struct map *map, struct symbol *sym)
+ u64 ip, struct map *map, struct symbol *sym,
+ bool branch, struct branch_flags *flags,
+ int nr_loop_iter, int samples)
{
struct callchain_cursor_node *node = *cursor->last;

@@ -780,6 +783,13 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
node->ip = ip;
node->map = map;
node->sym = sym;
+ node->branch = branch;
+ node->nr_loop_iter = nr_loop_iter;
+ node->samples = samples;
+
+ if (flags)
+ memcpy(&node->branch_flags, flags,
+ sizeof(struct branch_flags));

cursor->nr++;

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 47cfd1080975..df6329d1c350 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -125,6 +125,10 @@ struct callchain_cursor_node {
u64 ip;
struct map *map;
struct symbol *sym;
+ bool branch;
+ struct branch_flags branch_flags;
+ int nr_loop_iter;
+ int samples;
struct callchain_cursor_node *next;
};

@@ -179,7 +183,9 @@ static inline void callchain_cursor_reset(struct callchain_cursor *cursor)
}

int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
- struct map *map, struct symbol *sym);
+ struct map *map, struct symbol *sym,
+ bool branch, struct branch_flags *flags,
+ int nr_loop_iter, int samples);

/* Close a cursor writing session. Initialize for the reader */
static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index df85b9efd80f..9b33bef54581 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1616,7 +1616,11 @@ static int add_callchain_ip(struct thread *thread,
struct symbol **parent,
struct addr_location *root_al,
u8 *cpumode,
- u64 ip)
+ u64 ip,
+ bool branch,
+ struct branch_flags *flags,
+ int nr_loop_iter,
+ int samples)
{
struct addr_location al;

@@ -1668,7 +1672,8 @@ static int add_callchain_ip(struct thread *thread,

if (symbol_conf.hide_unresolved && al.sym == NULL)
return 0;
- return callchain_cursor_append(cursor, al.addr, al.map, al.sym);
+ return callchain_cursor_append(cursor, al.addr, al.map, al.sym,
+ branch, flags, nr_loop_iter, samples);
}

struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
@@ -1757,7 +1762,9 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
/* LBR only affects the user callchain */
if (i != chain_nr) {
struct branch_stack *lbr_stack = sample->branch_stack;
- int lbr_nr = lbr_stack->nr, j;
+ int lbr_nr = lbr_stack->nr, j, k;
+ bool branch;
+ struct branch_flags *flags;
/*
* LBR callstack can only get user call chain.
* The mix_chain_nr is kernel call chain
@@ -1772,23 +1779,41 @@ static int resolve_lbr_callchain_sample(struct thread *thread,

for (j = 0; j < mix_chain_nr; j++) {
int err;
+ branch = false;
+ flags = NULL;
+
if (callchain_param.order == ORDER_CALLEE) {
if (j < i + 1)
ip = chain->ips[j];
- else if (j > i + 1)
- ip = lbr_stack->entries[j - i - 2].from;
- else
+ else if (j > i + 1) {
+ k = j - i - 2;
+ ip = lbr_stack->entries[k].from;
+ branch = true;
+ flags = &lbr_stack->entries[k].flags;
+ } else {
ip = lbr_stack->entries[0].to;
+ branch = true;
+ flags = &lbr_stack->entries[0].flags;
+ }
} else {
- if (j < lbr_nr)
- ip = lbr_stack->entries[lbr_nr - j - 1].from;
+ if (j < lbr_nr) {
+ k = lbr_nr - j - 1;
+ ip = lbr_stack->entries[k].from;
+ branch = true;
+ flags = &lbr_stack->entries[k].flags;
+ }
else if (j > lbr_nr)
ip = chain->ips[i + 1 - (j - lbr_nr)];
- else
+ else {
ip = lbr_stack->entries[0].to;
+ branch = true;
+ flags = &lbr_stack->entries[0].flags;
+ }
}

- err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip);
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ branch, flags, 0, 0);
if (err)
return (err < 0) ? err : 0;
}
@@ -1813,6 +1838,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
int i, j, err, nr_entries;
int skip_idx = -1;
int first_call = 0;
+ int nr_loop_iter;

if (perf_evsel__has_branch_callstack(evsel)) {
err = resolve_lbr_callchain_sample(thread, cursor, sample, parent,
@@ -1868,14 +1894,37 @@ static int thread__resolve_callchain_sample(struct thread *thread,
be[i] = branch->entries[branch->nr - i - 1];
}

+ nr_loop_iter = nr;
nr = remove_loops(be, nr);

+ /*
+ * Get the number of iterations.
+ * It's only approximation, but good enough in practice.
+ */
+ if (nr_loop_iter > nr)
+ nr_loop_iter = nr_loop_iter - nr + 1;
+ else
+ nr_loop_iter = 0;
+
for (i = 0; i < nr; i++) {
- err = add_callchain_ip(thread, cursor, parent, root_al,
- NULL, be[i].to);
+ if (i == nr - 1)
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al,
+ NULL, be[i].to,
+ true, &be[i].flags,
+ nr_loop_iter, 1);
+ else
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al,
+ NULL, be[i].to,
+ true, &be[i].flags,
+ 0, 0);
+
if (!err)
err = add_callchain_ip(thread, cursor, parent, root_al,
- NULL, be[i].from);
+ NULL, be[i].from,
+ true, &be[i].flags,
+ 0, 0);
if (err == -EINVAL)
break;
if (err)
@@ -1903,7 +1952,9 @@ static int thread__resolve_callchain_sample(struct thread *thread,
if (ip < PERF_CONTEXT_MAX)
++nr_entries;

- err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip);
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ false, NULL, 0, 0);

if (err)
return (err < 0) ? err : 0;
@@ -1919,7 +1970,8 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;
return callchain_cursor_append(cursor, entry->ip,
- entry->map, entry->sym);
+ entry->map, entry->sym,
+ false, NULL, 0, 0);
}

static int thread__resolve_callchain_unwind(struct thread *thread,
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Jin Yao <yao...@linux.intel.com>

If the branch is 100% predicted then the "predicted" is hidden.
Similarly, if there is no branch tsx abort, the "abort" is hidden.
There is only cycles shown (cycle is supported on skylake platform,
older platform would be 0).

If no iterations, the "iterations" is hidden.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Linux-...@vger.kernel.org
Cc: Yao Jin <yao...@linux.intel.com>
Link: http://lkml.kernel.org/r/1477876794-30749-6-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/hists.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 84f5dd2fb59c..66676cb8effe 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -738,6 +738,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
struct callchain_print_arg *arg)
{
char bf[1024], *alloc_str;
+ char buf[64], *alloc_str2;
const char *str;

if (arg->row_offset != 0) {
@@ -746,12 +747,26 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
}

alloc_str = NULL;
+ alloc_str2 = NULL;
+
str = callchain_list__sym_name(chain, bf, sizeof(bf),
browser->show_dso);

- if (need_percent) {
- char buf[64];
+ if (symbol_conf.show_branchflag_count) {
+ if (need_percent)
+ callchain_list_counts__printf_value(node, chain, NULL,
+ buf, sizeof(buf));
+ else
+ callchain_list_counts__printf_value(NULL, chain, NULL,
+ buf, sizeof(buf));
+
+ if (asprintf(&alloc_str2, "%s%s", str, buf) < 0)
+ str = "Not enough memory!";
+ else
+ str = alloc_str2;
+ }

+ if (need_percent) {
callchain_node__scnprintf_value(node, buf, sizeof(buf),
total);

@@ -764,6 +779,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
print(browser, chain, str, offset, row, arg);

free(alloc_str);
+ free(alloc_str2);
return 1;
}

--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Jin Yao <yao...@linux.intel.com>

Create a new flag show_branchflag_count in symbol_conf. The flag is used
to control if showing the branch flag counting information. The flag
depends on if the perf.data has branch data and if user chooses the
"branch-history" option in perf report command line.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Linux-...@vger.kernel.org
Cc: Yao Jin <yao...@linux.intel.com>
Link: http://lkml.kernel.org/r/1477876794-30749-3-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-report.c | 3 +++
tools/perf/util/symbol.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8064de8ceedc..3dfbfffe2ecd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -911,6 +911,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
if (itrace_synth_opts.last_branch)
has_br_stack = true;

+ if (has_br_stack && branch_call_mode)
+ symbol_conf.show_branchflag_count = true;
+
/*
* Branch mode is a tristate:
* -1 means default, so decide based on the file having branch data.
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index d964844eb314..2d0a905c879a 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -100,6 +100,7 @@ struct symbol_conf {
show_total_period,
use_callchain,
cumulate_callchain,
+ show_branchflag_count,
exclude_other,
show_cpu_utilization,
initialized,
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Jin Yao <yao...@linux.intel.com>

If the branch is 100% predicted then the "predicted" is hidden.
Similarly, if there is no branch tsx abort, the "abort" is hidden.
There is only cycles shown (cycle is supported on skylake platform,
older platform would be 0).

If no iterations, the "iterations" is hidden.

For example:

|--29.93%--main div.c:39 (predicted:50.6%, cycles:1, iterations:18)
| main div.c:44 (predicted:50.6%, cycles:1)
| |
| --22.69%--main div.c:42 (cycles:2, iterations:17)
| compute_flag div.c:28 (cycles:2)
| |
| --10.52%--compute_flag div.c:27 (cycles:1)
| rand rand.c:28 (cycles:1)
| rand rand.c:28 (cycles:1)
| __random random.c:298 (cycles:1)
| __random random.c:297 (cycles:1)
| __random random.c:295 (cycles:1)
| __random random.c:295 (cycles:1)
| __random random.c:295 (cycles:1)
| __random random.c:295 (cycles:6)

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Linux-...@vger.kernel.org
Cc: Yao Jin <yao...@linux.intel.com>
Link: http://lkml.kernel.org/r/1477876794-30749-5-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/stdio/hist.c | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 89d8441f9890..668f4aecf2e6 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -41,7 +41,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
{
int i;
size_t ret = 0;
- char bf[1024];
+ char bf[1024], *alloc_str = NULL;
+ char buf[64];
+ const char *str;

ret += callchain__fprintf_left_margin(fp, left_margin);
for (i = 0; i < depth; i++) {
@@ -56,8 +58,26 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
} else
ret += fprintf(fp, "%s", " ");
}
- fputs(callchain_list__sym_name(chain, bf, sizeof(bf), false), fp);
+
+ str = callchain_list__sym_name(chain, bf, sizeof(bf), false);
+
+ if (symbol_conf.show_branchflag_count) {
+ if (!period)
+ callchain_list_counts__printf_value(node, chain, NULL,
+ buf, sizeof(buf));
+ else
+ callchain_list_counts__printf_value(NULL, chain, NULL,
+ buf, sizeof(buf));
+
+ if (asprintf(&alloc_str, "%s%s", str, buf) < 0)
+ str = "Not enough memory!";
+ else
+ str = alloc_str;
+ }
+
+ fputs(str, fp);
fputc('\n', fp);
+ free(alloc_str);
return ret;
}

@@ -219,8 +239,15 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
} else
ret += callchain__fprintf_left_margin(fp, left_margin);

- ret += fprintf(fp, "%s\n", callchain_list__sym_name(chain, bf, sizeof(bf),
- false));
+ ret += fprintf(fp, "%s",
+ callchain_list__sym_name(chain, bf,
+ sizeof(bf),
+ false));
+
+ if (symbol_conf.show_branchflag_count)
+ ret += callchain_list_counts__printf_value(
+ NULL, chain, fp, NULL, 0);
+ ret += fprintf(fp, "\n");

if (++entries_printed == callchain_param.print_limit)
break;
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Jiri Olsa <jo...@kernel.org>

Compile jvmti agent as part of the perf build. The agent library is
called libperf-jvmti.so and is installed in default place together with
other files:

$ make libperf-jvmti.so
BUILD: Doing 'make -j4' parallel build
...
CC jvmti/libjvmti.o
CC jvmti/jvmti_agent.o
LD jvmti/jvmti-in.o
LINK libperf-jvmti.so

$ make DESTDIR=/tmp/krava/ install-bin
...
$ find /tmp/krava/ | grep libperf
/tmp/krava/lib64/libperf-jvmti.so
/tmp/krava/lib64/libperf-gtk.so

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Tested-by: Stephane Eranian <era...@google.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: William Cohen <wco...@redhat.com>
Link: http://lkml.kernel.org/r/1478093749-5602-4-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.config | 26 ++++++++++++++++++++++++++
tools/perf/Makefile.perf | 24 +++++++++++++++++++++++-
tools/perf/jvmti/Build | 8 ++++++++
tools/perf/tests/make | 2 +-
4 files changed, 58 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/jvmti/Build

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index cffdd9cf3ebf..8a493d46fab9 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -758,6 +758,31 @@ ifndef NO_AUXTRACE
endif
endif

+ifndef NO_JVMTI
+ ifneq (,$(wildcard /usr/sbin/update-java-alternatives))
+ JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | awk '{print $$3}')
+ else
+ ifneq (,$(wildcard /usr/sbin/alternatives))
+ JDIR=$(shell alternatives --display java | tail -1 | cut -d' ' -f 5 | sed 's%/jre/bin/java.%%g')
+ endif
+ endif
+ ifndef JDIR
+ $(warning No alternatives command found, you need to set JDIR= to point to the root of your Java directory)
+ NO_JVMTI := 1
+ endif
+endif
+
+ifndef NO_JVMTI
+ FEATURE_CHECK_CFLAGS-jvmti := -I$(JDIR)/include -I$(JDIR)/include/linux
+ $(call feature_check,jvmti)
+ ifeq ($(feature-jvmti), 1)
+ $(call detected_var,JDIR)
+ else
+ $(warning No openjdk development package found, please install JDK package)
+ NO_JVMTI := 1
+ endif
+endif
+
# Among the variables below, these:
# perfexecdir
# template_dir
@@ -850,6 +875,7 @@ ifeq ($(VF),1)
$(call print_var,sysconfdir)
$(call print_var,LIBUNWIND_DIR)
$(call print_var,LIBDW_DIR)
+ $(call print_var,JDIR)

ifeq ($(dwarf-post-unwind),1)
$(call feature_print_text,"DWARF post unwind library", $(dwarf-post-unwind-text))
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 7de14f470f3c..3cb1df43ad3e 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -86,6 +86,8 @@ include ../scripts/utilities.mak
#
# Define FEATURES_DUMP to provide features detection dump file
# and bypass the feature detection
+#
+# Define NO_JVMTI if you do not want jvmti agent built

# As per kernel Makefile, avoid funny character set dependencies
unexport LC_ALL
@@ -283,6 +285,12 @@ ifndef NO_PERF_READ_VDSOX32
PROGRAMS += $(OUTPUT)perf-read-vdsox32
endif

+LIBJVMTI = libperf-jvmti.so
+
+ifndef NO_JVMTI
+PROGRAMS += $(OUTPUT)$(LIBJVMTI)
+endif
+
# what 'all' will build and 'install' will install, in perfexecdir
ALL_PROGRAMS = $(PROGRAMS) $(SCRIPTS)

@@ -551,6 +559,16 @@ $(OUTPUT)perf-read-vdsox32: perf-read-vdso.c util/find-vdso-map.c
$(QUIET_CC)$(CC) -mx32 $(filter -static,$(LDFLAGS)) -Wall -Werror -o $@ perf-read-vdso.c
endif

+ifndef NO_JVMTI
+LIBJVMTI_IN := $(OUTPUT)jvmti/jvmti-in.o
+
+$(LIBJVMTI_IN): FORCE
+ $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=jvmti obj=jvmti
+
+$(OUTPUT)$(LIBJVMTI): $(LIBJVMTI_IN)
+ $(QUIET_LINK)$(CC) -shared -Wl,-soname -Wl,$(LIBJVMTI) -o $@ $< -lelf -lrt
+endif
+
$(patsubst perf-%,%.o,$(PROGRAMS)): $(wildcard */*.h)

LIBPERF_IN := $(OUTPUT)libperf-in.o
@@ -688,6 +706,10 @@ ifndef NO_PERF_READ_VDSOX32
$(call QUIET_INSTALL, perf-read-vdsox32) \
$(INSTALL) $(OUTPUT)perf-read-vdsox32 '$(DESTDIR_SQ)$(bindir_SQ)';
endif
+ifndef NO_JVMTI
+ $(call QUIET_INSTALL, $(LIBJVMTI)) \
+ $(INSTALL) $(OUTPUT)$(LIBJVMTI) '$(DESTDIR_SQ)$(libdir_SQ)';
+endif
$(call QUIET_INSTALL, libexec) \
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
$(call QUIET_INSTALL, perf-archive) \
@@ -754,7 +776,7 @@ clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clea
$(call QUIET_CLEAN, core-objs) $(RM) $(LIB_FILE) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
$(Q)$(RM) $(OUTPUT).config-detected
- $(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents
+ $(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents $(OUTPUT)$(LIBJVMTI).so
$(call QUIET_CLEAN, core-gen) $(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
$(OUTPUT)util/intel-pt-decoder/inat-tables.c $(OUTPUT)fixdep \
$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
diff --git a/tools/perf/jvmti/Build b/tools/perf/jvmti/Build
new file mode 100644
index 000000000000..eaeb8cb5379b
--- /dev/null
+++ b/tools/perf/jvmti/Build
@@ -0,0 +1,8 @@
+jvmti-y += libjvmti.o
+jvmti-y += jvmti_agent.o
+
+CFLAGS_jvmti = -fPIC -DPIC -I$(JDIR)/include -I$(JDIR)/include/linux
+CFLAGS_REMOVE_jvmti = -Wmissing-declarations
+CFLAGS_REMOVE_jvmti += -Wstrict-prototypes
+CFLAGS_REMOVE_jvmti += -Wextra
+CFLAGS_REMOVE_jvmti += -Wwrite-strings
diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index 143f4d549769..08ed7f12cc37 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -106,7 +106,7 @@ make_minimal := NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1
make_minimal += NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1
make_minimal += NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1
make_minimal += NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1
-make_minimal += NO_LIBCRYPTO=1 NO_SDT=1
+make_minimal += NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1

# $(run) contains all available tests
run := make_pure
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:05 PM11/14/16
to
From: Taeung Song <treeze...@gmail.com>

You can show the values for several config items as below:

# perf config report.queue-size call-graph.record-mode

but it is necessary to more precisely check arguments, before passing
them to show_spec_config(). This validation function would be also used
when parsing config key-value pairs arguments in the near future.

Committer notes:

Testing it:

$ perf config bla.
The config variable does not contain a variable name: bla.
$ perf config .bla
The config variable does not contain a section name: .bla
$ perf config bla.bla
$

Signed-off-by: Taeung Song <treeze...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Nambong Ha <over...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: Wookje Kwon <awe...@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-4-git-...@gmail.com
[ Fix some spelling errors ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-config.c | 45 +++++++++++++++++++++++++++++++++++++++++----
1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index df3fa1c18e55..88a43fe4963c 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -82,6 +82,27 @@ static int show_config(struct perf_config_set *set)
return 0;
}

+static int parse_config_arg(char *arg, char **var)
+{
+ const char *last_dot = strchr(arg, '.');
+
+ /*
+ * Since "var" actually contains the section name and the real
+ * config variable name separated by a dot, we have to know where the dot is.
+ */
+ if (last_dot == NULL || last_dot == arg) {
+ pr_err("The config variable does not contain a section name: %s\n", arg);
+ return -1;
+ }
+ if (!last_dot[1]) {
+ pr_err("The config variable does not contain a variable name: %s\n", arg);
+ return -1;
+ }
+
+ *var = arg;
+ return 0;
+}
+
int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
{
int i, ret = 0;
@@ -130,10 +151,26 @@ int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
}
break;
default:
- if (argc)
- for (i = 0; argv[i]; i++)
- ret = show_spec_config(set, argv[i]);
- else
+ if (argc) {
+ for (i = 0; argv[i]; i++) {
+ char *var, *arg = strdup(argv[i]);
+
+ if (!arg) {
+ pr_err("%s: strdup failed\n", __func__);
+ ret = -1;
+ break;
+ }
+
+ if (parse_config_arg(arg, &var) < 0) {
+ free(arg);
+ ret = -1;
+ break;
+ }
+
+ ret = show_spec_config(set, var);
+ free(arg);
+ }
+ } else
usage_with_options(config_usage, config_options);
}

--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:06 PM11/14/16
to
From: Taeung Song <treeze...@gmail.com>

Add a functionality getting specific config key-value pairs.
For the syntax examples,

perf config [<file-option>] [section.name ...]

e.g. To query config items 'report.queue-size' and 'report.children', do

# perf config report.queue-size report.children

Signed-off-by: Taeung Song <treeze...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Nambong Ha <over...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: Wookje Kwon <awe...@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-2-git-...@gmail.com
[ Combined patch with docs update with this one ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-config.txt | 18 ++++++++++++++
tools/perf/builtin-config.c | 40 +++++++++++++++++++++++++++++---
2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index cb081ac59fd1..1714b0c8c8e1 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -8,6 +8,8 @@ perf-config - Get and set variables in a configuration file.
SYNOPSIS
--------
[verse]
+'perf config' [<file-option>] [section.name ...]
+or
'perf config' [<file-option>] -l | --list

DESCRIPTION
@@ -118,6 +120,22 @@ Given a $HOME/.perfconfig like this:
children = true
group = true

+To query the record mode of call graph, do
+
+ % perf config call-graph.record-mode
+
+If you want to know multiple config key/value pairs, you can do like
+
+ % perf config report.queue-size call-graph.order report.children
+
+To query the config value of sort order of call graph in user config file (i.e. `~/.perfconfig`), do
+
+ % perf config --user call-graph.sort-order
+
+To query the config value of buildid directory in system config file (i.e. `$(sysconf)/perfconfig`), do
+
+ % perf config --system buildid.dir
+
Variables
~~~~~~~~~

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index e4207a23b52c..df3fa1c18e55 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -17,7 +17,7 @@
static bool use_system_config, use_user_config;

static const char * const config_usage[] = {
- "perf config [<file-option>] [options]",
+ "perf config [<file-option>] [options] [section.name ...]",
NULL
};

@@ -33,6 +33,36 @@ static struct option config_options[] = {
OPT_END()
};

+static int show_spec_config(struct perf_config_set *set, const char *var)
+{
+ struct perf_config_section *section;
+ struct perf_config_item *item;
+
+ if (set == NULL)
+ return -1;
+
+ perf_config_items__for_each_entry(&set->sections, section) {
+ if (prefixcmp(var, section->name) != 0)
+ continue;
+
+ perf_config_items__for_each_entry(&section->items, item) {
+ const char *name = var + strlen(section->name) + 1;
+
+ if (strcmp(name, item->name) == 0) {
+ char *value = item->value;
+
+ if (value) {
+ printf("%s=%s\n", var, value);
+ return 0;
+ }
+ }
+
+ }
+ }
+
+ return 0;
+}
+
static int show_config(struct perf_config_set *set)
{
struct perf_config_section *section;
@@ -54,7 +84,7 @@ static int show_config(struct perf_config_set *set)

int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
{
- int ret = 0;
+ int i, ret = 0;
struct perf_config_set *set;
char *user_config = mkpath("%s/.perfconfig", getenv("HOME"));

@@ -100,7 +130,11 @@ int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
}
break;
default:
- usage_with_options(config_usage, config_options);
+ if (argc)
+ for (i = 0; argv[i]; i++)
+ ret = show_spec_config(set, argv[i]);
+ else
+ usage_with_options(config_usage, config_options);
}

perf_config_set__delete(set);
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 14, 2016, 8:50:29 PM11/14/16
to
From: Jin Yao <yao...@linux.intel.com>

Create some branch counters in per callchain list entry. Each counter
is for a branch flag. For example, predicted_count counts all the
*predicted* branches. The counters get updated by processing the
callchain cursor nodes.

It also provides functions to retrieve or print the values of counters
in callchain list.

Besides the counting for branch flags, it also counts and returns the
average number of iterations.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/1477876794-30749-4-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 189 +++++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/callchain.h | 14 ++++
2 files changed, 202 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 138a415fad0d..823befd8209a 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -438,6 +438,21 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor)
call->ip = cursor_node->ip;
call->ms.sym = cursor_node->sym;
call->ms.map = cursor_node->map;
+
+ if (cursor_node->branch) {
+ call->branch_count = 1;
+
+ if (cursor_node->branch_flags.predicted)
+ call->predicted_count = 1;
+
+ if (cursor_node->branch_flags.abort)
+ call->abort_count = 1;
+
+ call->cycles_count = cursor_node->branch_flags.cycles;
+ call->iter_count = cursor_node->nr_loop_iter;
+ call->samples_count = cursor_node->samples;
+ }
+
list_add_tail(&call->list, &node->val);

callchain_cursor_advance(cursor);
@@ -497,8 +512,23 @@ static enum match_result match_chain(struct callchain_cursor_node *node,
right = node->ip;
}

- if (left == right)
+ if (left == right) {
+ if (node->branch) {
+ cnode->branch_count++;
+
+ if (node->branch_flags.predicted)
+ cnode->predicted_count++;
+
+ if (node->branch_flags.abort)
+ cnode->abort_count++;
+
+ cnode->cycles_count += node->branch_flags.cycles;
+ cnode->iter_count += node->nr_loop_iter;
+ cnode->samples_count += node->samples;
+ }
+
return MATCH_EQ;
+ }

return left > right ? MATCH_GT : MATCH_LT;
}
@@ -947,6 +977,163 @@ int callchain_node__fprintf_value(struct callchain_node *node,
return 0;
}

+static void callchain_counts_value(struct callchain_node *node,
+ u64 *branch_count, u64 *predicted_count,
+ u64 *abort_count, u64 *cycles_count)
+{
+ struct callchain_list *clist;
+
+ list_for_each_entry(clist, &node->val, list) {
+ if (branch_count)
+ *branch_count += clist->branch_count;
+
+ if (predicted_count)
+ *predicted_count += clist->predicted_count;
+
+ if (abort_count)
+ *abort_count += clist->abort_count;
+
+ if (cycles_count)
+ *cycles_count += clist->cycles_count;
+ }
+}
+
+static int callchain_node_branch_counts_cumul(struct callchain_node *node,
+ u64 *branch_count,
+ u64 *predicted_count,
+ u64 *abort_count,
+ u64 *cycles_count)
+{
+ struct callchain_node *child;
+ struct rb_node *n;
+
+ n = rb_first(&node->rb_root_in);
+ while (n) {
+ child = rb_entry(n, struct callchain_node, rb_node_in);
+ n = rb_next(n);
+
+ callchain_node_branch_counts_cumul(child, branch_count,
+ predicted_count,
+ abort_count,
+ cycles_count);
+
+ callchain_counts_value(child, branch_count,
+ predicted_count, abort_count,
+ cycles_count);
+ }
+
+ return 0;
+}
+
+int callchain_branch_counts(struct callchain_root *root,
+ u64 *branch_count, u64 *predicted_count,
+ u64 *abort_count, u64 *cycles_count)
+{
+ if (branch_count)
+ *branch_count = 0;
+
+ if (predicted_count)
+ *predicted_count = 0;
+
+ if (abort_count)
+ *abort_count = 0;
+
+ if (cycles_count)
+ *cycles_count = 0;
+
+ return callchain_node_branch_counts_cumul(&root->node,
+ branch_count,
+ predicted_count,
+ abort_count,
+ cycles_count);
+}
+
+static int callchain_counts_printf(FILE *fp, char *bf, int bfsize,
+ u64 branch_count, u64 predicted_count,
+ u64 abort_count, u64 cycles_count,
+ u64 iter_count, u64 samples_count)
+{
+ double predicted_percent = 0.0;
+ const char *null_str = "";
+ char iter_str[32];
+ char *str;
+ u64 cycles = 0;
+
+ if (branch_count == 0) {
+ if (fp)
+ return fprintf(fp, " (calltrace)");
+
+ return scnprintf(bf, bfsize, " (calltrace)");
+ }
+
+ if (iter_count && samples_count) {
+ scnprintf(iter_str, sizeof(iter_str),
+ ", iterations:%" PRId64 "",
+ iter_count / samples_count);
+ str = iter_str;
+ } else
+ str = (char *)null_str;
+
+ predicted_percent = predicted_count * 100.0 / branch_count;
+ cycles = cycles_count / branch_count;
+
+ if ((predicted_percent >= 100.0) && (abort_count == 0)) {
+ if (fp)
+ return fprintf(fp, " (cycles:%" PRId64 "%s)",
+ cycles, str);
+
+ return scnprintf(bf, bfsize, " (cycles:%" PRId64 "%s)",
+ cycles, str);
+ }
+
+ if ((predicted_percent < 100.0) && (abort_count == 0)) {
+ if (fp)
+ return fprintf(fp,
+ " (predicted:%.1f%%, cycles:%" PRId64 "%s)",
+ predicted_percent, cycles, str);
+
+ return scnprintf(bf, bfsize,
+ " (predicted:%.1f%%, cycles:%" PRId64 "%s)",
+ predicted_percent, cycles, str);
+ }
+
+ if (fp)
+ return fprintf(fp,
+ " (predicted:%.1f%%, abort:%" PRId64 ", cycles:%" PRId64 "%s)",
+ predicted_percent, abort_count, cycles, str);
+
+ return scnprintf(bf, bfsize,
+ " (predicted:%.1f%%, abort:%" PRId64 ", cycles:%" PRId64 "%s)",
+ predicted_percent, abort_count, cycles, str);
+}
+
+int callchain_list_counts__printf_value(struct callchain_node *node,
+ struct callchain_list *clist,
+ FILE *fp, char *bf, int bfsize)
+{
+ u64 branch_count, predicted_count;
+ u64 abort_count, cycles_count;
+ u64 iter_count = 0, samples_count = 0;
+
+ branch_count = clist->branch_count;
+ predicted_count = clist->predicted_count;
+ abort_count = clist->abort_count;
+ cycles_count = clist->cycles_count;
+
+ if (node) {
+ struct callchain_list *call;
+
+ list_for_each_entry(call, &node->val, list) {
+ iter_count += call->iter_count;
+ samples_count += call->samples_count;
+ }
+ }
+
+ return callchain_counts_printf(fp, bf, bfsize, branch_count,
+ predicted_count, abort_count,
+ cycles_count, iter_count, samples_count);
+}
+
static void free_callchain_node(struct callchain_node *node)
{
struct callchain_list *list, *tmp;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index df6329d1c350..d9c70dccf06a 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -111,6 +111,12 @@ struct callchain_list {
bool unfolded;
bool has_children;
};
+ u64 branch_count;
+ u64 predicted_count;
+ u64 abort_count;
+ u64 cycles_count;
+ u64 iter_count;
+ u64 samples_count;
char *srcline;
struct list_head list;
};
@@ -263,8 +269,16 @@ char *callchain_node__scnprintf_value(struct callchain_node *node,
int callchain_node__fprintf_value(struct callchain_node *node,
FILE *fp, u64 total);

+int callchain_list_counts__printf_value(struct callchain_node *node,
+ struct callchain_list *clist,
+ FILE *fp, char *bf, int bfsize);
+
void free_callchain(struct callchain_root *root);
void decay_callchain(struct callchain_root *root);
int callchain_node__make_parent_list(struct callchain_node *node);

+int callchain_branch_counts(struct callchain_root *root,
+ u64 *branch_count, u64 *predicted_count,
+ u64 *abort_count, u64 *cycles_count);
+
#endif /* __PERF_CALLCHAIN_H */
--
2.7.4

Taeung Song

unread,
Nov 14, 2016, 9:30:04 PM11/14/16
to
Hi, Arnaldo :)
We need to add a space between two word as you said..
But it is so minor part, is it ok ?

Thanks,
Taeung

Ingo Molnar

unread,
Nov 15, 2016, 3:50:05 AM11/15/16
to
Pulled, thanks a lot Arnaldo!

Ingo

tip-bot for Jin Yao

unread,
Nov 15, 2016, 5:50:06 AM11/15/16
to
Commit-ID: 410024dbbcb1df5b8140a812b4f1a4dbd62ef924
Gitweb: http://git.kernel.org/tip/410024dbbcb1df5b8140a812b4f1a4dbd62ef924
Author: Jin Yao <yao...@linux.intel.com>
AuthorDate: Mon, 31 Oct 2016 09:19:49 +0800
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 14 Nov 2016 13:15:56 -0300

perf report: Add branch flag to callchain cursor node

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/1477876794-30749-2-g...@linux.intel.com
[ Renamed 'iter' to 'nr_loop_iter' for clarity ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 14 ++++++--
tools/perf/util/callchain.h | 8 ++++-
tools/perf/util/machine.c | 82 ++++++++++++++++++++++++++++++++++++---------
3 files changed, 86 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index ae58b49..138a415 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 47cfd10..df6329d 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -125,6 +125,10 @@ struct callchain_cursor_node {
u64 ip;
struct map *map;
struct symbol *sym;
+ bool branch;
+ struct branch_flags branch_flags;
+ int nr_loop_iter;
+ int samples;
struct callchain_cursor_node *next;
};

@@ -179,7 +183,9 @@ static inline void callchain_cursor_reset(struct callchain_cursor *cursor)
}

int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
- struct map *map, struct symbol *sym);
+ struct map *map, struct symbol *sym,
+ bool branch, struct branch_flags *flags,
+ int nr_loop_iter, int samples);

/* Close a cursor writing session. Initialize for the reader */
static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index df85b9e..9b33bef 100644
+
@@ -1903,7 +1952,9 @@ check_calls:

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:06 AM11/23/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end.

The following changes since commit 6a6b12e2125591e24891e6860410795ea53aed11:

Merge tag 'perf-core-for-mingo-20161114' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-11-15 09:45:04 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161123

for you to fetch changes up to a407b0678bc1c39d70af5fdbe6421c164b69a8c0:

perf sched timehist: Add -V/--cpu-visual option (2016-11-23 10:44:09 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New tool:

- 'perf sched timehist' provides an analysis of scheduling events.

Example usage:
perf sched record -- sleep 1
perf sched timehist

By default it shows the individual schedule events, including the wait
time (time between sched-out and next sched-in events for the task), the
task scheduling delay (time between wakeup and actually running) and run
time for the task:

time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
-------- ------ ---------------- --------- --------- --------
1.874569 [0011] gcc[31949] 0.014 0.000 1.148
1.874591 [0010] gcc[31951] 0.000 0.000 0.024
1.874603 [0010] migration/10[59] 3.350 0.004 0.011
1.874604 [0011] <idle> 1.148 0.000 0.035
1.874723 [0005] <idle> 0.016 0.000 1.383
1.874746 [0005] gcc[31949] 0.153 0.078 0.022
...

Times are in msec.usec. (David Ahern, Namhyung Kim)

Improvements:

- Make 'perf c2c report' support -f/--force, to allow skipping the
ownership check for root users, for instance, just like the other
tools (Jiri Olsa)

- Allow sorting cachelines by total number of HITMs, in addition to
local and remote numbers (Jiri Olsa)

Fixes:

- Make sure errors aren't suppressed by the TUI reset at the end of
a 'perf c2c report' session (Jiri Olsa)

Infrastructure:

- Initial work on having the annotate code better support multiple
architectures, including the ability to cross-annotate, i.e. to
annotate perf.data files collected on an ARM system on a x86_64
workstation (Arnaldo Carvalho de Melo, Ravi Bangoria, Kim Phillips)

- Use USECS_PER_SEC instead of hard coded number in libtraceevent (Steven Rostedt)

- Add retrieval of preempt count and latency flags in libtraceevent (Steven Rostedt)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (3):
perf annotate: Start supporting cross arch annotation
perf annotate: Allow arches to specify functions to skip
perf annotate: Add per arch instructions annotate handlers

David Ahern (5):
perf sched timehist: Introduce timehist command
perf sched timehist: Add summary options
perf sched timehist: Add -w/--wakeups option
perf sched timehist: Add call graph options
perf sched timehist: Add -V/--cpu-visual option

Jiri Olsa (6):
perf tools: Show event fd in debug output
perf c2c report: Setup browser after opening perf.data
perf c2c report: Add -f/--force option
perf c2c report: Add struct c2c_stats::tot_hitm field
perf c2c report: Display total HITMs on default
perf c2c: Support cascading options

Namhyung Kim (2):
perf symbols: Print symbol offsets conditionally
perf evsel: Support printing callchains with arrows

Steven Rostedt (2):
tools lib traceevent: Use USECS_PER_SEC instead of hardcoded number
tools lib traceevent: Add retrieval of preempt count and latency flags

tools/lib/traceevent/event-parse.c | 41 +-
tools/lib/traceevent/event-parse.h | 5 +-
tools/perf/Documentation/perf-c2c.txt | 8 +
tools/perf/Documentation/perf-sched.txt | 66 +-
tools/perf/arch/arm/annotate/instructions.c | 90 +++
tools/perf/arch/x86/annotate/instructions.c | 78 +++
tools/perf/builtin-c2c.c | 80 ++-
tools/perf/builtin-sched.c | 914 +++++++++++++++++++++++++++-
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 251 ++++----
tools/perf/util/annotate.h | 6 +-
tools/perf/util/evsel.c | 6 +-
tools/perf/util/evsel.h | 1 +
tools/perf/util/evsel_fprintf.c | 12 +-
tools/perf/util/mem-events.c | 12 +-
tools/perf/util/mem-events.h | 1 +
tools/perf/util/symbol.h | 3 +-
tools/perf/util/symbol_fprintf.c | 11 +-
20 files changed, 1406 insertions(+), 185 deletions(-)
create mode 100644 tools/perf/arch/arm/annotate/instructions.c
create mode 100644 tools/perf/arch/x86/annotate/instructions.c

# uname -a
Linux jouet 4.8.6-201.fc24.x86_64 #1 SMP Thu Nov 3 14:38:57 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
#

# dm
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libperl_O: make NO_LIBPERL=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_install_bin_O: make install-bin
make_install_prefix_O: make install prefix=/tmp/krava
make_util_map_o_O: make util/map.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libbpf_O: make NO_LIBBPF=1
make_doc_O: make doc
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_debug_O: make DEBUG=1
make_perf_o_O: make perf.o
make_no_slang_O: make NO_SLANG=1
make_no_newt_O: make NO_NEWT=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_install_O: make install
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_clean_all_O: make clean all
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_help_O: make help
make_no_libelf_O: make NO_LIBELF=1
make_tags_O: make tags
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_gtk2_O: make NO_GTK2=1
make_pure_O: make
make_static_O: make LDFLAGS=-static

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:06 AM11/23/16
to
From: Jiri Olsa <jo...@kernel.org>

Adding -f/--force option to go through ownership validation:

$ sudo perf c2c report
File perf.data not owned by current user or root (use -f to override)
$
$ sudo perf c2c report -f
< c2c report output >
$

Signed-off-by: Jiri Olsa <jo...@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-4-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-c2c.txt | 4 ++++
tools/perf/builtin-c2c.c | 4 +++-
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index 21810d711f5f..5eda9336267e 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -100,6 +100,10 @@ REPORT OPTIONS
--show-all::
Show all captured HITM lines, with no regard to HITM % 0.0005 limit.

+-f::
+--force::
+ Don't do ownership validation.
+
C2C RECORD
----------
The perf c2c record command setup options related to HITM cacheline analysis
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 15addb06d611..d873977b8fb6 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2523,6 +2523,7 @@ static int perf_c2c__report(int argc, const char **argv)
OPT_STRING('d', "display", &display, NULL, "lcl,rmt"),
OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
"coalesce fields: pid,tid,iaddr,dso"),
+ OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_END()
};
int err = 0;
@@ -2538,7 +2539,8 @@ static int perf_c2c__report(int argc, const char **argv)
if (!input_name || !strlen(input_name))
input_name = "perf.data";

- file.path = input_name;
+ file.path = input_name;
+ file.force = symbol_conf.force;

err = setup_display(display);
if (err)
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:06 AM11/23/16
to
From: Jiri Olsa <jo...@kernel.org>

Currently we display the cacheline list sorted on remote HITMs by
default.

The problem is that they might not be always counted and 'perf c2c
report' displays an empty output. Thus it's more convenient to display
and sort the cacheline list based on the total of HITMs and have the
best change to see data in the default report run.

Signed-off-by: Jiri Olsa <jo...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-6-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-c2c.txt | 4 ++++
tools/perf/builtin-c2c.c | 39 ++++++++++++++++++++++++++++-------
2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index 5eda9336267e..3f06730c7f47 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -104,6 +104,10 @@ REPORT OPTIONS
--force::
Don't do ownership validation.

+-d::
+--display::
+ Siwtch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default.
+
C2C RECORD
----------
The perf c2c record command setup options related to HITM cacheline analysis
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index d873977b8fb6..54924717ae8e 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -91,6 +91,14 @@ struct perf_c2c {
enum {
DISPLAY_LCL,
DISPLAY_RMT,
+ DISPLAY_TOT,
+ DISPLAY_MAX,
+};
+
+static const char *display_str[DISPLAY_MAX] = {
+ [DISPLAY_LCL] = "Local",
+ [DISPLAY_RMT] = "Remote",
+ [DISPLAY_TOT] = "Total",
};

static struct perf_c2c c2c;
@@ -745,6 +753,10 @@ static double percent_hitm(struct c2c_hist_entry *c2c_he)
case DISPLAY_LCL:
st = stats->lcl_hitm;
tot = total->lcl_hitm;
+ break;
+ case DISPLAY_TOT:
+ st = stats->tot_hitm;
+ tot = total->tot_hitm;
default:
break;
}
@@ -1044,6 +1056,9 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
break;
case DISPLAY_LCL:
DISPLAY_HITM(lcl_hitm);
+ break;
+ case DISPLAY_TOT:
+ DISPLAY_HITM(tot_hitm);
default:
break;
}
@@ -1351,6 +1366,7 @@ static struct c2c_dimension dim_tot_loads = {
static struct c2c_header percent_hitm_header[] = {
[DISPLAY_LCL] = HEADER_BOTH("Lcl", "Hitm"),
[DISPLAY_RMT] = HEADER_BOTH("Rmt", "Hitm"),
+ [DISPLAY_TOT] = HEADER_BOTH("Tot", "Hitm"),
};

static struct c2c_dimension dim_percent_hitm = {
@@ -1794,6 +1810,9 @@ static bool he__display(struct hist_entry *he, struct c2c_stats *stats)
break;
case DISPLAY_RMT:
FILTER_HITM(rmt_hitm);
+ break;
+ case DISPLAY_TOT:
+ FILTER_HITM(tot_hitm);
default:
break;
};
@@ -1809,8 +1828,9 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
bool has_hitm;

c2c_he = container_of(he, struct c2c_hist_entry, he);
- has_hitm = c2c.display == DISPLAY_LCL ?
- c2c_he->stats.lcl_hitm : c2c_he->stats.rmt_hitm;
+ has_hitm = c2c.display == DISPLAY_TOT ? c2c_he->stats.tot_hitm :
+ c2c.display == DISPLAY_LCL ? c2c_he->stats.lcl_hitm :
+ c2c_he->stats.rmt_hitm;
return has_hitm || c2c_he->stats.store;
}

@@ -2095,7 +2115,7 @@ static void print_c2c_info(FILE *out, struct perf_session *session)
first = false;
}
fprintf(out, " Cachelines sort on : %s HITMs\n",
- c2c.display == DISPLAY_LCL ? "Local" : "Remote");
+ display_str[c2c.display]);
fprintf(out, " Cacheline data grouping : %s\n", c2c.cl_sort);
}

@@ -2250,7 +2270,7 @@ static int perf_c2c_browser__title(struct hist_browser *browser,
"Shared Data Cache Line Table "
"(%lu entries, sorted on %s HITMs)",
browser->nr_non_filtered_entries,
- c2c.display == DISPLAY_LCL ? "local" : "remote");
+ display_str[c2c.display]);
return 0;
}

@@ -2387,9 +2407,11 @@ static int setup_callchain(struct perf_evlist *evlist)

static int setup_display(const char *str)
{
- const char *display = str ?: "rmt";
+ const char *display = str ?: "tot";

- if (!strcmp(display, "rmt"))
+ if (!strcmp(display, "tot"))
+ c2c.display = DISPLAY_TOT;
+ else if (!strcmp(display, "rmt"))
c2c.display = DISPLAY_RMT;
else if (!strcmp(display, "lcl"))
c2c.display = DISPLAY_LCL;
@@ -2474,6 +2496,8 @@ static int setup_coalesce(const char *coalesce, bool no_source)
return -1;

if (asprintf(&c2c.cl_resort, "offset,%s",
+ c2c.display == DISPLAY_TOT ?
+ "tot_hitm" :
c2c.display == DISPLAY_RMT ?
"rmt_hitm,lcl_hitm" :
"lcl_hitm,rmt_hitm") < 0)
@@ -2520,7 +2544,7 @@ static int perf_c2c__report(int argc, const char **argv)
"print_type,threshold[,print_limit],order,sort_key[,branch],value",
callchain_help, &parse_callchain_opt,
callchain_default_opt),
- OPT_STRING('d', "display", &display, NULL, "lcl,rmt"),
+ OPT_STRING('d', "display", &display, "Switch HITM output type", "lcl,rmt"),
OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
"coalesce fields: pid,tid,iaddr,dso"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
@@ -2608,6 +2632,7 @@ static int perf_c2c__report(int argc, const char **argv)
"tot_loads,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
"ld_lclhit,ld_rmthit",
+ c2c.display == DISPLAY_TOT ? "tot_hitm" :
c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
);

--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:06 AM11/23/16
to
From: David Ahern <dsa...@gmail.com>

The -s/--summary option is to show process runtime statistics. And the
-S/--with-summary option is to show the stats with the normal output.

$ perf sched timehist -s

Runtime summary
comm parent sched-in run-time min-run avg-run max-run stddev
(count) (msec) (msec) (msec) (msec) %
---------------------------------------------------------------------------------------------------------
ksoftirqd/0[3] 2 2 0.011 0.004 0.005 0.006 14.87
rcu_preempt[7] 2 11 0.071 0.002 0.006 0.017 20.23
watchdog/0[11] 2 1 0.002 0.002 0.002 0.002 0.00
watchdog/1[12] 2 1 0.004 0.004 0.004 0.004 0.00
...

Terminated tasks:
sleep[7220] 7219 3 0.770 0.087 0.256 0.576 62.28

Idle stats:
CPU 0 idle for 2352.006 msec
CPU 1 idle for 2764.497 msec
CPU 2 idle for 2998.229 msec
CPU 3 idle for 2967.800 msec

Total number of unique tasks: 52
Total number of context switches: 2532
Total run time (msec): 218.036

Signed-off-by: David Ahern <dsa...@gmail.com>
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Ingo Molnar <mi...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Link: http://lkml.kernel.org/r/20161116060634....@kernel.org
[ Add documentation from last commit, so that docs comes with the cset that introduces the feature ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 166 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 160 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index c0ac0c9557e8..1e7d81ad5ec6 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -194,6 +194,11 @@ struct perf_sched {
bool force;
bool skip_merge;
struct perf_sched_map map;
+
+ /* options for timehist command */
+ bool summary;
+ bool summary_only;
+ u64 skipped_samples;
};

/* per thread run time data */
@@ -2010,12 +2015,15 @@ static struct thread *timehist_get_thread(struct perf_sample *sample,
return thread;
}

-static bool timehist_skip_sample(struct thread *thread)
+static bool timehist_skip_sample(struct perf_sched *sched,
+ struct thread *thread)
{
bool rc = false;

- if (thread__is_filtered(thread))
+ if (thread__is_filtered(thread)) {
rc = true;
+ sched->skipped_samples++;
+ }

return rc;
}
@@ -2045,7 +2053,7 @@ static int timehist_sched_wakeup_event(struct perf_tool *tool __maybe_unused,
return 0;
}

-static int timehist_sched_change_event(struct perf_tool *tool __maybe_unused,
+static int timehist_sched_change_event(struct perf_tool *tool,
union perf_event *event,
struct perf_evsel *evsel,
struct perf_sample *sample,
@@ -2056,6 +2064,7 @@ static int timehist_sched_change_event(struct perf_tool *tool __maybe_unused,
struct thread_runtime *tr = NULL;
u64 tprev;
int rc = 0;
+ struct perf_sched *sched = container_of(tool, struct perf_sched, tool);

if (machine__resolve(machine, &al, sample) < 0) {
pr_err("problem processing %d event. skipping it\n",
@@ -2070,7 +2079,7 @@ static int timehist_sched_change_event(struct perf_tool *tool __maybe_unused,
goto out;
}

- if (timehist_skip_sample(thread))
+ if (timehist_skip_sample(sched, thread))
goto out;

tr = thread__get_runtime(thread);
@@ -2082,7 +2091,8 @@ static int timehist_sched_change_event(struct perf_tool *tool __maybe_unused,
tprev = perf_evsel__get_time(evsel, sample->cpu);

timehist_update_runtime_stats(tr, sample->time, tprev);
- timehist_print_sample(sample, thread);
+ if (!sched->summary_only)
+ timehist_print_sample(sample, thread);

out:
if (tr) {
@@ -2122,6 +2132,131 @@ static int process_lost(struct perf_tool *tool __maybe_unused,
}


+static void print_thread_runtime(struct thread *t,
+ struct thread_runtime *r)
+{
+ double mean = avg_stats(&r->run_stats);
+ float stddev;
+
+ printf("%*s %5d %9" PRIu64 " ",
+ comm_width, timehist_get_commstr(t), t->ppid,
+ (u64) r->run_stats.n);
+
+ print_sched_time(r->total_run_time, 8);
+ stddev = rel_stddev_stats(stddev_stats(&r->run_stats), mean);
+ print_sched_time(r->run_stats.min, 6);
+ printf(" ");
+ print_sched_time((u64) mean, 6);
+ printf(" ");
+ print_sched_time(r->run_stats.max, 6);
+ printf(" ");
+ printf("%5.2f", stddev);
+ printf("\n");
+}
+
+struct total_run_stats {
+ u64 sched_count;
+ u64 task_count;
+ u64 total_run_time;
+};
+
+static int __show_thread_runtime(struct thread *t, void *priv)
+{
+ struct total_run_stats *stats = priv;
+ struct thread_runtime *r;
+
+ if (thread__is_filtered(t))
+ return 0;
+
+ r = thread__priv(t);
+ if (r && r->run_stats.n) {
+ stats->task_count++;
+ stats->sched_count += r->run_stats.n;
+ stats->total_run_time += r->total_run_time;
+ print_thread_runtime(t, r);
+ }
+
+ return 0;
+}
+
+static int show_thread_runtime(struct thread *t, void *priv)
+{
+ if (t->dead)
+ return 0;
+
+ return __show_thread_runtime(t, priv);
+}
+
+static int show_deadthread_runtime(struct thread *t, void *priv)
+{
+ if (!t->dead)
+ return 0;
+
+ return __show_thread_runtime(t, priv);
+}
+
+static void timehist_print_summary(struct perf_sched *sched,
+ struct perf_session *session)
+{
+ struct machine *m = &session->machines.host;
+ struct total_run_stats totals;
+ u64 task_count;
+ struct thread *t;
+ struct thread_runtime *r;
+ int i;
+
+ memset(&totals, 0, sizeof(totals));
+
+ if (comm_width < 30)
+ comm_width = 30;
+
+ printf("\nRuntime summary\n");
+ printf("%*s parent sched-in ", comm_width, "comm");
+ printf(" run-time min-run avg-run max-run stddev\n");
+ printf("%*s (count) ", comm_width, "");
+ printf(" (msec) (msec) (msec) (msec) %%\n");
+ printf("%.105s\n", graph_dotted_line);
+
+ machine__for_each_thread(m, show_thread_runtime, &totals);
+ task_count = totals.task_count;
+ if (!task_count)
+ printf("<no still running tasks>\n");
+
+ printf("\nTerminated tasks:\n");
+ machine__for_each_thread(m, show_deadthread_runtime, &totals);
+ if (task_count == totals.task_count)
+ printf("<no terminated tasks>\n");
+
+ /* CPU idle stats not tracked when samples were skipped */
+ if (sched->skipped_samples)
+ return;
+
+ printf("\nIdle stats:\n");
+ for (i = 0; i <= idle_max_cpu; ++i) {
+ t = idle_threads[i];
+ if (!t)
+ continue;
+
+ r = thread__priv(t);
+ if (r && r->run_stats.n) {
+ totals.sched_count += r->run_stats.n;
+ printf(" CPU %2d idle for ", i);
+ print_sched_time(r->total_run_time, 6);
+ printf(" msec\n");
+ } else
+ printf(" CPU %2d idle entire time window\n", i);
+ }
+
+ printf("\n"
+ " Total number of unique tasks: %" PRIu64 "\n"
+ "Total number of context switches: %" PRIu64 "\n"
+ " Total run time (msec): ",
+ totals.task_count, totals.sched_count);
+
+ print_sched_time(totals.total_run_time, 2);
+ printf("\n");
+}
+
typedef int (*sched_handler)(struct perf_tool *tool,
union perf_event *event,
struct perf_evsel *evsel,
@@ -2163,6 +2298,7 @@ static int perf_sched__timehist(struct perf_sched *sched)
};

struct perf_session *session;
+ struct perf_evlist *evlist;
int err = -1;

/*
@@ -2185,6 +2321,8 @@ static int perf_sched__timehist(struct perf_sched *sched)
if (session == NULL)
return -ENOMEM;

+ evlist = session->evlist;
+
symbol__init(&session->header.env);

setup_pager();
@@ -2203,7 +2341,12 @@ static int perf_sched__timehist(struct perf_sched *sched)
if (init_idle_threads(sched->max_cpu))
goto out;

- timehist_header();
+ /* summary_only implies summary option, but don't overwrite summary if set */
+ if (sched->summary_only)
+ sched->summary = sched->summary_only;
+
+ if (!sched->summary_only)
+ timehist_header();

err = perf_session__process_events(session);
if (err) {
@@ -2211,6 +2354,13 @@ static int perf_sched__timehist(struct perf_sched *sched)
goto out;
}

+ sched->nr_events = evlist->stats.nr_events[0];
+ sched->nr_lost_events = evlist->stats.total_lost;
+ sched->nr_lost_chunks = evlist->stats.nr_events[PERF_RECORD_LOST];
+
+ if (sched->summary)
+ timehist_print_summary(sched, session);
+
out:
free_idle_threads();
perf_session__delete(session);
@@ -2569,6 +2719,10 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
"file", "kallsyms pathname"),
OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
"Look for files with symbols relative to this directory"),
+ OPT_BOOLEAN('s', "summary", &sched.summary_only,
+ "Show only syscall summary with statistics"),
+ OPT_BOOLEAN('S', "with-summary", &sched.summary,
+ "Show all syscalls and summary with statistics"),
OPT_PARENT(sched_options)
};

--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: Jiri Olsa <jo...@kernel.org>

It is useful for debug to see file descriptors for each event.

Before:

$ perf stat -vvv -e cycles,cache-misses ls
...
sys_perf_event_open: pid 12146 cpu -1 group_fd -1 flags 0x8
...
sys_perf_event_open: pid 12146 cpu -1 group_fd 3 flags 0x8
sys_perf_event_open failed, error -13

Now:

$ perf stat -vvv -e cycles,cache-misses ls
...
sys_perf_event_open: pid 12858 cpu -1 group_fd -1 flags 0x8 = 3
...
sys_perf_event_open: pid 12858 cpu -1 group_fd 3 flags 0x8
sys_perf_event_open failed, error -13

Signed-off-by: Jiri Olsa <jo...@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-2-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/evsel.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e58a2fbf3b16..b2365a63db45 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1481,7 +1481,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,

group_fd = get_group_fd(evsel, cpu, thread);
retry_open:
- pr_debug2("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
+ pr_debug2("sys_perf_event_open: pid %d cpu %d group_fd %d flags %#lx",
pid, cpus->map[cpu], group_fd, flags);

FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
@@ -1490,11 +1490,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
group_fd, flags);
if (FD(evsel, cpu, thread) < 0) {
err = -errno;
- pr_debug2("sys_perf_event_open failed, error %d\n",
+ pr_debug2("\nsys_perf_event_open failed, error %d\n",
err);
goto try_fallback;
}

+ pr_debug2(" = %d\n", FD(evsel, cpu, thread));
+
if (evsel->bpf_fd >= 0) {
int evt_fd = FD(evsel, cpu, thread);
int bpf_fd = evsel->bpf_fd;
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: Namhyung Kim <namh...@kernel.org>

The __symbol__fprintf_symname_offs() always shows symbol offsets. So
there's no difference between 'perf script -F ip,sym' and 'perf script
-F ip,sym,symoff'. I don't think it's a desired behavior..

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Ingo Molnar <mi...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/evsel_fprintf.c | 6 ++++--
tools/perf/util/symbol.h | 3 ++-
tools/perf/util/symbol_fprintf.c | 11 ++++++-----
3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 662a0a6182e7..ccb602397b60 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -137,7 +137,8 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,

if (print_symoffset) {
printed += __symbol__fprintf_symname_offs(node->sym, &node_al,
- print_unknown_as_addr, fp);
+ print_unknown_as_addr,
+ true, fp);
} else {
printed += __symbol__fprintf_symname(node->sym, &node_al,
print_unknown_as_addr, fp);
@@ -188,7 +189,8 @@ int sample__fprintf_sym(struct perf_sample *sample, struct addr_location *al,
printed += fprintf(fp, " ");
if (print_symoffset) {
printed += __symbol__fprintf_symname_offs(al->sym, al,
- print_unknown_as_addr, fp);
+ print_unknown_as_addr,
+ true, fp);
} else {
printed += __symbol__fprintf_symname(al->sym, al,
print_unknown_as_addr, fp);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 2d0a905c879a..dec7e2d44885 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -282,7 +282,8 @@ int symbol__annotation_init(void);
struct symbol *symbol__new(u64 start, u64 len, u8 binding, const char *name);
size_t __symbol__fprintf_symname_offs(const struct symbol *sym,
const struct addr_location *al,
- bool unknown_as_addr, FILE *fp);
+ bool unknown_as_addr,
+ bool print_offsets, FILE *fp);
size_t symbol__fprintf_symname_offs(const struct symbol *sym,
const struct addr_location *al, FILE *fp);
size_t __symbol__fprintf_symname(const struct symbol *sym,
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index a680bdaa65dc..7c6b33e8e2d2 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -15,14 +15,15 @@ size_t symbol__fprintf(struct symbol *sym, FILE *fp)

size_t __symbol__fprintf_symname_offs(const struct symbol *sym,
const struct addr_location *al,
- bool unknown_as_addr, FILE *fp)
+ bool unknown_as_addr,
+ bool print_offsets, FILE *fp)
{
unsigned long offset;
size_t length;

if (sym && sym->name) {
length = fprintf(fp, "%s", sym->name);
- if (al) {
+ if (al && print_offsets) {
if (al->addr < sym->end)
offset = al->addr - sym->start;
else
@@ -40,19 +41,19 @@ size_t symbol__fprintf_symname_offs(const struct symbol *sym,
const struct addr_location *al,
FILE *fp)
{
- return __symbol__fprintf_symname_offs(sym, al, false, fp);
+ return __symbol__fprintf_symname_offs(sym, al, false, true, fp);
}

size_t __symbol__fprintf_symname(const struct symbol *sym,
const struct addr_location *al,
bool unknown_as_addr, FILE *fp)
{
- return __symbol__fprintf_symname_offs(sym, al, unknown_as_addr, fp);
+ return __symbol__fprintf_symname_offs(sym, al, unknown_as_addr, false, fp);
}

size_t symbol__fprintf_symname(const struct symbol *sym, FILE *fp)
{
- return __symbol__fprintf_symname_offs(sym, NULL, false, fp);
+ return __symbol__fprintf_symname_offs(sym, NULL, false, false, fp);
}

size_t dso__fprintf_symbols_by_name(struct dso *dso,
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: David Ahern <dsa...@gmail.com>

'perf sched timehist' provides an analysis of scheduling events.

Example usage:
perf sched record -- sleep 1
perf sched timehist

By default it shows the individual schedule events, including the wait
time (time between sched-out and next sched-in events for the task), the
task scheduling delay (time between wakeup and actually running) and run
time for the task:

time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
-------------- ------ -------------------- --------- --------- ---------
79371.874569 [0011] gcc[31949] 0.014 0.000 1.148
79371.874591 [0010] gcc[31951] 0.000 0.000 0.024
79371.874603 [0010] migration/10[59] 3.350 0.004 0.011
79371.874604 [0011] <idle> 1.148 0.000 0.035
79371.874723 [0005] <idle> 0.016 0.000 1.383
79371.874746 [0005] gcc[31949] 0.153 0.078 0.022
...

Times are in msec.usec.

Committer note:

Add above explanation as the 'perf sched timehist' entry for 'man
perf-sched'.

Signed-off-by: David Ahern <dsa...@gmail.com>
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Ingo Molnar <mi...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 50 ++-
tools/perf/builtin-sched.c | 594 +++++++++++++++++++++++++++++++-
2 files changed, 637 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 1cc08cc47ac5..72730da307b9 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -8,11 +8,11 @@ perf-sched - Tool to trace/measure scheduler properties (latencies)
SYNOPSIS
--------
[verse]
-'perf sched' {record|latency|map|replay|script}
+'perf sched' {record|latency|map|replay|script|timehist}

DESCRIPTION
-----------
-There are five variants of perf sched:
+There are several variants of 'perf sched':

'perf sched record <command>' to record the scheduling events
of an arbitrary workload.
@@ -36,6 +36,30 @@ There are five variants of perf sched:
are running on a CPU. A '*' denotes the CPU that had the event, and
a dot signals an idle CPU.

+ 'perf sched timehist' provides an analysis of scheduling events.
+
+ Example usage:
+ perf sched record -- sleep 1
+ perf sched timehist
+
+ By default it shows the individual schedule events, including the wait
+ time (time between sched-out and next sched-in events for the task), the
+ task scheduling delay (time between wakeup and actually running) and run
+ time for the task:
+
+ time cpu task name wait time sch delay run time
+ [tid/pid] (msec) (msec) (msec)
+ -------------- ------ -------------------- --------- --------- ---------
+ 79371.874569 [0011] gcc[31949] 0.014 0.000 1.148
+ 79371.874591 [0010] gcc[31951] 0.000 0.000 0.024
+ 79371.874603 [0010] migration/10[59] 3.350 0.004 0.011
+ 79371.874604 [0011] <idle> 1.148 0.000 0.035
+ 79371.874723 [0005] <idle> 0.016 0.000 1.383
+ 79371.874746 [0005] gcc[31949] 0.153 0.078 0.022
+ ...
+
+ Times are in msec.usec.
+
OPTIONS
-------
-i::
@@ -66,6 +90,28 @@ OPTIONS for 'perf sched map'
--color-pids::
Highlight the given pids.

+OPTIONS for 'perf sched timehist'
+---------------------------------
+-k::
+--vmlinux=<file>::
+ vmlinux pathname
+
+--kallsyms=<file>::
+ kallsyms pathname
+
+-s::
+--summary::
+ Show only a summary of scheduling by thread with min, max, and average
+ run times (in sec) and relative stddev.
+
+-S::
+--with-summary::
+ Show all scheduling events followed by a summary by thread with min,
+ max, and average run times (in sec) and relative stddev.
+
+--symfs=<directory>::
+ Look for files with symbols relative to this directory.
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index fb3441211e4b..c0ac0c9557e8 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,12 +13,14 @@
#include "util/cloexec.h"
#include "util/thread_map.h"
#include "util/color.h"
+#include "util/stat.h"

#include <subcmd/parse-options.h>
#include "util/trace-event.h"

#include "util/debug.h"

+#include <linux/log2.h>
#include <sys/prctl.h>
#include <sys/resource.h>

@@ -194,6 +196,29 @@ struct perf_sched {
struct perf_sched_map map;
};

+/* per thread run time data */
+struct thread_runtime {
+ u64 last_time; /* time of previous sched in/out event */
+ u64 dt_run; /* run time */
+ u64 dt_wait; /* time between CPU access (off cpu) */
+ u64 dt_delay; /* time between wakeup and sched-in */
+ u64 ready_to_run; /* time of wakeup */
+
+ struct stats run_stats;
+ u64 total_run_time;
+};
+
+/* per event run time data */
+struct evsel_runtime {
+ u64 *last_time; /* time this event was last seen per cpu */
+ u32 ncpu; /* highest cpu slot allocated */
+};
+
+/* track idle times per cpu */
+static struct thread **idle_threads;
+static int idle_max_cpu;
+static char idle_comm[] = "<idle>";
+
static u64 get_nsecs(void)
{
struct timespec ts;
@@ -1654,6 +1679,546 @@ static int perf_sched__read_events(struct perf_sched *sched)
return rc;
}

+/*
+ * scheduling times are printed as msec.usec
+ */
+static inline void print_sched_time(unsigned long long nsecs, int width)
+{
+ unsigned long msecs;
+ unsigned long usecs;
+
+ msecs = nsecs / NSEC_PER_MSEC;
+ nsecs -= msecs * NSEC_PER_MSEC;
+ usecs = nsecs / NSEC_PER_USEC;
+ printf("%*lu.%03lu ", width, msecs, usecs);
+}
+
+/*
+ * returns runtime data for event, allocating memory for it the
+ * first time it is used.
+ */
+static struct evsel_runtime *perf_evsel__get_runtime(struct perf_evsel *evsel)
+{
+ struct evsel_runtime *r = evsel->priv;
+
+ if (r == NULL) {
+ r = zalloc(sizeof(struct evsel_runtime));
+ evsel->priv = r;
+ }
+
+ return r;
+}
+
+/*
+ * save last time event was seen per cpu
+ */
+static void perf_evsel__save_time(struct perf_evsel *evsel,
+ u64 timestamp, u32 cpu)
+{
+ struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+
+ if (r == NULL)
+ return;
+
+ if ((cpu >= r->ncpu) || (r->last_time == NULL)) {
+ int i, n = __roundup_pow_of_two(cpu+1);
+ void *p = r->last_time;
+
+ p = realloc(r->last_time, n * sizeof(u64));
+ if (!p)
+ return;
+
+ r->last_time = p;
+ for (i = r->ncpu; i < n; ++i)
+ r->last_time[i] = (u64) 0;
+
+ r->ncpu = n;
+ }
+
+ r->last_time[cpu] = timestamp;
+}
+
+/* returns last time this event was seen on the given cpu */
+static u64 perf_evsel__get_time(struct perf_evsel *evsel, u32 cpu)
+{
+ struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+
+ if ((r == NULL) || (r->last_time == NULL) || (cpu >= r->ncpu))
+ return 0;
+
+ return r->last_time[cpu];
+}
+
+static int comm_width = 20;
+
+static char *timehist_get_commstr(struct thread *thread)
+{
+ static char str[32];
+ const char *comm = thread__comm_str(thread);
+ pid_t tid = thread->tid;
+ pid_t pid = thread->pid_;
+ int n;
+
+ if (pid == 0)
+ n = scnprintf(str, sizeof(str), "%s", comm);
+
+ else if (tid != pid)
+ n = scnprintf(str, sizeof(str), "%s[%d/%d]", comm, tid, pid);
+
+ else
+ n = scnprintf(str, sizeof(str), "%s[%d]", comm, tid);
+
+ if (n > comm_width)
+ comm_width = n;
+
+ return str;
+}
+
+static void timehist_header(void)
+{
+ printf("%15s %6s ", "time", "cpu");
+
+ printf(" %-20s %9s %9s %9s",
+ "task name", "wait time", "sch delay", "run time");
+
+ printf("\n");
+
+ /*
+ * units row
+ */
+ printf("%15s %-6s ", "", "");
+
+ printf(" %-20s %9s %9s %9s\n", "[tid/pid]", "(msec)", "(msec)", "(msec)");
+
+ /*
+ * separator
+ */
+ printf("%.15s %.6s ", graph_dotted_line, graph_dotted_line);
+
+ printf(" %.20s %.9s %.9s %.9s",
+ graph_dotted_line, graph_dotted_line, graph_dotted_line,
+ graph_dotted_line);
+
+ printf("\n");
+}
+
+static void timehist_print_sample(struct perf_sample *sample,
+ struct thread *thread)
+{
+ struct thread_runtime *tr = thread__priv(thread);
+ char tstr[64];
+
+ timestamp__scnprintf_usec(sample->time, tstr, sizeof(tstr));
+ printf("%15s [%04d] ", tstr, sample->cpu);
+
+ printf(" %-*s ", comm_width, timehist_get_commstr(thread));
+
+ print_sched_time(tr->dt_wait, 6);
+ print_sched_time(tr->dt_delay, 6);
+ print_sched_time(tr->dt_run, 6);
+ printf("\n");
+}
+
+/*
+ * Explanation of delta-time stats:
+ *
+ * t = time of current schedule out event
+ * tprev = time of previous sched out event
+ * also time of schedule-in event for current task
+ * last_time = time of last sched change event for current task
+ * (i.e, time process was last scheduled out)
+ * ready_to_run = time of wakeup for current task
+ *
+ * -----|------------|------------|------------|------
+ * last ready tprev t
+ * time to run
+ *
+ * |-------- dt_wait --------|
+ * |- dt_delay -|-- dt_run --|
+ *
+ * dt_run = run time of current task
+ * dt_wait = time between last schedule out event for task and tprev
+ * represents time spent off the cpu
+ * dt_delay = time between wakeup and schedule-in of task
+ */
+
+static void timehist_update_runtime_stats(struct thread_runtime *r,
+ u64 t, u64 tprev)
+{
+ r->dt_delay = 0;
+ r->dt_wait = 0;
+ r->dt_run = 0;
+ if (tprev) {
+ r->dt_run = t - tprev;
+ if (r->ready_to_run) {
+ if (r->ready_to_run > tprev)
+ pr_debug("time travel: wakeup time for task > previous sched_switch event\n");
+ else
+ r->dt_delay = tprev - r->ready_to_run;
+ }
+
+ if (r->last_time > tprev)
+ pr_debug("time travel: last sched out time for task > previous sched_switch event\n");
+ else if (r->last_time)
+ r->dt_wait = tprev - r->last_time;
+ }
+
+ update_stats(&r->run_stats, r->dt_run);
+ r->total_run_time += r->dt_run;
+}
+
+static bool is_idle_sample(struct perf_sample *sample,
+ struct perf_evsel *evsel)
+{
+ /* pid 0 == swapper == idle task */
+ if (sample->pid == 0)
+ return true;
+
+ if (strcmp(perf_evsel__name(evsel), "sched:sched_switch") == 0) {
+ if (perf_evsel__intval(evsel, sample, "prev_pid") == 0)
+ return true;
+ }
+ return false;
+}
+
+/*
+ * Track idle stats per cpu by maintaining a local thread
+ * struct for the idle task on each cpu.
+ */
+static int init_idle_threads(int ncpu)
+{
+ int i;
+
+ idle_threads = zalloc(ncpu * sizeof(struct thread *));
+ if (!idle_threads)
+ return -ENOMEM;
+
+ idle_max_cpu = ncpu - 1;
+
+ /* allocate the actual thread struct if needed */
+ for (i = 0; i < ncpu; ++i) {
+ idle_threads[i] = thread__new(0, 0);
+ if (idle_threads[i] == NULL)
+ return -ENOMEM;
+
+ thread__set_comm(idle_threads[i], idle_comm, 0);
+ }
+
+ return 0;
+}
+
+static void free_idle_threads(void)
+{
+ int i;
+
+ if (idle_threads == NULL)
+ return;
+
+ for (i = 0; i <= idle_max_cpu; ++i) {
+ if ((idle_threads[i]))
+ thread__delete(idle_threads[i]);
+ }
+
+ free(idle_threads);
+}
+
+static struct thread *get_idle_thread(int cpu)
+{
+ /*
+ * expand/allocate array of pointers to local thread
+ * structs if needed
+ */
+ if ((cpu >= idle_max_cpu) || (idle_threads == NULL)) {
+ int i, j = __roundup_pow_of_two(cpu+1);
+ void *p;
+
+ p = realloc(idle_threads, j * sizeof(struct thread *));
+ if (!p)
+ return NULL;
+
+ idle_threads = (struct thread **) p;
+ i = idle_max_cpu ? idle_max_cpu + 1 : 0;
+ for (; i < j; ++i)
+ idle_threads[i] = NULL;
+
+ idle_max_cpu = j;
+ }
+
+ /* allocate a new thread struct if needed */
+ if (idle_threads[cpu] == NULL) {
+ idle_threads[cpu] = thread__new(0, 0);
+
+ if (idle_threads[cpu]) {
+ idle_threads[cpu]->tid = 0;
+ thread__set_comm(idle_threads[cpu], idle_comm, 0);
+ }
+ }
+
+ return idle_threads[cpu];
+}
+
+/*
+ * handle runtime stats saved per thread
+ */
+static struct thread_runtime *thread__init_runtime(struct thread *thread)
+{
+ struct thread_runtime *r;
+
+ r = zalloc(sizeof(struct thread_runtime));
+ if (!r)
+ return NULL;
+
+ init_stats(&r->run_stats);
+ thread__set_priv(thread, r);
+
+ return r;
+}
+
+static struct thread_runtime *thread__get_runtime(struct thread *thread)
+{
+ struct thread_runtime *tr;
+
+ tr = thread__priv(thread);
+ if (tr == NULL) {
+ tr = thread__init_runtime(thread);
+ if (tr == NULL)
+ pr_debug("Failed to malloc memory for runtime data.\n");
+ }
+
+ return tr;
+}
+
+static struct thread *timehist_get_thread(struct perf_sample *sample,
+ struct machine *machine,
+ struct perf_evsel *evsel)
+{
+ struct thread *thread;
+
+ if (is_idle_sample(sample, evsel)) {
+ thread = get_idle_thread(sample->cpu);
+ if (thread == NULL)
+ pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
+
+ } else {
+ thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+ if (thread == NULL) {
+ pr_debug("Failed to get thread for tid %d. skipping sample.\n",
+ sample->tid);
+ }
+ }
+
+ return thread;
+}
+
+static bool timehist_skip_sample(struct thread *thread)
+{
+ bool rc = false;
+
+ if (thread__is_filtered(thread))
+ rc = true;
+
+ return rc;
+}
+
+static int timehist_sched_wakeup_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event __maybe_unused,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ struct thread *thread;
+ struct thread_runtime *tr = NULL;
+ /* want pid of awakened task not pid in sample */
+ const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+
+ thread = machine__findnew_thread(machine, 0, pid);
+ if (thread == NULL)
+ return -1;
+
+ tr = thread__get_runtime(thread);
+ if (tr == NULL)
+ return -1;
+
+ if (tr->ready_to_run == 0)
+ tr->ready_to_run = sample->time;
+
+ return 0;
+}
+
+static int timehist_sched_change_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ struct addr_location al;
+ struct thread *thread;
+ struct thread_runtime *tr = NULL;
+ u64 tprev;
+ int rc = 0;
+
+ if (machine__resolve(machine, &al, sample) < 0) {
+ pr_err("problem processing %d event. skipping it\n",
+ event->header.type);
+ rc = -1;
+ goto out;
+ }
+
+ thread = timehist_get_thread(sample, machine, evsel);
+ if (thread == NULL) {
+ rc = -1;
+ goto out;
+ }
+
+ if (timehist_skip_sample(thread))
+ goto out;
+
+ tr = thread__get_runtime(thread);
+ if (tr == NULL) {
+ rc = -1;
+ goto out;
+ }
+
+ tprev = perf_evsel__get_time(evsel, sample->cpu);
+
+ timehist_update_runtime_stats(tr, sample->time, tprev);
+ timehist_print_sample(sample, thread);
+
+out:
+ if (tr) {
+ /* time of this sched_switch event becomes last time task seen */
+ tr->last_time = sample->time;
+
+ /* sched out event for task so reset ready to run time */
+ tr->ready_to_run = 0;
+ }
+
+ perf_evsel__save_time(evsel, sample->time, sample->cpu);
+
+ return rc;
+}
+
+static int timehist_sched_switch_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine __maybe_unused)
+{
+ return timehist_sched_change_event(tool, event, evsel, sample, machine);
+}
+
+static int process_lost(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine __maybe_unused)
+{
+ char tstr[64];
+
+ timestamp__scnprintf_usec(sample->time, tstr, sizeof(tstr));
+ printf("%15s ", tstr);
+ printf("lost %" PRIu64 " events on cpu %d\n", event->lost.lost, sample->cpu);
+
+ return 0;
+}
+
+
+typedef int (*sched_handler)(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine);
+
+static int perf_timehist__process_sample(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
+{
+ struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
+ int err = 0;
+ int this_cpu = sample->cpu;
+
+ if (this_cpu > sched->max_cpu)
+ sched->max_cpu = this_cpu;
+
+ if (evsel->handler != NULL) {
+ sched_handler f = evsel->handler;
+
+ err = f(tool, event, evsel, sample, machine);
+ }
+
+ return err;
+}
+
+static int perf_sched__timehist(struct perf_sched *sched)
+{
+ const struct perf_evsel_str_handler handlers[] = {
+ { "sched:sched_switch", timehist_sched_switch_event, },
+ { "sched:sched_wakeup", timehist_sched_wakeup_event, },
+ { "sched:sched_wakeup_new", timehist_sched_wakeup_event, },
+ };
+ struct perf_data_file file = {
+ .path = input_name,
+ .mode = PERF_DATA_MODE_READ,
+ };
+
+ struct perf_session *session;
+ int err = -1;
+
+ /*
+ * event handlers for timehist option
+ */
+ sched->tool.sample = perf_timehist__process_sample;
+ sched->tool.mmap = perf_event__process_mmap;
+ sched->tool.comm = perf_event__process_comm;
+ sched->tool.exit = perf_event__process_exit;
+ sched->tool.fork = perf_event__process_fork;
+ sched->tool.lost = process_lost;
+ sched->tool.attr = perf_event__process_attr;
+ sched->tool.tracing_data = perf_event__process_tracing_data;
+ sched->tool.build_id = perf_event__process_build_id;
+
+ sched->tool.ordered_events = true;
+ sched->tool.ordering_requires_timestamps = true;
+
+ session = perf_session__new(&file, false, &sched->tool);
+ if (session == NULL)
+ return -ENOMEM;
+
+ symbol__init(&session->header.env);
+
+ setup_pager();
+
+ /* setup per-evsel handlers */
+ if (perf_session__set_tracepoints_handlers(session, handlers))
+ goto out;
+
+ if (!perf_session__has_traces(session, "record -R"))
+ goto out;
+
+ /* pre-allocate struct for per-CPU idle stats */
+ sched->max_cpu = session->header.env.nr_cpus_online;
+ if (sched->max_cpu == 0)
+ sched->max_cpu = 4;
+ if (init_idle_threads(sched->max_cpu))
+ goto out;
+
+ timehist_header();
+
+ err = perf_session__process_events(session);
+ if (err) {
+ pr_err("Failed to process events, error %d", err);
+ goto out;
+ }
+
+out:
+ free_idle_threads();
+ perf_session__delete(session);
+
+ return err;
+}
+
+
static void print_bad_events(struct perf_sched *sched)
{
if (sched->nr_unordered_timestamps && sched->nr_timestamps) {
@@ -1970,8 +2535,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
const struct option latency_options[] = {
OPT_STRING('s', "sort", &sched.sort_order, "key[,key2...]",
"sort by key(s): runtime, switch, avg, max"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show symbol address, etc)"),
OPT_INTEGER('C', "CPU", &sched.profile_cpu,
"CPU to profile on"),
OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
@@ -1983,8 +2546,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
const struct option replay_options[] = {
OPT_UINTEGER('r', "repeat", &sched.replay_repeat,
"repeat the workload replay N times (-1: infinite)"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show symbol address, etc)"),
OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
"dump raw trace in ASCII"),
OPT_BOOLEAN('f', "force", &sched.force, "don't complain, do it"),
@@ -2001,6 +2562,16 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
"display given CPUs in map"),
OPT_PARENT(sched_options)
};
+ const struct option timehist_options[] = {
+ OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
+ "file", "vmlinux pathname"),
+ OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
+ "file", "kallsyms pathname"),
+ OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
+ "Look for files with symbols relative to this directory"),
+ OPT_PARENT(sched_options)
+ };
+
const char * const latency_usage[] = {
"perf sched latency [<options>]",
NULL
@@ -2013,8 +2584,13 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
"perf sched map [<options>]",
NULL
};
+ const char * const timehist_usage[] = {
+ "perf sched timehist [<options>]",
+ NULL
+ };
const char *const sched_subcommands[] = { "record", "latency", "map",
- "replay", "script", NULL };
+ "replay", "script",
+ "timehist", NULL };
const char *sched_usage[] = {
NULL,
NULL
@@ -2077,6 +2653,14 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
usage_with_options(replay_usage, replay_options);
}
return perf_sched__replay(&sched);
+ } else if (!strcmp(argv[0], "timehist")) {
+ if (argc) {
+ argc = parse_options(argc, argv, timehist_options,
+ timehist_usage, 0);
+ if (argc)
+ usage_with_options(timehist_usage, timehist_options);
+ }
+ return perf_sched__timehist(&sched);
} else {
usage_with_options(sched_usage, sched_options);
}
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: David Ahern <dsa...@gmail.com>

If callchains were recorded they are appended to the line with a default stack depth of 5:

1.874569 [0011] gcc[31949] 0.014 0.000 1.148 wait_for_completion_killable <- do_fork <- sys_vfork <- stub_vfork <- __vfork
1.874591 [0010] gcc[31951] 0.000 0.000 0.024 __cond_resched <- _cond_resched <- wait_for_completion <- stop_one_cpu <- sched_exec
1.874603 [0010] migration/10[59] 3.350 0.004 0.011 smpboot_thread_fn <- kthread <- ret_from_fork
1.874604 [0011] <idle> 1.148 0.000 0.035 cpu_startup_entry <- start_secondary
1.874723 [0005] <idle> 0.016 0.000 1.383 cpu_startup_entry <- start_secondary
1.874746 [0005] gcc[31949] 0.153 0.078 0.022 do_wait sys_wait4 <- system_call_fastpath <- __GI___waitpid

--no-call-graph can be used to not show the callchains. --max-stack is used
to control the number of frames shown (default of 5). -x/--excl options can
be used to collapse redundant callchains to get more relevant data on screen.

Signed-off-by: David Ahern <dsa...@gmail.com>
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Ingo Molnar <mi...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Link: http://lkml.kernel.org/r/20161116060634....@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 7 +++
tools/perf/builtin-sched.c | 88 ++++++++++++++++++++++++++++++---
2 files changed, 89 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 9a77bc73e8a3..83452088727d 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -99,6 +99,13 @@ OPTIONS for 'perf sched timehist'
--kallsyms=<file>::
kallsyms pathname

+-g::
+--no-call-graph::
+ Do not display call chains if present.
+
+--max-stack::
+ Maximum number of functions to display in backtrace, default 5.
+
-s::
--summary::
Show only a summary of scheduling by thread with min, max, and average
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 8fb7bcc2cb76..1f8731640809 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -14,6 +14,7 @@
#include "util/thread_map.h"
#include "util/color.h"
#include "util/stat.h"
+#include "util/callchain.h"

#include <subcmd/parse-options.h>
#include "util/trace-event.h"
@@ -198,6 +199,8 @@ struct perf_sched {
/* options for timehist command */
bool summary;
bool summary_only;
+ bool show_callchain;
+ unsigned int max_stack;
bool show_wakeups;
u64 skipped_samples;
};
@@ -1810,6 +1813,7 @@ static void timehist_header(void)

static void timehist_print_sample(struct perf_sched *sched,
struct perf_sample *sample,
+ struct addr_location *al,
struct thread *thread)
{
struct thread_runtime *tr = thread__priv(thread);
@@ -1827,6 +1831,18 @@ static void timehist_print_sample(struct perf_sched *sched,
if (sched->show_wakeups)
printf(" %-*s", comm_width, "");

+ if (thread->tid == 0)
+ goto out;
+
+ if (sched->show_callchain)
+ printf(" ");
+
+ sample__fprintf_sym(sample, al, 0,
+ EVSEL__PRINT_SYM | EVSEL__PRINT_ONELINE |
+ EVSEL__PRINT_CALLCHAIN_ARROW,
+ &callchain_cursor, stdout);
+
+out:
printf("\n");
}

@@ -1878,9 +1894,14 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
r->total_run_time += r->dt_run;
}

-static bool is_idle_sample(struct perf_sample *sample,
- struct perf_evsel *evsel)
+static bool is_idle_sample(struct perf_sched *sched,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
{
+ struct thread *thread;
+ struct callchain_cursor *cursor = &callchain_cursor;
+
/* pid 0 == swapper == idle task */
if (sample->pid == 0)
return true;
@@ -1889,6 +1910,25 @@ static bool is_idle_sample(struct perf_sample *sample,
if (perf_evsel__intval(evsel, sample, "prev_pid") == 0)
return true;
}
+
+ /* want main thread for process - has maps */
+ thread = machine__findnew_thread(machine, sample->pid, sample->pid);
+ if (thread == NULL) {
+ pr_debug("Failed to get thread for pid %d.\n", sample->pid);
+ return false;
+ }
+
+ if (!symbol_conf.use_callchain || sample->callchain == NULL)
+ return false;
+
+ if (thread__resolve_callchain(thread, cursor, evsel, sample,
+ NULL, NULL, sched->max_stack) != 0) {
+ if (verbose)
+ error("Failed to resolve callchain. Skipping\n");
+
+ return false;
+ }
+ callchain_cursor_commit(cursor);
return false;
}

@@ -1999,13 +2039,14 @@ static struct thread_runtime *thread__get_runtime(struct thread *thread)
return tr;
}

-static struct thread *timehist_get_thread(struct perf_sample *sample,
+static struct thread *timehist_get_thread(struct perf_sched *sched,
+ struct perf_sample *sample,
struct machine *machine,
struct perf_evsel *evsel)
{
struct thread *thread;

- if (is_idle_sample(sample, evsel)) {
+ if (is_idle_sample(sched, sample, evsel, machine)) {
thread = get_idle_thread(sample->cpu);
if (thread == NULL)
pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
@@ -2115,7 +2156,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
goto out;
}

- thread = timehist_get_thread(sample, machine, evsel);
+ thread = timehist_get_thread(sched, sample, machine, evsel);
if (thread == NULL) {
rc = -1;
goto out;
@@ -2134,7 +2175,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,

timehist_update_runtime_stats(tr, sample->time, tprev);
if (!sched->summary_only)
- timehist_print_sample(sched, sample, thread);
+ timehist_print_sample(sched, sample, &al, thread);

out:
if (tr) {
@@ -2327,6 +2368,30 @@ static int perf_timehist__process_sample(struct perf_tool *tool,
return err;
}

+static int timehist_check_attr(struct perf_sched *sched,
+ struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ struct evsel_runtime *er;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ er = perf_evsel__get_runtime(evsel);
+ if (er == NULL) {
+ pr_err("Failed to allocate memory for evsel runtime data\n");
+ return -1;
+ }
+
+ if (sched->show_callchain &&
+ !(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ pr_info("Samples do not have callchains.\n");
+ sched->show_callchain = 0;
+ symbol_conf.use_callchain = 0;
+ }
+ }
+
+ return 0;
+}
+
static int perf_sched__timehist(struct perf_sched *sched)
{
const struct perf_evsel_str_handler handlers[] = {
@@ -2359,6 +2424,8 @@ static int perf_sched__timehist(struct perf_sched *sched)
sched->tool.ordered_events = true;
sched->tool.ordering_requires_timestamps = true;

+ symbol_conf.use_callchain = sched->show_callchain;
+
session = perf_session__new(&file, false, &sched->tool);
if (session == NULL)
return -ENOMEM;
@@ -2367,6 +2434,9 @@ static int perf_sched__timehist(struct perf_sched *sched)

symbol__init(&session->header.env);

+ if (timehist_check_attr(sched, evlist) != 0)
+ goto out;
+
setup_pager();

/* setup per-evsel handlers */
@@ -2714,6 +2784,8 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.next_shortname1 = 'A',
.next_shortname2 = '0',
.skip_merge = 0,
+ .show_callchain = 1,
+ .max_stack = 5,
};
const struct option sched_options[] = {
OPT_STRING('i', "input", &input_name, "file",
@@ -2759,6 +2831,10 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
"file", "vmlinux pathname"),
OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
"file", "kallsyms pathname"),
+ OPT_BOOLEAN('g', "call-graph", &sched.show_callchain,
+ "Display call chains if present (default on)"),
+ OPT_UINTEGER(0, "max-stack", &sched.max_stack,
+ "Maximum number of functions to display backtrace."),
OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
"Look for files with symbols relative to this directory"),
OPT_BOOLEAN('s', "summary", &sched.summary_only,
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Another step in supporting cross annotation.

The arch specific tables are put in:

tools/perf/arch/$ARCH/annotation/instructions.c

which, so far, just plug instructions to a bunch of parsers/formatters,
but may have more as the need arises.

This is an alternative implementation to a previous attempt made by Ravi
Bangoria.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Russell King <rmk+k...@arm.linux.org.uk>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-g3wt282lfa...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/arm/annotate/instructions.c | 90 ++++++++++++++++++
tools/perf/arch/x86/annotate/instructions.c | 78 ++++++++++++++++
tools/perf/util/annotate.c | 136 ++++++----------------------
3 files changed, 198 insertions(+), 106 deletions(-)
create mode 100644 tools/perf/arch/arm/annotate/instructions.c
create mode 100644 tools/perf/arch/x86/annotate/instructions.c

diff --git a/tools/perf/arch/arm/annotate/instructions.c b/tools/perf/arch/arm/annotate/instructions.c
new file mode 100644
index 000000000000..d67b8aa26274
--- /dev/null
+++ b/tools/perf/arch/arm/annotate/instructions.c
@@ -0,0 +1,90 @@
+static struct ins arm__instructions[] = {
+ { .name = "add", .ops = &mov_ops, },
+ { .name = "addl", .ops = &mov_ops, },
+ { .name = "addq", .ops = &mov_ops, },
+ { .name = "addw", .ops = &mov_ops, },
+ { .name = "and", .ops = &mov_ops, },
+ { .name = "b", .ops = &jump_ops, }, // might also be a call
+ { .name = "bcc", .ops = &jump_ops, },
+ { .name = "bcs", .ops = &jump_ops, },
+ { .name = "beq", .ops = &jump_ops, },
+ { .name = "bge", .ops = &jump_ops, },
+ { .name = "bgt", .ops = &jump_ops, },
+ { .name = "bhi", .ops = &jump_ops, },
+ { .name = "bl", .ops = &call_ops, },
+ { .name = "bls", .ops = &jump_ops, },
+ { .name = "blt", .ops = &jump_ops, },
+ { .name = "blx", .ops = &call_ops, },
+ { .name = "bne", .ops = &jump_ops, },
+ { .name = "bts", .ops = &mov_ops, },
+ { .name = "call", .ops = &call_ops, },
+ { .name = "callq", .ops = &call_ops, },
+ { .name = "cmp", .ops = &mov_ops, },
+ { .name = "cmpb", .ops = &mov_ops, },
+ { .name = "cmpl", .ops = &mov_ops, },
+ { .name = "cmpq", .ops = &mov_ops, },
+ { .name = "cmpw", .ops = &mov_ops, },
+ { .name = "cmpxch", .ops = &mov_ops, },
+ { .name = "dec", .ops = &dec_ops, },
+ { .name = "decl", .ops = &dec_ops, },
+ { .name = "imul", .ops = &mov_ops, },
+ { .name = "inc", .ops = &dec_ops, },
+ { .name = "incl", .ops = &dec_ops, },
+ { .name = "ja", .ops = &jump_ops, },
+ { .name = "jae", .ops = &jump_ops, },
+ { .name = "jb", .ops = &jump_ops, },
+ { .name = "jbe", .ops = &jump_ops, },
+ { .name = "jc", .ops = &jump_ops, },
+ { .name = "jcxz", .ops = &jump_ops, },
+ { .name = "je", .ops = &jump_ops, },
+ { .name = "jecxz", .ops = &jump_ops, },
+ { .name = "jg", .ops = &jump_ops, },
+ { .name = "jge", .ops = &jump_ops, },
+ { .name = "jl", .ops = &jump_ops, },
+ { .name = "jle", .ops = &jump_ops, },
+ { .name = "jmp", .ops = &jump_ops, },
+ { .name = "jmpq", .ops = &jump_ops, },
+ { .name = "jna", .ops = &jump_ops, },
+ { .name = "jnae", .ops = &jump_ops, },
+ { .name = "jnb", .ops = &jump_ops, },
+ { .name = "jnbe", .ops = &jump_ops, },
+ { .name = "jnc", .ops = &jump_ops, },
+ { .name = "jne", .ops = &jump_ops, },
+ { .name = "jng", .ops = &jump_ops, },
+ { .name = "jnge", .ops = &jump_ops, },
+ { .name = "jnl", .ops = &jump_ops, },
+ { .name = "jnle", .ops = &jump_ops, },
+ { .name = "jno", .ops = &jump_ops, },
+ { .name = "jnp", .ops = &jump_ops, },
+ { .name = "jns", .ops = &jump_ops, },
+ { .name = "jnz", .ops = &jump_ops, },
+ { .name = "jo", .ops = &jump_ops, },
+ { .name = "jp", .ops = &jump_ops, },
+ { .name = "jpe", .ops = &jump_ops, },
+ { .name = "jpo", .ops = &jump_ops, },
+ { .name = "jrcxz", .ops = &jump_ops, },
+ { .name = "js", .ops = &jump_ops, },
+ { .name = "jz", .ops = &jump_ops, },
+ { .name = "lea", .ops = &mov_ops, },
+ { .name = "lock", .ops = &lock_ops, },
+ { .name = "mov", .ops = &mov_ops, },
+ { .name = "movb", .ops = &mov_ops, },
+ { .name = "movdqa", .ops = &mov_ops, },
+ { .name = "movl", .ops = &mov_ops, },
+ { .name = "movq", .ops = &mov_ops, },
+ { .name = "movslq", .ops = &mov_ops, },
+ { .name = "movzbl", .ops = &mov_ops, },
+ { .name = "movzwl", .ops = &mov_ops, },
+ { .name = "nop", .ops = &nop_ops, },
+ { .name = "nopl", .ops = &nop_ops, },
+ { .name = "nopw", .ops = &nop_ops, },
+ { .name = "or", .ops = &mov_ops, },
+ { .name = "orl", .ops = &mov_ops, },
+ { .name = "test", .ops = &mov_ops, },
+ { .name = "testb", .ops = &mov_ops, },
+ { .name = "testl", .ops = &mov_ops, },
+ { .name = "xadd", .ops = &mov_ops, },
+ { .name = "xbeginl", .ops = &jump_ops, },
+ { .name = "xbeginq", .ops = &jump_ops, },
+ { .name = "retq", .ops = &ret_ops, },
+};
diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c
new file mode 100644
index 000000000000..c1625f256df3
--- /dev/null
+++ b/tools/perf/arch/x86/annotate/instructions.c
@@ -0,0 +1,78 @@
+static struct ins x86__instructions[] = {
+ { .name = "add", .ops = &mov_ops, },
+ { .name = "addl", .ops = &mov_ops, },
+ { .name = "addq", .ops = &mov_ops, },
+ { .name = "addw", .ops = &mov_ops, },
+ { .name = "and", .ops = &mov_ops, },
+ { .name = "bts", .ops = &mov_ops, },
+ { .name = "call", .ops = &call_ops, },
+ { .name = "callq", .ops = &call_ops, },
+ { .name = "cmp", .ops = &mov_ops, },
+ { .name = "cmpb", .ops = &mov_ops, },
+ { .name = "cmpl", .ops = &mov_ops, },
+ { .name = "cmpq", .ops = &mov_ops, },
+ { .name = "cmpw", .ops = &mov_ops, },
+ { .name = "cmpxch", .ops = &mov_ops, },
+ { .name = "dec", .ops = &dec_ops, },
+ { .name = "decl", .ops = &dec_ops, },
+ { .name = "imul", .ops = &mov_ops, },
+ { .name = "inc", .ops = &dec_ops, },
+ { .name = "incl", .ops = &dec_ops, },
+ { .name = "ja", .ops = &jump_ops, },
+ { .name = "jae", .ops = &jump_ops, },
+ { .name = "jb", .ops = &jump_ops, },
+ { .name = "jbe", .ops = &jump_ops, },
+ { .name = "jc", .ops = &jump_ops, },
+ { .name = "jcxz", .ops = &jump_ops, },
+ { .name = "je", .ops = &jump_ops, },
+ { .name = "jecxz", .ops = &jump_ops, },
+ { .name = "jg", .ops = &jump_ops, },
+ { .name = "jge", .ops = &jump_ops, },
+ { .name = "jl", .ops = &jump_ops, },
+ { .name = "jle", .ops = &jump_ops, },
+ { .name = "jmp", .ops = &jump_ops, },
+ { .name = "jmpq", .ops = &jump_ops, },
+ { .name = "jna", .ops = &jump_ops, },
+ { .name = "jnae", .ops = &jump_ops, },
+ { .name = "jnb", .ops = &jump_ops, },
+ { .name = "jnbe", .ops = &jump_ops, },
+ { .name = "jnc", .ops = &jump_ops, },
+ { .name = "jne", .ops = &jump_ops, },
+ { .name = "jng", .ops = &jump_ops, },
+ { .name = "jnge", .ops = &jump_ops, },
+ { .name = "jnl", .ops = &jump_ops, },
+ { .name = "jnle", .ops = &jump_ops, },
+ { .name = "jno", .ops = &jump_ops, },
+ { .name = "jnp", .ops = &jump_ops, },
+ { .name = "jns", .ops = &jump_ops, },
+ { .name = "jnz", .ops = &jump_ops, },
+ { .name = "jo", .ops = &jump_ops, },
+ { .name = "jp", .ops = &jump_ops, },
+ { .name = "jpe", .ops = &jump_ops, },
+ { .name = "jpo", .ops = &jump_ops, },
+ { .name = "jrcxz", .ops = &jump_ops, },
+ { .name = "js", .ops = &jump_ops, },
+ { .name = "jz", .ops = &jump_ops, },
+ { .name = "lea", .ops = &mov_ops, },
+ { .name = "lock", .ops = &lock_ops, },
+ { .name = "mov", .ops = &mov_ops, },
+ { .name = "movb", .ops = &mov_ops, },
+ { .name = "movdqa", .ops = &mov_ops, },
+ { .name = "movl", .ops = &mov_ops, },
+ { .name = "movq", .ops = &mov_ops, },
+ { .name = "movslq", .ops = &mov_ops, },
+ { .name = "movzbl", .ops = &mov_ops, },
+ { .name = "movzwl", .ops = &mov_ops, },
+ { .name = "nop", .ops = &nop_ops, },
+ { .name = "nopl", .ops = &nop_ops, },
+ { .name = "nopw", .ops = &nop_ops, },
+ { .name = "or", .ops = &mov_ops, },
+ { .name = "orl", .ops = &mov_ops, },
+ { .name = "test", .ops = &mov_ops, },
+ { .name = "testb", .ops = &mov_ops, },
+ { .name = "testl", .ops = &mov_ops, },
+ { .name = "xadd", .ops = &mov_ops, },
+ { .name = "xbeginl", .ops = &jump_ops, },
+ { .name = "xbeginq", .ops = &jump_ops, },
+ { .name = "retq", .ops = &ret_ops, },
+};
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 72769762ece9..095d90a9077f 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -28,20 +28,36 @@ const char *disassembler_style;
const char *objdump_path;
static regex_t file_lineno;

-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(struct arch *arch, const char *name);
static int disasm_line__parse(char *line, char **namep, char **rawp);

struct arch {
const char *name;
+ struct ins *instructions;
+ size_t nr_instructions;
+ bool sorted_instructions;
struct {
char comment_char;
char skip_functions_char;
} objdump;
};

+static struct ins_ops call_ops;
+static struct ins_ops dec_ops;
+static struct ins_ops jump_ops;
+static struct ins_ops mov_ops;
+static struct ins_ops nop_ops;
+static struct ins_ops lock_ops;
+static struct ins_ops ret_ops;
+
+#include "arch/arm/annotate/instructions.c"
+#include "arch/x86/annotate/instructions.c"
+
static struct arch architectures[] = {
{
.name = "arm",
+ .instructions = arm__instructions,
+ .nr_instructions = ARRAY_SIZE(arm__instructions),
.objdump = {
.comment_char = ';',
.skip_functions_char = '+',
@@ -49,6 +65,8 @@ static struct arch architectures[] = {
},
{
.name = "x86",
+ .instructions = x86__instructions,
+ .nr_instructions = ARRAY_SIZE(x86__instructions),
.objdump = {
.comment_char = '#',
},
@@ -209,7 +227,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map *
if (disasm_line__parse(ops->raw, &name, &ops->locked.ops->raw) < 0)
goto out_free_ops;

- ops->locked.ins = ins__find(name);
+ ops->locked.ins = ins__find(arch, name);
free(name);

if (ops->locked.ins == NULL)
@@ -385,99 +403,6 @@ bool ins__is_ret(const struct ins *ins)
return ins->ops == &ret_ops;
}

-static struct ins instructions[] = {
- { .name = "add", .ops = &mov_ops, },
- { .name = "addl", .ops = &mov_ops, },
- { .name = "addq", .ops = &mov_ops, },
- { .name = "addw", .ops = &mov_ops, },
- { .name = "and", .ops = &mov_ops, },
-#ifdef __arm__
- { .name = "b", .ops = &jump_ops, }, // might also be a call
- { .name = "bcc", .ops = &jump_ops, },
- { .name = "bcs", .ops = &jump_ops, },
- { .name = "beq", .ops = &jump_ops, },
- { .name = "bge", .ops = &jump_ops, },
- { .name = "bgt", .ops = &jump_ops, },
- { .name = "bhi", .ops = &jump_ops, },
- { .name = "bl", .ops = &call_ops, },
- { .name = "bls", .ops = &jump_ops, },
- { .name = "blt", .ops = &jump_ops, },
- { .name = "blx", .ops = &call_ops, },
- { .name = "bne", .ops = &jump_ops, },
-#endif
- { .name = "bts", .ops = &mov_ops, },
- { .name = "call", .ops = &call_ops, },
- { .name = "callq", .ops = &call_ops, },
- { .name = "cmp", .ops = &mov_ops, },
- { .name = "cmpb", .ops = &mov_ops, },
- { .name = "cmpl", .ops = &mov_ops, },
- { .name = "cmpq", .ops = &mov_ops, },
- { .name = "cmpw", .ops = &mov_ops, },
- { .name = "cmpxch", .ops = &mov_ops, },
- { .name = "dec", .ops = &dec_ops, },
- { .name = "decl", .ops = &dec_ops, },
- { .name = "imul", .ops = &mov_ops, },
- { .name = "inc", .ops = &dec_ops, },
- { .name = "incl", .ops = &dec_ops, },
- { .name = "ja", .ops = &jump_ops, },
- { .name = "jae", .ops = &jump_ops, },
- { .name = "jb", .ops = &jump_ops, },
- { .name = "jbe", .ops = &jump_ops, },
- { .name = "jc", .ops = &jump_ops, },
- { .name = "jcxz", .ops = &jump_ops, },
- { .name = "je", .ops = &jump_ops, },
- { .name = "jecxz", .ops = &jump_ops, },
- { .name = "jg", .ops = &jump_ops, },
- { .name = "jge", .ops = &jump_ops, },
- { .name = "jl", .ops = &jump_ops, },
- { .name = "jle", .ops = &jump_ops, },
- { .name = "jmp", .ops = &jump_ops, },
- { .name = "jmpq", .ops = &jump_ops, },
- { .name = "jna", .ops = &jump_ops, },
- { .name = "jnae", .ops = &jump_ops, },
- { .name = "jnb", .ops = &jump_ops, },
- { .name = "jnbe", .ops = &jump_ops, },
- { .name = "jnc", .ops = &jump_ops, },
- { .name = "jne", .ops = &jump_ops, },
- { .name = "jng", .ops = &jump_ops, },
- { .name = "jnge", .ops = &jump_ops, },
- { .name = "jnl", .ops = &jump_ops, },
- { .name = "jnle", .ops = &jump_ops, },
- { .name = "jno", .ops = &jump_ops, },
- { .name = "jnp", .ops = &jump_ops, },
- { .name = "jns", .ops = &jump_ops, },
- { .name = "jnz", .ops = &jump_ops, },
- { .name = "jo", .ops = &jump_ops, },
- { .name = "jp", .ops = &jump_ops, },
- { .name = "jpe", .ops = &jump_ops, },
- { .name = "jpo", .ops = &jump_ops, },
- { .name = "jrcxz", .ops = &jump_ops, },
- { .name = "js", .ops = &jump_ops, },
- { .name = "jz", .ops = &jump_ops, },
- { .name = "lea", .ops = &mov_ops, },
- { .name = "lock", .ops = &lock_ops, },
- { .name = "mov", .ops = &mov_ops, },
- { .name = "movb", .ops = &mov_ops, },
- { .name = "movdqa",.ops = &mov_ops, },
- { .name = "movl", .ops = &mov_ops, },
- { .name = "movq", .ops = &mov_ops, },
- { .name = "movslq", .ops = &mov_ops, },
- { .name = "movzbl", .ops = &mov_ops, },
- { .name = "movzwl", .ops = &mov_ops, },
- { .name = "nop", .ops = &nop_ops, },
- { .name = "nopl", .ops = &nop_ops, },
- { .name = "nopw", .ops = &nop_ops, },
- { .name = "or", .ops = &mov_ops, },
- { .name = "orl", .ops = &mov_ops, },
- { .name = "test", .ops = &mov_ops, },
- { .name = "testb", .ops = &mov_ops, },
- { .name = "testl", .ops = &mov_ops, },
- { .name = "xadd", .ops = &mov_ops, },
- { .name = "xbeginl", .ops = &jump_ops, },
- { .name = "xbeginq", .ops = &jump_ops, },
- { .name = "retq", .ops = &ret_ops, },
-};
-
static int ins__key_cmp(const void *name, const void *insp)
{
const struct ins *ins = insp;
@@ -493,24 +418,23 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
}

-static void ins__sort(void)
+static void ins__sort(struct arch *arch)
{
- const int nmemb = ARRAY_SIZE(instructions);
+ const int nmemb = arch->nr_instructions;

- qsort(instructions, nmemb, sizeof(struct ins), ins__cmp);
+ qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
}

-static struct ins *ins__find(const char *name)
+static struct ins *ins__find(struct arch *arch, const char *name)
{
- const int nmemb = ARRAY_SIZE(instructions);
- static bool sorted;
+ const int nmemb = arch->nr_instructions;

- if (!sorted) {
- ins__sort();
- sorted = true;
+ if (!arch->sorted_instructions) {
+ ins__sort(arch);
+ arch->sorted_instructions = true;
}

- return bsearch(name, instructions, nmemb, sizeof(struct ins), ins__key_cmp);
+ return bsearch(name, arch->instructions, nmemb, sizeof(struct ins), ins__key_cmp);
}

static int arch__key_cmp(const void *name, const void *archp)
@@ -767,7 +691,7 @@ int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 ip)

static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map *map)
{
- dl->ins = ins__find(dl->name);
+ dl->ins = ins__find(arch, dl->name);

if (dl->ins == NULL)
return;
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:07 AM11/23/16
to
From: Jiri Olsa <jo...@kernel.org>

Adding support for cascading options added by Namhyung in:

commit 369a2478973a ("tools lib subcmd: Support cascading options")

This way the report and record command share options with with c2c
command and can save some option duplicates. For now it's the 'v'
option.

Signed-off-by: Jiri Olsa <jo...@redhat.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-7-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-c2c.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 54924717ae8e..4b419631753d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -101,6 +101,11 @@ static const char *display_str[DISPLAY_MAX] = {
[DISPLAY_TOT] = "Total",
};

+static const struct option c2c_options[] = {
+ OPT_INCR('v', "verbose", &verbose, "be more verbose (show counter open errors, etc)"),
+ OPT_END()
+};
+
static struct perf_c2c c2c;

static void *c2c_he_zalloc(size_t size)
@@ -2520,11 +2525,9 @@ static int perf_c2c__report(int argc, const char **argv)
const char *display = NULL;
const char *coalesce = NULL;
bool no_source = false;
- const struct option c2c_options[] = {
+ const struct option options[] = {
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show counter open errors, etc)"),
OPT_STRING('i', "input", &input_name, "file",
"the input file to process"),
OPT_INCR('N', "node-info", &c2c.node_info,
@@ -2548,14 +2551,15 @@ static int perf_c2c__report(int argc, const char **argv)
OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
"coalesce fields: pid,tid,iaddr,dso"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
+ OPT_PARENT(c2c_options),
OPT_END()
};
int err = 0;

- argc = parse_options(argc, argv, c2c_options, report_c2c_usage,
+ argc = parse_options(argc, argv, options, report_c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
if (argc)
- usage_with_options(report_c2c_usage, c2c_options);
+ usage_with_options(report_c2c_usage, options);

if (c2c.stats_only)
c2c.use_stdio = true;
@@ -2683,11 +2687,10 @@ static int perf_c2c__record(int argc, const char **argv)
OPT_CALLBACK('e', "event", &event_set, "event",
"event selector. Use 'perf mem record -e list' to list available events",
parse_record_events),
- OPT_INCR('v', "verbose", &verbose,
- "be more verbose (show counter open errors, etc)"),
OPT_BOOLEAN('u', "all-user", &all_user, "collect only user level data"),
OPT_BOOLEAN('k', "all-kernel", &all_kernel, "collect only kernel level data"),
OPT_UINTEGER('l', "ldlat", &perf_mem_events__loads_ldlat, "setup mem-loads latency"),
+ OPT_PARENT(c2c_options),
OPT_END()
};

@@ -2759,11 +2762,6 @@ static int perf_c2c__record(int argc, const char **argv)

int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
{
- const struct option c2c_options[] = {
- OPT_INCR('v', "verbose", &verbose, "be more verbose"),
- OPT_END()
- };
-
argc = parse_options(argc, argv, c2c_options, c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:10 AM11/23/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

This is to cope with an ARM specific kludge introduced in the original
patch supporting ARM annotation, cfef25b8daf7 ("perf annotate: ARM
support") that made functions with a '+' in its name to be skipped when
processing call instructions.

With this patchkit it should be possible to collect a perf.data file on
a ARM machine and then annotate it on a x86 workstation and have those
ARM kludges used.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Russell King <rmk+k...@arm.linux.org.uk>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-2fi3sy7q3s...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1ba41a27214d..72769762ece9 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -35,6 +35,7 @@ struct arch {
const char *name;
struct {
char comment_char;
+ char skip_functions_char;
} objdump;
};

@@ -43,6 +44,7 @@ static struct arch architectures[] = {
.name = "arm",
.objdump = {
.comment_char = ';',
+ .skip_functions_char = '+',
},
},
{
@@ -78,7 +80,7 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
}

-static int call__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map *map)
+static int call__parse(struct arch *arch, struct ins_operands *ops, struct map *map)
{
char *endptr, *tok, *name;

@@ -90,10 +92,9 @@ static int call__parse(struct arch *arch __maybe_unused, struct ins_operands *op

name++;

-#ifdef __arm__
- if (strchr(name, '+'))
+ if (arch->objdump.skip_functions_char &&
+ strchr(name, arch->objdump.skip_functions_char))
return -1;
-#endif

tok = strchr(name, '>');
if (tok == NULL)
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:10 AM11/23/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Introduce a 'struct arch', where arch specific stuff will live, starting
with objdump's choice of comment delimitation character, that is '#' in
x86 while a ';' in arm.

This has some bits and pieces from a patch submitted by Ravi.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-f337tzjjcl...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 114 ++++++++++++++++++++++++++++++++------
tools/perf/util/annotate.h | 6 +-
5 files changed, 103 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fe3af9535e85..3df4178ba378 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -130,7 +130,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
return err;
}

- err = symbol__disassemble(sym, map, 0);
+ err = symbol__disassemble(sym, map, NULL, 0);
if (err == 0) {
out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 4c18271c71c9..e6e9f7d80dbd 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
(nr_pcnt - 1);
}

- err = symbol__disassemble(sym, map, sizeof_bdl);
+ err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), sizeof_bdl);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 42d319927762..8c9308ac30b7 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -167,7 +167,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
if (map->dso->annotate_warned)
return -1;

- err = symbol__disassemble(sym, map, 0);
+ err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), 0);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index aeb5a441bd74..1ba41a27214d 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -18,9 +18,11 @@
#include "annotate.h"
#include "evsel.h"
#include "block-range.h"
+#include "arch/common.h"
#include <regex.h>
#include <pthread.h>
#include <linux/bitops.h>
+#include <sys/utsname.h>

const char *disassembler_style;
const char *objdump_path;
@@ -29,6 +31,28 @@ static regex_t file_lineno;
static struct ins *ins__find(const char *name);
static int disasm_line__parse(char *line, char **namep, char **rawp);

+struct arch {
+ const char *name;
+ struct {
+ char comment_char;
+ } objdump;
+};
+
+static struct arch architectures[] = {
+ {
+ .name = "arm",
+ .objdump = {
+ .comment_char = ';',
+ },
+ },
+ {
+ .name = "x86",
+ .objdump = {
+ .comment_char = '#',
+ },
+ },
+};
+
static void ins__delete(struct ins_operands *ops)
{
if (ops == NULL)
@@ -54,7 +78,7 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
}

-static int call__parse(struct ins_operands *ops, struct map *map)
+static int call__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map *map)
{
char *endptr, *tok, *name;

@@ -118,7 +142,7 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == &call_ops;
}

-static int jump__parse(struct ins_operands *ops, struct map *map __maybe_unused)
+static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map *map __maybe_unused)
{
const char *s = strchr(ops->raw, '+');

@@ -173,7 +197,7 @@ static int comment__symbol(char *raw, char *comment, u64 *addrp, char **namep)
return 0;
}

-static int lock__parse(struct ins_operands *ops, struct map *map)
+static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map *map)
{
char *name;

@@ -194,7 +218,7 @@ static int lock__parse(struct ins_operands *ops, struct map *map)
return 0;

if (ops->locked.ins->ops->parse &&
- ops->locked.ins->ops->parse(ops->locked.ops, map) < 0)
+ ops->locked.ins->ops->parse(arch, ops->locked.ops, map) < 0)
goto out_free_ops;

return 0;
@@ -237,7 +261,7 @@ static struct ins_ops lock_ops = {
.scnprintf = lock__scnprintf,
};

-static int mov__parse(struct ins_operands *ops, struct map *map __maybe_unused)
+static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map *map __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *comment, prev;

@@ -252,11 +276,7 @@ static int mov__parse(struct ins_operands *ops, struct map *map __maybe_unused)
return -1;

target = ++s;
-#ifdef __arm__
- comment = strchr(s, ';');
-#else
- comment = strchr(s, '#');
-#endif
+ comment = strchr(s, arch->objdump.comment_char);

if (comment != NULL)
s = comment - 1;
@@ -304,7 +324,7 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};

-static int dec__parse(struct ins_operands *ops, struct map *map __maybe_unused)
+static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map *map __maybe_unused)
{
char *target, *comment, *s, prev;

@@ -492,6 +512,41 @@ static struct ins *ins__find(const char *name)
return bsearch(name, instructions, nmemb, sizeof(struct ins), ins__key_cmp);
}

+static int arch__key_cmp(const void *name, const void *archp)
+{
+ const struct arch *arch = archp;
+
+ return strcmp(name, arch->name);
+}
+
+static int arch__cmp(const void *a, const void *b)
+{
+ const struct arch *aa = a;
+ const struct arch *ab = b;
+
+ return strcmp(aa->name, ab->name);
+}
+
+static void arch__sort(void)
+{
+ const int nmemb = ARRAY_SIZE(architectures);
+
+ qsort(architectures, nmemb, sizeof(struct arch), arch__cmp);
+}
+
+static struct arch *arch__find(const char *name)
+{
+ const int nmemb = ARRAY_SIZE(architectures);
+ static bool sorted;
+
+ if (!sorted) {
+ arch__sort();
+ sorted = true;
+ }
+
+ return bsearch(name, architectures, nmemb, sizeof(struct arch), arch__key_cmp);
+}
+
int symbol__alloc_hist(struct symbol *sym)
{
struct annotation *notes = symbol__annotation(sym);
@@ -709,7 +764,7 @@ int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 ip)
return symbol__inc_addr_samples(he->ms.sym, he->ms.map, evidx, ip);
}

-static void disasm_line__init_ins(struct disasm_line *dl, struct map *map)
+static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map *map)
{
dl->ins = ins__find(dl->name);

@@ -719,7 +774,7 @@ static void disasm_line__init_ins(struct disasm_line *dl, struct map *map)
if (!dl->ins->ops)
return;

- if (dl->ins->ops->parse && dl->ins->ops->parse(&dl->ops, map) < 0)
+ if (dl->ins->ops->parse && dl->ins->ops->parse(arch, &dl->ops, map) < 0)
dl->ins = NULL;
}

@@ -762,6 +817,7 @@ static int disasm_line__parse(char *line, char **namep, char **rawp)

static struct disasm_line *disasm_line__new(s64 offset, char *line,
size_t privsize, int line_nr,
+ struct arch *arch,
struct map *map)
{
struct disasm_line *dl = zalloc(sizeof(*dl) + privsize);
@@ -777,7 +833,7 @@ static struct disasm_line *disasm_line__new(s64 offset, char *line,
if (disasm_line__parse(dl->line, &dl->name, &dl->ops.raw) < 0)
goto out_free_line;

- disasm_line__init_ins(dl, map);
+ disasm_line__init_ins(dl, arch, map);
}
}

@@ -1087,6 +1143,7 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
* The ops.raw part will be parsed further according to type of the instruction.
*/
static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
+ struct arch *arch,
FILE *file, size_t privsize,
int *line_nr)
{
@@ -1149,7 +1206,7 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
parsed_line = tmp2 + 1;
}

- dl = disasm_line__new(offset, parsed_line, privsize, *line_nr, map);
+ dl = disasm_line__new(offset, parsed_line, privsize, *line_nr, arch, map);
free(line);
(*line_nr)++;

@@ -1280,10 +1337,23 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
return 0;
}

-int symbol__disassemble(struct symbol *sym, struct map *map, size_t privsize)
+static const char *annotate__norm_arch(const char *arch_name)
+{
+ struct utsname uts;
+
+ if (!arch_name) { /* Assume we are annotating locally. */
+ if (uname(&uts) < 0)
+ return NULL;
+ arch_name = uts.machine;
+ }
+ return normalize_arch((char *)arch_name);
+}
+
+int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_name, size_t privsize)
{
struct dso *dso = map->dso;
char command[PATH_MAX * 2];
+ struct arch *arch = NULL;
FILE *file;
char symfs_filename[PATH_MAX];
struct kcore_extract kce;
@@ -1297,6 +1367,14 @@ int symbol__disassemble(struct symbol *sym, struct map *map, size_t privsize)
if (err)
return err;

+ arch_name = annotate__norm_arch(arch_name);
+ if (!arch_name)
+ return -1;
+
+ arch = arch__find(arch_name);
+ if (arch == NULL)
+ return -ENOTSUP;
+
pr_debug("%s: filename=%s, sym=%s, start=%#" PRIx64 ", end=%#" PRIx64 "\n", __func__,
symfs_filename, sym->name, map->unmap_ip(map, sym->start),
map->unmap_ip(map, sym->end));
@@ -1395,7 +1473,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, size_t privsize)

nline = 0;
while (!feof(file)) {
- if (symbol__parse_objdump_line(sym, map, file, privsize,
+ if (symbol__parse_objdump_line(sym, map, arch, file, privsize,
&lineno) < 0)
break;
nline++;
@@ -1793,7 +1871,7 @@ int symbol__tty_annotate(struct symbol *sym, struct map *map,
struct rb_root source_line = RB_ROOT;
u64 len;

- if (symbol__disassemble(sym, map, 0) < 0)
+ if (symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), 0) < 0)
return -1;

len = symbol__size(sym);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 5bbcec173b82..8e490b5c91bc 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -34,9 +34,11 @@ struct ins_operands {
};
};

+struct arch;
+
struct ins_ops {
void (*free)(struct ins_operands *ops);
- int (*parse)(struct ins_operands *ops, struct map *map);
+ int (*parse)(struct arch *arch, struct ins_operands *ops, struct map *map);
int (*scnprintf)(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops);
};
@@ -156,7 +158,7 @@ int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 addr);
int symbol__alloc_hist(struct symbol *sym);
void symbol__annotate_zero_histograms(struct symbol *sym);

-int symbol__disassemble(struct symbol *sym, struct map *map, size_t privsize);
+int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_name, size_t privsize);

enum symbol_disassemble_errno {
SYMBOL_ANNOTATE_ERRNO__SUCCESS = 0,
--
2.7.4

Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:50:10 AM11/23/16
to
From: Namhyung Kim <namh...@kernel.org>

The EVSEL__PRINT_CALLCHAIN_ARROW options can be used to print callchains
with arrows for readability. It will be used 'sched timehist' command
like below:

__schedule <- schedule <- schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
__schedule <- schedule <- schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
__schedule <- schedule <- worker_thread <- kthread <- ret_from_fork

Suggested-and-Acked-by: Ingo Molnar <mi...@kernel.org>
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/evsel.h | 1 +
tools/perf/util/evsel_fprintf.c | 6 ++++++
2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8cd7cd227483..27fa3a343577 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -391,6 +391,7 @@ int perf_evsel__fprintf(struct perf_evsel *evsel,
#define EVSEL__PRINT_ONELINE (1<<4)
#define EVSEL__PRINT_SRCLINE (1<<5)
#define EVSEL__PRINT_UNKNOWN_AS_ADDR (1<<6)
+#define EVSEL__PRINT_CALLCHAIN_ARROW (1<<7)

struct callchain_cursor;

diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index ccb602397b60..53bb614feafb 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -108,7 +108,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
int print_oneline = print_opts & EVSEL__PRINT_ONELINE;
int print_srcline = print_opts & EVSEL__PRINT_SRCLINE;
int print_unknown_as_addr = print_opts & EVSEL__PRINT_UNKNOWN_AS_ADDR;
+ int print_arrow = print_opts & EVSEL__PRINT_CALLCHAIN_ARROW;
char s = print_oneline ? ' ' : '\t';
+ bool first = true;

if (sample->callchain) {
struct addr_location node_al;
@@ -124,6 +126,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,

printed += fprintf(fp, "%-*.*s", left_alignment, left_alignment, " ");

+ if (print_arrow && !first)
+ printed += fprintf(fp, " <-");
+
if (print_ip)
printed += fprintf(fp, "%c%16" PRIx64, s, node->ip);

@@ -158,6 +163,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
printed += fprintf(fp, "\n");

callchain_cursor_advance(cursor);
+ first = false;
}
}

--
2.7.4

Ingo Molnar

unread,
Nov 23, 2016, 11:20:05 PM11/23/16
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:20:06 PM11/23/16
to
Commit-ID: 763d8960a17126e73e7d9cd6b66e390196f48894
Gitweb: http://git.kernel.org/tip/763d8960a17126e73e7d9cd6b66e390196f48894
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Thu, 17 Nov 2016 12:31:51 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 17 Nov 2016 17:31:59 -0300

perf annotate: Add per arch instructions annotate handlers

Another step in supporting cross annotation.

The arch specific tables are put in:

tools/perf/arch/$ARCH/annotation/instructions.c

which, so far, just plug instructions to a bunch of parsers/formatters,
but may have more as the need arises.

This is an alternative implementation to a previous attempt made by Ravi
Bangoria.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Russell King <rmk+k...@arm.linux.org.uk>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-g3wt282lfa...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/arm/annotate/instructions.c | 90 ++++++++++++++++++
tools/perf/arch/x86/annotate/instructions.c | 78 ++++++++++++++++
tools/perf/util/annotate.c | 136 ++++++----------------------
3 files changed, 198 insertions(+), 106 deletions(-)

diff --git a/tools/perf/arch/arm/annotate/instructions.c b/tools/perf/arch/arm/annotate/instructions.c
new file mode 100644
index 0000000..d67b8aa
index 0000000..c1625f2
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7276976..095d90a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -28,20 +28,36 @@ const char *disassembler_style;
const char *objdump_path;
static regex_t file_lineno;

-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(struct arch *arch, const char *name);
static int disasm_line__parse(char *line, char **namep, char **rawp);

struct arch {
const char *name;
+ struct ins *instructions;
+ size_t nr_instructions;
+ bool sorted_instructions;
struct {
char comment_char;
char skip_functions_char;
} objdump;
};

+static struct ins_ops call_ops;
+static struct ins_ops dec_ops;
+static struct ins_ops jump_ops;
+static struct ins_ops mov_ops;
+static struct ins_ops nop_ops;
+static struct ins_ops lock_ops;
+static struct ins_ops ret_ops;
+
+#include "arch/arm/annotate/instructions.c"
+#include "arch/x86/annotate/instructions.c"
+
static struct arch architectures[] = {
{
.name = "arm",
+ .instructions = arm__instructions,
+ .nr_instructions = ARRAY_SIZE(arm__instructions),
.objdump = {
.comment_char = ';',
.skip_functions_char = '+',
@@ -49,6 +65,8 @@ static struct arch architectures[] = {
},
{
.name = "x86",
- return bsearch(name, instructions, nmemb, sizeof(struct ins), ins__key_cmp);
+ return bsearch(name, arch->instructions, nmemb, sizeof(struct ins), ins__key_cmp);
}

static int arch__key_cmp(const void *name, const void *archp)
@@ -767,7 +691,7 @@ int hist_entry__inc_addr_samples(struct hist_entry *he, int evidx, u64 ip)

static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map *map)
{

tip-bot for Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:20:07 PM11/23/16
to
Commit-ID: 786c1b51844d858041166057c0c79e93c2015013
Gitweb: http://git.kernel.org/tip/786c1b51844d858041166057c0c79e93c2015013
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 16 Nov 2016 15:39:50 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 17 Nov 2016 17:12:50 -0300

perf annotate: Start supporting cross arch annotation

Introduce a 'struct arch', where arch specific stuff will live, starting
with objdump's choice of comment delimitation character, that is '#' in
x86 while a ';' in arm.

This has some bits and pieces from a patch submitted by Ravi.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-f337tzjjcl...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 114 ++++++++++++++++++++++++++++++++------
tools/perf/util/annotate.h | 6 +-
5 files changed, 103 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fe3af95..3df4178 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -130,7 +130,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
return err;
}

- err = symbol__disassemble(sym, map, 0);
+ err = symbol__disassemble(sym, map, NULL, 0);
if (err == 0) {
out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 4c18271..e6e9f7d 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
(nr_pcnt - 1);
}

- err = symbol__disassemble(sym, map, sizeof_bdl);
+ err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), sizeof_bdl);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 42d3199..8c9308a 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -167,7 +167,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
if (map->dso->annotate_warned)
return -1;

- err = symbol__disassemble(sym, map, 0);
+ err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel), 0);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index aeb5a44..1ba41a2 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -18,9 +18,11 @@
#include "annotate.h"
#include "evsel.h"
#include "block-range.h"
+#include "arch/common.h"
#include <regex.h>
#include <pthread.h>
#include <linux/bitops.h>
+#include <sys/utsname.h>

const char *disassembler_style;
const char *objdump_path;
@@ -29,6 +31,28 @@ static regex_t file_lineno;
static struct ins *ins__find(const char *name);
static int disasm_line__parse(char *line, char **namep, char **rawp);

@@ -492,6 +512,41 @@ static struct ins *ins__find(const char *name)
return bsearch(name, instructions, nmemb, sizeof(struct ins), ins__key_cmp);
}

-static void disasm_line__init_ins(struct disasm_line *dl, struct map *map)
+static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map *map)
{
dl->ins = ins__find(dl->name);

@@ -719,7 +774,7 @@ static void disasm_line__init_ins(struct disasm_line *dl, struct map *map)
if (!dl->ins->ops)
return;

- if (dl->ins->ops->parse && dl->ins->ops->parse(&dl->ops, map) < 0)
+ if (dl->ins->ops->parse && dl->ins->ops->parse(arch, &dl->ops, map) < 0)
dl->ins = NULL;
}

@@ -762,6 +817,7 @@ out_free_name:

static struct disasm_line *disasm_line__new(s64 offset, char *line,
size_t privsize, int line_nr,
+ struct arch *arch,
struct map *map)
{
struct disasm_line *dl = zalloc(sizeof(*dl) + privsize);
@@ -777,7 +833,7 @@ static struct disasm_line *disasm_line__new(s64 offset, char *line,
if (disasm_line__parse(dl->line, &dl->name, &dl->ops.raw) < 0)
goto out_free_line;

- disasm_line__init_ins(dl, map);
+ disasm_line__init_ins(dl, arch, map);
}
}

@@ -1087,6 +1143,7 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
* The ops.raw part will be parsed further according to type of the instruction.
*/
static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
+ struct arch *arch,
FILE *file, size_t privsize,
int *line_nr)
{
@@ -1149,7 +1206,7 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
parsed_line = tmp2 + 1;
}

- dl = disasm_line__new(offset, parsed_line, privsize, *line_nr, map);
+ dl = disasm_line__new(offset, parsed_line, privsize, *line_nr, arch, map);
free(line);
(*line_nr)++;

@@ -1280,10 +1337,23 @@ fallback:
index 5bbcec1..8e490b5 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Nov 23, 2016, 11:20:07 PM11/23/16
to
Commit-ID: 9c2fb451bda0aa60127e63e44993401818326e91
Gitweb: http://git.kernel.org/tip/9c2fb451bda0aa60127e63e44993401818326e91
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 16 Nov 2016 15:50:38 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 17 Nov 2016 17:12:56 -0300

perf annotate: Allow arches to specify functions to skip

This is to cope with an ARM specific kludge introduced in the original
patch supporting ARM annotation, cfef25b8daf7 ("perf annotate: ARM
support") that made functions with a '+' in its name to be skipped when
processing call instructions.

With this patchkit it should be possible to collect a perf.data file on
a ARM machine and then annotate it on a x86 workstation and have those
ARM kludges used.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Russell King <rmk+k...@arm.linux.org.uk>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-2fi3sy7q3s...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1ba41a2..7276976 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -35,6 +35,7 @@ struct arch {
const char *name;
struct {
char comment_char;
+ char skip_functions_char;
} objdump;
};

@@ -43,6 +44,7 @@ static struct arch architectures[] = {
.name = "arm",
.objdump = {
.comment_char = ';',
+ .skip_functions_char = '+',
},
},
{
@@ -78,7 +80,7 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
}

Arnaldo Carvalho de Melo

unread,
Nov 25, 2016, 10:20:08 AM11/25/16
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message.

The following changes since commit 47414424c53a70eceb0fc6e0a35a31a2b763d5b2:

Merge tag 'perf-core-for-mingo-20161123' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-11-24 05:09:31 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161125

for you to fetch changes up to 4708bbda5cb2f6cdc331744597527143f46394d5:

tools lib bpf: Fix maps resolution (2016-11-25 11:27:33 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Improve ARM support in the annotation code, affecting 'perf annotate', 'perf
report' and live annotation in 'perf top' (Kim Phillips)

- Initial support for PowerPC in the annotation code (Ravi Bangoria)

- Skip repetitive scheduler function on the top of the stack in
'perf sched timehist' (Namhyung Kim)

Fixes:

- Fix maps resolution in libbpf (Eric Leblond)

- Get the kernel signature via /proc/version_signature, available on
ubuntu systems, to make sure bpf proggies works, as the one provided
via 'uname -r' doesn't (Wang Nan)

- Fix segfault in 'perf record' when running with suid and kptr_restrict
is 1 (Wang Nan)

Infrastructure:

- Support per-arch instruction tables, kept via a static or dynamic table
(Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (4):
perf annotate: Remove duplicate 'name' field from disasm_line
perf annotate: Introduce alternative method of keeping instructions table
perf annotate: Allow arches to have a init routine and a priv area
perf annotate: Improve support for ARM

Eric Leblond (1):
tools lib bpf: Fix maps resolution

Namhyung Kim (3):
perf callchain: Add option to skip ignore symbol when printing callchains
perf sched timehist: Mark schedule function in callchains
perf sched timehist: Enlarge max stack depth by 2

Ravi Bangoria (1):
perf annotate: Initial PowerPC support

Wang Nan (3):
perf tools: Fix kernel version error in ubuntu
perf record: Fix segfault when running with suid and kptr_restrict is 1
perf tools: Add missing struct definition in probe_event.h

tools/lib/bpf/libbpf.c | 142 ++++++++++++++-------
tools/perf/arch/arm/annotate/instructions.c | 147 +++++++++-------------
tools/perf/arch/powerpc/annotate/instructions.c | 58 +++++++++
tools/perf/builtin-sched.c | 26 +++-
tools/perf/ui/browsers/annotate.c | 18 +--
tools/perf/util/annotate.c | 157 +++++++++++++++++-------
tools/perf/util/annotate.h | 17 ++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/evsel_fprintf.c | 7 +-
tools/perf/util/probe-event.h | 2 +
tools/perf/util/symbol.c | 2 +-
tools/perf/util/symbol.h | 1 +
tools/perf/util/util.c | 55 ++++++++-
13 files changed, 431 insertions(+), 202 deletions(-)
create mode 100644 tools/perf/arch/powerpc/annotate/instructions.c

Rebuilding containers, so limited coverage at this time:
# dm
1 debian:experimental: Ok
2 fedora:24: Ok
3 fedora:24-x-ARC-uClibc: Ok
4 fedora:rawhide: Ok
5 opensuse:tumbleweed: Ok
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_slang_O: make NO_SLANG=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_newt_O: make NO_NEWT=1
make_util_map_o_O: make util/map.o
make_pure_O: make
make_no_libbpf_O: make NO_LIBBPF=1
make_doc_O: make doc
make_no_libnuma_O: make NO_LIBNUMA=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_tags_O: make tags
make_install_prefix_O: make install prefix=/tmp/krava
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_static_O: make LDFLAGS=-static
make_debug_O: make DEBUG=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_perf_o_O: make perf.o
make_no_libbionic_O: make NO_LIBBIONIC=1
make_clean_all_O: make clean all
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_O: make install
make_no_libunwind_O: make NO_LIBUNWIND=1
make_install_bin_O: make install-bin
make_no_libelf_O: make NO_LIBELF=1
make_help_O: make help
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$

Arnaldo Carvalho de Melo

unread,
Nov 25, 2016, 10:20:09 AM11/25/16
to
From: Eric Leblond <er...@regit.org>

It is not correct to assimilate the elf data of the maps section to an
array of map definition. In fact the sizes differ. The offset provided
in the symbol section has to be used instead.

This patch fixes a bug causing a elf with two maps not to load
correctly.

Wang Nan added:

This patch requires a name for each BPF map, so array of BPF maps is not
allowed. This restriction is reasonable, because kernel verifier forbid
indexing BPF map from such array unless the index is a fixed value, but
if the index is fixed why not merging it into name?

For example:

Program like this:
...
unsigned long cpu = get_smp_processor_id();
int *pval = map_lookup_elem(&map_array[cpu], &key);
...

Generates bytecode like this:

0: (b7) r1 = 0
1: (63) *(u32 *)(r10 -4) = r1
2: (b7) r1 = 680997
3: (63) *(u32 *)(r10 -8) = r1
4: (85) call 8
5: (67) r0 <<= 4
6: (18) r1 = 0x112dd000
8: (0f) r0 += r1
9: (bf) r2 = r10
10: (07) r2 += -4
11: (bf) r1 = r0
12: (85) call 1

Where instruction 8 is the computation, 8 and 11 render r1 to an invalid
value for function map_lookup_elem, causes verifier report error.

Signed-off-by: Eric Leblond <er...@regit.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Wang Nan <wang...@huawei.com>
[ Merge bpf_object__init_maps_name into bpf_object__init_maps.
Fix segfault for buggy BPF script Validate obj->maps ]
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161115040617....@huawei.com
Signed-off-by: Wang Nan <wang...@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 142 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 98 insertions(+), 44 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b699aea9a025..96a2b2ff1212 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -185,6 +185,7 @@ struct bpf_program {
struct bpf_map {
int fd;
char *name;
+ size_t offset;
struct bpf_map_def def;
void *priv;
bpf_map_clear_priv_t clear_priv;
@@ -513,57 +514,106 @@ bpf_object__init_kversion(struct bpf_object *obj,
}

static int
-bpf_object__init_maps(struct bpf_object *obj, void *data,
- size_t size)
+bpf_object__validate_maps(struct bpf_object *obj)
{
- size_t nr_maps;
int i;

- nr_maps = size / sizeof(struct bpf_map_def);
- if (!data || !nr_maps) {
- pr_debug("%s doesn't need map definition\n",
- obj->path);
+ /*
+ * If there's only 1 map, the only error case should have been
+ * catched in bpf_object__init_maps().
+ */
+ if (!obj->maps || !obj->nr_maps || (obj->nr_maps == 1))
return 0;
- }

- pr_debug("maps in %s: %zd bytes\n", obj->path, size);
+ for (i = 1; i < obj->nr_maps; i++) {
+ const struct bpf_map *a = &obj->maps[i - 1];
+ const struct bpf_map *b = &obj->maps[i];

- obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
- if (!obj->maps) {
- pr_warning("alloc maps for object failed\n");
- return -ENOMEM;
+ if (b->offset - a->offset < sizeof(struct bpf_map_def)) {
+ pr_warning("corrupted map section in %s: map \"%s\" too small\n",
+ obj->path, a->name);
+ return -EINVAL;
+ }
}
- obj->nr_maps = nr_maps;
-
- for (i = 0; i < nr_maps; i++) {
- struct bpf_map_def *def = &obj->maps[i].def;
+ return 0;
+}

- /*
- * fill all fd with -1 so won't close incorrect
- * fd (fd=0 is stdin) when failure (zclose won't close
- * negative fd)).
- */
- obj->maps[i].fd = -1;
+static int compare_bpf_map(const void *_a, const void *_b)
+{
+ const struct bpf_map *a = _a;
+ const struct bpf_map *b = _b;

- /* Save map definition into obj->maps */
- *def = ((struct bpf_map_def *)data)[i];
- }
- return 0;
+ return a->offset - b->offset;
}

static int
-bpf_object__init_maps_name(struct bpf_object *obj)
+bpf_object__init_maps(struct bpf_object *obj)
{
- int i;
+ int i, map_idx, nr_maps = 0;
+ Elf_Scn *scn;
+ Elf_Data *data;
Elf_Data *symbols = obj->efile.symbols;

- if (!symbols || obj->efile.maps_shndx < 0)
+ if (obj->efile.maps_shndx < 0)
+ return -EINVAL;
+ if (!symbols)
+ return -EINVAL;
+
+ scn = elf_getscn(obj->efile.elf, obj->efile.maps_shndx);
+ if (scn)
+ data = elf_getdata(scn, NULL);
+ if (!scn || !data) {
+ pr_warning("failed to get Elf_Data from map section %d\n",
+ obj->efile.maps_shndx);
return -EINVAL;
+ }

+ /*
+ * Count number of maps. Each map has a name.
+ * Array of maps is not supported: only the first element is
+ * considered.
+ *
+ * TODO: Detect array of map and report error.
+ */
for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
GElf_Sym sym;
- size_t map_idx;
+
+ if (!gelf_getsym(symbols, i, &sym))
+ continue;
+ if (sym.st_shndx != obj->efile.maps_shndx)
+ continue;
+ nr_maps++;
+ }
+
+ /* Alloc obj->maps and fill nr_maps. */
+ pr_debug("maps in %s: %d maps in %zd bytes\n", obj->path,
+ nr_maps, data->d_size);
+
+ if (!nr_maps)
+ return 0;
+
+ obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
+ if (!obj->maps) {
+ pr_warning("alloc maps for object failed\n");
+ return -ENOMEM;
+ }
+ obj->nr_maps = nr_maps;
+
+ /*
+ * fill all fd with -1 so won't close incorrect
+ * fd (fd=0 is stdin) when failure (zclose won't close
+ * negative fd)).
+ */
+ for (i = 0; i < nr_maps; i++)
+ obj->maps[i].fd = -1;
+
+ /*
+ * Fill obj->maps using data in "maps" section.
+ */
+ for (i = 0, map_idx = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
+ GElf_Sym sym;
const char *map_name;
+ struct bpf_map_def *def;

if (!gelf_getsym(symbols, i, &sym))
continue;
@@ -573,21 +623,27 @@ bpf_object__init_maps_name(struct bpf_object *obj)
map_name = elf_strptr(obj->efile.elf,
obj->efile.strtabidx,
sym.st_name);
- map_idx = sym.st_value / sizeof(struct bpf_map_def);
- if (map_idx >= obj->nr_maps) {
- pr_warning("index of map \"%s\" is buggy: %zu > %zu\n",
- map_name, map_idx, obj->nr_maps);
- continue;
+ obj->maps[map_idx].offset = sym.st_value;
+ if (sym.st_value + sizeof(struct bpf_map_def) > data->d_size) {
+ pr_warning("corrupted maps section in %s: last map \"%s\" too small\n",
+ obj->path, map_name);
+ return -EINVAL;
}
+
obj->maps[map_idx].name = strdup(map_name);
if (!obj->maps[map_idx].name) {
pr_warning("failed to alloc map name\n");
return -ENOMEM;
}
- pr_debug("map %zu is \"%s\"\n", map_idx,
+ pr_debug("map %d is \"%s\"\n", map_idx,
obj->maps[map_idx].name);
+ def = (struct bpf_map_def *)(data->d_buf + sym.st_value);
+ obj->maps[map_idx].def = *def;
+ map_idx++;
}
- return 0;
+
+ qsort(obj->maps, obj->nr_maps, sizeof(obj->maps[0]), compare_bpf_map);
+ return bpf_object__validate_maps(obj);
}

static int bpf_object__elf_collect(struct bpf_object *obj)
@@ -645,11 +701,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
err = bpf_object__init_kversion(obj,
data->d_buf,
data->d_size);
- else if (strcmp(name, "maps") == 0) {
- err = bpf_object__init_maps(obj, data->d_buf,
- data->d_size);
+ else if (strcmp(name, "maps") == 0)
obj->efile.maps_shndx = idx;
- } else if (sh.sh_type == SHT_SYMTAB) {
+ else if (sh.sh_type == SHT_SYMTAB) {
if (obj->efile.symbols) {
pr_warning("bpf: multiple SYMTAB in %s\n",
obj->path);
@@ -698,7 +752,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
return LIBBPF_ERRNO__FORMAT;
}
if (obj->efile.maps_shndx >= 0)
- err = bpf_object__init_maps_name(obj);
+ err = bpf_object__init_maps(obj);
out:
return err;
}
@@ -807,7 +861,7 @@ bpf_object__create_maps(struct bpf_object *obj)
zclose(obj->maps[j].fd);
return err;
}
- pr_debug("create map: fd=%d\n", *pfd);
+ pr_debug("create map %s: fd=%d\n", obj->maps[i].name, *pfd);
}

return 0;
--
2.7.4
Message has been deleted

Ingo Molnar

unread,
Nov 25, 2016, 12:20:06 PM11/25/16
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:05 PM12/1/16
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 2471cece40d61b0035360338569d338f9dea6099:

Merge tag 'perf-core-for-mingo-20161125' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-11-25 18:12:41 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161201

for you to fetch changes up to 0fcb1da4aba6e6c7b32de5e0948b740b31ad822d:

perf annotate: AArch64 support (2016-12-01 13:03:19 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Support AArch64 in the 'annotate' code, native/local and
cross-arch/remote (Kim Phillips)

- Allow considering just events in a given time interval, via the
'--time start.s.ms,end.s.ms' command line, added to 'perf kmem',
'perf report', 'perf sched timehist' and 'perf script' (David Ahern)

- Add option to stop printing a callchain at one of a given group of
symbol names (David Ahern)

- Handle cpu migration events in 'perf sched timehist' (David Ahern)

- Track memory freed in 'perf kmem stat' (David Ahern)

Infrastructure:

- Initial support (and perf test entry) for tooling hooks, starting with
'record_start' and 'record_end', that will have as its initial user the
eBPF infrastructure, where perf_ prefixed functions will be JITed and
run when such hooks are called (Wang Nan)

- Remove redundant "test" and similar strings from 'perf test' descriptions
(Arnaldo Carvalho de Melo)

- libbpf assorted improvements (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (3):
perf ui helpline: Provide a printf variant
perf annotate: Show invalid jump offset in error message
perf test: Remove "test" and similar strings from test descriptions

David Ahern (10):
perf sched timehist: Handle cpu migration events
perf trace: Update tid/pid filtering option to leverage symbol_conf
perf kmem stat: Track memory freed
perf script: Add option to stop printing callchain
perf tools: Add time-based utility functions
perf tools: Move parse_nsec_time to time-utils.c
perf script: Add option to specify time window of interest
perf sched timehist: Add option to specify time window of interest
perf kmem: Add option to specify time window of interest
perf report: Add option to specify time window of interest

Kim Phillips (2):
perf annotate: Use arch->objdump.comment_char in dec__parse()
perf annotate: AArch64 support

Wang Nan (4):
tools lib bpf: Add missing BPF functions
tools lib bpf: Add private field for bpf_object
tools lib bpf: Retrive bpf_map through offset of bpf_map_def
perf tools: Introduce perf hooks

tools/lib/bpf/bpf.c | 56 ++++++++++
tools/lib/bpf/bpf.h | 7 ++
tools/lib/bpf/libbpf.c | 35 ++++++
tools/lib/bpf/libbpf.h | 13 +++
tools/perf/Documentation/perf-kmem.txt | 7 ++
tools/perf/Documentation/perf-report.txt | 7 ++
tools/perf/Documentation/perf-sched.txt | 12 +++
tools/perf/Documentation/perf-script.txt | 10 ++
tools/perf/arch/arm64/annotate/instructions.c | 62 +++++++++++
tools/perf/arch/x86/tests/arch-tests.c | 10 +-
tools/perf/builtin-kmem.c | 36 ++++++-
tools/perf/builtin-record.c | 11 ++
tools/perf/builtin-report.c | 14 ++-
tools/perf/builtin-sched.c | 148 ++++++++++++++++++++++++--
tools/perf/builtin-script.c | 17 ++-
tools/perf/builtin-trace.c | 49 ++-------
tools/perf/tests/Build | 1 +
tools/perf/tests/bpf.c | 6 +-
tools/perf/tests/builtin-test.c | 96 +++++++++--------
tools/perf/tests/llvm.c | 8 +-
tools/perf/tests/perf-hooks.c | 44 ++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/ui/browsers/annotate.c | 6 +-
tools/perf/ui/helpline.c | 10 ++
tools/perf/ui/helpline.h | 1 +
tools/perf/util/Build | 3 +
tools/perf/util/annotate.c | 7 +-
tools/perf/util/evsel_fprintf.c | 8 ++
tools/perf/util/perf-hooks-list.h | 3 +
tools/perf/util/perf-hooks.c | 84 +++++++++++++++
tools/perf/util/perf-hooks.h | 37 +++++++
tools/perf/util/symbol.c | 8 ++
tools/perf/util/symbol.h | 6 +-
tools/perf/util/time-utils.c | 119 +++++++++++++++++++++
tools/perf/util/time-utils.h | 14 +++
tools/perf/util/util.c | 33 ------
tools/perf/util/util.h | 2 -
37 files changed, 842 insertions(+), 149 deletions(-)
create mode 100644 tools/perf/arch/arm64/annotate/instructions.c
create mode 100644 tools/perf/tests/perf-hooks.c
create mode 100644 tools/perf/util/perf-hooks-list.h
create mode 100644 tools/perf/util/perf-hooks.c
create mode 100644 tools/perf/util/perf-hooks.h
create mode 100644 tools/perf/util/time-utils.c
create mode 100644 tools/perf/util/time-utils.h

# uname -a
Linux jouet 4.8.8-300.fc25.x86_64 #1 SMP Tue Nov 15 18:10:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF prologue generation : Ok
37.3: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Synthesize cpu map : Ok
40: Synthesize stat config : Ok
41: Synthesize stat : Ok
42: Synthesize stat round : Ok
43: Synthesize attr update : Ok
44: Event times : Ok
45: Read backward ring buffer : Ok
46: Print cpu map : Ok
47: Probe SDT events : Ok
48: is_printable_array : Ok
49: Print bitmap : Ok
50: perf hooks : Ok
51: x86 rdpmc : Ok
52: Convert perf time to TSC : Ok
53: DWARF unwind : Ok
54: x86 instruction decoder - new instructions : Ok
55: Intel cqm nmi context read : Skip
#
# uname -a
Linux zoo 4.7.3-200.fc24.x86_64 #1 SMP Wed Sep 7 17:31:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ grep PRETTY_NAME /etc/os-release
PRETTY_NAME="Fedora 25 (Workstation Edition)"
$
$ perf stat make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_slang_O: make NO_SLANG=1
make_util_map_o_O: make util/map.o
make_static_O: make LDFLAGS=-static
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_perf_o_O: make perf.o
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libelf_O: make NO_LIBELF=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libperl_O: make NO_LIBPERL=1
make_tags_O: make tags
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_install_bin_O: make install-bin
make_no_newt_O: make NO_NEWT=1
make_pure_O: make
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_clean_all_O: make clean all
make_no_gtk2_O: make NO_GTK2=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_help_O: make help
make_doc_O: make doc
make_debug_O: make DEBUG=1
make_install_O: make install

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

# perf kmem record usleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ]
# perf evlist
kmem:kmalloc
kmem:kmalloc_node
kmem:kfree
kmem:kmem_cache_alloc
kmem:kmem_cache_alloc_node
kmem:kmem_cache_free
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
#
# # Use 'perf script' to get a first approach, select a chunk for then using
# # with 'perf kmem stat --time'
#
# perf script | tail -15
usleep 9889 [0] 20119.782088: kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0
perf 9888 [3] 20119.782088: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782090: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
perf 9888 [3] 20119.782091: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782093: kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0
perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782098: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782101: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782103: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
#
# # stats for the whole perf.data file, i.e. no interval specified
#
# perf kmem stat

SUMMARY (SLAB allocator)
========================
Total bytes requested: 172,628
Total bytes allocated: 173,088
Total bytes freed: 161,280
Net total bytes allocated: 11,808
Total bytes wasted on internal fragmentation: 460
Internal fragmentation: 0.265761%
Cross CPU allocations: 0/851
#
# # stats for an end open interval, after a certain time:
#
# perf kmem stat --time 20119.782088,

SUMMARY (SLAB allocator)
========================
Total bytes requested: 552
Total bytes allocated: 552
Total bytes freed: 448
Net total bytes allocated: 104
Total bytes wasted on internal fragmentation: 0
Internal fragmentation: 0.000000%
Cross CPU allocations: 0/8
#

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-6-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-kmem.txt | 7 +++++++
tools/perf/builtin-kmem.c | 24 ++++++++++++++++++++++++
2 files changed, 31 insertions(+)

diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index ff0f433b3fce..479fc3261a50 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -61,6 +61,13 @@ OPTIONS
default, but this option shows live (currently allocated) pages
instead. (This option works with --page option only)

+--time::
+ Only analyze samples within given time window: <start>,<stop>. Times
+ have the format seconds.microseconds. If start is not given (i.e., time
+ string is ',x.y') then analysis starts at the beginning of the file. If
+ stop time is not given (i.e, time string is 'x.y,') then analysis goes
+ to end of file.
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 7fd6f1e1e293..35a02f8e5a4a 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -11,6 +11,7 @@
#include "util/session.h"
#include "util/tool.h"
#include "util/callchain.h"
+#include "util/time-utils.h"

#include <subcmd/parse-options.h>
#include "util/trace-event.h"
@@ -66,6 +67,10 @@ static struct rb_root root_caller_sorted;
static unsigned long total_requested, total_allocated, total_freed;
static unsigned long nr_allocs, nr_cross_allocs;

+/* filters for controlling start and stop of time of analysis */
+static struct perf_time_interval ptime;
+const char *time_str;
+
static int insert_alloc_stat(unsigned long call_site, unsigned long ptr,
int bytes_req, int bytes_alloc, int cpu)
{
@@ -912,6 +917,15 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
return 0;
}

+static bool perf_kmem__skip_sample(struct perf_sample *sample)
+{
+ /* skip sample based on time? */
+ if (perf_time__skip_sample(&ptime, sample->time))
+ return true;
+
+ return false;
+}
+
typedef int (*tracepoint_handler)(struct perf_evsel *evsel,
struct perf_sample *sample);

@@ -931,6 +945,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return -1;
}

+ if (perf_kmem__skip_sample(sample))
+ return 0;
+
dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), thread->tid);

if (evsel->handler != NULL) {
@@ -1894,6 +1911,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK_NOOPT(0, "page", NULL, NULL, "Analyze page allocator",
parse_page_opt),
OPT_BOOLEAN(0, "live", &live_page, "Show live page stat"),
+ OPT_STRING(0, "time", &time_str, "str",
+ "Time span of interest (start,stop)"),
OPT_END()
};
const char *const kmem_subcommands[] = { "record", "stat", NULL };
@@ -1954,6 +1973,11 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)

symbol__init(&session->header.env);

+ if (perf_time__parse_str(&ptime, time_str) != 0) {
+ pr_err("Invalid time string\n");
+ return -EINVAL;
+ }
+
if (!strcmp(argv[0], "stat")) {
setlocale(LC_ALL, "");

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To help in debugging when the wrong offset is being used, like in:

│13d98: ↓ jne 13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1>

That is the full line from objdump, and it seems what should be used is
13dd1, not 28e1.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-4nc0marsgs...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/annotate.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index cee0eee31ce6..ec7a30fad149 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -543,14 +543,16 @@ struct disasm_line *annotate_browser__find_offset(struct annotate_browser *brows
static bool annotate_browser__jump(struct annotate_browser *browser)
{
struct disasm_line *dl = browser->selection;
+ u64 offset;
s64 idx;

if (!ins__is_jump(&dl->ins))
return false;

- dl = annotate_browser__find_offset(browser, dl->ops.target.offset, &idx);
+ offset = dl->ops.target.offset;
+ dl = annotate_browser__find_offset(browser, offset, &idx);
if (dl == NULL) {
- ui_helpline__puts("Invalid jump offset");
+ ui_helpline__printf("Invalid jump offset: %" PRIx64, offset);
return true;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

Using the perf.data file captured via 'perf kmem record':

# perf report --header-only
# ========
# captured on: Tue Nov 29 16:01:53 2016
# hostname : jouet
# os release : 4.8.8-300.fc25.x86_64
# perf version : 4.9.rc6.g5a6aca
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
# cpuid : GenuineIntel,6,61,4
# total memory : 20254660 kB
# cmdline : /home/acme/bin/perf kmem record usleep 1
# event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
# event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
# event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
# event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
# event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
# event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
# HEADER_CACHE info available, use -I to display
# missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
# ========
#
# # Looking at just the histogram entries for the first event:
#
# perf report | head -33
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 40 of event 'kmem:kmalloc'
# Event count (approx.): 40
#
# Overhead Trace output
# ........ ...............................................................................................................
#
37.50% call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
10.00% call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
7.50% call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
2.50% call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
2.50% call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO

#
# # And then limiting using the example for 'perf kmem stat --time' used
# # in the previous changeset committer note we see that there were no
# # kmem:kmalloc in that last part of the file, but there were some
# # kmem:kmem_cache_alloc ones:
#
# perf report --time 20119.782088, --stdio
#
# Total Lost Samples: 0
#
# Samples: 0 of event 'kmem:kmalloc'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 0 of event 'kmem:kmalloc_node'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 0 of event 'kmem:kfree'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 8 of event 'kmem:kmem_cache_alloc'
# Event count (approx.): 8
#
# Overhead Trace output
# ........ ..................................................................................................................
#
75.00% call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
12.50% call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
12.50% call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
#

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-7-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 7 +++++++
tools/perf/builtin-report.c | 14 +++++++++++++-
2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 2d1746295abf..3a166ae4a4d3 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -382,6 +382,13 @@ OPTIONS
--header-only::
Show only perf.data header (forces --stdio).

+--time::
+ Only analyze samples within given time window: <start>,<stop>. Times
+ have the format seconds.microseconds. If start is not given (i.e., time
+ string is ',x.y') then analysis starts at the beginning of the file. If
+ stop time is not given (i.e, time string is 'x.y,') then analysis goes
+ to end of file.
+
--itrace::
Options for decoding instruction tracing data. The options are:

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3dfbfffe2ecd..d2afbe4a240d 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -36,7 +36,7 @@
#include "util/hist.h"
#include "util/data.h"
#include "arch/common.h"
-
+#include "util/time-utils.h"
#include "util/auxtrace.h"

#include <dlfcn.h>
@@ -59,6 +59,8 @@ struct report {
const char *pretty_printing_style;
const char *cpu_list;
const char *symbol_filter_str;
+ const char *time_str;
+ struct perf_time_interval ptime;
float min_percent;
u64 nr_entries;
u64 queue_size;
@@ -158,6 +160,9 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;

+ if (perf_time__skip_sample(&rep->ptime, sample->time))
+ return 0;
+
if (machine__resolve(machine, &al, sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
event->header.type);
@@ -830,6 +835,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
"'always' (default), 'never' or 'auto' only applicable to --stdio mode",
stdio__config_color, "always"),
+ OPT_STRING(0, "time", &report.time_str, "str",
+ "Time span of interest (start,stop)"),
OPT_END()
};
struct perf_data_file file = {
@@ -1015,6 +1022,11 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
if (symbol__init(&session->header.env) < 0)
goto error;

+ if (perf_time__parse_str(&report.ptime, report.time_str) != 0) {
+ pr_err("Invalid time string\n");
+ return -EINVAL;
+ }
+
sort__setup_elide(stdout);

ret = __cmd_report(&report);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Kim Phillips <kim.ph...@arm.com>

Presume neglected in commit 786c1b5 "perf annotate: Start supporting
cross arch annotation". This doesn't fix a bug since none of the
affected arches support parsing dec/inc instructions yet.

Signed-off-by: Kim Phillips <kim.ph...@arm.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Ryder <chris...@arm.com>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Pawel Moll <pawel...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Will Deacon <will....@arm.com>
Link: http://lkml.kernel.org/r/20161130092333.1cca...@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3e34ee0fde28..191599eca807 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -408,7 +408,7 @@ static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops
if (ops->target.raw == NULL)
return -1;

- comment = strchr(s, '#');
+ comment = strchr(s, arch->objdump.comment_char);
if (comment == NULL)
return 0;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Add option to allow user to control analysis window. e.g., collect data
for some amount of time and analyze a segment of interest within that
window.

Committer notes:

Testing it:

# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
#
# perf script --hide-call-graph | head -15
swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370048: 126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370049: 2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370051: 58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370059: 1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370062: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
perf 5124 [2] 9693.370076: 1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
perf 5124 [2] 9693.370091: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
perf 5124 [2] 9693.370095: 3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
#
# perf script --hide-call-graph --time ,9693.370048
swapper 0 [0] 9693.370039: 1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370044: 1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [0] 9693.370046: 7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
# perf script --hide-call-graph --time 9693.370064,9693.370076
swapper 0 [1] 9693.370064: 13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370065: 250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370067: 5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
swapper 0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
#

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-4-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 7 +++++++
tools/perf/builtin-script.c | 15 ++++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 0f6ee09f7256..5dc5c6a09ac4 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -292,6 +292,13 @@ include::itrace.txt[]
--force::
Don't do ownership validation.

+--time::
+ Only analyze samples within given time window: <start>,<stop>. Times
+ have the format seconds.microseconds. If start is not given (i.e., time
+ string is ',x.y') then analysis starts at the beginning of the file. If
+ stop time is not given (i.e, time string is 'x.y,') then analysis goes
+ to end of file.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 066b4bf73780..2f3ff69fc4e7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -22,6 +22,7 @@
#include "util/thread_map.h"
#include "util/stat.h"
#include "util/thread-stack.h"
+#include "util/time-utils.h"
#include <linux/bitmap.h>
#include <linux/stringify.h>
#include <linux/time64.h>
@@ -833,6 +834,8 @@ struct perf_script {
struct cpu_map *cpus;
struct thread_map *threads;
int name_width;
+ const char *time_str;
+ struct perf_time_interval ptime;
};

static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1014,6 +1017,9 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_script *scr = container_of(tool, struct perf_script, tool);
struct addr_location al;

+ if (perf_time__skip_sample(&scr->ptime, sample->time))
+ return 0;
+
if (debug_mode) {
if (sample->time < last_timestamp) {
pr_err("Samples misordered, previous: %" PRIu64
@@ -2186,7 +2192,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"Enable symbol demangling"),
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
-
+ OPT_STRING(0, "time", &script.time_str, "str",
+ "Time span of interest (start,stop)"),
OPT_END()
};
const char * const script_subcommands[] = { "record", "report", NULL };
@@ -2465,6 +2472,12 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (err < 0)
goto out_delete;

+ /* needs to be parsed after looking up reference time */
+ if (perf_time__parse_str(&script.ptime, script.time_str) != 0) {
+ pr_err("Invalid time string\n");
+ return -EINVAL;
+ }
+
err = __cmd_script(&script);

flush_scripting();
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Wang Nan <wang...@huawei.com>

Add a new API to libbpf, caller is able to get bpf_map through the
offset of bpf_map_def to 'maps' section.

The API will be used to help jitted perf hook code find fd of a map.

Signed-off-by: Wang Nan <wang...@huawei.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 12 ++++++++++++
tools/lib/bpf/libbpf.h | 8 ++++++++
2 files changed, 20 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 866d5cdeffc7..2e974593f3e8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1524,3 +1524,15 @@ bpf_object__find_map_by_name(struct bpf_object *obj, const char *name)
}
return NULL;
}
+
+struct bpf_map *
+bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset)
+{
+ int i;
+
+ for (i = 0; i < obj->nr_maps; i++) {
+ if (obj->maps[i].offset == offset)
+ return &obj->maps[i];
+ }
+ return ERR_PTR(-ENOENT);
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 0c0b0127e03e..a5a8b86a06fe 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -24,6 +24,7 @@
#include <stdio.h>
#include <stdbool.h>
#include <linux/err.h>
+#include <sys/types.h> // for size_t

enum libbpf_errno {
__LIBBPF_ERRNO__START = 4000,
@@ -200,6 +201,13 @@ struct bpf_map;
struct bpf_map *
bpf_object__find_map_by_name(struct bpf_object *obj, const char *name);

+/*
+ * Get bpf_map through the offset of corresponding struct bpf_map_def
+ * in the bpf object file.
+ */
+struct bpf_map *
+bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset);
+
struct bpf_map *
bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
#define bpf_map__for_each(pos, obj) \
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Wang Nan <wang...@huawei.com>

Similar to other classes defined in libbpf.h (map and program), allow
'object' class has its own private data.

Signed-off-by: Wang Nan <wang...@huawei.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Cc: He Kuang <hek...@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 23 +++++++++++++++++++++++
tools/lib/bpf/libbpf.h | 5 +++++
2 files changed, 28 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 96a2b2ff1212..866d5cdeffc7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -229,6 +229,10 @@ struct bpf_object {
* all objects.
*/
struct list_head list;
+
+ void *priv;
+ bpf_object_clear_priv_t clear_priv;
+
char path[];
};
#define obj_elf_valid(o) ((o)->efile.elf)
@@ -1229,6 +1233,9 @@ void bpf_object__close(struct bpf_object *obj)
if (!obj)
return;

+ if (obj->clear_priv)
+ obj->clear_priv(obj, obj->priv);
+
bpf_object__elf_finish(obj);
bpf_object__unload(obj);

@@ -1282,6 +1289,22 @@ unsigned int bpf_object__kversion(struct bpf_object *obj)
return obj ? obj->kern_version : 0;
}

+int bpf_object__set_priv(struct bpf_object *obj, void *priv,
+ bpf_object_clear_priv_t clear_priv)
+{
+ if (obj->priv && obj->clear_priv)
+ obj->clear_priv(obj, obj->priv);
+
+ obj->priv = priv;
+ obj->clear_priv = clear_priv;
+ return 0;
+}
+
+void *bpf_object__priv(struct bpf_object *obj)
+{
+ return obj ? obj->priv : ERR_PTR(-EINVAL);
+}
+
struct bpf_program *
bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
{
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index dd7a513efb10..0c0b0127e03e 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -79,6 +79,11 @@ struct bpf_object *bpf_object__next(struct bpf_object *prev);
(pos) != NULL; \
(pos) = (tmp), (tmp) = bpf_object__next(tmp))

+typedef void (*bpf_object_clear_priv_t)(struct bpf_object *, void *);
+int bpf_object__set_priv(struct bpf_object *obj, void *priv,
+ bpf_object_clear_priv_t clear_priv);
+void *bpf_object__priv(struct bpf_object *prog);
+
/* Accessors of bpf_program. */
struct bpf_program;
struct bpf_program *bpf_program__next(struct bpf_program *prog,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <dsa...@gmail.com>

Leverage pid/tid filtering done by symbol_conf hooks.

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Link: http://lkml.kernel.org/r/1480091392-35645-1...@cumulusnetworks.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-trace.c | 49 +++++++++-------------------------------------
1 file changed, 9 insertions(+), 40 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 5f45166c892d..206bf72b77fc 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -74,8 +74,6 @@ struct trace {
size_t nr;
int *entries;
} ev_qualifier_ids;
- struct intlist *tid_list;
- struct intlist *pid_list;
struct {
size_t nr;
pid_t *entries;
@@ -1890,18 +1888,6 @@ static int trace__pgfault(struct trace *trace,
return err;
}

-static bool skip_sample(struct trace *trace, struct perf_sample *sample)
-{
- if ((trace->pid_list && intlist__find(trace->pid_list, sample->pid)) ||
- (trace->tid_list && intlist__find(trace->tid_list, sample->tid)))
- return false;
-
- if (trace->pid_list || trace->tid_list)
- return true;
-
- return false;
-}
-
static void trace__set_base_time(struct trace *trace,
struct perf_evsel *evsel,
struct perf_sample *sample)
@@ -1926,11 +1912,13 @@ static int trace__process_sample(struct perf_tool *tool,
struct machine *machine __maybe_unused)
{
struct trace *trace = container_of(tool, struct trace, tool);
+ struct thread *thread;
int err = 0;

tracepoint_handler handler = evsel->handler;

- if (skip_sample(trace, sample))
+ thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ if (thread && thread__is_filtered(thread))
return 0;

trace__set_base_time(trace, evsel, sample);
@@ -1943,27 +1931,6 @@ static int trace__process_sample(struct perf_tool *tool,
return err;
}

-static int parse_target_str(struct trace *trace)
-{
- if (trace->opts.target.pid) {
- trace->pid_list = intlist__new(trace->opts.target.pid);
- if (trace->pid_list == NULL) {
- pr_err("Error parsing process id string\n");
- return -EINVAL;
- }
- }
-
- if (trace->opts.target.tid) {
- trace->tid_list = intlist__new(trace->opts.target.tid);
- if (trace->tid_list == NULL) {
- pr_err("Error parsing thread id string\n");
- return -EINVAL;
- }
- }
-
- return 0;
-}
-
static int trace__record(struct trace *trace, int argc, const char **argv)
{
unsigned int rec_argc, i, j;
@@ -2460,6 +2427,12 @@ static int trace__replay(struct trace *trace)
if (session == NULL)
return -1;

+ if (trace->opts.target.pid)
+ symbol_conf.pid_list_str = strdup(trace->opts.target.pid);
+
+ if (trace->opts.target.tid)
+ symbol_conf.tid_list_str = strdup(trace->opts.target.tid);
+
if (symbol__init(&session->header.env) < 0)
goto out;

@@ -2503,10 +2476,6 @@ static int trace__replay(struct trace *trace)
evsel->handler = trace__pgfault;
}

- err = parse_target_str(trace);
- if (err != 0)
- goto out;
-
setup_pager();

err = perf_session__process_events(session);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <dsa...@gmail.com>

Add handlers for sched:sched_migrate_task event. Total number of
migrations is added to summary display and -M/--migrations can be used
to show migration events.

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Link: http://lkml.kernel.org/r/1480091321-35591-1...@cumulusnetworks.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 4 ++
tools/perf/builtin-sched.c | 97 ++++++++++++++++++++++++++++++++-
2 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index fb9e52d65fca..121c60da03e5 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -128,6 +128,10 @@ OPTIONS for 'perf sched timehist'
--wakeups::
Show wakeup events.

+-M::
+--migrations::
+ Show migration events.
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index a49a032f5b15..4f9e7cba4ebf 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -203,6 +203,7 @@ struct perf_sched {
unsigned int max_stack;
bool show_cpu_visual;
bool show_wakeups;
+ bool show_migrations;
u64 skipped_samples;
};

@@ -216,6 +217,8 @@ struct thread_runtime {

struct stats run_stats;
u64 total_run_time;
+
+ u64 migrations;
};

/* per event run time data */
@@ -2197,6 +2200,87 @@ static int timehist_sched_wakeup_event(struct perf_tool *tool,
return 0;
}

+static void timehist_print_migration_event(struct perf_sched *sched,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine,
+ struct thread *migrated)
+{
+ struct thread *thread;
+ char tstr[64];
+ u32 max_cpus = sched->max_cpu + 1;
+ u32 ocpu, dcpu;
+
+ if (sched->summary_only)
+ return;
+
+ max_cpus = sched->max_cpu + 1;
+ ocpu = perf_evsel__intval(evsel, sample, "orig_cpu");
+ dcpu = perf_evsel__intval(evsel, sample, "dest_cpu");
+
+ thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+ if (thread == NULL)
+ return;
+
+ if (timehist_skip_sample(sched, thread) &&
+ timehist_skip_sample(sched, migrated)) {
+ return;
+ }
+
+ timestamp__scnprintf_usec(sample->time, tstr, sizeof(tstr));
+ printf("%15s [%04d] ", tstr, sample->cpu);
+
+ if (sched->show_cpu_visual) {
+ u32 i;
+ char c;
+
+ printf(" ");
+ for (i = 0; i < max_cpus; ++i) {
+ c = (i == sample->cpu) ? 'm' : ' ';
+ printf("%c", c);
+ }
+ printf(" ");
+ }
+
+ printf(" %-*s ", comm_width, timehist_get_commstr(thread));
+
+ /* dt spacer */
+ printf(" %9s %9s %9s ", "", "", "");
+
+ printf("migrated: %s", timehist_get_commstr(migrated));
+ printf(" cpu %d => %d", ocpu, dcpu);
+
+ printf("\n");
+}
+
+static int timehist_migrate_task_event(struct perf_tool *tool,
+ union perf_event *event __maybe_unused,
+ struct perf_evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
+ struct thread *thread;
+ struct thread_runtime *tr = NULL;
+ /* want pid of migrated task not pid in sample */
+ const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+
+ thread = machine__findnew_thread(machine, 0, pid);
+ if (thread == NULL)
+ return -1;
+
+ tr = thread__get_runtime(thread);
+ if (tr == NULL)
+ return -1;
+
+ tr->migrations++;
+
+ /* show migrations if requested */
+ timehist_print_migration_event(sched, evsel, sample, machine, thread);
+
+ return 0;
+}
+
static int timehist_sched_change_event(struct perf_tool *tool,
union perf_event *event,
struct perf_evsel *evsel,
@@ -2295,6 +2379,7 @@ static void print_thread_runtime(struct thread *t,
print_sched_time(r->run_stats.max, 6);
printf(" ");
printf("%5.2f", stddev);
+ printf(" %5" PRIu64, r->migrations);
printf("\n");
}

@@ -2356,10 +2441,10 @@ static void timehist_print_summary(struct perf_sched *sched,

printf("\nRuntime summary\n");
printf("%*s parent sched-in ", comm_width, "comm");
- printf(" run-time min-run avg-run max-run stddev\n");
+ printf(" run-time min-run avg-run max-run stddev migrations\n");
printf("%*s (count) ", comm_width, "");
printf(" (msec) (msec) (msec) (msec) %%\n");
- printf("%.105s\n", graph_dotted_line);
+ printf("%.117s\n", graph_dotted_line);

machine__for_each_thread(m, show_thread_runtime, &totals);
task_count = totals.task_count;
@@ -2460,6 +2545,9 @@ static int perf_sched__timehist(struct perf_sched *sched)
{ "sched:sched_wakeup", timehist_sched_wakeup_event, },
{ "sched:sched_wakeup_new", timehist_sched_wakeup_event, },
};
+ const struct perf_evsel_str_handler migrate_handlers[] = {
+ { "sched:sched_migrate_task", timehist_migrate_task_event, },
+ };
struct perf_data_file file = {
.path = input_name,
.mode = PERF_DATA_MODE_READ,
@@ -2507,6 +2595,10 @@ static int perf_sched__timehist(struct perf_sched *sched)
if (!perf_session__has_traces(session, "record -R"))
goto out;

+ if (sched->show_migrations &&
+ perf_session__set_tracepoints_handlers(session, migrate_handlers))
+ goto out;
+
/* pre-allocate struct for per-CPU idle stats */
sched->max_cpu = session->header.env.nr_cpus_online;
if (sched->max_cpu == 0)
@@ -2903,6 +2995,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('S', "with-summary", &sched.summary,
"Show all syscalls and summary with statistics"),
OPT_BOOLEAN('w', "wakeups", &sched.show_wakeups, "Show wakeup events"),
+ OPT_BOOLEAN('M', "migrations", &sched.show_migrations, "Show migration events"),
OPT_BOOLEAN('V', "cpu-visual", &sched.show_cpu_visual, "Add CPU visual"),
OPT_PARENT(sched_options)
};
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Add function to parse a user time string of the form <start>,<stop>
where start and stop are time in sec.nsec format. Both start and stop
times are optional.

Add function to determine if a sample time is within a given time
time window of interest.

Signed-off-by: David Ahern <dsa...@gmail.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-2-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/Build | 1 +
tools/perf/util/time-utils.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/time-utils.h | 12 +++++++
3 files changed, 98 insertions(+)
create mode 100644 tools/perf/util/time-utils.c
create mode 100644 tools/perf/util/time-utils.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index b2a47aac8d1c..bdad82a9812d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -87,6 +87,7 @@ libperf-y += help-unknown-cmd.o
libperf-y += mem-events.o
libperf-y += vsprintf.o
libperf-y += drv_configs.o
+libperf-y += time-utils.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
new file mode 100644
index 000000000000..0443b2afd0cf
--- /dev/null
+++ b/tools/perf/util/time-utils.c
@@ -0,0 +1,85 @@
+#include <string.h>
+#include <sys/time.h>
+#include <time.h>
+#include <errno.h>
+#include <inttypes.h>
+
+#include "perf.h"
+#include "debug.h"
+#include "time-utils.h"
+#include "util.h"
+
+static int parse_timestr_sec_nsec(struct perf_time_interval *ptime,
+ char *start_str, char *end_str)
+{
+ if (start_str && (*start_str != '\0') &&
+ (parse_nsec_time(start_str, &ptime->start) != 0)) {
+ return -1;
+ }
+
+ if (end_str && (*end_str != '\0') &&
+ (parse_nsec_time(end_str, &ptime->end) != 0)) {
+ return -1;
+ }
+
+ return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+ char *start_str, *end_str;
+ char *d, *str;
+ int rc = 0;
+
+ if (ostr == NULL || *ostr == '\0')
+ return 0;
+
+ /* copy original string because we need to modify it */
+ str = strdup(ostr);
+ if (str == NULL)
+ return -ENOMEM;
+
+ ptime->start = 0;
+ ptime->end = 0;
+
+ /* str has the format: <start>,<stop>
+ * variations: <start>,
+ * ,<stop>
+ * ,
+ */
+ start_str = str;
+ d = strchr(start_str, ',');
+ if (d) {
+ *d = '\0';
+ ++d;
+ }
+ end_str = d;
+
+ rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
+
+ free(str);
+
+ /* make sure end time is after start time if it was given */
+ if (rc == 0 && ptime->end && ptime->end < ptime->start)
+ return -EINVAL;
+
+ pr_debug("start time %" PRIu64 ", ", ptime->start);
+ pr_debug("end time %" PRIu64 "\n", ptime->end);
+
+ return rc;
+}
+
+bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
+{
+ /* if time is not set don't drop sample */
+ if (timestamp == 0)
+ return false;
+
+ /* otherwise compare sample time to time window */
+ if ((ptime->start && timestamp < ptime->start) ||
+ (ptime->end && timestamp > ptime->end)) {
+ return true;
+ }
+
+ return false;
+}
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
new file mode 100644
index 000000000000..8f3e0e370be8
--- /dev/null
+++ b/tools/perf/util/time-utils.h
@@ -0,0 +1,12 @@
+#ifndef _TIME_UTILS_H_
+#define _TIME_UTILS_H_
+
+struct perf_time_interval {
+ u64 start, end;
+};
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr);
+
+bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
+
+#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Having "test" in almost all test descriptions is redundant, simplify it
removing and rewriting tests with such descriptions.

End result:
Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-rx2lbfcrri...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/x86/tests/arch-tests.c | 10 ++--
tools/perf/tests/bpf.c | 6 +--
tools/perf/tests/builtin-test.c | 94 +++++++++++++++++-----------------
tools/perf/tests/llvm.c | 8 +--
4 files changed, 59 insertions(+), 59 deletions(-)

diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index 2218cb64f840..99d66191e56c 100644
--- a/tools/perf/arch/x86/tests/arch-tests.c
+++ b/tools/perf/arch/x86/tests/arch-tests.c
@@ -4,27 +4,27 @@

struct test arch_tests[] = {
{
- .desc = "x86 rdpmc test",
+ .desc = "x86 rdpmc",
.func = test__rdpmc,
},
{
- .desc = "Test converting perf time to TSC",
+ .desc = "Convert perf time to TSC",
.func = test__perf_time_to_tsc,
},
#ifdef HAVE_DWARF_UNWIND_SUPPORT
{
- .desc = "Test dwarf unwind",
+ .desc = "DWARF unwind",
.func = test__dwarf_unwind,
},
#endif
#ifdef HAVE_AUXTRACE_SUPPORT
{
- .desc = "Test x86 instruction decoder - new instructions",
+ .desc = "x86 instruction decoder - new instructions",
.func = test__insn_x86,
},
#endif
{
- .desc = "Test intel cqm nmi context read",
+ .desc = "Intel cqm nmi context read",
.func = test__intel_cqm_count_nmi_context,
},
{
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 8f0298aff222..92343f43e44a 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -57,7 +57,7 @@ static struct {
} bpf_testcase_table[] = {
{
LLVM_TESTCASE_BASE,
- "Test basic BPF filtering",
+ "Basic BPF filtering",
"[basic_bpf_test]",
"fix 'perf test LLVM' first",
"load bpf object failed",
@@ -67,7 +67,7 @@ static struct {
#ifdef HAVE_BPF_PROLOGUE
{
LLVM_TESTCASE_BPF_PROLOGUE,
- "Test BPF prologue generation",
+ "BPF prologue generation",
"[bpf_prologue_test]",
"fix kbuild first",
"check your vmlinux setting?",
@@ -77,7 +77,7 @@ static struct {
#endif
{
LLVM_TESTCASE_BPF_RELOCATION,
- "Test BPF relocation checker",
+ "BPF relocation checker",
"[bpf_relocation_test]",
"fix 'perf test LLVM' first",
"libbpf error when dealing with relocation",
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index dab83f7042fa..d1bec0444be7 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -28,119 +28,119 @@ static struct test generic_tests[] = {
.func = test__vmlinux_matches_kallsyms,
},
{
- .desc = "detect openat syscall event",
+ .desc = "Detect openat syscall event",
.func = test__openat_syscall_event,
},
{
- .desc = "detect openat syscall event on all cpus",
+ .desc = "Detect openat syscall event on all cpus",
.func = test__openat_syscall_event_on_all_cpus,
},
{
- .desc = "read samples using the mmap interface",
+ .desc = "Read samples using the mmap interface",
.func = test__basic_mmap,
},
{
- .desc = "parse events tests",
+ .desc = "Parse event definition strings",
.func = test__parse_events,
},
{
- .desc = "Validate PERF_RECORD_* events & perf_sample fields",
+ .desc = "PERF_RECORD_* events & perf_sample fields",
.func = test__PERF_RECORD,
},
{
- .desc = "Test perf pmu format parsing",
+ .desc = "Parse perf pmu format",
.func = test__pmu,
},
{
- .desc = "Test dso data read",
+ .desc = "DSO data read",
.func = test__dso_data,
},
{
- .desc = "Test dso data cache",
+ .desc = "DSO data cache",
.func = test__dso_data_cache,
},
{
- .desc = "Test dso data reopen",
+ .desc = "DSO data reopen",
.func = test__dso_data_reopen,
},
{
- .desc = "roundtrip evsel->name check",
+ .desc = "Roundtrip evsel->name",
.func = test__perf_evsel__roundtrip_name_test,
},
{
- .desc = "Check parsing of sched tracepoints fields",
+ .desc = "Parse sched tracepoints fields",
.func = test__perf_evsel__tp_sched_test,
},
{
- .desc = "Generate and check syscalls:sys_enter_openat event fields",
+ .desc = "syscalls:sys_enter_openat event fields",
.func = test__syscall_openat_tp_fields,
},
{
- .desc = "struct perf_event_attr setup",
+ .desc = "Setup struct perf_event_attr",
.func = test__attr,
},
{
- .desc = "Test matching and linking multiple hists",
+ .desc = "Match and link multiple hists",
.func = test__hists_link,
},
{
- .desc = "Try 'import perf' in python, checking link problems",
+ .desc = "'import perf' in python",
.func = test__python_use,
},
{
- .desc = "Test breakpoint overflow signal handler",
+ .desc = "Breakpoint overflow signal handler",
.func = test__bp_signal,
},
{
- .desc = "Test breakpoint overflow sampling",
+ .desc = "Breakpoint overflow sampling",
.func = test__bp_signal_overflow,
},
{
- .desc = "Test number of exit event of a simple workload",
+ .desc = "Number of exit events of a simple workload",
.func = test__task_exit,
},
{
- .desc = "Test software clock events have valid period values",
+ .desc = "Software clock events period values",
.func = test__sw_clock_freq,
},
{
- .desc = "Test object code reading",
+ .desc = "Object code reading",
.func = test__code_reading,
},
{
- .desc = "Test sample parsing",
+ .desc = "Sample parsing",
.func = test__sample_parsing,
},
{
- .desc = "Test using a dummy software event to keep tracking",
+ .desc = "Use a dummy software event to keep tracking",
.func = test__keep_tracking,
},
{
- .desc = "Test parsing with no sample_id_all bit set",
+ .desc = "Parse with no sample_id_all bit set",
.func = test__parse_no_sample_id_all,
},
{
- .desc = "Test filtering hist entries",
+ .desc = "Filter hist entries",
.func = test__hists_filter,
},
{
- .desc = "Test mmap thread lookup",
+ .desc = "Lookup mmap thread",
.func = test__mmap_thread_lookup,
},
{
- .desc = "Test thread mg sharing",
+ .desc = "Share thread mg",
.func = test__thread_mg_share,
},
{
- .desc = "Test output sorting of hist entries",
+ .desc = "Sort output of hist entries",
.func = test__hists_output,
},
{
- .desc = "Test cumulation of child hist entries",
+ .desc = "Cumulate child hist entries",
.func = test__hists_cumulate,
},
{
- .desc = "Test tracking with sched_switch",
+ .desc = "Track with sched_switch",
.func = test__switch_tracking,
},
{
@@ -152,15 +152,15 @@ static struct test generic_tests[] = {
.func = test__fdarray__add,
},
{
- .desc = "Test kmod_path__parse function",
+ .desc = "kmod_path__parse",
.func = test__kmod_path__parse,
},
{
- .desc = "Test thread map",
+ .desc = "Thread map",
.func = test__thread_map,
},
{
- .desc = "Test LLVM searching and compiling",
+ .desc = "LLVM search and compile",
.func = test__llvm,
.subtest = {
.skip_if_fail = true,
@@ -169,11 +169,11 @@ static struct test generic_tests[] = {
},
},
{
- .desc = "Test topology in session",
+ .desc = "Session topology",
.func = test_session_topology,
},
{
- .desc = "Test BPF filter",
+ .desc = "BPF filter",
.func = test__bpf,
.subtest = {
.skip_if_fail = true,
@@ -182,55 +182,55 @@ static struct test generic_tests[] = {
},
},
{
- .desc = "Test thread map synthesize",
+ .desc = "Synthesize thread map",
.func = test__thread_map_synthesize,
},
{
- .desc = "Test cpu map synthesize",
+ .desc = "Synthesize cpu map",
.func = test__cpu_map_synthesize,
},
{
- .desc = "Test stat config synthesize",
+ .desc = "Synthesize stat config",
.func = test__synthesize_stat_config,
},
{
- .desc = "Test stat synthesize",
+ .desc = "Synthesize stat",
.func = test__synthesize_stat,
},
{
- .desc = "Test stat round synthesize",
+ .desc = "Synthesize stat round",
.func = test__synthesize_stat_round,
},
{
- .desc = "Test attr update synthesize",
+ .desc = "Synthesize attr update",
.func = test__event_update,
},
{
- .desc = "Test events times",
+ .desc = "Event times",
.func = test__event_times,
},
{
- .desc = "Test backward reading from ring buffer",
+ .desc = "Read backward ring buffer",
.func = test__backward_ring_buffer,
},
{
- .desc = "Test cpu map print",
+ .desc = "Print cpu map",
.func = test__cpu_map_print,
},
{
- .desc = "Test SDT event probing",
+ .desc = "Probe SDT events",
.func = test__sdt_event,
},
{
- .desc = "Test is_printable_array function",
+ .desc = "is_printable_array",
.func = test__is_printable_array,
},
{
- .desc = "Test bitmap print",
+ .desc = "Print bitmap",
.func = test__bitmap_print,
},
{
- .desc = "Test perf hooks",
+ .desc = "perf hooks",
.func = test__perf_hooks,
},
{
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index b798a4bfd238..02a33ebcd992 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -34,19 +34,19 @@ static struct {
} bpf_source_table[__LLVM_TESTCASE_MAX] = {
[LLVM_TESTCASE_BASE] = {
.source = test_llvm__bpf_base_prog,
- .desc = "Basic BPF llvm compiling test",
+ .desc = "Basic BPF llvm compile",
},
[LLVM_TESTCASE_KBUILD] = {
.source = test_llvm__bpf_test_kbuild_prog,
- .desc = "Test kbuild searching",
+ .desc = "kbuild searching",
},
[LLVM_TESTCASE_BPF_PROLOGUE] = {
.source = test_llvm__bpf_test_prologue_prog,
- .desc = "Compile source for BPF prologue generation test",
+ .desc = "Compile source for BPF prologue generation",
},
[LLVM_TESTCASE_BPF_RELOCATION] = {
.source = test_llvm__bpf_test_relocation,
- .desc = "Compile source for BPF relocation test",
+ .desc = "Compile source for BPF relocation",
.should_load_fail = true,
},
};
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Wang Nan <wang...@huawei.com>

Add more BPF map operations to libbpf. Also add bpf_obj_{pin,get}(). They
can be used on not only BPF maps but also BPF programs.

Signed-off-by: Wang Nan <wang...@huawei.com>
Acked-by: Alexei Starovoitov <a...@kernel.org>
Cc: He Kuang <hek...@huawei.com>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/bpf.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++
tools/lib/bpf/bpf.h | 7 +++++++
2 files changed, 63 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 4212ed62235b..8143536b462a 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -110,3 +110,59 @@ int bpf_map_update_elem(int fd, void *key, void *value,

return sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
}
+
+int bpf_map_lookup_elem(int fd, void *key, void *value)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.map_fd = fd;
+ attr.key = ptr_to_u64(key);
+ attr.value = ptr_to_u64(value);
+
+ return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
+}
+
+int bpf_map_delete_elem(int fd, void *key)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.map_fd = fd;
+ attr.key = ptr_to_u64(key);
+
+ return sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
+}
+
+int bpf_map_get_next_key(int fd, void *key, void *next_key)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.map_fd = fd;
+ attr.key = ptr_to_u64(key);
+ attr.next_key = ptr_to_u64(next_key);
+
+ return sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
+}
+
+int bpf_obj_pin(int fd, const char *pathname)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.pathname = ptr_to_u64((void *)pathname);
+ attr.bpf_fd = fd;
+
+ return sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr));
+}
+
+int bpf_obj_get(const char *pathname)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.pathname = ptr_to_u64((void *)pathname);
+
+ return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index e8ba54087497..253c3dbb06b4 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -35,4 +35,11 @@ int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,

int bpf_map_update_elem(int fd, void *key, void *value,
u64 flags);
+
+int bpf_map_lookup_elem(int fd, void *key, void *value);
+int bpf_map_delete_elem(int fd, void *key);
+int bpf_map_get_next_key(int fd, void *key, void *next_key);
+int bpf_obj_pin(int fd, const char *pathname);
+int bpf_obj_get(const char *pathname);
+
#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Allow user to specify list of symbols which cause the dump of callchains
to stop at that symbol.

Committer notes:

Testing it:

# perf record -ag usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
#
# # Without it:
#
# perf script
swapper 0 [000] 9693.370039: 1 cycles:ppp:
2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
137ffeb start_kernel ([kernel.vmlinux].init.text)
137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)

swapper 0 [000] 9693.370044: 1 cycles:ppp:
20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
137ffeb start_kernel ([kernel.vmlinux].init.text)
137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
#
# # Using it to see just what are the calls from the 'remote_function' function:
#
# perf script --stop-bt remote_function
swapper 0 [000] 9693.370039: 1 cycles:ppp:
2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

swapper 0 [000] 9693.370044: 1 cycles:ppp:
20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

Signed-off-by: David Ahern <d...@cumulusnetworks.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480104021-36275-1-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 3 +++
tools/perf/builtin-script.c | 2 ++
tools/perf/util/evsel_fprintf.c | 8 ++++++++
tools/perf/util/symbol.c | 8 ++++++++
tools/perf/util/symbol.h | 6 ++++--
5 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index c01904f388ce..0f6ee09f7256 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -212,6 +212,9 @@ OPTIONS
--hide-call-graph::
When printing symbols do not display call chain.

+--stop-bt::
+ Stop display of callgraph at these symbols
+
-C::
--cpu:: Only report samples for the list of CPUs provided. Multiple CPUs can
be provided as a comma-separated list with no space: 0,1. Ranges of
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e1daff36d070..066b4bf73780 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2151,6 +2151,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
"only consider these symbols"),
+ OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
+ "Stop display of callgraph at these symbols"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
OPT_STRING('c', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
"only display events for these comms"),
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index 5a6f52284452..6b2925542c0a 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -166,6 +166,14 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
if (!print_oneline)
printed += fprintf(fp, "\n");

+ if (symbol_conf.bt_stop_list &&
+ node->sym &&
+ node->sym->name &&
+ strlist__has_entry(symbol_conf.bt_stop_list,
+ node->sym->name)) {
+ break;
+ }
+
first = false;
next:
callchain_cursor_advance(cursor);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 420ada9de22f..df2482b2ba45 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2032,6 +2032,10 @@ int symbol__init(struct perf_env *env)
symbol_conf.sym_list_str, "symbol") < 0)
goto out_free_tid_list;

+ if (setup_list(&symbol_conf.bt_stop_list,
+ symbol_conf.bt_stop_list_str, "symbol") < 0)
+ goto out_free_sym_list;
+
/*
* A path to symbols of "/" is identical to ""
* reset here for simplicity.
@@ -2049,6 +2053,8 @@ int symbol__init(struct perf_env *env)
symbol_conf.initialized = true;
return 0;

+out_free_sym_list:
+ strlist__delete(symbol_conf.sym_list);
out_free_tid_list:
intlist__delete(symbol_conf.tid_list);
out_free_pid_list:
@@ -2064,6 +2070,7 @@ void symbol__exit(void)
{
if (!symbol_conf.initialized)
return;
+ strlist__delete(symbol_conf.bt_stop_list);
strlist__delete(symbol_conf.sym_list);
strlist__delete(symbol_conf.dso_list);
strlist__delete(symbol_conf.comm_list);
@@ -2071,6 +2078,7 @@ void symbol__exit(void)
intlist__delete(symbol_conf.pid_list);
vmlinux_path__exit();
symbol_conf.sym_list = symbol_conf.dso_list = symbol_conf.comm_list = NULL;
+ symbol_conf.bt_stop_list = NULL;
symbol_conf.initialized = false;
}

diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 1bcbefc0c325..6c358b7ed336 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -132,14 +132,16 @@ struct symbol_conf {
*pid_list_str,
*tid_list_str,
*sym_list_str,
- *col_width_list_str;
+ *col_width_list_str,
+ *bt_stop_list_str;
struct strlist *dso_list,
*comm_list,
*sym_list,
*dso_from_list,
*dso_to_list,
*sym_from_list,
- *sym_to_list;
+ *sym_to_list,
+ *bt_stop_list;
struct intlist *pid_list,
*tid_list;
const char *symfs;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Track freed memory as well as allocations and show the net in the
summary.

Committer notes:

Testing it:

# perf kmem record usleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.626 MB perf.data (4208 samples) ]
[root@jouet ~]# perf kmem stat --slab

SUMMARY (SLAB allocator)
========================
Total bytes requested: 234,011
Total bytes allocated: 234,504
Total bytes freed: 213,328 <------
Net total bytes allocated: 21,176
Total bytes wasted on internal fragmentation: 493
Internal fragmentation: 0.210231%
Cross CPU allocations: 4/1,963
#

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480110133-37039-1-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-kmem.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index d426dcb18ce9..7fd6f1e1e293 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -49,6 +49,7 @@ struct alloc_stat {
u64 ptr;
u64 bytes_req;
u64 bytes_alloc;
+ u64 last_alloc;
u32 hit;
u32 pingpong;

@@ -62,7 +63,7 @@ static struct rb_root root_alloc_sorted;
static struct rb_root root_caller_stat;
static struct rb_root root_caller_sorted;

-static unsigned long total_requested, total_allocated;
+static unsigned long total_requested, total_allocated, total_freed;
static unsigned long nr_allocs, nr_cross_allocs;

static int insert_alloc_stat(unsigned long call_site, unsigned long ptr,
@@ -105,6 +106,8 @@ static int insert_alloc_stat(unsigned long call_site, unsigned long ptr,
}
data->call_site = call_site;
data->alloc_cpu = cpu;
+ data->last_alloc = bytes_alloc;
+
return 0;
}

@@ -223,6 +226,8 @@ static int perf_evsel__process_free_event(struct perf_evsel *evsel,
if (!s_alloc)
return 0;

+ total_freed += s_alloc->last_alloc;
+
if ((short)sample->cpu != s_alloc->alloc_cpu) {
s_alloc->pingpong++;

@@ -1128,6 +1133,11 @@ static void print_slab_summary(void)
printf("\n========================\n");
printf("Total bytes requested: %'lu\n", total_requested);
printf("Total bytes allocated: %'lu\n", total_allocated);
+ printf("Total bytes freed: %'lu\n", total_freed);
+ if (total_allocated > total_freed) {
+ printf("Net total bytes allocated: %'lu\n",
+ total_allocated - total_freed);
+ }
printf("Total bytes wasted on internal fragmentation: %'lu\n",
total_allocated - total_requested);
printf("Internal fragmentation: %f%%\n",
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: Wang Nan <wang...@huawei.com>

Perf hooks allow hooking user code at perf events. They can be used for
manipulation of BPF maps, taking snapshot and reporting results. In this
patch two perf hook points are introduced: record_start and record_end.

To avoid buggy user actions, a SIGSEGV signal handler is introduced into
'perf record'. It turns off perf hook if it causes a segfault and report
an error to help debugging.

A test case for perf hook is introduced.

Test result:
$ ./buildperf/perf test -v hook
50: Test perf hooks :
--- start ---
test child forked, pid 10311
SIGSEGV is observed as expected, try to recover.
Fatal error (SEGFAULT) in perf hook 'test'
test child finished with 0
---- end ----
Test perf hooks: Ok

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-record.c | 11 +++++
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 ++
tools/perf/tests/perf-hooks.c | 44 ++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/util/Build | 2 +
tools/perf/util/perf-hooks-list.h | 3 ++
tools/perf/util/perf-hooks.c | 84 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/perf-hooks.h | 37 +++++++++++++++++
9 files changed, 187 insertions(+)
create mode 100644 tools/perf/tests/perf-hooks.c
create mode 100644 tools/perf/util/perf-hooks-list.h
create mode 100644 tools/perf/util/perf-hooks.c
create mode 100644 tools/perf/util/perf-hooks.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 67d2a9003294..fa26865364b6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
#include "util/llvm-utils.h"
#include "util/bpf-loader.h"
#include "util/trigger.h"
+#include "util/perf-hooks.h"
#include "asm/bug.h"

#include <unistd.h>
@@ -206,6 +207,12 @@ static void sig_handler(int sig)
done = 1;
}

+static void sigsegv_handler(int sig)
+{
+ perf_hooks__recover();
+ sighandler_dump_stack(sig);
+}
+
static void record__sig_exit(void)
{
if (signr == -1)
@@ -833,6 +840,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
signal(SIGCHLD, sig_handler);
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);
+ signal(SIGSEGV, sigsegv_handler);

if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
signal(SIGUSR2, snapshot_sig_handler);
@@ -970,6 +978,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)

trigger_ready(&auxtrace_snapshot_trigger);
trigger_ready(&switch_output_trigger);
+ perf_hooks__invoke_record_start();
for (;;) {
unsigned long long hits = rec->samples;

@@ -1114,6 +1123,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
}

+ perf_hooks__invoke_record_end();
+
if (!err && !quiet) {
char samples[128];
const char *postfix = rec->timestamp_filename ?
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 8a4ce492f7b2..af3ec94869aa 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -42,6 +42,7 @@ perf-y += backward-ring-buffer.o
perf-y += sdt.o
perf-y += is_printable_array.o
perf-y += bitmap.o
+perf-y += perf-hooks.o

$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 778668a2a966..dab83f7042fa 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -230,6 +230,10 @@ static struct test generic_tests[] = {
.func = test__bitmap_print,
},
{
+ .desc = "Test perf hooks",
+ .func = test__perf_hooks,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/perf-hooks.c b/tools/perf/tests/perf-hooks.c
new file mode 100644
index 000000000000..9338cb2c25ab
--- /dev/null
+++ b/tools/perf/tests/perf-hooks.c
@@ -0,0 +1,44 @@
+#include <signal.h>
+#include <stdlib.h>
+
+#include "tests.h"
+#include "debug.h"
+#include "util.h"
+#include "perf-hooks.h"
+
+static void sigsegv_handler(int sig __maybe_unused)
+{
+ pr_debug("SIGSEGV is observed as expected, try to recover.\n");
+ perf_hooks__recover();
+ signal(SIGSEGV, SIG_DFL);
+ raise(SIGSEGV);
+ exit(-1);
+}
+
+static int hook_flags;
+
+static void the_hook(void)
+{
+ int *p = NULL;
+
+ hook_flags = 1234;
+
+ /* Generate a segfault, test perf_hooks__recover */
+ *p = 0;
+}
+
+int test__perf_hooks(int subtest __maybe_unused)
+{
+ signal(SIGSEGV, sigsegv_handler);
+ perf_hooks__set_hook("test", the_hook);
+ perf_hooks__invoke_test();
+
+ /* hook is triggered? */
+ if (hook_flags != 1234)
+ return TEST_FAIL;
+
+ /* the buggy hook is removed? */
+ if (perf_hooks__get_hook("test"))
+ return TEST_FAIL;
+ return TEST_OK;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 7c196c585472..3a1f98f291ba 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -91,6 +91,7 @@ int test__cpu_map_print(int subtest);
int test__sdt_event(int subtest);
int test__is_printable_array(int subtest);
int test__bitmap_print(int subtest);
+int test__perf_hooks(int subtest);

#if defined(__arm__) || defined(__aarch64__)
#ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 1dc67efad634..b2a47aac8d1c 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -123,6 +123,8 @@ libperf-$(CONFIG_LIBELF) += genelf.o
libperf-$(CONFIG_DWARF) += genelf_debug.o
endif

+libperf-y += perf-hooks.o
+
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
# avoid compiler warnings in 32-bit mode
CFLAGS_genelf_debug.o += -Wno-packed
diff --git a/tools/perf/util/perf-hooks-list.h b/tools/perf/util/perf-hooks-list.h
new file mode 100644
index 000000000000..2867c07ee84e
--- /dev/null
+++ b/tools/perf/util/perf-hooks-list.h
@@ -0,0 +1,3 @@
+PERF_HOOK(record_start)
+PERF_HOOK(record_end)
+PERF_HOOK(test)
diff --git a/tools/perf/util/perf-hooks.c b/tools/perf/util/perf-hooks.c
new file mode 100644
index 000000000000..4ce88e37dd63
--- /dev/null
+++ b/tools/perf/util/perf-hooks.c
@@ -0,0 +1,84 @@
+/*
+ * perf_hooks.c
+ *
+ * Copyright (C) 2016 Wang Nan <wang...@huawei.com>
+ * Copyright (C) 2016 Huawei Inc.
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <setjmp.h>
+#include <linux/err.h>
+#include "util/util.h"
+#include "util/debug.h"
+#include "util/perf-hooks.h"
+
+static sigjmp_buf jmpbuf;
+static const struct perf_hook_desc *current_perf_hook;
+
+void perf_hooks__invoke(const struct perf_hook_desc *desc)
+{
+ if (!(desc && desc->p_hook_func && *desc->p_hook_func))
+ return;
+
+ if (sigsetjmp(jmpbuf, 1)) {
+ pr_warning("Fatal error (SEGFAULT) in perf hook '%s'\n",
+ desc->hook_name);
+ *(current_perf_hook->p_hook_func) = NULL;
+ } else {
+ current_perf_hook = desc;
+ (**desc->p_hook_func)();
+ }
+ current_perf_hook = NULL;
+}
+
+void perf_hooks__recover(void)
+{
+ if (current_perf_hook)
+ siglongjmp(jmpbuf, 1);
+}
+
+#define PERF_HOOK(name) \
+perf_hook_func_t __perf_hook_func_##name = NULL; \
+struct perf_hook_desc __perf_hook_desc_##name = \
+ {.hook_name = #name, .p_hook_func = &__perf_hook_func_##name};
+#include "perf-hooks-list.h"
+#undef PERF_HOOK
+
+#define PERF_HOOK(name) \
+ &__perf_hook_desc_##name,
+
+static struct perf_hook_desc *perf_hooks[] = {
+#include "perf-hooks-list.h"
+};
+#undef PERF_HOOK
+
+int perf_hooks__set_hook(const char *hook_name,
+ perf_hook_func_t hook_func)
+{
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(perf_hooks); i++) {
+ if (strcmp(hook_name, perf_hooks[i]->hook_name) != 0)
+ continue;
+
+ if (*(perf_hooks[i]->p_hook_func))
+ pr_warning("Overwrite existing hook: %s\n", hook_name);
+ *(perf_hooks[i]->p_hook_func) = hook_func;
+ return 0;
+ }
+ return -ENOENT;
+}
+
+perf_hook_func_t perf_hooks__get_hook(const char *hook_name)
+{
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(perf_hooks); i++) {
+ if (strcmp(hook_name, perf_hooks[i]->hook_name) != 0)
+ continue;
+
+ return *(perf_hooks[i]->p_hook_func);
+ }
+ return ERR_PTR(-ENOENT);
+}
diff --git a/tools/perf/util/perf-hooks.h b/tools/perf/util/perf-hooks.h
new file mode 100644
index 000000000000..1d482b26b4b9
--- /dev/null
+++ b/tools/perf/util/perf-hooks.h
@@ -0,0 +1,37 @@
+#ifndef PERF_UTIL_PERF_HOOKS_H
+#define PERF_UTIL_PERF_HOOKS_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef void (*perf_hook_func_t)(void);
+struct perf_hook_desc {
+ const char * const hook_name;
+ perf_hook_func_t * const p_hook_func;
+};
+
+extern void perf_hooks__invoke(const struct perf_hook_desc *);
+extern void perf_hooks__recover(void);
+
+#define PERF_HOOK(name) \
+extern struct perf_hook_desc __perf_hook_desc_##name; \
+static inline void perf_hooks__invoke_##name(void) \
+{ \
+ perf_hooks__invoke(&__perf_hook_desc_##name); \
+}
+
+#include "perf-hooks-list.h"
+#undef PERF_HOOK
+
+extern int
+perf_hooks__set_hook(const char *hook_name,
+ perf_hook_func_t hook_func);
+
+extern perf_hook_func_t
+perf_hooks__get_hook(const char *hook_name);
+
+#ifdef __cplusplus
+}
+#endif
+#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:07 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

# perf sched record -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.593 MB perf.data (25 samples) ]
#
# perf sched timehist | head -18
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
------------- ------ --------------- --------- --------- --------
19818.635579 [0002] <idle> 0.000 0.000 0.000
19818.635613 [0000] perf[9116] 0.000 0.000 0.000
19818.635676 [0000] <idle> 0.000 0.000 0.063
19818.635678 [0000] rcuos/2[29] 0.000 0.002 0.001
19818.635696 [0002] perf[9117] 0.000 0.004 0.116
19818.635702 [0000] <idle> 0.001 0.000 0.024
19818.635709 [0002] migration/2[25] 0.000 0.003 0.012
19818.636263 [0000] usleep[9117] 0.005 0.000 0.560
19818.636316 [0000] <idle> 0.560 0.000 0.053
19818.636358 [0002] <idle> 0.129 0.000 0.649
19818.636358 [0000] usleep[9117] 0.053 0.002 0.042
#

# perf sched timehist --time 19818.635696,
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
------------- ------ --------------- -------- --------- ---------
19818.635696 [0002] perf[9117] 0.000 0.120 0.000
19818.635702 [0000] <idle> 0.019 0.000 0.006
19818.635709 [0002] migration/2[25] 0.000 0.003 0.012
19818.636263 [0000] usleep[9117] 0.005 0.000 0.560
19818.636316 [0000] <idle> 0.560 0.000 0.053
19818.636358 [0002] <idle> 0.129 0.000 0.649
19818.636358 [0000] usleep[9117] 0.053 0.002 0.042
#
# perf sched timehist --time 19818.635696,19818.635709
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
------------- ------ --------------- --------- --------- ---------
19818.635696 [0002] perf[9117] 0.000 0.120 0.000
19818.635702 [0000] <idle> 0.019 0.000 0.006
19818.635709 [0002] migration/2[25] 0.000 0.003 0.012
19818.635709 [0000] usleep[9117] 0.005 0.000 0.006
#

Signed-off-by: David Ahern <dsa...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-5-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 8 ++++++
tools/perf/builtin-sched.c | 51 +++++++++++++++++++++++++++++----
2 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 121c60da03e5..7775b1eb2bee 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -132,6 +132,14 @@ OPTIONS for 'perf sched timehist'
--migrations::
Show migration events.

+--time::
+ Only analyze samples within given time window: <start>,<stop>. Times
+ have the format seconds.microseconds. If start is not given (i.e., time
+ string is ',x.y') then analysis starts at the beginning of the file. If
+ stop time is not given (i.e, time string is 'x.y,') then analysis goes
+ to end of file.
+
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4f9e7cba4ebf..870d94cd20ba 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -15,6 +15,7 @@
#include "util/color.h"
#include "util/stat.h"
#include "util/callchain.h"
+#include "util/time-utils.h"

#include <subcmd/parse-options.h>
#include "util/trace-event.h"
@@ -205,6 +206,8 @@ struct perf_sched {
bool show_wakeups;
bool show_migrations;
u64 skipped_samples;
+ const char *time_str;
+ struct perf_time_interval ptime;
};

/* per thread run time data */
@@ -1837,13 +1840,14 @@ static void timehist_header(struct perf_sched *sched)
static void timehist_print_sample(struct perf_sched *sched,
struct perf_sample *sample,
struct addr_location *al,
- struct thread *thread)
+ struct thread *thread,
+ u64 t)
{
struct thread_runtime *tr = thread__priv(thread);
u32 max_cpus = sched->max_cpu + 1;
char tstr[64];

- timestamp__scnprintf_usec(sample->time, tstr, sizeof(tstr));
+ timestamp__scnprintf_usec(t, tstr, sizeof(tstr));
printf("%15s [%04d] ", tstr, sample->cpu);

if (sched->show_cpu_visual) {
@@ -2194,7 +2198,8 @@ static int timehist_sched_wakeup_event(struct perf_tool *tool,
tr->ready_to_run = sample->time;

/* show wakeups if requested */
- if (sched->show_wakeups)
+ if (sched->show_wakeups &&
+ !perf_time__skip_sample(&sched->ptime, sample->time))
timehist_print_wakeup_event(sched, sample, machine, thread);

return 0;
@@ -2288,10 +2293,11 @@ static int timehist_sched_change_event(struct perf_tool *tool,
struct machine *machine)
{
struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
+ struct perf_time_interval *ptime = &sched->ptime;
struct addr_location al;
struct thread *thread;
struct thread_runtime *tr = NULL;
- u64 tprev;
+ u64 tprev, t = sample->time;
int rc = 0;

if (machine__resolve(machine, &al, sample) < 0) {
@@ -2318,9 +2324,35 @@ static int timehist_sched_change_event(struct perf_tool *tool,

tprev = perf_evsel__get_time(evsel, sample->cpu);

- timehist_update_runtime_stats(tr, sample->time, tprev);
+ /*
+ * If start time given:
+ * - sample time is under window user cares about - skip sample
+ * - tprev is under window user cares about - reset to start of window
+ */
+ if (ptime->start && ptime->start > t)
+ goto out;
+
+ if (ptime->start > tprev)
+ tprev = ptime->start;
+
+ /*
+ * If end time given:
+ * - previous sched event is out of window - we are done
+ * - sample time is beyond window user cares about - reset it
+ * to close out stats for time window interest
+ */
+ if (ptime->end) {
+ if (tprev > ptime->end)
+ goto out;
+
+ if (t > ptime->end)
+ t = ptime->end;
+ }
+
+ timehist_update_runtime_stats(tr, t, tprev);
+
if (!sched->summary_only)
- timehist_print_sample(sched, sample, &al, thread);
+ timehist_print_sample(sched, sample, &al, thread, t);

out:
if (tr) {
@@ -2583,6 +2615,11 @@ static int perf_sched__timehist(struct perf_sched *sched)

symbol__init(&session->header.env);

+ if (perf_time__parse_str(&sched->ptime, sched->time_str) != 0) {
+ pr_err("Invalid time string\n");
+ return -EINVAL;
+ }
+
if (timehist_check_attr(sched, evlist) != 0)
goto out;

@@ -2997,6 +3034,8 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('w', "wakeups", &sched.show_wakeups, "Show wakeup events"),
OPT_BOOLEAN('M', "migrations", &sched.show_migrations, "Show migration events"),
OPT_BOOLEAN('V', "cpu-visual", &sched.show_cpu_visual, "Add CPU visual"),
+ OPT_STRING(0, "time", &sched.time_str, "str",
+ "Time span for analysis (start,stop)"),
OPT_PARENT(sched_options)
};

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:08 PM12/1/16
to
From: David Ahern <d...@cumulusnetworks.com>

Code move only; no functional change intended.

Committer notes:

Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this
set of auto-detected features:

... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libelf: [ on ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libslang: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ on ]

Where it was failing with:

CC /tmp/build/perf/util/time-utils.o
util/time-utils.c: In function 'parse_nsec_time':
util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration]
time_sec = strtoul(str, &end, 10);
^
util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs]
time_sec = strtoul(str, &end, 10);
^
util/time-utils.c: In function 'perf_time__parse_str':
util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
free(str);
^
util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror]
util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free'

Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul()
declarations and fix the build.

Signed-off-by: David Ahern <dsa...@gmail.com>
Acked-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-3-g...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/time-utils.c | 36 +++++++++++++++++++++++++++++++++++-
tools/perf/util/time-utils.h | 2 ++
tools/perf/util/util.c | 33 ---------------------------------
tools/perf/util/util.h | 2 --
4 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 0443b2afd0cf..d1b21c72206d 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -1,5 +1,7 @@
+#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
+#include <linux/time64.h>
#include <time.h>
#include <errno.h>
#include <inttypes.h>
@@ -7,7 +9,39 @@
#include "perf.h"
#include "debug.h"
#include "time-utils.h"
-#include "util.h"
+
+int parse_nsec_time(const char *str, u64 *ptime)
+{
+ u64 time_sec, time_nsec;
+ char *end;
+
+ time_sec = strtoul(str, &end, 10);
+ if (*end != '.' && *end != '\0')
+ return -1;
+
+ if (*end == '.') {
+ int i;
+ char nsec_buf[10];
+
+ if (strlen(++end) > 9)
+ return -1;
+
+ strncpy(nsec_buf, end, 9);
+ nsec_buf[9] = '\0';
+
+ /* make it nsec precision */
+ for (i = strlen(nsec_buf); i < 9; i++)
+ nsec_buf[i] = '0';
+
+ time_nsec = strtoul(nsec_buf, &end, 10);
+ if (*end != '\0')
+ return -1;
+ } else
+ time_nsec = 0;
+
+ *ptime = time_sec * NSEC_PER_SEC + time_nsec;
+ return 0;
+}

static int parse_timestr_sec_nsec(struct perf_time_interval *ptime,
char *start_str, char *end_str)
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 8f3e0e370be8..c1f197c4af6c 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -5,6 +5,8 @@ struct perf_time_interval {
u64 start, end;
};

+int parse_nsec_time(const char *str, u64 *ptime);
+
int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr);

bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 67ac765da27a..9ddd98827d12 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -400,39 +400,6 @@ void sighandler_dump_stack(int sig)
raise(sig);
}

-int parse_nsec_time(const char *str, u64 *ptime)
-{
- u64 time_sec, time_nsec;
- char *end;
-
- time_sec = strtoul(str, &end, 10);
- if (*end != '.' && *end != '\0')
- return -1;
-
- if (*end == '.') {
- int i;
- char nsec_buf[10];
-
- if (strlen(++end) > 9)
- return -1;
-
- strncpy(nsec_buf, end, 9);
- nsec_buf[9] = '\0';
-
- /* make it nsec precision */
- for (i = strlen(nsec_buf); i < 9; i++)
- nsec_buf[i] = '0';
-
- time_nsec = strtoul(nsec_buf, &end, 10);
- if (*end != '\0')
- return -1;
- } else
- time_nsec = 0;
-
- *ptime = time_sec * NSEC_PER_SEC + time_nsec;
- return 0;
-}
-
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
{
u64 sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 79662d67891e..1d639e38aa82 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -179,8 +179,6 @@ static inline void *zalloc(size_t size)
#undef tolower
#undef toupper

-int parse_nsec_time(const char *str, u64 *ptime);
-
extern unsigned char sane_ctype[256];
#define GIT_SPACE 0x01
#define GIT_DIGIT 0x02
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 1, 2016, 1:10:09 PM12/1/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To print some values, like in the annotation code with invalid jump
offsets.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-1vk0g5twas...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/helpline.c | 10 ++++++++++
tools/perf/ui/helpline.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/tools/perf/ui/helpline.c b/tools/perf/ui/helpline.c
index 5b74a7eba210..379039ab00d8 100644
--- a/tools/perf/ui/helpline.c
+++ b/tools/perf/ui/helpline.c
@@ -72,3 +72,13 @@ int ui_helpline__vshow(const char *fmt, va_list ap)
{
return helpline_fns->show(fmt, ap);
}
+
+void ui_helpline__printf(const char *fmt, ...)
+{
+ va_list ap;
+
+ ui_helpline__pop();
+ va_start(ap, fmt);
+ ui_helpline__vpush(fmt, ap);
+ va_end(ap);
+}
diff --git a/tools/perf/ui/helpline.h b/tools/perf/ui/helpline.h
index 46181f4fc07e..d52d0a1a881b 100644
--- a/tools/perf/ui/helpline.h
+++ b/tools/perf/ui/helpline.h
@@ -21,6 +21,7 @@ void ui_helpline__push(const char *msg);
void ui_helpline__vpush(const char *fmt, va_list ap);
void ui_helpline__fpush(const char *fmt, ...);
void ui_helpline__puts(const char *msg);
+void ui_helpline__printf(const char *fmt, ...);
int ui_helpline__vshow(const char *fmt, va_list ap);

extern char ui_helpline__current[512];
--
2.9.3

Ingo Molnar

unread,
Dec 2, 2016, 4:20:05 AM12/2/16
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Arnaldo Carvalho de Melo

unread,
Dec 2, 2016, 5:40:05 AM12/2/16
to
Commit-ID: 5252b1aeabd0ae794cfaf323c10968443f10a363
Gitweb: http://git.kernel.org/tip/5252b1aeabd0ae794cfaf323c10968443f10a363
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Fri, 25 Nov 2016 15:56:34 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 25 Nov 2016 15:56:34 -0300

perf annotate: Show invalid jump offset in error message

To help in debugging when the wrong offset is being used, like in:

│13d98: ↓ jne 13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1>

That is the full line from objdump, and it seems what should be used is
13dd1, not 28e1.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-4nc0marsgs...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/annotate.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index cee0eee..ec7a30f 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Dec 2, 2016, 5:40:07 AM12/2/16
to
Commit-ID: 030910c085467c7f08f49735c19c66c1baa53f76
Gitweb: http://git.kernel.org/tip/030910c085467c7f08f49735c19c66c1baa53f76
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 29 Nov 2016 12:38:14 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Tue, 29 Nov 2016 12:46:11 -0300

perf test: Remove "test" and similar strings from test descriptions

Having "test" in almost all test descriptions is redundant, simplify it
removing and rewriting tests with such descriptions.

End result:

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-rx2lbfcrri...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/x86/tests/arch-tests.c | 10 ++--
tools/perf/tests/bpf.c | 6 +--
tools/perf/tests/builtin-test.c | 94 +++++++++++++++++-----------------
tools/perf/tests/llvm.c | 8 +--
4 files changed, 59 insertions(+), 59 deletions(-)

diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index 2218cb6..99d6619 100644
index 8f0298a..92343f4 100644
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index dab83f7..d1bec04 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
index b798a4b..02a33eb 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Dec 2, 2016, 5:40:07 AM12/2/16
to
Commit-ID: 9484b86e9cad903d3295d75c03961c3bdd1444a8
Gitweb: http://git.kernel.org/tip/9484b86e9cad903d3295d75c03961c3bdd1444a8
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Fri, 25 Nov 2016 15:48:25 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 25 Nov 2016 15:49:16 -0300

perf ui helpline: Provide a printf variant

To print some values, like in the annotation code with invalid jump
offsets.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-1vk0g5twas...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/helpline.c | 10 ++++++++++
tools/perf/ui/helpline.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/tools/perf/ui/helpline.c b/tools/perf/ui/helpline.c
index 5b74a7e..379039a 100644
--- a/tools/perf/ui/helpline.c
+++ b/tools/perf/ui/helpline.c
@@ -72,3 +72,13 @@ int ui_helpline__vshow(const char *fmt, va_list ap)
{
return helpline_fns->show(fmt, ap);
}
+
+void ui_helpline__printf(const char *fmt, ...)
+{
+ va_list ap;
+
+ ui_helpline__pop();
+ va_start(ap, fmt);
+ ui_helpline__vpush(fmt, ap);
+ va_end(ap);
+}
diff --git a/tools/perf/ui/helpline.h b/tools/perf/ui/helpline.h
index 46181f4..d52d0a1 100644

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:05 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

We've been hit several times by a Makefile bug where line indented by
tab was falsely considered as target command.

We prevent this by always using space indentation for everything except
for the target commands.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-3-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/Build.include | 12 ++--
tools/build/Makefile.feature | 138 +++++++++++++++++++++----------------------
tools/build/feature/Makefile | 102 ++++++++++++++++----------------
3 files changed, 126 insertions(+), 126 deletions(-)

diff --git a/tools/build/Build.include b/tools/build/Build.include
index 475152c52871..418871d02ebf 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -72,15 +72,15 @@ dep-cmd = $(if $(wildcard $(fixdep)),
# target, or command line has changed and update
# dependencies in the cmd file
if_changed_dep = $(if $(strip $(any-prereq) $(arg-check)), \
- @set -e; \
- $(echo-cmd) $(cmd_$(1)) && $(dep-cmd))
+ @set -e; \
+ $(echo-cmd) $(cmd_$(1)) && $(dep-cmd))

# if_changed - execute command if any prerequisite is newer than
# target, or command line has changed
-if_changed = $(if $(strip $(any-prereq) $(arg-check)), \
- @set -e; \
- $(echo-cmd) $(cmd_$(1)); \
- printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd)
+if_changed = $(if $(strip $(any-prereq) $(arg-check)), \
+ @set -e; \
+ $(echo-cmd) $(cmd_$(1)); \
+ printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd)

###
# C flags to be used in rule definitions, includes:
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index ae52e029dd22..e3fb5ecbdcb6 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -27,58 +27,58 @@ endef
# the rule that uses them - an example for that is the 'bionic'
# feature check. ]
#
-FEATURE_TESTS_BASIC := \
- backtrace \
- dwarf \
- dwarf_getlocations \
- fortify-source \
- sync-compare-and-swap \
- glibc \
- gtk2 \
- gtk2-infobar \
- libaudit \
- libbfd \
- libelf \
- libelf-getphdrnum \
- libelf-gelf_getnote \
- libelf-getshdrstrndx \
- libelf-mmap \
- libnuma \
- numa_num_possible_cpus \
- libperl \
- libpython \
- libpython-version \
- libslang \
- libcrypto \
- libunwind \
- libunwind-x86 \
- libunwind-x86_64 \
- libunwind-arm \
- libunwind-aarch64 \
- pthread-attr-setaffinity-np \
- stackprotector-all \
- timerfd \
- libdw-dwarf-unwind \
- zlib \
- lzma \
- get_cpuid \
- bpf \
- sdt
+FEATURE_TESTS_BASIC := \
+ backtrace \
+ dwarf \
+ dwarf_getlocations \
+ fortify-source \
+ sync-compare-and-swap \
+ glibc \
+ gtk2 \
+ gtk2-infobar \
+ libaudit \
+ libbfd \
+ libelf \
+ libelf-getphdrnum \
+ libelf-gelf_getnote \
+ libelf-getshdrstrndx \
+ libelf-mmap \
+ libnuma \
+ numa_num_possible_cpus \
+ libperl \
+ libpython \
+ libpython-version \
+ libslang \
+ libcrypto \
+ libunwind \
+ libunwind-x86 \
+ libunwind-x86_64 \
+ libunwind-arm \
+ libunwind-aarch64 \
+ pthread-attr-setaffinity-np \
+ stackprotector-all \
+ timerfd \
+ libdw-dwarf-unwind \
+ zlib \
+ lzma \
+ get_cpuid \
+ bpf \
+ sdt

# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
# of all feature tests
-FEATURE_TESTS_EXTRA := \
- bionic \
- compile-32 \
- compile-x32 \
- cplus-demangle \
- hello \
- libbabeltrace \
- liberty \
- liberty-z \
- libunwind-debug-frame \
- libunwind-debug-frame-arm \
- libunwind-debug-frame-aarch64
+FEATURE_TESTS_EXTRA := \
+ bionic \
+ compile-32 \
+ compile-x32 \
+ cplus-demangle \
+ hello \
+ libbabeltrace \
+ liberty \
+ liberty-z \
+ libunwind-debug-frame \
+ libunwind-debug-frame-arm \
+ libunwind-debug-frame-aarch64

FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)

@@ -86,26 +86,26 @@ ifeq ($(FEATURE_TESTS),all)
FEATURE_TESTS := $(FEATURE_TESTS_BASIC) $(FEATURE_TESTS_EXTRA)
endif

-FEATURE_DISPLAY ?= \
- dwarf \
- dwarf_getlocations \
- glibc \
- gtk2 \
- libaudit \
- libbfd \
- libelf \
- libnuma \
- numa_num_possible_cpus \
- libperl \
- libpython \
- libslang \
- libcrypto \
- libunwind \
- libdw-dwarf-unwind \
- zlib \
- lzma \
- get_cpuid \
- bpf
+FEATURE_DISPLAY ?= \
+ dwarf \
+ dwarf_getlocations \
+ glibc \
+ gtk2 \
+ libaudit \
+ libbfd \
+ libelf \
+ libnuma \
+ numa_num_possible_cpus \
+ libperl \
+ libpython \
+ libslang \
+ libcrypto \
+ libunwind \
+ libdw-dwarf-unwind \
+ zlib \
+ lzma \
+ get_cpuid \
+ bpf

# Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
# If in the future we need per-feature checks/flags for features not
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 871d5536951d..303196c16019 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -1,54 +1,54 @@
-FILES= \
- test-all.bin \
- test-backtrace.bin \
- test-bionic.bin \
- test-dwarf.bin \
- test-dwarf_getlocations.bin \
- test-fortify-source.bin \
- test-sync-compare-and-swap.bin \
- test-glibc.bin \
- test-gtk2.bin \
- test-gtk2-infobar.bin \
- test-hello.bin \
- test-libaudit.bin \
- test-libbfd.bin \
- test-liberty.bin \
- test-liberty-z.bin \
- test-cplus-demangle.bin \
- test-libelf.bin \
- test-libelf-getphdrnum.bin \
- test-libelf-gelf_getnote.bin \
- test-libelf-getshdrstrndx.bin \
- test-libelf-mmap.bin \
- test-libnuma.bin \
- test-numa_num_possible_cpus.bin \
- test-libperl.bin \
- test-libpython.bin \
- test-libpython-version.bin \
- test-libslang.bin \
- test-libcrypto.bin \
- test-libunwind.bin \
- test-libunwind-debug-frame.bin \
- test-libunwind-x86.bin \
- test-libunwind-x86_64.bin \
- test-libunwind-arm.bin \
- test-libunwind-aarch64.bin \
- test-libunwind-debug-frame-arm.bin \
- test-libunwind-debug-frame-aarch64.bin \
- test-pthread-attr-setaffinity-np.bin \
- test-stackprotector-all.bin \
- test-timerfd.bin \
- test-libdw-dwarf-unwind.bin \
- test-libbabeltrace.bin \
- test-compile-32.bin \
- test-compile-x32.bin \
- test-zlib.bin \
- test-lzma.bin \
- test-bpf.bin \
- test-get_cpuid.bin \
- test-sdt.bin \
- test-cxx.bin \
- test-jvmti.bin
+FILES= \
+ test-all.bin \
+ test-backtrace.bin \
+ test-bionic.bin \
+ test-dwarf.bin \
+ test-dwarf_getlocations.bin \
+ test-fortify-source.bin \
+ test-sync-compare-and-swap.bin \
+ test-glibc.bin \
+ test-gtk2.bin \
+ test-gtk2-infobar.bin \
+ test-hello.bin \
+ test-libaudit.bin \
+ test-libbfd.bin \
+ test-liberty.bin \
+ test-liberty-z.bin \
+ test-cplus-demangle.bin \
+ test-libelf.bin \
+ test-libelf-getphdrnum.bin \
+ test-libelf-gelf_getnote.bin \
+ test-libelf-getshdrstrndx.bin \
+ test-libelf-mmap.bin \
+ test-libnuma.bin \
+ test-numa_num_possible_cpus.bin \
+ test-libperl.bin \
+ test-libpython.bin \
+ test-libpython-version.bin \
+ test-libslang.bin \
+ test-libcrypto.bin \
+ test-libunwind.bin \
+ test-libunwind-debug-frame.bin \
+ test-libunwind-x86.bin \
+ test-libunwind-x86_64.bin \
+ test-libunwind-arm.bin \
+ test-libunwind-aarch64.bin \
+ test-libunwind-debug-frame-arm.bin \
+ test-libunwind-debug-frame-aarch64.bin \
+ test-pthread-attr-setaffinity-np.bin \
+ test-stackprotector-all.bin \
+ test-timerfd.bin \
+ test-libdw-dwarf-unwind.bin \
+ test-libbabeltrace.bin \
+ test-compile-32.bin \
+ test-compile-x32.bin \
+ test-zlib.bin \
+ test-lzma.bin \
+ test-bpf.bin \
+ test-get_cpuid.bin \
+ test-sdt.bin \
+ test-cxx.bin \
+ test-jvmti.bin

FILES := $(addprefix $(OUTPUT),$(FILES))

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:05 PM12/5/16
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit e7af7b15121ca08c31a0ab9df71a41b4c53365b4:

Merge tag 'perf-core-for-mingo-20161201' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-12-02 10:08:03 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161205

for you to fetch changes up to bec60e50af83741cde1786ab475d4bf472aed6f9:

perf annotate: Show raw form for jump instruction with indirect target (2016-12-05 17:21:57 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

Fixes:

- Do not show a bogus target address in 'perf annotate' for targetless powerpc
jump instructions such as 'bctr' (Ravi Bangoria)

- tools/build fixes related to race conditions with the fixdep utility (Jiri Olsa)

- Fix building objtool with clang (Peter Foley)

Infrastructure:

- Support linking perf with clang and LLVM libraries, initially statically, but
this limitation will be lifted and shared libraries, when available, will
be preferred to the static build, that should, as with other features, be
enabled explicitly (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Jiri Olsa (7):
tools build: Make fixdep parsing wait for last target
tools build: Make the .cmd file more readable
tools build: Move tabs to spaces where suitable
perf tools: Move install-gtk target into rules area
perf tools: Move python/perf.so target into rules area
perf tools: Cleanup build directory before each test
perf tools: Add non config targets

Peter Foley (1):
tools build: Fix objtool build with clang

Ravi Bangoria (1):
perf annotate: Show raw form for jump instruction with indirect target

Wang Nan (11):
perf tools: Pass context to perf hook functions
perf llvm: Extract helpers in llvm-utils.c
tools build: Add feature detection for LLVM
tools build: Add feature detection for clang
perf build: Add clang and llvm compile and linking support
perf clang: Add builtin clang support ant test case
perf clang: Use real file system for #include
perf clang: Allow passing CFLAGS to builtin clang
perf clang: Update test case to use real BPF script
perf clang: Support compile IR to BPF object and add testcase
perf clang: Compile BPF script using builtin clang support

tools/build/Build.include | 20 ++--
tools/build/Makefile.feature | 138 +++++++++++++-------------
tools/build/feature/Makefile | 120 +++++++++++++----------
tools/build/feature/test-clang.cpp | 21 ++++
tools/build/feature/test-llvm.cpp | 8 ++
tools/build/fixdep.c | 5 +-
tools/perf/Makefile.config | 62 +++++++++---
tools/perf/Makefile.perf | 56 +++++++----
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 9 ++
tools/perf/tests/clang.c | 46 +++++++++
tools/perf/tests/llvm.h | 7 ++
tools/perf/tests/make | 4 +-
tools/perf/tests/perf-hooks.c | 14 ++-
tools/perf/tests/tests.h | 3 +
tools/perf/util/Build | 2 +
tools/perf/util/annotate.c | 3 +
tools/perf/util/bpf-loader.c | 19 +++-
tools/perf/util/c++/Build | 2 +
tools/perf/util/c++/clang-c.h | 43 ++++++++
tools/perf/util/c++/clang-test.cpp | 62 ++++++++++++
tools/perf/util/c++/clang.cpp | 195 +++++++++++++++++++++++++++++++++++++
tools/perf/util/c++/clang.h | 26 +++++
tools/perf/util/llvm-utils.c | 76 +++++++++++----
tools/perf/util/llvm-utils.h | 6 ++
tools/perf/util/perf-hooks.c | 10 +-
tools/perf/util/perf-hooks.h | 6 +-
tools/perf/util/util-cxx.h | 26 +++++
28 files changed, 795 insertions(+), 195 deletions(-)
create mode 100644 tools/build/feature/test-clang.cpp
create mode 100644 tools/build/feature/test-llvm.cpp
create mode 100644 tools/perf/tests/clang.c
create mode 100644 tools/perf/util/c++/Build
create mode 100644 tools/perf/util/c++/clang-c.h
create mode 100644 tools/perf/util/c++/clang-test.cpp
create mode 100644 tools/perf/util/c++/clang.cpp
create mode 100644 tools/perf/util/c++/clang.h
create mode 100644 tools/perf/util/util-cxx.h

# uname -a
Linux jouet 4.8.8-300.fc25.x86_64 #1 SMP Tue Nov 15 18:10:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
51: builtin clang support : Skip (not compiled in)
52: x86 rdpmc : Ok
53: Convert perf time to TSC : Ok
54: DWARF unwind : Ok
55: x86 instruction decoder - new instructions : Ok
56: Intel cqm nmi context read : Skip
#
# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 fedora:20: Ok
11 fedora:21: Ok
12 fedora:22: Ok
13 fedora:23: Ok
14 fedora:24: Ok
15 fedora:24-x-ARC-uClibc: Ok
16 fedora:25: Ok
17 fedora:rawhide: Ok
18 mageia:5: Ok
19 opensuse:13.2: Ok
20 opensuse:42.1: Ok
21 opensuse:tumbleweed: Ok
22 ubuntu:12.04.5: Ok
23 ubuntu:14.04.4-x-linaro-arm64: Ok
24 ubuntu:16.04: Ok
25 ubuntu:16.04-x-arm: Ok
26 ubuntu:16.04-x-arm64: Ok
27 ubuntu:16.04-x-powerpc: Ok
28 ubuntu:16.04-x-powerpc64: Ok
29 ubuntu:16.04-x-powerpc64el: Ok
30 ubuntu:16.04-x-s390: Ok
31 ubuntu:16.10: Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_gtk2_O: make NO_GTK2=1
make_static_O: make LDFLAGS=-static
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_perf_o_O: make perf.o
make_no_slang_O: make NO_SLANG=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_newt_O: make NO_NEWT=1
make_debug_O: make DEBUG=1
make_tags_O: make tags
make_no_libbionic_O: make NO_LIBBIONIC=1
make_help_O: make help
make_install_O: make install
make_no_libunwind_O: make NO_LIBUNWIND=1
make_pure_O: make
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_doc_O: make doc
make_no_libaudit_O: make NO_LIBAUDIT=1
make_clean_all_O: make clean all
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_util_map_o_O: make util/map.o
make_install_bin_O: make install-bin
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libelf_O: make NO_LIBELF=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_with_clangllvm_O: make LIBCLANGLLVM=1

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:07 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Check if basic LLVM compiling environment is ready.

Use llvm-config to detect include and library directories. Avoid using
'llvm-config --cxxflags' because its result contain some unwanted flags
like --sysroot (if LLVM is built by yocto).

Use '?=' to set LLVM_CONFIG, so explicitly passing LLVM_CONFIG to make
would override it.

Use 'llvm-config --libs BPF' to check if BPF backend is compiled in.
Since now BPF bytecode is the only required backend, no need to waste
time linking llvm and clang if BPF backend is missing. This also
introduce an implicit requirement that LLVM should be new enough. Old
LLVM doesn't support BPF backend.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/feature/Makefile | 8 ++++++++
tools/build/feature/test-llvm.cpp | 8 ++++++++
2 files changed, 16 insertions(+)
create mode 100644 tools/build/feature/test-llvm.cpp

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 8f668bce8996..c09de59affc9 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -55,6 +55,7 @@ FILES := $(addprefix $(OUTPUT),$(FILES))
CC := $(CROSS_COMPILE)gcc -MD
CXX := $(CROSS_COMPILE)g++ -MD
PKG_CONFIG := $(CROSS_COMPILE)pkg-config
+LLVM_CONFIG ?= llvm-config

all: $(FILES)

@@ -229,6 +230,13 @@ $(OUTPUT)test-cxx.bin:
$(OUTPUT)test-jvmti.bin:
$(BUILD)

+$(OUTPUT)test-llvm.bin:
+ $(BUILDXX) -std=gnu++11 \
+ -I$(shell $(LLVM_CONFIG) --includedir) \
+ -L$(shell $(LLVM_CONFIG) --libdir) \
+ $(shell $(LLVM_CONFIG) --libs Core BPF) \
+ $(shell $(LLVM_CONFIG) --system-libs)
+
-include $(OUTPUT)*.d

###############################
diff --git a/tools/build/feature/test-llvm.cpp b/tools/build/feature/test-llvm.cpp
new file mode 100644
index 000000000000..d8d2cee35345
--- /dev/null
+++ b/tools/build/feature/test-llvm.cpp
@@ -0,0 +1,8 @@
+#include "llvm/Support/ManagedStatic.h"
+#include "llvm/Support/raw_ostream.h"
+int main()
+{
+ llvm::errs() << "Hello World!\n";
+ llvm::llvm_shutdown();
+ return 0;
+}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:07 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

After this patch, perf utilizes builtin clang support to build BPF
script, no longer depend on external clang, but fallbacking to it
if for some reason the builtin compiling framework fails.

Test:

$ type clang
-bash: type: clang: not found
$ cat ~/.perfconfig
$ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c
$ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c
$ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin
bpf: successfull builtin compilation
$

Can't pass cflags so unable to include kernel headers now. Will be fixed
by following commits.

Committer notes:

Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the
'verbose' variable will not be set when the bpf event is parsed and thus the
pr_debug indicating a 'successfull builtin compilation' will not be output, as
the debug level (1) will be less than what 'verbose' has at that point (0).

Signed-off-by: Wang Nan <wang...@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
[ Spell check/reflow successfull pr_debug string ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/bpf-loader.c | 15 +++++++++++----
tools/perf/util/c++/clang-c.h | 26 ++++++++++++++++++++++++++
tools/perf/util/c++/clang.cpp | 29 +++++++++++++++++++++++++++++
3 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index cf16b94115b5..36c861103291 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -14,11 +14,11 @@
#include "debug.h"
#include "bpf-loader.h"
#include "bpf-prologue.h"
-#include "llvm-utils.h"
#include "probe-event.h"
#include "probe-finder.h" // for MAX_PROBES
#include "parse-events.h"
#include "llvm-utils.h"
+#include "c++/clang-c.h"

#define DEFINE_PRINT_FN(name, level) \
static int libbpf_##name(const char *fmt, ...) \
@@ -86,9 +86,16 @@ struct bpf_object *bpf__prepare_load(const char *filename, bool source)
void *obj_buf;
size_t obj_buf_sz;

- err = llvm__compile_bpf(filename, &obj_buf, &obj_buf_sz);
- if (err)
- return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE);
+ perf_clang__init();
+ err = perf_clang__compile_bpf(filename, &obj_buf, &obj_buf_sz);
+ perf_clang__cleanup();
+ if (err) {
+ pr_warning("bpf: builtin compilation failed: %d, try external compiler\n", err);
+ err = llvm__compile_bpf(filename, &obj_buf, &obj_buf_sz);
+ if (err)
+ return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE);
+ } else
+ pr_debug("bpf: successfull builtin compilation\n");
obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename);

if (!IS_ERR(obj) && llvm_param.dump_obj)
diff --git a/tools/perf/util/c++/clang-c.h b/tools/perf/util/c++/clang-c.h
index 22b3936d1f09..0eadd792ab1f 100644
--- a/tools/perf/util/c++/clang-c.h
+++ b/tools/perf/util/c++/clang-c.h
@@ -1,16 +1,42 @@
#ifndef PERF_UTIL_CLANG_C_H
#define PERF_UTIL_CLANG_C_H

+#include <stddef.h> /* for size_t */
+#include <util-cxx.h> /* for __maybe_unused */
+
#ifdef __cplusplus
extern "C" {
#endif

+#ifdef HAVE_LIBCLANGLLVM_SUPPORT
extern void perf_clang__init(void);
extern void perf_clang__cleanup(void);

extern int test__clang_to_IR(void);
extern int test__clang_to_obj(void);

+extern int perf_clang__compile_bpf(const char *filename,
+ void **p_obj_buf,
+ size_t *p_obj_buf_sz);
+#else
+
+
+static inline void perf_clang__init(void) { }
+static inline void perf_clang__cleanup(void) { }
+
+static inline int test__clang_to_IR(void) { return -1; }
+static inline int test__clang_to_obj(void) { return -1;}
+
+static inline int
+perf_clang__compile_bpf(const char *filename __maybe_unused,
+ void **p_obj_buf __maybe_unused,
+ size_t *p_obj_buf_sz __maybe_unused)
+{
+ return -ENOTSUP;
+}
+
+#endif
+
#ifdef __cplusplus
}
#endif
diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index 2a1a75df204f..1e974152cac2 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -163,4 +163,33 @@ void perf_clang__cleanup(void)
perf::LLVMCtx.reset(nullptr);
llvm::llvm_shutdown();
}
+
+int perf_clang__compile_bpf(const char *filename,
+ void **p_obj_buf,
+ size_t *p_obj_buf_sz)
+{
+ using namespace perf;
+
+ if (!p_obj_buf || !p_obj_buf_sz)
+ return -EINVAL;
+
+ llvm::opt::ArgStringList CFlags;
+ auto M = getModuleFromSource(std::move(CFlags), filename);
+ if (!M)
+ return -EINVAL;
+ auto O = getBPFObjectFromModule(&*M);
+ if (!O)
+ return -EINVAL;
+
+ size_t size = O->size_in_bytes();
+ void *buffer;
+
+ buffer = malloc(size);
+ if (!buffer)
+ return -ENOMEM;
+ memcpy(buffer, O->data(), size);
+ *p_obj_buf = buffer;
+ *p_obj_buf_sz = size;

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:08 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Add basic clang support in clang.cpp and test__clang() testcase. The
first testcase checks if builtin clang is able to generate LLVM IR.

tests/clang.c is a proxy. Real testcase resides in
utils/c++/clang-test.cpp in c++ and exports C interface to perf test
subsystem.

Test result:

$ perf test -v clang
51: builtin clang support :
51.1: Test builtin clang compile C source to IR :
--- start ---
test child forked, pid 13215
test child finished with 0
---- end ----
Test builtin clang support subtest 0: Ok

Committer note:

Make sure you've enabled CLANG and LLVM builtin support by setting
the LIBCLANGLLVM variable on the make command line, e.g.:

make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin

Otherwise you'll get this when trying to do the 'perf test' call above:

# perf test clang
51: builtin clang support : Skip (not compiled in)
#

Signed-off-by: Wang Nan <wang...@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
[ Removed "Test" from descriptions, redundant and already removed from all the other entries ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 9 ++++
tools/perf/tests/clang.c | 42 +++++++++++++++++
tools/perf/tests/tests.h | 3 ++
tools/perf/util/Build | 2 +
tools/perf/util/c++/Build | 2 +
tools/perf/util/c++/clang-c.h | 16 +++++++
tools/perf/util/c++/clang-test.cpp | 31 ++++++++++++
tools/perf/util/c++/clang.cpp | 96 ++++++++++++++++++++++++++++++++++++++
tools/perf/util/c++/clang.h | 16 +++++++
10 files changed, 218 insertions(+)
create mode 100644 tools/perf/tests/clang.c
create mode 100644 tools/perf/util/c++/Build
create mode 100644 tools/perf/util/c++/clang-c.h
create mode 100644 tools/perf/util/c++/clang-test.cpp
create mode 100644 tools/perf/util/c++/clang.cpp
create mode 100644 tools/perf/util/c++/clang.h

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index af3ec94869aa..6676c2dd6dcb 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -43,6 +43,7 @@ perf-y += sdt.o
perf-y += is_printable_array.o
perf-y += bitmap.o
perf-y += perf-hooks.o
+perf-y += clang.o

$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index d1bec0444be7..23605202d4a1 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -234,6 +234,15 @@ static struct test generic_tests[] = {
.func = test__perf_hooks,
},
{
+ .desc = "builtin clang support",
+ .func = test__clang,
+ .subtest = {
+ .skip_if_fail = true,
+ .get_nr = test__clang_subtest_get_nr,
+ .get_desc = test__clang_subtest_get_desc,
+ }
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/clang.c b/tools/perf/tests/clang.c
new file mode 100644
index 000000000000..636d6d0e9037
--- /dev/null
+++ b/tools/perf/tests/clang.c
@@ -0,0 +1,42 @@
+#include "tests.h"
+#include "debug.h"
+#include "util.h"
+#include "c++/clang-c.h"
+
+static struct {
+ int (*func)(void);
+ const char *desc;
+} clang_testcase_table[] = {
+#ifdef HAVE_LIBCLANGLLVM_SUPPORT
+ {
+ .func = test__clang_to_IR,
+ .desc = "builtin clang compile C source to IR",
+ },
+#endif
+};
+
+int test__clang_subtest_get_nr(void)
+{
+ return (int)ARRAY_SIZE(clang_testcase_table);
+}
+
+const char *test__clang_subtest_get_desc(int i)
+{
+ if (i < 0 || i >= (int)ARRAY_SIZE(clang_testcase_table))
+ return NULL;
+ return clang_testcase_table[i].desc;
+}
+
+#ifndef HAVE_LIBCLANGLLVM_SUPPORT
+int test__clang(int i __maybe_unused)
+{
+ return TEST_SKIP;
+}
+#else
+int test__clang(int i __maybe_unused)
+{
+ if (i < 0 || i >= (int)ARRAY_SIZE(clang_testcase_table))
+ return TEST_FAIL;
+ return clang_testcase_table[i].func();
+}
+#endif
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 3a1f98f291ba..0d7b251305af 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -92,6 +92,9 @@ int test__sdt_event(int subtest);
int test__is_printable_array(int subtest);
int test__bitmap_print(int subtest);
int test__perf_hooks(int subtest);
+int test__clang(int subtest);
+const char *test__clang_subtest_get_desc(int subtest);
+int test__clang_subtest_get_nr(void);

#if defined(__arm__) || defined(__aarch64__)
#ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index bdad82a9812d..3840e3a87057 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -126,6 +126,8 @@ endif

libperf-y += perf-hooks.o

+libperf-$(CONFIG_CXX) += c++/
+
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
# avoid compiler warnings in 32-bit mode
CFLAGS_genelf_debug.o += -Wno-packed
diff --git a/tools/perf/util/c++/Build b/tools/perf/util/c++/Build
new file mode 100644
index 000000000000..988fef1b11d7
--- /dev/null
+++ b/tools/perf/util/c++/Build
@@ -0,0 +1,2 @@
+libperf-$(CONFIG_CLANGLLVM) += clang.o
+libperf-$(CONFIG_CLANGLLVM) += clang-test.o
diff --git a/tools/perf/util/c++/clang-c.h b/tools/perf/util/c++/clang-c.h
new file mode 100644
index 000000000000..dcde4b564f3b
--- /dev/null
+++ b/tools/perf/util/c++/clang-c.h
@@ -0,0 +1,16 @@
+#ifndef PERF_UTIL_CLANG_C_H
+#define PERF_UTIL_CLANG_C_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+extern void perf_clang__init(void);
+extern void perf_clang__cleanup(void);
+
+extern int test__clang_to_IR(void);
+
+#ifdef __cplusplus
+}
+#endif
+#endif
diff --git a/tools/perf/util/c++/clang-test.cpp b/tools/perf/util/c++/clang-test.cpp
new file mode 100644
index 000000000000..3da6bfa4bc54
--- /dev/null
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -0,0 +1,31 @@
+#include "clang.h"
+#include "clang-c.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/LLVMContext.h"
+
+class perf_clang_scope {
+public:
+ explicit perf_clang_scope() {perf_clang__init();}
+ ~perf_clang_scope() {perf_clang__cleanup();}
+};
+
+extern "C" {
+
+int test__clang_to_IR(void)
+{
+ perf_clang_scope _scope;
+
+ std::unique_ptr<llvm::Module> M =
+ perf::getModuleFromSource("perf-test.c",
+ "int myfunc(void) {return 1;}");
+
+ if (!M)
+ return -1;
+
+ for (llvm::Function& F : *M)
+ if (F.getName() == "myfunc")
+ return 0;
+ return -1;
+}
+
+}
diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
new file mode 100644
index 000000000000..c17b1176e25d
--- /dev/null
+++ b/tools/perf/util/c++/clang.cpp
@@ -0,0 +1,96 @@
+/*
+ * llvm C frontend for perf. Support dynamically compile C file
+ *
+ * Inspired by clang example code:
+ * http://llvm.org/svn/llvm-project/cfe/trunk/examples/clang-interpreter/main.cpp
+ *
+ * Copyright (C) 2016 Wang Nan <wang...@huawei.com>
+ * Copyright (C) 2016 Huawei Inc.
+ */
+
+#include "clang/CodeGen/CodeGenAction.h"
+#include "clang/Frontend/CompilerInvocation.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "clang/Frontend/TextDiagnosticPrinter.h"
+#include "clang/Tooling/Tooling.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Option/Option.h"
+#include "llvm/Support/ManagedStatic.h"
+#include <memory>
+
+#include "clang.h"
+#include "clang-c.h"
+
+namespace perf {
+
+static std::unique_ptr<llvm::LLVMContext> LLVMCtx;
+
+using namespace clang;
+
+static vfs::InMemoryFileSystem *
+buildVFS(StringRef& Name, StringRef& Content)
+{
+ vfs::InMemoryFileSystem *VFS = new vfs::InMemoryFileSystem(true);
+ VFS->addFile(Twine(Name), 0, llvm::MemoryBuffer::getMemBuffer(Content));
+ return VFS;
+}
+
+static CompilerInvocation *
+createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
+{
+ llvm::opt::ArgStringList CCArgs {
+ "-cc1",
+ "-triple", "bpf-pc-linux",
+ "-fsyntax-only",
+ "-ferror-limit", "19",
+ "-fmessage-length", "127",
+ "-O2",
+ "-nostdsysteminc",
+ "-nobuiltininc",
+ "-vectorize-loops",
+ "-vectorize-slp",
+ "-Wno-unused-value",
+ "-Wno-pointer-sign",
+ "-x", "c"};
+ CompilerInvocation *CI = tooling::newInvocation(&Diags, CCArgs);
+
+ FrontendOptions& Opts = CI->getFrontendOpts();
+ Opts.Inputs.clear();
+ Opts.Inputs.emplace_back(Path, IK_C);
+ return CI;
+}
+
+std::unique_ptr<llvm::Module>
+getModuleFromSource(StringRef Name, StringRef Content)
+{
+ CompilerInstance Clang;
+ Clang.createDiagnostics();
+
+ IntrusiveRefCntPtr<vfs::FileSystem> VFS = buildVFS(Name, Content);
+ Clang.setVirtualFileSystem(&*VFS);
+
+ IntrusiveRefCntPtr<CompilerInvocation> CI =
+ createCompilerInvocation(Name, Clang.getDiagnostics());
+ Clang.setInvocation(&*CI);
+
+ std::unique_ptr<CodeGenAction> Act(new EmitLLVMOnlyAction(&*LLVMCtx));
+ if (!Clang.ExecuteAction(*Act))
+ return std::unique_ptr<llvm::Module>(nullptr);
+
+ return Act->takeModule();
+}
+
+}
+
+extern "C" {
+void perf_clang__init(void)
+{
+ perf::LLVMCtx.reset(new llvm::LLVMContext());
+}
+
+void perf_clang__cleanup(void)
+{
+ perf::LLVMCtx.reset(nullptr);
+ llvm::llvm_shutdown();
+}
+}
diff --git a/tools/perf/util/c++/clang.h b/tools/perf/util/c++/clang.h
new file mode 100644
index 000000000000..f64483be43d0
--- /dev/null
+++ b/tools/perf/util/c++/clang.h
@@ -0,0 +1,16 @@
+#ifndef PERF_UTIL_CLANG_H
+#define PERF_UTIL_CLANG_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include <memory>
+namespace perf {
+
+using namespace llvm;
+
+std::unique_ptr<Module>
+getModuleFromSource(StringRef Name, StringRef Content);
+
+}
+#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:08 PM12/5/16
to
From: Jiri Olsa <jo...@redhat.com>

The fixdep tool, among other things, replaces the target of the object
in the gcc generated dependency output file.

The parsing code assumes there's only single target in the rule but this
is not always the case as described in here:

https://gcc.gnu.org/ml/gcc-help/2016-11/msg00099.html

Make the fixdep code smart enough to skip all the possible targets.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Peter Foley <pefo...@pefoley.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20161201130025.GA16430@krava
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/fixdep.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/build/fixdep.c b/tools/build/fixdep.c
index 1521d36cef0d..734d1547cbae 100644
--- a/tools/build/fixdep.c
+++ b/tools/build/fixdep.c
@@ -49,7 +49,7 @@ static void parse_dep_file(void *map, size_t len)
char *end = m + len;
char *p;
char s[PATH_MAX];
- int is_target;
+ int is_target, has_target = 0;
int saw_any_target = 0;
int is_first_dep = 0;

@@ -67,7 +67,8 @@ static void parse_dep_file(void *map, size_t len)
if (is_target) {
/* The /next/ file is the first dependency */
is_first_dep = 1;
- } else {
+ has_target = 1;
+ } else if (has_target) {
/* Save this token/filename */
memcpy(s, m, p-m);
s[p - m] = 0;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:09 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Improve getModuleFromSource() API to accept a cflags list. This feature
will be used to pass LINUX_VERSION_CODE and -I flags.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/c++/clang-test.cpp | 5 +++--
tools/perf/util/c++/clang.cpp | 21 +++++++++++++--------
tools/perf/util/c++/clang.h | 8 ++++++--
3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/c++/clang-test.cpp b/tools/perf/util/c++/clang-test.cpp
index 3da6bfa4bc54..0f484fbb2b58 100644
--- a/tools/perf/util/c++/clang-test.cpp
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -16,8 +16,9 @@ int test__clang_to_IR(void)
perf_clang_scope _scope;

std::unique_ptr<llvm::Module> M =
- perf::getModuleFromSource("perf-test.c",
- "int myfunc(void) {return 1;}");
+ perf::getModuleFromSource({"-DRESULT=1"},
+ "perf-test.c",
+ "int myfunc(void) {return RESULT;}");

if (!M)
return -1;
diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index cf96199b4b6f..715ca0a3dee0 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -29,7 +29,8 @@ static std::unique_ptr<llvm::LLVMContext> LLVMCtx;
using namespace clang;

static CompilerInvocation *
-createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
+createCompilerInvocation(llvm::opt::ArgStringList CFlags, StringRef& Path,
+ DiagnosticsEngine& Diags)
{
llvm::opt::ArgStringList CCArgs {
"-cc1",
@@ -45,6 +46,8 @@ createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
"-Wno-unused-value",
"-Wno-pointer-sign",
"-x", "c"};
+
+ CCArgs.append(CFlags.begin(), CFlags.end());
CompilerInvocation *CI = tooling::newInvocation(&Diags, CCArgs);

FrontendOptions& Opts = CI->getFrontendOpts();
@@ -54,8 +57,8 @@ createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
}

static std::unique_ptr<llvm::Module>
-getModuleFromSource(StringRef Path,
- IntrusiveRefCntPtr<vfs::FileSystem> VFS)
+getModuleFromSource(llvm::opt::ArgStringList CFlags,
+ StringRef Path, IntrusiveRefCntPtr<vfs::FileSystem> VFS)
{
CompilerInstance Clang;
Clang.createDiagnostics();
@@ -63,7 +66,8 @@ getModuleFromSource(StringRef Path,
Clang.setVirtualFileSystem(&*VFS);

IntrusiveRefCntPtr<CompilerInvocation> CI =
- createCompilerInvocation(Path, Clang.getDiagnostics());
+ createCompilerInvocation(std::move(CFlags), Path,
+ Clang.getDiagnostics());
Clang.setInvocation(&*CI);

std::unique_ptr<CodeGenAction> Act(new EmitLLVMOnlyAction(&*LLVMCtx));
@@ -74,7 +78,8 @@ getModuleFromSource(StringRef Path,
}

std::unique_ptr<llvm::Module>
-getModuleFromSource(StringRef Name, StringRef Content)
+getModuleFromSource(llvm::opt::ArgStringList CFlags,
+ StringRef Name, StringRef Content)
{
using namespace vfs;

@@ -90,14 +95,14 @@ getModuleFromSource(StringRef Name, StringRef Content)
OverlayFS->pushOverlay(MemFS);
MemFS->addFile(Twine(Name), 0, llvm::MemoryBuffer::getMemBuffer(Content));

- return getModuleFromSource(Name, OverlayFS);
+ return getModuleFromSource(std::move(CFlags), Name, OverlayFS);
}

std::unique_ptr<llvm::Module>
-getModuleFromSource(StringRef Path)
+getModuleFromSource(llvm::opt::ArgStringList CFlags, StringRef Path)
{
IntrusiveRefCntPtr<vfs::FileSystem> VFS(vfs::getRealFileSystem());
- return getModuleFromSource(Path, VFS);
+ return getModuleFromSource(std::move(CFlags), Path, VFS);
}

}
diff --git a/tools/perf/util/c++/clang.h b/tools/perf/util/c++/clang.h
index 90aff0162f1c..b4fc2a96b79d 100644
--- a/tools/perf/util/c++/clang.h
+++ b/tools/perf/util/c++/clang.h
@@ -4,16 +4,20 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
+#include "llvm/Option/Option.h"
#include <memory>
+
namespace perf {

using namespace llvm;

std::unique_ptr<Module>
-getModuleFromSource(StringRef Name, StringRef Content);
+getModuleFromSource(opt::ArgStringList CFlags,
+ StringRef Name, StringRef Content);

std::unique_ptr<Module>
-getModuleFromSource(StringRef Path);
+getModuleFromSource(opt::ArgStringList CFlags,
+ StringRef Path);

}
#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:40:12 PM12/5/16
to
From: Peter Foley <pefo...@pefoley.com>

Clang doesn't support multiple arguments being passed to -Wp, so split
them.

Fixes this error:
HOSTCC tools/objtool/fixdep.o
cat: tools/objtool/.fixdep.o.d: No such file or directory

Signed-off-by: Peter Foley <pefo...@pefoley.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@redhat.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20161128024346....@pefoley.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/Build.include | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/build/Build.include b/tools/build/Build.include
index c4ae12a5d0a5..62dcf0c7aac2 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -89,12 +89,12 @@ if_changed = $(if $(strip $(any-prereq) $(arg-check)), \
# - per target C flags
# - per object C flags
# - BUILD_STR macro to allow '-D"$(variable)"' constructs
-c_flags_1 = -Wp,-MD,$(depfile),-MT,$@ $(CFLAGS) -D"BUILD_STR(s)=\#s" $(CFLAGS_$(basetarget).o) $(CFLAGS_$(obj))
+c_flags_1 = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CFLAGS) -D"BUILD_STR(s)=\#s" $(CFLAGS_$(basetarget).o) $(CFLAGS_$(obj))
c_flags_2 = $(filter-out $(CFLAGS_REMOVE_$(basetarget).o), $(c_flags_1))
c_flags = $(filter-out $(CFLAGS_REMOVE_$(obj)), $(c_flags_2))
-cxx_flags = -Wp,-MD,$(depfile),-MT,$@ $(CXXFLAGS) -D"BUILD_STR(s)=\#s" $(CXXFLAGS_$(basetarget).o) $(CXXFLAGS_$(obj))
+cxx_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CXXFLAGS) -D"BUILD_STR(s)=\#s" $(CXXFLAGS_$(basetarget).o) $(CXXFLAGS_$(obj))

###
## HOSTCC C flags

-host_c_flags = -Wp,-MD,$(depfile),-MT,$@ $(CHOSTFLAGS) -D"BUILD_STR(s)=\#s" $(CHOSTFLAGS_$(basetarget).o) $(CHOSTFLAGS_$(obj))
+host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CHOSTFLAGS) -D"BUILD_STR(s)=\#s" $(CHOSTFLAGS_$(basetarget).o) $(CHOSTFLAGS_$(obj))
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:05 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

Cleanup the fixdep tool before every test.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-8-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/make | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index aa49b6600d1f..0784748f1670 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -280,7 +280,7 @@ endif

MAKEFLAGS := --no-print-directory

-clean := @(cd $(PERF); $(MAKE_F) -s $(O_OPT) clean >/dev/null)
+clean := @(cd $(PERF); $(MAKE_F) -s $(O_OPT) clean >/dev/null && $(MAKE) -s $(O_OPT) -C ../build clean >/dev/null)

$(run):
$(call clean)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:05 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

Putting extra line between dependencies and cmd_* definition
to make it more readable.

Before:

$ cat .builtin-top.o.cmd
...
/home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \
/home/jolsa/kernel/linux-perf/tools/include/linux/time64.h
cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-...
...

After:

$ cat .builtin-top.o.cmd
...
/home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \
/home/jolsa/kernel/linux-perf/tools/include/linux/time64.h

cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-...
...

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-2-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/Build.include | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Build.include b/tools/build/Build.include
index 62dcf0c7aac2..475152c52871 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -65,7 +65,7 @@ dep-cmd = $(if $(wildcard $(fixdep)),
printf '\# cannot find fixdep (%s)\n' $(fixdep) > $(dot-target).cmd; \
printf '\# using basic dep data\n\n' >> $(dot-target).cmd; \
cat $(depfile) >> $(dot-target).cmd; \
- printf '%s\n' 'cmd_$@ := $(make-cmd)' >> $(dot-target).cmd)
+ printf '\n%s\n' 'cmd_$@ := $(make-cmd)' >> $(dot-target).cmd)

###
# if_changed_dep - execute command if any prerequisite is newer than
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:05 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a
real BPF script.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.config | 27 +++++++++++++++------------
tools/perf/tests/llvm.h | 7 +++++++
tools/perf/util/c++/clang-test.cpp | 17 ++++++++++++++---
tools/perf/util/util-cxx.h | 26 ++++++++++++++++++++++++++
4 files changed, 62 insertions(+), 15 deletions(-)
create mode 100644 tools/perf/util/util-cxx.h

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index b7c9c8051a33..09c2a9874f2f 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -212,24 +212,27 @@ ifeq ($(DEBUG),0)
endif
endif

-CFLAGS += -I$(src-perf)/util/include
-CFLAGS += -I$(src-perf)/arch/$(ARCH)/include
-CFLAGS += -I$(srctree)/tools/include/uapi
-CFLAGS += -I$(srctree)/tools/include/
-CFLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/uapi
-CFLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/
-CFLAGS += -I$(srctree)/tools/arch/$(ARCH)/
+INC_FLAGS += -I$(src-perf)/util/include
+INC_FLAGS += -I$(src-perf)/arch/$(ARCH)/include
+INC_FLAGS += -I$(srctree)/tools/include/uapi
+INC_FLAGS += -I$(srctree)/tools/include/
+INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/uapi
+INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/include/
+INC_FLAGS += -I$(srctree)/tools/arch/$(ARCH)/

# $(obj-perf) for generated common-cmds.h
# $(obj-perf)/util for generated bison/flex headers
ifneq ($(OUTPUT),)
-CFLAGS += -I$(obj-perf)/util
-CFLAGS += -I$(obj-perf)
+INC_FLAGS += -I$(obj-perf)/util
+INC_FLAGS += -I$(obj-perf)
endif

-CFLAGS += -I$(src-perf)/util
-CFLAGS += -I$(src-perf)
-CFLAGS += -I$(srctree)/tools/lib/
+INC_FLAGS += -I$(src-perf)/util
+INC_FLAGS += -I$(src-perf)
+INC_FLAGS += -I$(srctree)/tools/lib/
+
+CFLAGS += $(INC_FLAGS)
+CXXFLAGS += $(INC_FLAGS)

CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE

diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 0eaa604be99d..b83571758d83 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -1,6 +1,10 @@
#ifndef PERF_TEST_LLVM_H
#define PERF_TEST_LLVM_H

+#ifdef __cplusplus
+extern "C" {
+#endif
+
#include <stddef.h> /* for size_t */
#include <stdbool.h> /* for bool */

@@ -20,4 +24,7 @@ enum test_llvm__testcase {
int test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz,
enum test_llvm__testcase index, bool force,
bool *should_load_fail);
+#ifdef __cplusplus
+}
+#endif
#endif
diff --git a/tools/perf/util/c++/clang-test.cpp b/tools/perf/util/c++/clang-test.cpp
index 0f484fbb2b58..d84e760d2aab 100644
--- a/tools/perf/util/c++/clang-test.cpp
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -3,6 +3,10 @@
#include "llvm/IR/Function.h"
#include "llvm/IR/LLVMContext.h"

+#include <util-cxx.h>
+#include <tests/llvm.h>
+#include <string>
+
class perf_clang_scope {
public:
explicit perf_clang_scope() {perf_clang__init();}
@@ -14,17 +18,24 @@ extern "C" {
int test__clang_to_IR(void)
{
perf_clang_scope _scope;
+ unsigned int kernel_version;
+
+ if (fetch_kernel_version(&kernel_version, NULL, 0))
+ return -1;
+
+ std::string cflag_kver("-DLINUX_VERSION_CODE=" +
+ std::to_string(kernel_version));

std::unique_ptr<llvm::Module> M =
- perf::getModuleFromSource({"-DRESULT=1"},
+ perf::getModuleFromSource({cflag_kver.c_str()},
"perf-test.c",
- "int myfunc(void) {return RESULT;}");
+ test_llvm__bpf_base_prog);

if (!M)
return -1;

for (llvm::Function& F : *M)
- if (F.getName() == "myfunc")
+ if (F.getName() == "bpf_func__SyS_epoll_wait")
return 0;
return -1;
}
diff --git a/tools/perf/util/util-cxx.h b/tools/perf/util/util-cxx.h
new file mode 100644
index 000000000000..0e0e019c9f34
--- /dev/null
+++ b/tools/perf/util/util-cxx.h
@@ -0,0 +1,26 @@
+/*
+ * Support C++ source use utilities defined in util.h
+ */
+
+#ifndef PERF_UTIL_UTIL_CXX_H
+#define PERF_UTIL_UTIL_CXX_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * Now 'new' is the only C++ keyword found in util.h:
+ * in tools/include/linux/rbtree.h
+ *
+ * Other keywords, like class and delete, should be
+ * redefined if necessary.
+ */
+#define new _new
+#include "util.h"
+#undef new
+
+#ifdef __cplusplus
+}
+#endif
+#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:05 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Add necessary c++ flags and link libraries to support builtin clang and
LLVM. Add all llvm and clang libraries, so don't need to worry about
clang changes its libraries setting. However, linking perf would take
much longer than usual.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.config | 35 +++++++++++++++++++++++++++++++++++
tools/perf/Makefile.perf | 23 ++++++++++++++++++++++-
tools/perf/tests/make | 2 ++
3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 8a493d46fab9..b7c9c8051a33 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -136,6 +136,7 @@ endif
# Treat warnings as errors unless directed not to
ifneq ($(WERROR),0)
CFLAGS += -Werror
+ CXXFLAGS += -Werror
endif

ifndef DEBUG
@@ -182,6 +183,13 @@ CFLAGS += -Wall
CFLAGS += -Wextra
CFLAGS += -std=gnu99

+CXXFLAGS += -std=gnu++11 -fno-exceptions -fno-rtti
+CXXFLAGS += -Wall
+CXXFLAGS += -fno-omit-frame-pointer
+CXXFLAGS += -ggdb3
+CXXFLAGS += -funwind-tables
+CXXFLAGS += -Wno-strict-aliasing
+
# Enforce a non-executable stack, as we may regress (again) in the future by
# adding assembler files missing the .GNU-stack linker note.
LDFLAGS += -Wl,-z,noexecstack
@@ -783,6 +791,33 @@ ifndef NO_JVMTI
endif
endif

+USE_CXX = 0
+USE_CLANGLLVM = 0
+ifdef LIBCLANGLLVM
+ $(call feature_check,cxx)
+ ifneq ($(feature-cxx), 1)
+ msg := $(warning No g++ found, disable clang and llvm support. Please install g++)
+ else
+ $(call feature_check,llvm)
+ ifneq ($(feature-llvm), 1)
+ msg := $(warning No libLLVM found, disable clang and llvm support. Please install llvm-dev)
+ else
+ $(call feature_check,clang)
+ ifneq ($(feature-clang), 1)
+ msg := $(warning No libclang found, disable clang and llvm support. Please install libclang-dev)
+ else
+ CFLAGS += -DHAVE_LIBCLANGLLVM_SUPPORT
+ CXXFLAGS += -DHAVE_LIBCLANGLLVM_SUPPORT -I$(shell $(LLVM_CONFIG) --includedir)
+ $(call detected,CONFIG_CXX)
+ $(call detected,CONFIG_CLANGLLVM)
+ USE_CXX = 1
+ USE_LLVM = 1
+ USE_CLANG = 1
+ endif
+ endif
+ endif
+endif
+
# Among the variables below, these:
# perfexecdir
# template_dir
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 3cb1df43ad3e..dfb20dd31865 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -88,6 +88,10 @@ include ../scripts/utilities.mak
# and bypass the feature detection
#
# Define NO_JVMTI if you do not want jvmti agent built
+#
+# Define LIBCLANGLLVM if you DO want builtin clang and llvm support.
+# When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if
+# llvm-config is not in $PATH.

# As per kernel Makefile, avoid funny character set dependencies
unexport LC_ALL
@@ -143,6 +147,7 @@ endef
$(call allow-override,CC,$(CROSS_COMPILE)gcc)
$(call allow-override,AR,$(CROSS_COMPILE)ar)
$(call allow-override,LD,$(CROSS_COMPILE)ld)
+$(call allow-override,CXX,$(CROSS_COMPILE)g++)

LD += $(EXTRA_LDFLAGS)

@@ -151,6 +156,7 @@ HOSTLD ?= ld
HOSTAR ?= ar

PKG_CONFIG = $(CROSS_COMPILE)pkg-config
+LLVM_CONFIG ?= llvm-config

RM = rm -f
LN = ln -f
@@ -338,6 +344,21 @@ endif

LIBS = -Wl,--whole-archive $(PERFLIBS) -Wl,--no-whole-archive -Wl,--start-group $(EXTLIBS) -Wl,--end-group

+ifeq ($(USE_CLANG), 1)
+ CLANGLIBS_LIST = AST Basic CodeGen Driver Frontend Lex Tooling Edit Sema Analysis Parse Serialization
+ LIBCLANG = $(foreach l,$(CLANGLIBS_LIST),$(wildcard $(shell $(LLVM_CONFIG) --libdir)/libclang$(l).a))
+ LIBS += -Wl,--start-group $(LIBCLANG) -Wl,--end-group
+endif
+
+ifeq ($(USE_LLVM), 1)
+ LIBLLVM = $(shell $(LLVM_CONFIG) --libs all) $(shell $(LLVM_CONFIG) --system-libs)
+ LIBS += -L$(shell $(LLVM_CONFIG) --libdir) $(LIBLLVM)
+endif
+
+ifeq ($(USE_CXX), 1)
+ LIBS += -lstdc++
+endif
+
export INSTALL SHELL_PATH

### Build rules
@@ -356,7 +377,7 @@ strip: $(PROGRAMS) $(OUTPUT)perf

PERF_IN := $(OUTPUT)perf-in.o

-export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX AWK
+export srctree OUTPUT RM CC CXX LD AR CFLAGS CXXFLAGS V BISON FLEX AWK
export HOSTCC HOSTLD HOSTAR
include $(srctree)/tools/build/Makefile.include

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index 08ed7f12cc37..aa49b6600d1f 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -83,6 +83,7 @@ make_no_libbpf := NO_LIBBPF=1
make_no_libcrypto := NO_LIBCRYPTO=1
make_with_babeltrace:= LIBBABELTRACE=1
make_no_sdt := NO_SDT=1
+make_with_clangllvm := LIBCLANGLLVM=1
make_tags := tags
make_cscope := cscope
make_help := help
@@ -139,6 +140,7 @@ run += make_no_libbionic
run += make_no_auxtrace
run += make_no_libbpf
run += make_with_babeltrace
+run += make_with_clangllvm
run += make_help
run += make_doc
run += make_perf_o
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:06 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Pass a pointer to perf hook functions so they receive context
information during setup.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/perf-hooks.c | 14 +++++++++-----
tools/perf/util/perf-hooks.c | 10 +++++++---
tools/perf/util/perf-hooks.h | 6 ++++--
3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/perf-hooks.c b/tools/perf/tests/perf-hooks.c
index 9338cb2c25ab..665ecc19671c 100644
--- a/tools/perf/tests/perf-hooks.c
+++ b/tools/perf/tests/perf-hooks.c
@@ -15,13 +15,13 @@ static void sigsegv_handler(int sig __maybe_unused)
exit(-1);
}

-static int hook_flags;

-static void the_hook(void)
+static void the_hook(void *_hook_flags)
{
+ int *hook_flags = _hook_flags;
int *p = NULL;

- hook_flags = 1234;
+ *hook_flags = 1234;

/* Generate a segfault, test perf_hooks__recover */
*p = 0;
@@ -29,13 +29,17 @@ static void the_hook(void)

int test__perf_hooks(int subtest __maybe_unused)
{
+ int hook_flags = 0;
+
signal(SIGSEGV, sigsegv_handler);
- perf_hooks__set_hook("test", the_hook);
+ perf_hooks__set_hook("test", the_hook, &hook_flags);
perf_hooks__invoke_test();

/* hook is triggered? */
- if (hook_flags != 1234)
+ if (hook_flags != 1234) {
+ pr_debug("Setting failed: %d (%p)\n", hook_flags, &hook_flags);
return TEST_FAIL;
+ }

/* the buggy hook is removed? */
if (perf_hooks__get_hook("test"))
diff --git a/tools/perf/util/perf-hooks.c b/tools/perf/util/perf-hooks.c
index 4ce88e37dd63..cb368306b12b 100644
--- a/tools/perf/util/perf-hooks.c
+++ b/tools/perf/util/perf-hooks.c
@@ -27,7 +27,7 @@ void perf_hooks__invoke(const struct perf_hook_desc *desc)
*(current_perf_hook->p_hook_func) = NULL;
} else {
current_perf_hook = desc;
- (**desc->p_hook_func)();
+ (**desc->p_hook_func)(desc->hook_ctx);
}
current_perf_hook = NULL;
}
@@ -41,7 +41,9 @@ void perf_hooks__recover(void)
#define PERF_HOOK(name) \
perf_hook_func_t __perf_hook_func_##name = NULL; \
struct perf_hook_desc __perf_hook_desc_##name = \
- {.hook_name = #name, .p_hook_func = &__perf_hook_func_##name};
+ {.hook_name = #name, \
+ .p_hook_func = &__perf_hook_func_##name, \
+ .hook_ctx = NULL};
#include "perf-hooks-list.h"
#undef PERF_HOOK

@@ -54,7 +56,8 @@ static struct perf_hook_desc *perf_hooks[] = {
#undef PERF_HOOK

int perf_hooks__set_hook(const char *hook_name,
- perf_hook_func_t hook_func)
+ perf_hook_func_t hook_func,
+ void *hook_ctx)
{
unsigned int i;

@@ -65,6 +68,7 @@ int perf_hooks__set_hook(const char *hook_name,
if (*(perf_hooks[i]->p_hook_func))
pr_warning("Overwrite existing hook: %s\n", hook_name);
*(perf_hooks[i]->p_hook_func) = hook_func;
+ perf_hooks[i]->hook_ctx = hook_ctx;
return 0;
}
return -ENOENT;
diff --git a/tools/perf/util/perf-hooks.h b/tools/perf/util/perf-hooks.h
index 1d482b26b4b9..838d5797bc1e 100644
--- a/tools/perf/util/perf-hooks.h
+++ b/tools/perf/util/perf-hooks.h
@@ -5,10 +5,11 @@
extern "C" {
#endif

-typedef void (*perf_hook_func_t)(void);
+typedef void (*perf_hook_func_t)(void *ctx);
struct perf_hook_desc {
const char * const hook_name;
perf_hook_func_t * const p_hook_func;
+ void *hook_ctx;
};

extern void perf_hooks__invoke(const struct perf_hook_desc *);
@@ -26,7 +27,8 @@ static inline void perf_hooks__invoke_##name(void) \

extern int
perf_hooks__set_hook(const char *hook_name,
- perf_hook_func_t hook_func);
+ perf_hook_func_t hook_func,
+ void *hook_ctx);

extern perf_hook_func_t
perf_hooks__get_hook(const char *hook_name);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:06 PM12/5/16
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

For jump instructions that does not include target address as direct operand,
show the original disassembled line for them. This is needed for certain
powerpc jump instructions that use target address in a register (such as bctr,
btar, ...).

Before:
ld r12,32088(r12)
mtctr r12
v bctr ffffffffffffca2c
std r2,24(r1)
addis r12,r2,-1

After:
ld r12,32088(r12)
mtctr r12
v bctr
std r2,24(r1)
addis r12,r2,-1

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

.__bpf_prog_run vmlinux.powerpc
│ std r10,512(r9) ▒
│ lbz r9,0(r31) ▒
│ rldicr r9,r9,3,60 ▒
│ ldx r9,r30,r9 ▒
│ mtctr r9 ▒
100.00 │ ↓ bctr 3fffffffffe01510 ▒
│ lwa r10,4(r31) ▒
│ lwz r9,0(r31) ▒
<SNIP>
Invalid jump offset: 3fffffffffe01510

After:

.__bpf_prog_run vmlinux.powerpc
│ std r10,512(r9) ▒
│ lbz r9,0(r31) ▒
│ rldicr r9,r9,3,60 ▒
│ ldx r9,r30,r9 ▒
│ mtctr r9 ▒
100.00 │ ↓ bctr ▒
│ lwa r10,4(r31) ▒
│ lwz r9,0(r31) ▒
<SNIP>
Invalid jump offset: 3fffffffffe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

[acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev
# ========
# captured on: Thu Nov 24 12:40:38 2016
# hostname : pdev-f22-qemu
# os release : 4.4.10-200.fc22.ppc64
# perf version : 4.9.rc1.g6298ce
# arch : ppc64
# nrcpus online : 48
# nrcpus avail : 48
# cpudesc : POWER7 (architected), altivec supported
# cpuid : 74,513
# total memory : 4158976 kB
# cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
# event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
# missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
# ========
#
[acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Taeung Song <treeze...@gmail.com>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-1-git-s...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4012b1de2813..ea7e0de4b9c1 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op
static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops)
{
+ if (!ops->target.addr)
+ return ins__raw_scnprintf(ins, bf, size, ops);
+
return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset);
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:07 PM12/5/16
to
From: Wang Nan <wang...@huawei.com>

Utilize clang's OverlayFileSystem facility, allow CompilerInstance to
access real file system.

With this patch the '#include' directive can be used.

Add a new getModuleFromSource for real file.

Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161126070354.1...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/c++/clang.cpp | 44 +++++++++++++++++++++++++++++++------------
tools/perf/util/c++/clang.h | 3 +++
2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index c17b1176e25d..cf96199b4b6f 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -15,6 +15,7 @@
#include "clang/Tooling/Tooling.h"
#include "llvm/IR/Module.h"
#include "llvm/Option/Option.h"
+#include "llvm/Support/FileSystem.h"
#include "llvm/Support/ManagedStatic.h"
#include <memory>

@@ -27,14 +28,6 @@ static std::unique_ptr<llvm::LLVMContext> LLVMCtx;

using namespace clang;

-static vfs::InMemoryFileSystem *
-buildVFS(StringRef& Name, StringRef& Content)
-{
- vfs::InMemoryFileSystem *VFS = new vfs::InMemoryFileSystem(true);
- VFS->addFile(Twine(Name), 0, llvm::MemoryBuffer::getMemBuffer(Content));
- return VFS;
-}
-
static CompilerInvocation *
createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
{
@@ -60,17 +53,17 @@ createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
return CI;
}

-std::unique_ptr<llvm::Module>
-getModuleFromSource(StringRef Name, StringRef Content)
+static std::unique_ptr<llvm::Module>
+getModuleFromSource(StringRef Path,
+ IntrusiveRefCntPtr<vfs::FileSystem> VFS)
{
CompilerInstance Clang;
Clang.createDiagnostics();

- IntrusiveRefCntPtr<vfs::FileSystem> VFS = buildVFS(Name, Content);
Clang.setVirtualFileSystem(&*VFS);

IntrusiveRefCntPtr<CompilerInvocation> CI =
- createCompilerInvocation(Name, Clang.getDiagnostics());
+ createCompilerInvocation(Path, Clang.getDiagnostics());
Clang.setInvocation(&*CI);

std::unique_ptr<CodeGenAction> Act(new EmitLLVMOnlyAction(&*LLVMCtx));
@@ -80,6 +73,33 @@ getModuleFromSource(StringRef Name, StringRef Content)
return Act->takeModule();
}

+std::unique_ptr<llvm::Module>
+getModuleFromSource(StringRef Name, StringRef Content)
+{
+ using namespace vfs;
+
+ llvm::IntrusiveRefCntPtr<OverlayFileSystem> OverlayFS(
+ new OverlayFileSystem(getRealFileSystem()));
+ llvm::IntrusiveRefCntPtr<InMemoryFileSystem> MemFS(
+ new InMemoryFileSystem(true));
+
+ /*
+ * pushOverlay helps setting working dir for MemFS. Must call
+ * before addFile.
+ */
+ OverlayFS->pushOverlay(MemFS);
+ MemFS->addFile(Twine(Name), 0, llvm::MemoryBuffer::getMemBuffer(Content));
+
+ return getModuleFromSource(Name, OverlayFS);
+}
+
+std::unique_ptr<llvm::Module>
+getModuleFromSource(StringRef Path)
+{
+ IntrusiveRefCntPtr<vfs::FileSystem> VFS(vfs::getRealFileSystem());
+ return getModuleFromSource(Path, VFS);
+}
+
}

extern "C" {
diff --git a/tools/perf/util/c++/clang.h b/tools/perf/util/c++/clang.h
index f64483be43d0..90aff0162f1c 100644
--- a/tools/perf/util/c++/clang.h
+++ b/tools/perf/util/c++/clang.h
@@ -12,5 +12,8 @@ using namespace llvm;
std::unique_ptr<Module>
getModuleFromSource(StringRef Name, StringRef Content);

+std::unique_ptr<Module>
+getModuleFromSource(StringRef Path);
+
}
#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:07 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

Adding some missing non config targets that were for some reason
omitted.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-7-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 2784b5843aef..10495c9dbe71 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -177,7 +177,7 @@ SUBCMD_DIR = $(srctree)/tools/lib/subcmd/
# non-config cases
config := 1

-NON_CONFIG_TARGETS := clean TAGS tags cscope help install-doc
+NON_CONFIG_TARGETS := clean TAGS tags cscope help install-doc install-man install-html install-info install-pdf doc man html info pdf

ifdef MAKECMDGOALS
ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:07 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

The upcoming fixdep fix needs all targets at the same area, so they'll
fit under a signal condition block.

Move install-gtk target into the rules section.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-4-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index dfb20dd31865..593d765d62ab 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -331,11 +331,6 @@ endif
ifndef NO_GTK2
ALL_PROGRAMS += $(OUTPUT)libperf-gtk.so
GTK_IN := $(OUTPUT)gtk-in.o
-
-install-gtk: $(OUTPUT)libperf-gtk.so
- $(call QUIET_INSTALL, 'GTK UI') \
- $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(libdir_SQ)'; \
- $(INSTALL) $(OUTPUT)libperf-gtk.so '$(DESTDIR_SQ)$(libdir_SQ)'
endif

ifdef ASCIIDOC8
@@ -712,7 +707,14 @@ check: $(OUTPUT)common-cmds.h

### Installation rules

+ifndef NO_GTK2
+install-gtk: $(OUTPUT)libperf-gtk.so
+ $(call QUIET_INSTALL, 'GTK UI') \
+ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(libdir_SQ)'; \
+ $(INSTALL) $(OUTPUT)libperf-gtk.so '$(DESTDIR_SQ)$(libdir_SQ)'
+else
install-gtk:
+endif

install-tools: all install-gtk
$(call QUIET_INSTALL, binaries) \
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 5, 2016, 4:50:09 PM12/5/16
to
From: Jiri Olsa <jo...@kernel.org>

Following fixdep fix needs all targets at the same area, so they'll fit
under signal condition block.

Moving python/perf.so target into rules section and intentionally
removing the perl script related comment.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-5-g...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 593d765d62ab..2784b5843aef 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -268,17 +268,6 @@ python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT
PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
PYTHON_EXT_DEPS := util/python-ext-sources util/setup.py $(LIBTRACEEVENT) $(LIBAPI)

-$(OUTPUT)python/perf.so: $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) $(LIBTRACEEVENT_DYNAMIC_LIST)
- $(QUIET_GEN)LDSHARED="$(CC) -pthread -shared" \
- CFLAGS='$(CFLAGS)' LDFLAGS='$(LDFLAGS) $(LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS)' \
- $(PYTHON_WORD) util/setup.py \
- --quiet build_ext; \
- mkdir -p $(OUTPUT)python && \
- cp $(PYTHON_EXTBUILD_LIB)perf.so $(OUTPUT)python/
-#
-# No Perl scripts right now:
-#
-
SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH))

PROGRAMS += $(OUTPUT)perf
@@ -362,6 +351,14 @@ SHELL = $(SHELL_PATH)

all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)

+$(OUTPUT)python/perf.so: $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) $(LIBTRACEEVENT_DYNAMIC_LIST)
+ $(QUIET_GEN)LDSHARED="$(CC) -pthread -shared" \
+ CFLAGS='$(CFLAGS)' LDFLAGS='$(LDFLAGS) $(LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS)' \
+ $(PYTHON_WORD) util/setup.py \
+ --quiet build_ext; \
+ mkdir -p $(OUTPUT)python && \
+ cp $(PYTHON_EXTBUILD_LIB)perf.so $(OUTPUT)python/
+
please_set_SHELL_PATH_to_a_more_modern_shell:
$(Q)$$(:)

--
2.9.3

Ingo Molnar

unread,
Dec 6, 2016, 3:20:04 AM12/6/16
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Dec 7, 2016, 12:00:06 PM12/7/16
to
From: Namhyung Kim <namh...@kernel.org>

The callchain_cursor__copy() function is to save current callchain
captured by a cursor. It'll be used to keep callchains when switching
to idle task for each cpu.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 27 +++++++++++++++++++++++++++
tools/perf/util/callchain.h | 3 +++
2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 823befd8209a..42922512c1c6 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1234,3 +1234,30 @@ int callchain_node__make_parent_list(struct callchain_node *node)
}
return -ENOMEM;
}
+
+int callchain_cursor__copy(struct callchain_cursor *dst,
+ struct callchain_cursor *src)
+{
+ int rc = 0;
+
+ callchain_cursor_reset(dst);
+ callchain_cursor_commit(src);
+
+ while (true) {
+ struct callchain_cursor_node *node;
+
+ node = callchain_cursor_current(src);
+ if (node == NULL)
+ break;
+
+ rc = callchain_cursor_append(dst, node->ip, node->map, node->sym,
+ node->branch, &node->branch_flags,
+ node->nr_loop_iter, node->samples);
+ if (rc)
+ break;
+
+ callchain_cursor_advance(src);
+ }
+
+ return rc;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index d9c70dccf06a..35c8e379530f 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -216,6 +216,9 @@ static inline void callchain_cursor_advance(struct callchain_cursor *cursor)
cursor->pos++;
}

+int callchain_cursor__copy(struct callchain_cursor *dst,
+ struct callchain_cursor *src);
+
struct option;
struct hist_entry;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 7, 2016, 12:00:07 PM12/7/16
to
From: Wang Nan <wang...@huawei.com>

Cancel builtin llvm and clang support when LLVM version is less than
3.9.0: following commits uses newer API.

Since Clang/LLVM's API is not guaranteed to be stable, add a
test-llvm-version.cpp feature checker, issue warning if LLVM found in
compiling environment is not tested yet.

Committer Notes:

Testing it:

Environment:

$ cat /etc/fedora-release
Fedora release 25 (Twenty Five)
$ rpm -q llvm-devel clang-devel
llvm-devel-3.8.0-1.fc25.x86_64
clang-devel-3.8.0-2.fc25.x86_64
$

Before:

$ make -k LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin
make: Entering directory '/home/acme/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
Warning: tools/include/uapi/linux/bpf.h differs from kernel
Warning: tools/arch/arm/include/uapi/asm/kvm.h differs from kernel
INSTALL GTK UI
LINK /tmp/build/perf/perf
/tmp/build/perf/libperf.a(libperf-in.o): In function `perf::createCompilerInvocation(llvm::SmallVector<char const*, 16u>, llvm::StringRef&, clang::DiagnosticsEngine&)':
/home/acme/git/linux/tools/perf/util/c++/clang.cpp:56: undefined reference to `clang::tooling::newInvocation(clang::DiagnosticsEngine*, llvm::SmallVector<char const*, 16u> const&)'
/tmp/build/perf/libperf.a(libperf-in.o): In function `perf::getModuleFromSource(llvm::SmallVector<char const*, 16u>, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)':
/home/acme/git/linux/tools/perf/util/c++/clang.cpp:68: undefined reference to `clang::CompilerInstance::CompilerInstance(std::shared_ptr<clang::PCHContainerOperations>, bool)'
/home/acme/git/linux/tools/perf/util/c++/clang.cpp:69: undefined reference to `clang::CompilerInstance::createDiagnostics(clang::DiagnosticConsumer*, bool)'
<SNIP>

After:

Makefile.config:807: No suitable libLLVM found, disabling builtin clang and llvm support. Please install llvm-dev(el) (>= 3.9.0)

Updating the environment to a locally built LLVM 4.0 + clang 3.9 (forgot
to git pull, duh) combo, all works as expected, it is properly detected
and built into the resulting perf binary.

Signed-off-by: Wang Nan <wang...@huawei.com>
Reported-and-Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Stringer <j...@ovn.org>
Cc: Zefan Li <liz...@huawei.com>
Cc: pi3o...@163.com
Link: http://lkml.kernel.org/r/20161206072230....@huawei.com
[ Change the warning message a bit (add 'suitable' and 'builtin'), clarifying it, see committer notes above ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/feature/Makefile | 8 ++++++--
tools/build/feature/test-llvm-version.cpp | 11 +++++++++++
tools/build/feature/test-llvm.cpp | 5 +++++
tools/perf/Makefile.config | 8 ++++++--
4 files changed, 28 insertions(+), 4 deletions(-)
create mode 100644 tools/build/feature/test-llvm-version.cpp

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 303196c16019..b564a2eea039 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -231,14 +231,18 @@ $(OUTPUT)test-jvmti.bin:
$(BUILD)

$(OUTPUT)test-llvm.bin:
- $(BUILDXX) -std=gnu++11 \
+ $(BUILDXX) -std=gnu++11 \
-I$(shell $(LLVM_CONFIG) --includedir) \
-L$(shell $(LLVM_CONFIG) --libdir) \
$(shell $(LLVM_CONFIG) --libs Core BPF) \
$(shell $(LLVM_CONFIG) --system-libs)

+$(OUTPUT)test-llvm-version.bin:
+ $(BUILDXX) -std=gnu++11 \
+ -I$(shell $(LLVM_CONFIG) --includedir)
+
$(OUTPUT)test-clang.bin:
- $(BUILDXX) -std=gnu++11 \
+ $(BUILDXX) -std=gnu++11 \
-I$(shell $(LLVM_CONFIG) --includedir) \
-L$(shell $(LLVM_CONFIG) --libdir) \
-Wl,--start-group -lclangBasic -lclangDriver \
diff --git a/tools/build/feature/test-llvm-version.cpp b/tools/build/feature/test-llvm-version.cpp
new file mode 100644
index 000000000000..896d31724568
--- /dev/null
+++ b/tools/build/feature/test-llvm-version.cpp
@@ -0,0 +1,11 @@
+#include <cstdio>
+#include "llvm/Config/llvm-config.h"
+
+#define NUM_VERSION (((LLVM_VERSION_MAJOR) << 16) + (LLVM_VERSION_MINOR << 8) + LLVM_VERSION_PATCH)
+#define pass int main() {printf("%x\n", NUM_VERSION); return 0;}
+
+#if NUM_VERSION >= 0x030900
+pass
+#else
+# error This LLVM is not tested yet.
+#endif
diff --git a/tools/build/feature/test-llvm.cpp b/tools/build/feature/test-llvm.cpp
index d8d2cee35345..455a332dc8a8 100644
--- a/tools/build/feature/test-llvm.cpp
+++ b/tools/build/feature/test-llvm.cpp
@@ -1,5 +1,10 @@
#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/raw_ostream.h"
+#define NUM_VERSION (((LLVM_VERSION_MAJOR) << 16) + (LLVM_VERSION_MINOR << 8) + LLVM_VERSION_PATCH)
+
+#if NUM_VERSION < 0x030900
+# error "LLVM version too low"
+#endif
int main()
{
llvm::errs() << "Hello World!\n";
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 09c2a9874f2f..76c84f0eec52 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -802,12 +802,13 @@ ifdef LIBCLANGLLVM
msg := $(warning No g++ found, disable clang and llvm support. Please install g++)
else
$(call feature_check,llvm)
+ $(call feature_check,llvm-version)
ifneq ($(feature-llvm), 1)
- msg := $(warning No libLLVM found, disable clang and llvm support. Please install llvm-dev)
+ msg := $(warning No suitable libLLVM found, disabling builtin clang and LLVM support. Please install llvm-dev(el) (>= 3.9.0))
else
$(call feature_check,clang)
ifneq ($(feature-clang), 1)
- msg := $(warning No libclang found, disable clang and llvm support. Please install libclang-dev)
+ msg := $(warning No suitable libclang found, disabling builtin clang and LLVM support. Please install libclang-dev(el) (>= 3.9.0))
else
CFLAGS += -DHAVE_LIBCLANGLLVM_SUPPORT
CXXFLAGS += -DHAVE_LIBCLANGLLVM_SUPPORT -I$(shell $(LLVM_CONFIG) --includedir)
@@ -816,6 +817,9 @@ ifdef LIBCLANGLLVM
USE_CXX = 1
USE_LLVM = 1
USE_CLANG = 1
+ ifneq ($(feature-llvm-version),1)
+ msg := $(warning This version of LLVM is not tested. May cause build errors)
+ endif
endif
endif
endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 7, 2016, 12:00:07 PM12/7/16
to
From: Jiri Olsa <jo...@kernel.org>

The fixdep tool needs to be built before everything else, because it fixes
every object dependency file.

We handle this currently by making all objects to depend on fixdep, which is
error prone and is easily forgotten when new object is added.

Instead of this, this patch force fixdep tool to be built as the first target
in the separate make session. This way we don't need to handle extra fixdep
dependencies and we are certain there's no fixdep race with any parallel make
job.

Committer notes:

Testing it:

Before:

$ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
make: Entering directory '/home/acme/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build

Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]

GEN /tmp/build/perf/common-cmds.h
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
MKDIR /tmp/build/perf/pmu-events/
HOSTCC /tmp/build/perf/pmu-events/json.o
MKDIR /tmp/build/perf/pmu-events/
HOSTCC /tmp/build/perf/pmu-events/jsmn.o
HOSTCC /tmp/build/perf/pmu-events/jevents.o
HOSTLD /tmp/build/perf/pmu-events/jevents-in.o
PERF_VERSION = 4.9.rc8.g868cd5
CC /tmp/build/perf/perf-read-vdso32
<SNIP>

After:

$ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
make: Entering directory '/home/acme/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep

Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]

GEN /tmp/build/perf/common-cmds.h
MKDIR /tmp/build/perf/fd/
CC /tmp/build/perf/fd/array.o
LD /tmp/build/perf/fd/libapi-in.o
MKDIR /tmp/build/perf/fs/
CC /tmp/build/perf/event-parse.o
CC /tmp/build/perf/fs/fs.o
PERF_VERSION = 4.9.rc8.g57a92f
CC /tmp/build/perf/event-plugin.o
MKDIR /tmp/build/perf/fs/
CC /tmp/build/perf/fs/tracing_path.o
<SNIP>

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-3-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 50 ++++++++++++++++++++++++++++++++++++------------
1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 9e5a6e1a387d..33b1d9f8555f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -181,6 +181,35 @@ ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
endif
endif

+# The fixdep build - we force fixdep tool to be built as
+# the first target in the separate make session not to be
+# disturbed by any parallel make jobs. Once fixdep is done
+# we issue the requested build with FIXDEP=1 variable.
+#
+# The fixdep build is disabled for $(NON_CONFIG_TARGETS)
+# targets, because it's not necessary.
+
+ifdef FIXDEP
+ force_fixdep := 0
+else
+ force_fixdep := $(config)
+endif
+
+export srctree OUTPUT RM CC CXX LD AR CFLAGS CXXFLAGS V BISON FLEX AWK
+export HOSTCC HOSTLD HOSTAR
+
+include $(srctree)/tools/build/Makefile.include
+
+ifeq ($(force_fixdep),1)
+goals := $(filter-out all sub-make, $(MAKECMDGOALS))
+
+$(goals) all: sub-make
+
+sub-make: fixdep
+ $(Q)$(MAKE) FIXDEP=1 -f Makefile.perf $(goals)
+
+else # force_fixdep
+
# Set FEATURE_TESTS to 'all' so all possible feature checkers are executed.
# Without this setting the output feature dump file misses some features, for
# example, liberty. Select all checkers so we won't get an incomplete feature
@@ -365,10 +394,6 @@ strip: $(PROGRAMS) $(OUTPUT)perf

PERF_IN := $(OUTPUT)perf-in.o

-export srctree OUTPUT RM CC CXX LD AR CFLAGS CXXFLAGS V BISON FLEX AWK
-export HOSTCC HOSTLD HOSTAR
-include $(srctree)/tools/build/Makefile.include
-
JEVENTS := $(OUTPUT)pmu-events/jevents
JEVENTS_IN := $(OUTPUT)pmu-events/jevents-in.o

@@ -487,7 +512,7 @@ $(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) $(LIBTRACEEVENT_DYNAMIC_L
$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS) \
$(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@

-$(GTK_IN): fixdep FORCE
+$(GTK_IN): FORCE
$(Q)$(MAKE) $(build)=gtk

$(OUTPUT)libperf-gtk.so: $(GTK_IN) $(PERFLIBS)
@@ -536,7 +561,7 @@ endif
__build-dir = $(subst $(OUTPUT),,$(dir $@))
build-dir = $(if $(__build-dir),$(__build-dir),.)

-prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h fixdep archheaders
+prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders

$(OUTPUT)%.o: %.c prepare FORCE
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
@@ -586,7 +611,7 @@ $(patsubst perf-%,%.o,$(PROGRAMS)): $(wildcard */*.h)

LIBPERF_IN := $(OUTPUT)libperf-in.o

-$(LIBPERF_IN): prepare fixdep FORCE
+$(LIBPERF_IN): prepare FORCE
$(Q)$(MAKE) $(build)=libperf

$(LIB_FILE): $(LIBPERF_IN)
@@ -594,10 +619,10 @@ $(LIB_FILE): $(LIBPERF_IN)

LIBTRACEEVENT_FLAGS += plugin_dir=$(plugindir_SQ)

-$(LIBTRACEEVENT): fixdep FORCE
+$(LIBTRACEEVENT): FORCE
$(Q)$(MAKE) -C $(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) O=$(OUTPUT) $(OUTPUT)libtraceevent.a

-libtraceevent_plugins: fixdep FORCE
+libtraceevent_plugins: FORCE
$(Q)$(MAKE) -C $(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) O=$(OUTPUT) plugins

$(LIBTRACEEVENT_DYNAMIC_LIST): libtraceevent_plugins
@@ -610,21 +635,21 @@ $(LIBTRACEEVENT)-clean:
install-traceevent-plugins: libtraceevent_plugins
$(Q)$(MAKE) -C $(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) O=$(OUTPUT) install_plugins

-$(LIBAPI): fixdep FORCE
+$(LIBAPI): FORCE
$(Q)$(MAKE) -C $(LIB_DIR) O=$(OUTPUT) $(OUTPUT)libapi.a

$(LIBAPI)-clean:
$(call QUIET_CLEAN, libapi)
$(Q)$(MAKE) -C $(LIB_DIR) O=$(OUTPUT) clean >/dev/null

-$(LIBBPF): fixdep FORCE
+$(LIBBPF): FORCE
$(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) $(OUTPUT)libbpf.a FEATURES_DUMP=$(FEATURE_DUMP_EXPORT)

$(LIBBPF)-clean:
$(call QUIET_CLEAN, libbpf)
$(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) clean >/dev/null

-$(LIBSUBCMD): fixdep FORCE
+$(LIBSUBCMD): FORCE
$(Q)$(MAKE) -C $(SUBCMD_DIR) O=$(OUTPUT) $(OUTPUT)libsubcmd.a

$(LIBSUBCMD)-clean:
@@ -832,3 +857,4 @@ FORCE:
.PHONY: $(GIT-HEAD-PHONY) TAGS tags cscope FORCE prepare
.PHONY: libtraceevent_plugins archheaders

+endif # force_fixdep
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 7, 2016, 12:00:08 PM12/7/16
to
Hi Ingo,

Please consider pulling, should get linux-next free of perf build fixdep
related race conditions on high core count machines,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 34c4a42791bbc455e65a15d12dcd0b6b3c52ad13:

Merge tag 'perf-core-for-mingo-20161205' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-12-06 09:14:56 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161207

for you to fetch changes up to 108a7c103b761309ccbd997002e8428808cf1e04:

perf tools: Explicitly document that --children is enabled by default (2016-12-07 12:00:35 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

Improvements:

- Improve error message when analyzing file with required events in
'perf sched timehist' (David Ahern)

Fixes:

- Force fixdep compilation to be done at the start of the build, fixing
some build race conditions in high core count machines (Jiri Olsa)

- Fix handling a zero sample->tid in 'perf sched timehist', as
sometimes that isn't the idle thread (Namhyung Kim)

Infrastructure:

- Check minimal accepted LLVM version in its feature check, 3.9 at this
time (Wang Nan)

Documentation:

- Explicitly document that --children is enabled by default (Yannick Brosseau)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
David Ahern (1):
perf sched timehist: Improve error message when analyzing wrong file

Jiri Olsa (3):
perf tools: Move PERF-VERSION-FILE target into rules area
perf tools: Force fixdep compilation at the start of the build
perf tools: Move perf build related variables under non fixdep leg

Namhyung Kim (4):
perf sched: Cleanup option processing
perf callchain: Introduce callchain_cursor__copy()
perf sched timehist: Handle zero sample->tid properly
perf sched timehist: Cleanup idle_max_cpu handling

Wang Nan (1):
perf build: Check LLVM version in feature check

Yannick Brosseau (1):
perf tools: Explicitly document that --children is enabled by default

tools/build/feature/Makefile | 8 +++-
tools/build/feature/test-llvm-version.cpp | 11 +++++
tools/build/feature/test-llvm.cpp | 5 +++
tools/perf/Documentation/perf-report.txt | 3 +-
tools/perf/Documentation/perf-top.txt | 1 +
tools/perf/Makefile.config | 8 +++-
tools/perf/Makefile.perf | 68 +++++++++++++++++++++----------
tools/perf/builtin-sched.c | 26 ++++++------
tools/perf/util/callchain.c | 27 ++++++++++++
tools/perf/util/callchain.h | 3 ++
10 files changed, 122 insertions(+), 38 deletions(-)
create mode 100644 tools/build/feature/test-llvm-version.cpp
51.1: builtin clang compile C source to IR : Ok
51.2: builtin clang compile C source to ELF object: Ok
52: x86 rdpmc : Ok
53: Convert perf time to TSC : Ok
54: DWARF unwind : Ok
55: x86 instruction decoder - new instructions : Ok
56: Intel cqm nmi context read : Skip
#
# uname -a
Linux zoo 4.7.3-200.fc24.x86_64 #1 SMP Wed Sep 7 17:31:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 fedora:20: Ok
11 fedora:21: Ok
12 fedora:22: Ok
13 fedora:23: Ok
14 fedora:24: Ok
15 fedora:24-x-ARC-uClibc: Ok
16 fedora:rawhide: Ok
17 mageia:5: Ok
18 opensuse:13.2: Ok
19 opensuse:42.1: Ok
20 opensuse:tumbleweed: Ok
21 ubuntu:12.04.5: Ok
22 ubuntu:14.04: Ok
23 ubuntu:14.04.4: Ok
24 ubuntu:15.10: Ok
25 ubuntu:16.04: Ok
26 ubuntu:16.04-x-arm: Ok
27 ubuntu:16.04-x-arm64: Ok
28 ubuntu:16.04-x-powerpc: Ok
29 ubuntu:16.04-x-powerpc64: Ok
30 ubuntu:16.04-x-powerpc64el: Ok
31 ubuntu:16.04-x-s390: Ok
32 ubuntu:16.10: Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_help_O: make help
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_install_O: make install
make_static_O: make LDFLAGS=-static
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_doc_O: make doc
make_no_libbpf_O: make NO_LIBBPF=1
make_util_map_o_O: make util/map.o
make_install_bin_O: make install-bin
make_no_auxtrace_O: make NO_AUXTRACE=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_debug_O: make DEBUG=1
make_no_libelf_O: make NO_LIBELF=1
make_clean_all_O: make clean all
make_no_libperl_O: make NO_LIBPERL=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_slang_O: make NO_SLANG=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_gtk2_O: make NO_GTK2=1
make_pure_O: make
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_install_prefix_O: make install prefix=/tmp/krava
make_tags_O: make tags
make_no_demangle_O: make NO_DEMANGLE=1
make_perf_o_O: make perf.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_newt_O: make NO_NEWT=1

Ingo Molnar

unread,
Dec 7, 2016, 1:20:05 PM12/7/16
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Dec 13, 2016, 10:20:08 AM12/13/16
to
From: Namhyung Kim <namh...@kernel.org>

The is_idle_sample() function actually does more than determining
whether sample come from idle task. Split the callchain part into
save_task_callchain() to make it clearer.

Also checking prev_pid from trace data looks preferred than just
checking sample->pid since it's possible, although rare, to have invalid
0 pid/tid on scheduling an exiting task.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: David Ahern <dsa...@gmail.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755....@kernel.org
[ Remove some needless () in some return statements ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 39 ++++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 1a3f1be93372..966eddce1609 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1939,39 +1939,40 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
r->total_run_time += r->dt_run;
}

-static bool is_idle_sample(struct perf_sched *sched,
- struct perf_sample *sample,
- struct perf_evsel *evsel,
- struct machine *machine)
+static bool is_idle_sample(struct perf_sample *sample,
+ struct perf_evsel *evsel)
{
- struct thread *thread;
- struct callchain_cursor *cursor = &callchain_cursor;
-
/* pid 0 == swapper == idle task */
- if (sample->pid == 0)
- return true;
+ if (strcmp(perf_evsel__name(evsel), "sched:sched_switch") == 0)
+ return perf_evsel__intval(evsel, sample, "prev_pid") == 0;

- if (strcmp(perf_evsel__name(evsel), "sched:sched_switch") == 0) {
- if (perf_evsel__intval(evsel, sample, "prev_pid") == 0)
- return true;
- }
+ return sample->pid == 0;
+}
+
+static void save_task_callchain(struct perf_sched *sched,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
+{
+ struct callchain_cursor *cursor = &callchain_cursor;
+ struct thread *thread;

/* want main thread for process - has maps */
thread = machine__findnew_thread(machine, sample->pid, sample->pid);
if (thread == NULL) {
pr_debug("Failed to get thread for pid %d.\n", sample->pid);
- return false;
+ return;
}

if (!symbol_conf.use_callchain || sample->callchain == NULL)
- return false;
+ return;

if (thread__resolve_callchain(thread, cursor, evsel, sample,
NULL, NULL, sched->max_stack + 2) != 0) {
if (verbose)
error("Failed to resolve callchain. Skipping\n");

- return false;
+ return;
}

callchain_cursor_commit(cursor);
@@ -1994,8 +1995,6 @@ static bool is_idle_sample(struct perf_sched *sched,

callchain_cursor_advance(cursor);
}
-
- return false;
}

/*
@@ -2111,7 +2110,7 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
{
struct thread *thread;

- if (is_idle_sample(sched, sample, evsel, machine)) {
+ if (is_idle_sample(sample, evsel)) {
thread = get_idle_thread(sample->cpu);
if (thread == NULL)
pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
@@ -2124,6 +2123,8 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
pr_debug("Failed to get thread for tid %d. skipping sample.\n",
sample->tid);
}
+
+ save_task_callchain(sched, sample, evsel, machine);
}

return thread;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 13, 2016, 10:20:10 AM12/13/16
to
From: Alexis Berlemont <alexis.b...@gmail.com>

During a "perf buildid-cache --add" command, the section ".note.stapsdt"
of the "added" binary is scanned in order to list the available SDT
markers available in a binary. The parts containing the probes arguments
were left unscanned.

The whole section is now parsed; the probe arguments are extracted for
later use.

Signed-off-by: Alexis Berlemont <alexis.b...@gmail.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Hemant Kumar <hem...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161126005803.2569...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/symbol-elf.c | 25 +++++++++++++++++++++++--
tools/perf/util/symbol.h | 1 +
2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 99400b0e8f2a..7725c3f9d6a2 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1822,7 +1822,7 @@ void kcore_extract__delete(struct kcore_extract *kce)
static int populate_sdt_note(Elf **elf, const char *data, size_t len,
struct list_head *sdt_notes)
{
- const char *provider, *name;
+ const char *provider, *name, *args;
struct sdt_note *tmp = NULL;
GElf_Ehdr ehdr;
GElf_Addr base_off = 0;
@@ -1881,6 +1881,25 @@ static int populate_sdt_note(Elf **elf, const char *data, size_t len,
goto out_free_prov;
}

+ args = memchr(name, '\0', data + len - name);
+
+ /*
+ * There is no argument if:
+ * - We reached the end of the note;
+ * - There is not enough room to hold a potential string;
+ * - The argument string is empty or just contains ':'.
+ */
+ if (args == NULL || data + len - args < 2 ||
+ args[1] == ':' || args[1] == '\0')
+ tmp->args = NULL;
+ else {
+ tmp->args = strdup(++args);
+ if (!tmp->args) {
+ ret = -ENOMEM;
+ goto out_free_name;
+ }
+ }
+
if (gelf_getclass(*elf) == ELFCLASS32) {
memcpy(&tmp->addr, &buf, 3 * sizeof(Elf32_Addr));
tmp->bit32 = true;
@@ -1892,7 +1911,7 @@ static int populate_sdt_note(Elf **elf, const char *data, size_t len,
if (!gelf_getehdr(*elf, &ehdr)) {
pr_debug("%s : cannot get elf header.\n", __func__);
ret = -EBADF;
- goto out_free_name;
+ goto out_free_args;
}

/* Adjust the prelink effect :
@@ -1917,6 +1936,8 @@ static int populate_sdt_note(Elf **elf, const char *data, size_t len,
list_add_tail(&tmp->note_list, sdt_notes);
return 0;

+out_free_args:
+ free(tmp->args);
out_free_name:
free(tmp->name);
out_free_prov:
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 6c358b7ed336..9222c7e702f3 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -351,6 +351,7 @@ int arch__choose_best_symbol(struct symbol *syma, struct symbol *symb);
struct sdt_note {
char *name; /* name of the note*/
char *provider; /* provider name */
+ char *args;
bool bit32; /* whether the location is 32 bits? */
union { /* location, base and semaphore addrs */
Elf64_Addr a64[3];
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 13, 2016, 10:20:10 AM12/13/16
to
From: Jiri Olsa <jo...@kernel.org>

To make it nicer and easily maintainable.

Also moving the check into fixdep sub make, so its output is not
scattered around the build output.

Removing extra $$ from mman*.h checks.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-5-...@kernel.org
[ Use /bin/sh, and 'function check() {' -> 'check () {' to make it work with busybox, in Alpine Linux, for instance ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 94 +--------------------------------------------
tools/perf/check-headers.sh | 59 ++++++++++++++++++++++++++++
2 files changed, 60 insertions(+), 93 deletions(-)
create mode 100755 tools/perf/check-headers.sh

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 8f1c258b151a..e9ec531131ca 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -201,6 +201,7 @@ goals := $(filter-out all sub-make, $(MAKECMDGOALS))
$(goals) all: sub-make

sub-make: fixdep
+ @./check-headers.sh
$(Q)$(MAKE) FIXDEP=1 -f Makefile.perf $(goals)

else # force_fixdep
@@ -404,99 +405,6 @@ export JEVENTS
build := -f $(srctree)/tools/build/Makefile.build dir=. obj

$(PERF_IN): prepare FORCE
- @(test -f ../../include/uapi/linux/perf_event.h && ( \
- (diff -B ../include/uapi/linux/perf_event.h ../../include/uapi/linux/perf_event.h >/dev/null) \
- || echo "Warning: tools/include/uapi/linux/perf_event.h differs from kernel" >&2 )) || true
- @(test -f ../../include/linux/hash.h && ( \
- (diff -B ../include/linux/hash.h ../../include/linux/hash.h >/dev/null) \
- || echo "Warning: tools/include/linux/hash.h differs from kernel" >&2 )) || true
- @(test -f ../../include/uapi/linux/hw_breakpoint.h && ( \
- (diff -B ../include/uapi/linux/hw_breakpoint.h ../../include/uapi/linux/hw_breakpoint.h >/dev/null) \
- || echo "Warning: tools/include/uapi/linux/hw_breakpoint.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/asm/disabled-features.h && ( \
- (diff -B ../arch/x86/include/asm/disabled-features.h ../../arch/x86/include/asm/disabled-features.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/asm/disabled-features.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/asm/required-features.h && ( \
- (diff -B ../arch/x86/include/asm/required-features.h ../../arch/x86/include/asm/required-features.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/asm/required-features.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/asm/cpufeatures.h && ( \
- (diff -B ../arch/x86/include/asm/cpufeatures.h ../../arch/x86/include/asm/cpufeatures.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/asm/cpufeatures.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/lib/memcpy_64.S && ( \
- (diff -B -I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>" ../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memcpy_64.S >/dev/null) \
- || echo "Warning: tools/arch/x86/lib/memcpy_64.S differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/lib/memset_64.S && ( \
- (diff -B -I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>" ../arch/x86/lib/memset_64.S ../../arch/x86/lib/memset_64.S >/dev/null) \
- || echo "Warning: tools/arch/x86/lib/memset_64.S differs from kernel" >&2 )) || true
- @(test -f ../../arch/arm/include/uapi/asm/perf_regs.h && ( \
- (diff -B ../arch/arm/include/uapi/asm/perf_regs.h ../../arch/arm/include/uapi/asm/perf_regs.h >/dev/null) \
- || echo "Warning: tools/arch/arm/include/uapi/asm/perf_regs.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/arm64/include/uapi/asm/perf_regs.h && ( \
- (diff -B ../arch/arm64/include/uapi/asm/perf_regs.h ../../arch/arm64/include/uapi/asm/perf_regs.h >/dev/null) \
- || echo "Warning: tools/arch/arm64/include/uapi/asm/perf_regs.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/powerpc/include/uapi/asm/perf_regs.h && ( \
- (diff -B ../arch/powerpc/include/uapi/asm/perf_regs.h ../../arch/powerpc/include/uapi/asm/perf_regs.h >/dev/null) \
- || echo "Warning: tools/arch/powerpc/include/uapi/asm/perf_regs.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/uapi/asm/perf_regs.h && ( \
- (diff -B ../arch/x86/include/uapi/asm/perf_regs.h ../../arch/x86/include/uapi/asm/perf_regs.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/uapi/asm/perf_regs.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/uapi/asm/kvm.h && ( \
- (diff -B ../arch/x86/include/uapi/asm/kvm.h ../../arch/x86/include/uapi/asm/kvm.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/uapi/asm/kvm.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/uapi/asm/kvm_perf.h && ( \
- (diff -B ../arch/x86/include/uapi/asm/kvm_perf.h ../../arch/x86/include/uapi/asm/kvm_perf.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/uapi/asm/kvm_perf.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/uapi/asm/svm.h && ( \
- (diff -B ../arch/x86/include/uapi/asm/svm.h ../../arch/x86/include/uapi/asm/svm.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/uapi/asm/svm.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/x86/include/uapi/asm/vmx.h && ( \
- (diff -B ../arch/x86/include/uapi/asm/vmx.h ../../arch/x86/include/uapi/asm/vmx.h >/dev/null) \
- || echo "Warning: tools/arch/x86/include/uapi/asm/vmx.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/powerpc/include/uapi/asm/kvm.h && ( \
- (diff -B ../arch/powerpc/include/uapi/asm/kvm.h ../../arch/powerpc/include/uapi/asm/kvm.h >/dev/null) \
- || echo "Warning: tools/arch/powerpc/include/uapi/asm/kvm.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/s390/include/uapi/asm/kvm.h && ( \
- (diff -B ../arch/s390/include/uapi/asm/kvm.h ../../arch/s390/include/uapi/asm/kvm.h >/dev/null) \
- || echo "Warning: tools/arch/s390/include/uapi/asm/kvm.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/s390/include/uapi/asm/kvm_perf.h && ( \
- (diff -B ../arch/s390/include/uapi/asm/kvm_perf.h ../../arch/s390/include/uapi/asm/kvm_perf.h >/dev/null) \
- || echo "Warning: tools/arch/s390/include/uapi/asm/kvm_perf.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/s390/include/uapi/asm/sie.h && ( \
- (diff -B ../arch/s390/include/uapi/asm/sie.h ../../arch/s390/include/uapi/asm/sie.h >/dev/null) \
- || echo "Warning: tools/arch/s390/include/uapi/asm/sie.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/arm/include/uapi/asm/kvm.h && ( \
- (diff -B ../arch/arm/include/uapi/asm/kvm.h ../../arch/arm/include/uapi/asm/kvm.h >/dev/null) \
- || echo "Warning: tools/arch/arm/include/uapi/asm/kvm.h differs from kernel" >&2 )) || true
- @(test -f ../../arch/arm64/include/uapi/asm/kvm.h && ( \
- (diff -B ../arch/arm64/include/uapi/asm/kvm.h ../../arch/arm64/include/uapi/asm/kvm.h >/dev/null) \
- || echo "Warning: tools/arch/arm64/include/uapi/asm/kvm.h differs from kernel" >&2 )) || true
- @(test -f ../../include/asm-generic/bitops/arch_hweight.h && ( \
- (diff -B ../include/asm-generic/bitops/arch_hweight.h ../../include/asm-generic/bitops/arch_hweight.h >/dev/null) \
- || echo "Warning: tools/include/asm-generic/bitops/arch_hweight.h differs from kernel" >&2 )) || true
- @(test -f ../../include/asm-generic/bitops/const_hweight.h && ( \
- (diff -B ../include/asm-generic/bitops/const_hweight.h ../../include/asm-generic/bitops/const_hweight.h >/dev/null) \
- || echo "Warning: tools/include/asm-generic/bitops/const_hweight.h differs from kernel" >&2 )) || true
- @(test -f ../../include/asm-generic/bitops/__fls.h && ( \
- (diff -B ../include/asm-generic/bitops/__fls.h ../../include/asm-generic/bitops/__fls.h >/dev/null) \
- || echo "Warning: tools/include/asm-generic/bitops/__fls.h differs from kernel" >&2 )) || true
- @(test -f ../../include/asm-generic/bitops/fls.h && ( \
- (diff -B ../include/asm-generic/bitops/fls.h ../../include/asm-generic/bitops/fls.h >/dev/null) \
- || echo "Warning: tools/include/asm-generic/bitops/fls.h differs from kernel" >&2 )) || true
- @(test -f ../../include/asm-generic/bitops/fls64.h && ( \
- (diff -B ../include/asm-generic/bitops/fls64.h ../../include/asm-generic/bitops/fls64.h >/dev/null) \
- || echo "Warning: tools/include/asm-generic/bitops/fls64.h differs from kernel" >&2 )) || true
- @(test -f ../../include/linux/coresight-pmu.h && ( \
- (diff -B ../include/linux/coresight-pmu.h ../../include/linux/coresight-pmu.h >/dev/null) \
- || echo "Warning: tools/include/linux/coresight-pmu.h differs from kernel" >&2 )) || true
- @(test -f ../../include/uapi/asm-generic/mman-common.h && ( \
- (diff -B ../include/uapi/asm-generic/mman-common.h ../../include/uapi/asm-generic/mman-common.h >/dev/null) \
- || echo "Warning: tools/include/uapi/asm-generic/mman-common.h differs from kernel" >&2 )) || true
- @(test -f ../../include/uapi/asm-generic/mman.h && ( \
- (diff -B -I "^#include <\(uapi/\)*asm-generic/mman-common.h>$$" ../include/uapi/asm-generic/mman.h ../../include/uapi/asm-generic/mman.h >/dev/null) \
- || echo "Warning: tools/include/uapi/asm-generic/mman.h differs from kernel" >&2 )) || true
- @(test -f ../../include/uapi/linux/mman.h && ( \
- (diff -B -I "^#include <\(uapi/\)*asm/mman.h>$$" ../include/uapi/linux/mman.h ../../include/uapi/linux/mman.h >/dev/null) \
- || echo "Warning: tools/include/uapi/linux/mman.h differs from kernel" >&2 )) || true
$(Q)$(MAKE) $(build)=perf

$(JEVENTS_IN): FORCE
diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh
new file mode 100755
index 000000000000..c747bfd7f14d
--- /dev/null
+++ b/tools/perf/check-headers.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+HEADERS='
+include/uapi/linux/perf_event.h
+include/linux/hash.h
+include/uapi/linux/hw_breakpoint.h
+arch/x86/include/asm/disabled-features.h
+arch/x86/include/asm/required-features.h
+arch/x86/include/asm/cpufeatures.h
+arch/arm/include/uapi/asm/perf_regs.h
+arch/arm64/include/uapi/asm/perf_regs.h
+arch/powerpc/include/uapi/asm/perf_regs.h
+arch/x86/include/uapi/asm/perf_regs.h
+arch/x86/include/uapi/asm/kvm.h
+arch/x86/include/uapi/asm/kvm_perf.h
+arch/x86/include/uapi/asm/svm.h
+arch/x86/include/uapi/asm/vmx.h
+arch/powerpc/include/uapi/asm/kvm.h
+arch/s390/include/uapi/asm/kvm.h
+arch/s390/include/uapi/asm/kvm_perf.h
+arch/s390/include/uapi/asm/sie.h
+arch/arm/include/uapi/asm/kvm.h
+arch/arm64/include/uapi/asm/kvm.h
+include/asm-generic/bitops/arch_hweight.h
+include/asm-generic/bitops/const_hweight.h
+include/asm-generic/bitops/__fls.h
+include/asm-generic/bitops/fls.h
+include/asm-generic/bitops/fls64.h
+include/linux/coresight-pmu.h
+include/uapi/asm-generic/mman-common.h
+'
+
+check () {
+ file=$1
+ opts=
+
+ shift
+ while [ -n "$*" ]; do
+ opts="$opts \"$1\""
+ shift
+ done
+
+ cmd="diff $opts ../$file ../../$file > /dev/null"
+
+ test -f ../../$file &&
+ eval $cmd || echo "Warning: $file differs from kernel" >&2
+}
+
+
+# simple diff check
+for i in $HEADERS; do
+ check $i -B
+done
+
+# diff with extra ignore lines
+check arch/x86/lib/memcpy_64.S -B -I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>"
+check arch/x86/lib/memset_64.S -B -I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>"
+check include/uapi/asm-generic/mman.h -B -I "^#include <\(uapi/\)*asm-generic/mman-common.h>"
+check include/uapi/linux/mman.h -B -I "^#include <\(uapi/\)*asm/mman.h>"
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 13, 2016, 10:20:11 AM12/13/16
to
From: Namhyung Kim <namh...@kernel.org>

When --idle-hist option is used with --summary, it now shows idle stats
with callchains like below:

Idle stats by callchain:
CPU 0: 902.195 msec
Idle time (msec) Count Callchains
---------------- ------- --------------------------------------------------
370.589 69 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath
178.799 17 worker_thread <- kthread <- ret_from_fork
128.352 17 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
125.111 19 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select
71.599 50 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
23.146 1 rcu_gp_kthread <- kthread <- ret_from_fork
4.510 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64
0.085 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- do_restart_poll
...

Committer notes:

Extra testing:

# uname -a
Linux jouet 4.8.8-300.fc25.x86_64 #1 SMP Tue Nov 15 18:10:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

1) Run 'perf sched record -g'

2) Run 'perf sched timehist --idle --summary'

<SNIP>
Idle stats by callchain:
CPU 0: 13456.840 msec
Idle time (msec) Count Callchains
---------------- ----- --------------------------------------------------
5386.637 3283 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
2750.238 2299 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- do_syscall_64
1275.672 1287 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- entry_SYSCALL_64_fastpath
936.322 452 worker_thread <- kthread <- ret_from_fork
741.311 385 rcu_nocb_kthread <- kthread <- ret_from_fork
729.385 248 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_ppoll
365.386 229 irq_thread <- kthread <- ret_from_fork
338.934 265 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath
219.488 201 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
186.839 410 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64
142.541 59 kvm_vcpu_block <- kvm_arch_vcpu_ioctl_run <- kvm_vcpu_ioctl <- do_vfs_ioctl <- sys_ioctl
83.887 92 smpboot_thread_fn <- kthread <- ret_from_fork
62.722 96 do_exit <- do_group_exit <- 0x2a5594 <- entry_SYSCALL_64_fastpath
47.894 83 pipe_wait <- pipe_read <- __vfs_read <- vfs_read <- sys_read
46.554 61 rcu_gp_kthread <- kthread <- ret_from_fork
34.337 21 schedule_timeout <- intel_fbc_work_fn <- process_one_work <- worker_thread <- kthread
29.521 14 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select
20.274 10 schedule_timeout <- io_schedule_timeout <- bit_wait_io <- __wait_on_bit <- out_of_line_wait_on_bit
15.085 55 schedule_timeout <- unix_stream_read_generic <- unix_stream_recvmsg <- sock_recvmsg <- SYSC_recvfrom
<SNIP>

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: David Ahern <dsa...@gmail.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 86 insertions(+)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0b14265432df..c1c07bfe132c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2448,6 +2448,9 @@ static int timehist_sched_change_event(struct perf_tool *tool,
last_tr->dt_wait = 0;
last_tr->dt_delay = 0;

+ if (itr->cursor.nr)
+ callchain_append(&itr->callchain, &itr->cursor, t - tprev);
+
itr->last_thread = NULL;
}
}
@@ -2557,6 +2560,60 @@ static int show_deadthread_runtime(struct thread *t, void *priv)
return __show_thread_runtime(t, priv);
}

+static size_t callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
+{
+ const char *sep = " <- ";
+ struct callchain_list *chain;
+ size_t ret = 0;
+ char bf[1024];
+ bool first;
+
+ if (node == NULL)
+ return 0;
+
+ ret = callchain__fprintf_folded(fp, node->parent);
+ first = (ret == 0);
+
+ list_for_each_entry(chain, &node->val, list) {
+ if (chain->ip >= PERF_CONTEXT_MAX)
+ continue;
+ if (chain->ms.sym && chain->ms.sym->ignore)
+ continue;
+ ret += fprintf(fp, "%s%s", first ? "" : sep,
+ callchain_list__sym_name(chain, bf, sizeof(bf),
+ false));
+ first = false;
+ }
+
+ return ret;
+}
+
+static size_t timehist_print_idlehist_callchain(struct rb_root *root)
+{
+ size_t ret = 0;
+ FILE *fp = stdout;
+ struct callchain_node *chain;
+ struct rb_node *rb_node = rb_first(root);
+
+ printf(" %16s %8s %s\n", "Idle time (msec)", "Count", "Callchains");
+ printf(" %.16s %.8s %.50s\n", graph_dotted_line, graph_dotted_line,
+ graph_dotted_line);
+
+ while (rb_node) {
+ chain = rb_entry(rb_node, struct callchain_node, rb_node);
+ rb_node = rb_next(rb_node);
+
+ ret += fprintf(fp, " ");
+ print_sched_time(chain->hit, 12);
+ ret += 16; /* print_sched_time returns 2nd arg + 4 */
+ ret += fprintf(fp, " %8d ", chain->count);
+ ret += callchain__fprintf_folded(fp, chain);
+ ret += fprintf(fp, "\n");
+ }
+
+ return ret;
+}
+
static void timehist_print_summary(struct perf_sched *sched,
struct perf_session *session)
{
@@ -2615,6 +2672,35 @@ static void timehist_print_summary(struct perf_sched *sched,
printf(" CPU %2d idle entire time window\n", i);
}

+ if (sched->idle_hist && symbol_conf.use_callchain) {
+ callchain_param.mode = CHAIN_FOLDED;
+ callchain_param.value = CCVAL_PERIOD;
+
+ callchain_register_param(&callchain_param);
+
+ printf("\nIdle stats by callchain:\n");
+ for (i = 0; i < idle_max_cpu; ++i) {
+ struct idle_thread_runtime *itr;
+
+ t = idle_threads[i];
+ if (!t)
+ continue;
+
+ itr = thread__priv(t);
+ if (itr == NULL)
+ continue;
+
+ callchain_param.sort(&itr->sorted_root, &itr->callchain,
+ 0, &callchain_param);
+
+ printf(" CPU %2d:", i);
+ print_sched_time(itr->tr.total_run_time, 6);
+ printf(" msec\n");
+ timehist_print_idlehist_callchain(&itr->sorted_root);
+ printf("\n");
+ }
+ }
+
printf("\n"
" Total number of unique tasks: %" PRIu64 "\n"
"Total number of context switches: %" PRIu64 "\n"
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 13, 2016, 10:20:13 AM12/13/16
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Hi Ingo,

Please consider pulling, I had most of this queued before your first
pull req to Linus for 4.10, most are fixes, with 'perf sched timehist --idle'
as a followup new feature to the 'perf sched timehist' command introduced in
this window.

Thanks,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit b0c1ef52959582144bbea9a2b37db7f4c9e399f7:

perf/x86: Fix exclusion of BTS and LBR for Goldmont (2016-12-11 13:06:09 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161213

for you to fetch changes up to a03f73547fb6e0f7f2942c46cce9b48df50238ba:

samples/bpf: Drop unnecessary build targets. (2016-12-13 10:38:10 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Introduce 'perf sched timehist --idle', to analyse processes
going to/from idle state (Namhyung Kim)

- Add scanning of SDT (Software Defined Tracing) probles arguments (Alexis Berlemont)

Fixes:

- Allow 'perf record -u user' to continue when facing races with threads
going away after having scanned them via /proc (Jiri Olsa)

- Fix 'perf mem' --all-user/--all-kernel options (Jiri Olsa)

Infrastructure:

- Switch over samples/bpf/ to tools/lib/bpf, removing libbpf duplication (Joe Stringer)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Alexis Berlemont (1):
perf sdt: Add scanning of sdt probles arguments

Arnaldo Carvalho de Melo (1):
perf tools: Remove some needless __maybe_unused

Jiri Olsa (6):
perf tools: Move headers check into bash script
perf mem: Fix --all-user/--all-kernel options
perf evsel: Use variable instead of repeating lengthy FD macro
perf thread_map: Add thread_map__remove function
perf evsel: Allow to ignore missing pid
perf record: Force ignore_missing_thread for uid option

Joe Stringer (8):
tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h
tools lib bpf: use __u32 from linux/types.h
tools lib bpf: Add flags to bpf_create_map()
samples/bpf: Make samples more libbpf-centric
samples/bpf: Switch over to libbpf
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Drop unnecessary build targets.

Namhyung Kim (6):
perf sched timehist: Split is_idle_sample()
perf sched timehist: Introduce struct idle_time_data
perf sched timehist: Save callchain when entering idle
perf sched timehist: Skip non-idle events when necessary
perf sched timehist: Add -I/--idle-hist option
perf sched timehist: Show callchains for idle stat

samples/bpf/Makefile | 60 ++---
samples/bpf/README.rst | 4 +-
samples/bpf/bpf_load.c | 20 +-
samples/bpf/fds_example.c | 10 +-
samples/bpf/lathist_user.c | 3 +-
samples/bpf/libbpf.c | 155 -------------
samples/bpf/libbpf.h | 25 +--
samples/bpf/map_perf_test_user.c | 1 +
samples/bpf/offwaketime_user.c | 10 +-
samples/bpf/sampleip_user.c | 8 +-
samples/bpf/sock_example.c | 11 +-
samples/bpf/sock_example.h | 35 +++
samples/bpf/sockex1_user.c | 9 +-
samples/bpf/sockex2_user.c | 7 +-
samples/bpf/sockex3_user.c | 7 +-
samples/bpf/spintest_user.c | 10 +-
samples/bpf/tc_l2_redirect_user.c | 4 +-
samples/bpf/test_cgrp2_array_pin.c | 4 +-
samples/bpf/test_current_task_under_cgroup_user.c | 10 +-
samples/bpf/test_maps.c | 142 ++++++------
samples/bpf/test_overhead_user.c | 2 +
samples/bpf/test_probe_write_user_user.c | 4 +-
samples/bpf/test_verifier.c | 8 +-
samples/bpf/trace_event_user.c | 24 +-
samples/bpf/trace_output_user.c | 6 +-
samples/bpf/tracex1_user.c | 2 +
samples/bpf/tracex2_user.c | 12 +-
samples/bpf/tracex3_user.c | 6 +-
samples/bpf/tracex4_user.c | 6 +-
samples/bpf/tracex5_user.c | 2 +
samples/bpf/tracex6_user.c | 7 +-
samples/bpf/xdp1_user.c | 4 +-
tools/include/uapi/linux/bpf.h | 51 +++++
tools/lib/bpf/bpf.c | 7 +-
tools/lib/bpf/bpf.h | 6 +-
tools/lib/bpf/libbpf.c | 3 +-
tools/perf/Documentation/perf-sched.txt | 4 +
tools/perf/Makefile.perf | 94 +-------
tools/perf/builtin-c2c.c | 13 +-
tools/perf/builtin-mem.c | 4 +-
tools/perf/builtin-record.c | 3 +
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 261 +++++++++++++++++++---
tools/perf/builtin-stat.c | 6 +-
tools/perf/check-headers.sh | 59 +++++
tools/perf/perf.h | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/thread-map.c | 44 ++++
tools/perf/util/evsel.c | 61 ++++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/symbol-elf.c | 25 ++-
tools/perf/util/symbol.h | 1 +
tools/perf/util/thread_map.c | 22 ++
tools/perf/util/thread_map.h | 1 +
55 files changed, 786 insertions(+), 506 deletions(-)
delete mode 100644 samples/bpf/libbpf.c
create mode 100644 samples/bpf/sock_example.h
create mode 100755 tools/perf/check-headers.sh

# uname -a
Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: x86 rdpmc : Ok
54: Convert perf time to TSC : Ok
55: DWARF unwind : Ok
56: x86 instruction decoder - new instructions : Ok
57: Intel cqm nmi context read : Skip
#
# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-mips64: Ok
11 fedora:20: Ok
12 fedora:21: Ok
13 fedora:22: Ok
14 fedora:23: Ok
15 fedora:24: Ok
16 fedora:24-x-ARC-uClibc: Ok
17 fedora:25: Ok
18 fedora:rawhide: Ok
19 mageia:5: Ok
20 opensuse:13.2: Ok
21 opensuse:42.1: Ok
22 opensuse:tumbleweed: Ok
23 ubuntu:12.04.5: Ok
24 ubuntu:14.04.4-x-linaro-arm64: Ok
25 ubuntu:16.04: Ok
26 ubuntu:16.04-x-arm: Ok
27 ubuntu:16.04-x-arm64: Ok
28 ubuntu:16.04-x-powerpc: Ok
29 ubuntu:16.04-x-powerpc64: Ok
30 ubuntu:16.04-x-powerpc64el: Ok
31 ubuntu:16.04-x-s390: Ok
32 ubuntu:16.10: Ok
#
[acme@felicio linux]$ make -C tools/perf build-test
make: Entering directory `/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_doc_O: make doc
make_cscope_O: make cscope
make_debug_O: make DEBUG=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_perf_o_O: make perf.o
make_install_bin_O: make install-bin
make_no_newt_O: make NO_NEWT=1
make_no_slang_O: make NO_SLANG=1
make_clean_all_O: make clean all
make_help_O: make help
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libelf_O: make NO_LIBELF=1
make_pure_O: make
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_static_O: make LDFLAGS=-static
make_util_map_o_O: make util/map.o
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_install_prefix_O: make install prefix=/tmp/krava
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_tags_O: make tags
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_install_O: make install
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libunwind_O: make NO_LIBUNWIND=1
OK
make: Leaving directory `/home/acme/git/linux/tools/perf'
[acme@felicio linux]$

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:05 PM12/20/16
to
From: Jiri Olsa <jo...@kernel.org>

To make it nicer and easily maintainable.

Also moving the check into fixdep sub make, so its output is not
scattered around the build output.

Removing extra $$ from mman*.h checks.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-5-...@kernel.org
[ Use /bin/sh, and 'function check() {' -> 'check () {' to make it work with busybox, in Alpine Linux, for instance ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:06 PM12/20/16
to
From: Joe Stringer <j...@ovn.org>

This declaration was made in samples/bpf/libbpf.c for convenience, but
there's already one in tools/perf/perf-sys.h. Reuse that one.

Committer notes:

Testing it:

$ make -j4 O=../build/v4.9.0-rc8+ samples/bpf/
make[1]: Entering directory '/home/build/v4.9.0-rc8+'
CHK include/config/kernel.release
GEN ./Makefile
CHK include/generated/uapi/linux/version.h
Using /home/acme/git/linux as source for kernel
CHK include/generated/utsrelease.h
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL /home/acme/git/linux/scripts/checksyscalls.sh
HOSTCC samples/bpf/test_verifier.o
HOSTCC samples/bpf/libbpf.o
HOSTCC samples/bpf/../../tools/lib/bpf/bpf.o
HOSTCC samples/bpf/test_maps.o
HOSTCC samples/bpf/sock_example.o
HOSTCC samples/bpf/bpf_load.o
<SNIP>
HOSTLD samples/bpf/trace_event
HOSTLD samples/bpf/sampleip
HOSTLD samples/bpf/tc_l2_redirect
make[1]: Leaving directory '/home/build/v4.9.0-rc8+'
$

Also tested the offwaketime resulting from the rebuild, seems to work as
before.

Signed-off-by: Joe Stringer <j...@ovn.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/2016120902462...@ovn.org
[ Use -I$(srctree)/tools/lib/ to support out of source code tree builds ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
samples/bpf/Makefile | 2 ++
samples/bpf/bpf_load.c | 3 ++-
samples/bpf/libbpf.c | 7 -------
samples/bpf/libbpf.h | 3 ---
samples/bpf/sampleip_user.c | 3 ++-
samples/bpf/trace_event_user.c | 9 +++++----
samples/bpf/trace_output_user.c | 3 ++-
samples/bpf/tracex6_user.c | 3 ++-
8 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 81b0ef2f7994..5a73f5a7ace1 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -109,6 +109,8 @@ always += xdp_tx_iptunnel_kern.o
HOSTCFLAGS += -I$(objtree)/usr/include
HOSTCFLAGS += -I$(srctree)/tools/lib/
HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/
+HOSTCFLAGS += -I$(srctree)/tools/lib/ -I$(srctree)/tools/include
+HOSTCFLAGS += -I$(srctree)/tools/perf

HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
HOSTLOADLIBES_fds_example += -lelf
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 1bfb43394013..396e204888b3 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -23,6 +23,7 @@
#include <ctype.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "perf-sys.h"

#define DEBUGFS "/sys/kernel/debug/tracing/"

@@ -179,7 +180,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
id = atoi(buf);
attr.config = id;

- efd = perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
+ efd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
if (efd < 0) {
printf("event %d fd %d err %s\n", id, efd, strerror(errno));
return -1;
diff --git a/samples/bpf/libbpf.c b/samples/bpf/libbpf.c
index d9af876b4a2c..bee473a494f1 100644
--- a/samples/bpf/libbpf.c
+++ b/samples/bpf/libbpf.c
@@ -34,10 +34,3 @@ int open_raw_sock(const char *name)

return sock;
}
-
-int perf_event_open(struct perf_event_attr *attr, int pid, int cpu,
- int group_fd, unsigned long flags)
-{
- return syscall(__NR_perf_event_open, attr, pid, cpu,
- group_fd, flags);
-}
diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
index cc815624aacf..09aedc320009 100644
--- a/samples/bpf/libbpf.h
+++ b/samples/bpf/libbpf.h
@@ -188,7 +188,4 @@ struct bpf_insn;
/* create RAW socket and bind to interface 'name' */
int open_raw_sock(const char *name);

-struct perf_event_attr;
-int perf_event_open(struct perf_event_attr *attr, int pid, int cpu,
- int group_fd, unsigned long flags);
#endif
diff --git a/samples/bpf/sampleip_user.c b/samples/bpf/sampleip_user.c
index 5ac5adf75931..be59d7dcbdde 100644
--- a/samples/bpf/sampleip_user.c
+++ b/samples/bpf/sampleip_user.c
@@ -21,6 +21,7 @@
#include <sys/ioctl.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "perf-sys.h"

#define DEFAULT_FREQ 99
#define DEFAULT_SECS 5
@@ -49,7 +50,7 @@ static int sampling_start(int *pmu_fd, int freq)
};

for (i = 0; i < nr_cpus; i++) {
- pmu_fd[i] = perf_event_open(&pe_sample_attr, -1 /* pid */, i,
+ pmu_fd[i] = sys_perf_event_open(&pe_sample_attr, -1 /* pid */, i,
-1 /* group_fd */, 0 /* flags */);
if (pmu_fd[i] < 0) {
fprintf(stderr, "ERROR: Initializing perf sampling\n");
diff --git a/samples/bpf/trace_event_user.c b/samples/bpf/trace_event_user.c
index 704fe9fa77b2..0c5561d193a4 100644
--- a/samples/bpf/trace_event_user.c
+++ b/samples/bpf/trace_event_user.c
@@ -20,6 +20,7 @@
#include <sys/resource.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "perf-sys.h"

#define SAMPLE_FREQ 50

@@ -125,9 +126,9 @@ static void test_perf_event_all_cpu(struct perf_event_attr *attr)

/* open perf_event on all cpus */
for (i = 0; i < nr_cpus; i++) {
- pmu_fd[i] = perf_event_open(attr, -1, i, -1, 0);
+ pmu_fd[i] = sys_perf_event_open(attr, -1, i, -1, 0);
if (pmu_fd[i] < 0) {
- printf("perf_event_open failed\n");
+ printf("sys_perf_event_open failed\n");
goto all_cpu_err;
}
assert(ioctl(pmu_fd[i], PERF_EVENT_IOC_SET_BPF, prog_fd[0]) == 0);
@@ -146,9 +147,9 @@ static void test_perf_event_task(struct perf_event_attr *attr)
int pmu_fd;

/* open task bound event */
- pmu_fd = perf_event_open(attr, 0, -1, -1, 0);
+ pmu_fd = sys_perf_event_open(attr, 0, -1, -1, 0);
if (pmu_fd < 0) {
- printf("perf_event_open failed\n");
+ printf("sys_perf_event_open failed\n");
return;
}
assert(ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd[0]) == 0);
diff --git a/samples/bpf/trace_output_user.c b/samples/bpf/trace_output_user.c
index 1a1da7bddb93..f4fa6af22def 100644
--- a/samples/bpf/trace_output_user.c
+++ b/samples/bpf/trace_output_user.c
@@ -21,6 +21,7 @@
#include <signal.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "perf-sys.h"

static int pmu_fd;

@@ -159,7 +160,7 @@ static void test_bpf_perf_event(void)
};
int key = 0;

- pmu_fd = perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
+ pmu_fd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);

assert(pmu_fd >= 0);
assert(bpf_map_update_elem(map_fd[0], &key, &pmu_fd, BPF_ANY) == 0);
diff --git a/samples/bpf/tracex6_user.c b/samples/bpf/tracex6_user.c
index 179297cb4d35..ca7874ed77f4 100644
--- a/samples/bpf/tracex6_user.c
+++ b/samples/bpf/tracex6_user.c
@@ -10,6 +10,7 @@
#include <linux/bpf.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "perf-sys.h"

#define SAMPLE_PERIOD 0x7fffffffffffffffULL

@@ -30,7 +31,7 @@ static void test_bpf_perf_event(void)
};

for (i = 0; i < nr_cpus; i++) {
- pmu_fd[i] = perf_event_open(&attr_insn_pmu, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
+ pmu_fd[i] = sys_perf_event_open(&attr_insn_pmu, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
if (pmu_fd[i] < 0) {
printf("event syscall failed\n");
goto exit;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:06 PM12/20/16
to
From: Joe Stringer <j...@ovn.org>

Commit d8c5b17f2bc0 ("samples: bpf: add userspace example for attaching
eBPF programs to cgroups") added these functions to samples/libbpf, but
during this merge all of the samples libbpf functionality is shifting to
tools/lib/bpf. Shift these functions there.

Committer notes:

Use bzero + attr.FIELD = value instead of 'attr = { .FIELD = value, just
like the other wrapper calls to sys_bpf with bpf_attr to make this build
in older toolchais, such as the ones in CentOS 5 and 6.

Signed-off-by: Joe Stringer <j...@ovn.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-au2zvtsh55...@git.kernel.org
Link: https://github.com/joestringer/linux/commit/353e6f298c3d0a92fa8bfa61ff898c5050261a12.patch
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
samples/bpf/libbpf.c | 21 ---------------------
samples/bpf/libbpf.h | 3 ---
tools/lib/bpf/bpf.c | 23 +++++++++++++++++++++++
tools/lib/bpf/bpf.h | 3 +++
4 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/samples/bpf/libbpf.c b/samples/bpf/libbpf.c
index 3391225ad7e9..d9af876b4a2c 100644
--- a/samples/bpf/libbpf.c
+++ b/samples/bpf/libbpf.c
@@ -11,27 +11,6 @@
#include <arpa/inet.h>
#include "libbpf.h"

-int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type)
-{
- union bpf_attr attr = {
- .target_fd = target_fd,
- .attach_bpf_fd = prog_fd,
- .attach_type = type,
- };
-
- return syscall(__NR_bpf, BPF_PROG_ATTACH, &attr, sizeof(attr));
-}
-
-int bpf_prog_detach(int target_fd, enum bpf_attach_type type)
-{
- union bpf_attr attr = {
- .target_fd = target_fd,
- .attach_type = type,
- };
-
- return syscall(__NR_bpf, BPF_PROG_DETACH, &attr, sizeof(attr));
-}
-
int open_raw_sock(const char *name)
{
struct sockaddr_ll sll;
diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
index cf7d2386d1f9..cc815624aacf 100644
--- a/samples/bpf/libbpf.h
+++ b/samples/bpf/libbpf.h
@@ -6,9 +6,6 @@

struct bpf_insn;

-int bpf_prog_attach(int prog_fd, int attachable_fd, enum bpf_attach_type type);
-int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
-
/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */

#define BPF_ALU64_REG(OP, DST, SRC) \
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index d0afb26c2e0f..3ddb58a36d3c 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -167,3 +167,26 @@ int bpf_obj_get(const char *pathname)

return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
}
+
+int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.target_fd = target_fd;
+ attr.attach_bpf_fd = prog_fd;
+ attr.attach_type = type;
+
+ return sys_bpf(BPF_PROG_ATTACH, &attr, sizeof(attr));
+}
+
+int bpf_prog_detach(int target_fd, enum bpf_attach_type type)
+{
+ union bpf_attr attr;
+
+ bzero(&attr, sizeof(attr));
+ attr.target_fd = target_fd;
+ attr.attach_type = type;
+
+ return sys_bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 7fcdce16fd62..a2f9853dd882 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -41,5 +41,8 @@ int bpf_map_delete_elem(int fd, void *key);
int bpf_map_get_next_key(int fd, void *key, void *next_key);
int bpf_obj_pin(int fd, const char *pathname);
int bpf_obj_get(const char *pathname);
+int bpf_prog_attach(int prog_fd, int attachable_fd, enum bpf_attach_type type);
+int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
+

#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Namhyung Kim <namh...@kernel.org>

The is_idle_sample() function actually does more than determining
whether sample come from idle task. Split the callchain part into
save_task_callchain() to make it clearer.

Also checking prev_pid from trace data looks preferred than just
checking sample->pid since it's possible, although rare, to have invalid
0 pid/tid on scheduling an exiting task.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: David Ahern <dsa...@gmail.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755....@kernel.org
[ Remove some needless () in some return statements ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 39 ++++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 1a3f1be93372..966eddce1609 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Namhyung Kim <namh...@kernel.org>

In order to investigate the idleness reason, it is necessary to keep the
callchains when entering idle. This can be identified by the
sched:sched_switch event having the next_pid field as 0.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: David Ahern <dsa...@gmail.com>
Cc: Andi Kleen <an...@firstfloor.org>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755....@kernel.org
Link: http://lkml.kernel.org/r/20161213080632....@kernel.org
[ Merged fix from Namhyung, see second Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index e108b0f6a246..dc83b803ca54 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -200,6 +200,7 @@ struct perf_sched {
/* options for timehist command */
bool summary;
bool summary_only;
+ bool idle_hist;
bool show_callchain;
unsigned int max_stack;
bool show_cpu_visual;
@@ -2101,6 +2102,15 @@ static struct thread *get_idle_thread(int cpu)
return idle_threads[cpu];
}

+static void save_idle_callchain(struct idle_thread_runtime *itr,
+ struct perf_sample *sample)
+{
+ if (!symbol_conf.use_callchain || sample->callchain == NULL)
+ return;
+
+ callchain_cursor__copy(&itr->cursor, &callchain_cursor);
+}
+
/*
* handle runtime stats saved per thread
*/
@@ -2154,6 +2164,26 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
}

save_task_callchain(sched, sample, evsel, machine);
+ if (sched->idle_hist) {
+ struct thread *idle;
+ struct idle_thread_runtime *itr;
+
+ idle = get_idle_thread(sample->cpu);
+ if (idle == NULL) {
+ pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
+ return NULL;
+ }
+
+ itr = thread__priv(idle);
+ if (itr == NULL)
+ return NULL;
+
+ itr->last_thread = thread;
+
+ /* copy task callchain when entering to idle */
+ if (perf_evsel__intval(evsel, sample, "next_pid") == 0)
+ save_idle_callchain(itr, sample);
+ }
}

return thread;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Jiri Olsa <jo...@redhat.com>

Adding perf_evsel::ignore_missing_cpu_thread bool.

When set true, it allows perf to ignore error of missing pid of perf
event syscall.

We remove missing thread id from the thread_map, so the rest of the
processing like ioctl and mmap won't get disturbed with -1 fd.

The reason for supporting this is to ease up monitoring group of pids,
that 'disappear' before perf opens their event. This currently leads
perf to report error and exit and makes perf record's -u option unusable
under certain setup.

With this change we will allow this race and ignore such failure with
following warning:

WARNING: Ignored open failure for pid 8605

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/20161213074622.GA3084@krava
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/perf.h | 1 +
tools/perf/util/evsel.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/evsel.h | 1 +
3 files changed, 46 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 9a0236a4cf95..1c27d947c2fe 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -55,6 +55,7 @@ struct record_opts {
bool all_user;
bool tail_synthesize;
bool overwrite;
+ bool ignore_missing_thread;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index fd61ebd77c26..04e536ae4d88 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -990,6 +990,8 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
* it overloads any global configuration.
*/
apply_config_terms(evsel, opts);
+
+ evsel->ignore_missing_thread = opts->ignore_missing_thread;
}

static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
@@ -1419,6 +1421,33 @@ static int __open_attr__fprintf(FILE *fp, const char *name, const char *val,
return fprintf(fp, " %-32s %s\n", name, val);
}

+static bool ignore_missing_thread(struct perf_evsel *evsel,
+ struct thread_map *threads,
+ int thread, int err)
+{
+ if (!evsel->ignore_missing_thread)
+ return false;
+
+ /* The system wide setup does not work with threads. */
+ if (evsel->system_wide)
+ return false;
+
+ /* The -ESRCH is perf event syscall errno for pid's not found. */
+ if (err != -ESRCH)
+ return false;
+
+ /* If there's only one thread, let it fail. */
+ if (threads->nr == 1)
+ return false;
+
+ if (thread_map__remove(threads, thread))
+ return false;
+
+ pr_warning("WARNING: Ignored open failure for pid %d\n",
+ thread_map__pid(threads, thread));
+ return true;
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
@@ -1491,6 +1520,21 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,

if (fd < 0) {
err = -errno;
+
+ if (ignore_missing_thread(evsel, threads, thread, err)) {
+ /*
+ * We just removed 1 thread, so take a step
+ * back on thread index and lower the upper
+ * nthreads limit.
+ */
+ nthreads--;
+ thread--;
+
+ /* ... and pretend like nothing have happened. */
+ err = 0;
+ continue;
+ }
+
pr_debug2("\nsys_perf_event_open failed, error %d\n",
err);
goto try_fallback;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6abb89cd27f9..06ef6f29efa1 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -120,6 +120,7 @@ struct perf_evsel {
bool tracking;
bool per_pkg;
bool precise_max;
+ bool ignore_missing_thread;
/* parse modifier helper */
int exclude_GH;
int nr_members;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

'perf report --tui' exits with error when it finds a sample of zero
length symbol (i.e. addr == sym->start == sym->end). Actually these are
valid samples. Don't exit TUI and show report with such symbols.

Reported-and-Tested-by: Anton Blanchard <an...@samba.org>
Link: https://lkml.org/lkml/2016/10/8/189
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Cc: Chris Riyder <chris...@arm.com>
Cc: linuxp...@lists.ozlabs.org
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Nicholas Piggin <npi...@gmail.com>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: sta...@kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1479804050-5028-1-git-s...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c81a3950a7fe..06cc04e5806a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -647,7 +647,8 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,

pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map->unmap_ip(map, addr));

- if (addr < sym->start || addr >= sym->end) {
+ if ((addr < sym->start || addr >= sym->end) &&
+ (addr != sym->end || sym->start != sym->end)) {
pr_debug("%s(%d): ERANGE! sym->name=%s, start=%#" PRIx64 ", addr=%#" PRIx64 ", end=%#" PRIx64 "\n",
__func__, __LINE__, sym->name, sym->start, addr, sym->end);
return -ERANGE;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Kan Liang <kan....@intel.com>

Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c9a ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% -99.80% tchain_edit [.] f2
0.12% +99.81% tchain_edit [.] f3
0.02% -0.01% [ixgbe] [k] ixgbe_read_reg

After the fix:
$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% +0.10% tchain_edit [.] f3
0.12% -0.08% tchain_edit [.] f2

Signed-off-by: Kan Liang <kan....@intel.com>
Cc: Andi Kleen <an...@firstfloor.org>
CC: Dima Kogan <di...@secretsauce.net>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Fixes: 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-gi...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/symbol.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index df2482b2ba45..dc93940de351 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1459,7 +1459,8 @@ int dso__load(struct dso *dso, struct map *map)
* Read the build id if possible. This is required for
* DSO_BINARY_TYPE__BUILDID_DEBUGINFO to work
*/
- if (is_regular_file(dso->long_name) &&
+ if (!dso->has_build_id &&
+ is_regular_file(dso->long_name) &&
filename__read_build_id(dso->long_name, build_id, BUILD_ID_SIZE) > 0)
dso__set_build_id(dso, build_id);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Joe Stringer <j...@ovn.org>

Switch all of the sample code to use the function names from
tools/lib/bpf so that they're consistent with that, and to declare their
own log buffers. This allow the next commit to be purely devoted to
getting rid of the duplicate library in samples/bpf.

Committer notes:

Testing it:

On a fedora rawhide container, with clang/llvm 3.9, sharing the host
linux kernel git tree:

# make O=/tmp/build/linux/ headers_install
# make O=/tmp/build/linux -C samples/bpf/

Since I forgot to make it privileged, just tested it outside the
container, using what it generated:

# uname -a
Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
# cd /var/lib/docker/devicemapper/mnt/c43e09a53ff56c86a07baf79847f00e2cc2a17a1e2220e1adbf8cbc62734feda/rootfs/tmp/build/linux/samples/bpf/
# ls -la offwaketime
-rwxr-xr-x. 1 root root 24200 Dec 15 12:19 offwaketime
# file offwaketime
offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=c940d3f127d5e66cdd680e42d885cb0b64f8a0e4, not stripped
# readelf -SW offwaketime_kern.o | grep PROGBITS
[ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
[ 3] kprobe/try_to_wake_up PROGBITS 0000000000000000 000040 0000d8 00 AX 0 0 8
[ 5] tracepoint/sched/sched_switch PROGBITS 0000000000000000 000118 000318 00 AX 0 0 8
[ 7] maps PROGBITS 0000000000000000 000430 000050 00 WA 0 0 4
[ 8] license PROGBITS 0000000000000000 000480 000004 00 WA 0 0 1
[ 9] version PROGBITS 0000000000000000 000484 000004 00 WA 0 0 4
# ./offwaketime | head -5
swapper/1;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 106
CPU 0/KVM;entry_SYSCALL_64_fastpath;sys_ioctl;do_vfs_ioctl;kvm_vcpu_ioctl;kvm_arch_vcpu_ioctl_run;kvm_vcpu_block;schedule;__schedule;-;try_to_wake_up;swake_up_locked;swake_up;apic_timer_expired;apic_timer_fn;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary;;swapper/3 2
Compositor;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;futex_requeue;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;SoftwareVsyncTh 5
firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 13
JS Helper;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;firefox 2
#

Signed-off-by: Joe Stringer <j...@ovn.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/2016121422434...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
samples/bpf/bpf_load.c | 17 +++++++++---
samples/bpf/bpf_load.h | 3 +++
samples/bpf/fds_example.c | 9 ++++---
samples/bpf/lathist_user.c | 2 +-
samples/bpf/libbpf.c | 23 ++++++++--------
samples/bpf/libbpf.h | 18 ++++++-------
samples/bpf/lwt_len_hist_user.c | 6 +++--
samples/bpf/offwaketime_user.c | 8 +++---
samples/bpf/sampleip_user.c | 4 +--
samples/bpf/sock_example.c | 12 +++++----
samples/bpf/sockex1_user.c | 6 ++---
samples/bpf/sockex2_user.c | 4 +--
samples/bpf/sockex3_user.c | 4 +--
samples/bpf/spintest_user.c | 8 +++---
samples/bpf/tc_l2_redirect_user.c | 4 +--
samples/bpf/test_cgrp2_array_pin.c | 4 +--
samples/bpf/test_cgrp2_attach.c | 11 +++++---
samples/bpf/test_cgrp2_attach2.c | 7 +++--
samples/bpf/test_cgrp2_sock.c | 6 +++--
samples/bpf/test_current_task_under_cgroup_user.c | 8 +++---
samples/bpf/test_lru_dist.c | 32 +++++++++++------------
samples/bpf/test_probe_write_user_user.c | 2 +-
samples/bpf/trace_event_user.c | 14 +++++-----
samples/bpf/trace_output_user.c | 2 +-
samples/bpf/tracex2_user.c | 10 +++----
samples/bpf/tracex3_user.c | 4 +--
samples/bpf/tracex4_user.c | 4 +--
samples/bpf/tracex6_user.c | 2 +-
samples/bpf/xdp1_user.c | 2 +-
samples/bpf/xdp_tx_iptunnel_user.c | 6 ++---
30 files changed, 133 insertions(+), 109 deletions(-)

diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index e30b6de94f2e..f5b186c46b7c 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -22,7 +22,6 @@
#include <poll.h>
#include <ctype.h>
#include "libbpf.h"
-#include "bpf_helpers.h"
#include "bpf_load.h"

#define DEBUGFS "/sys/kernel/debug/tracing/"
@@ -30,17 +29,26 @@
static char license[128];
static int kern_version;
static bool processed_sec[128];
+char bpf_log_buf[BPF_LOG_BUF_SIZE];
int map_fd[MAX_MAPS];
int prog_fd[MAX_PROGS];
int event_fd[MAX_PROGS];
int prog_cnt;
int prog_array_fd = -1;

+struct bpf_map_def {
+ unsigned int type;
+ unsigned int key_size;
+ unsigned int value_size;
+ unsigned int max_entries;
+ unsigned int map_flags;
+};
+
static int populate_prog_array(const char *event, int prog_fd)
{
int ind = atoi(event), err;

- err = bpf_update_elem(prog_array_fd, &ind, &prog_fd, BPF_ANY);
+ err = bpf_map_update_elem(prog_array_fd, &ind, &prog_fd, BPF_ANY);
if (err < 0) {
printf("failed to store prog_fd in prog_array\n");
return -1;
@@ -87,9 +95,10 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
return -1;
}

- fd = bpf_prog_load(prog_type, prog, size, license, kern_version);
+ fd = bpf_load_program(prog_type, prog, size, license, kern_version,
+ bpf_log_buf, BPF_LOG_BUF_SIZE);
if (fd < 0) {
- printf("bpf_prog_load() err=%d\n%s", errno, bpf_log_buf);
+ printf("bpf_load_program() err=%d\n%s", errno, bpf_log_buf);
return -1;
}

diff --git a/samples/bpf/bpf_load.h b/samples/bpf/bpf_load.h
index fb46a421ab41..c827827299b3 100644
--- a/samples/bpf/bpf_load.h
+++ b/samples/bpf/bpf_load.h
@@ -1,12 +1,15 @@
#ifndef __BPF_LOAD_H
#define __BPF_LOAD_H

+#include "libbpf.h"
+
#define MAX_MAPS 32
#define MAX_PROGS 32

extern int map_fd[MAX_MAPS];
extern int prog_fd[MAX_PROGS];
extern int event_fd[MAX_PROGS];
+extern char bpf_log_buf[BPF_LOG_BUF_SIZE];
extern int prog_cnt;

/* parses elf file compiled by llvm .c->.o
diff --git a/samples/bpf/fds_example.c b/samples/bpf/fds_example.c
index 625e797be6ef..8a4fc4ef3993 100644
--- a/samples/bpf/fds_example.c
+++ b/samples/bpf/fds_example.c
@@ -58,8 +58,9 @@ static int bpf_prog_create(const char *object)
assert(!load_bpf_file((char *)object));
return prog_fd[0];
} else {
- return bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER,
- insns, sizeof(insns), "GPL", 0);
+ return bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER,
+ insns, sizeof(insns), "GPL", 0,
+ bpf_log_buf, BPF_LOG_BUF_SIZE);
}
}

@@ -83,12 +84,12 @@ static int bpf_do_map(const char *file, uint32_t flags, uint32_t key,
}

if ((flags & BPF_F_KEY_VAL) == BPF_F_KEY_VAL) {
- ret = bpf_update_elem(fd, &key, &value, 0);
+ ret = bpf_map_update_elem(fd, &key, &value, 0);
printf("bpf: fd:%d u->(%u:%u) ret:(%d,%s)\n", fd, key, value,
ret, strerror(errno));
assert(ret == 0);
} else if (flags & BPF_F_KEY) {
- ret = bpf_lookup_elem(fd, &key, &value);
+ ret = bpf_map_lookup_elem(fd, &key, &value);
printf("bpf: fd:%d l->(%u):%u ret:(%d,%s)\n", fd, key, value,
ret, strerror(errno));
assert(ret == 0);
diff --git a/samples/bpf/lathist_user.c b/samples/bpf/lathist_user.c
index 65da8c1576de..6477bad5b4e2 100644
--- a/samples/bpf/lathist_user.c
+++ b/samples/bpf/lathist_user.c
@@ -73,7 +73,7 @@ static void get_data(int fd)
for (c = 0; c < MAX_CPU; c++) {
for (i = 0; i < MAX_ENTRIES; i++) {
key = c * MAX_ENTRIES + i;
- bpf_lookup_elem(fd, &key, &value);
+ bpf_map_lookup_elem(fd, &key, &value);

cpu_hist[c].data[i] = value;
if (value > cpu_hist[c].max)
diff --git a/samples/bpf/libbpf.c b/samples/bpf/libbpf.c
index 9ce707bf02a7..6f076abdca35 100644
--- a/samples/bpf/libbpf.c
+++ b/samples/bpf/libbpf.c
@@ -32,7 +32,7 @@ int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
}

-int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags)
+int bpf_map_update_elem(int fd, void *key, void *value, unsigned long long flags)
{
union bpf_attr attr = {
.map_fd = fd,
@@ -44,7 +44,7 @@ int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags)
return syscall(__NR_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
}

-int bpf_lookup_elem(int fd, void *key, void *value)
+int bpf_map_lookup_elem(int fd, void *key, void *value)
{
union bpf_attr attr = {
.map_fd = fd,
@@ -55,7 +55,7 @@ int bpf_lookup_elem(int fd, void *key, void *value)
return syscall(__NR_bpf, BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
}

-int bpf_delete_elem(int fd, void *key)
+int bpf_map_delete_elem(int fd, void *key)
{
union bpf_attr attr = {
.map_fd = fd,
@@ -65,7 +65,7 @@ int bpf_delete_elem(int fd, void *key)
return syscall(__NR_bpf, BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
}

-int bpf_get_next_key(int fd, void *key, void *next_key)
+int bpf_map_get_next_key(int fd, void *key, void *next_key)
{
union bpf_attr attr = {
.map_fd = fd,
@@ -78,19 +78,18 @@ int bpf_get_next_key(int fd, void *key, void *next_key)

#define ROUND_UP(x, n) (((x) + (n) - 1u) & ~((n) - 1u))

-char bpf_log_buf[LOG_BUF_SIZE];
-
-int bpf_prog_load(enum bpf_prog_type prog_type,
- const struct bpf_insn *insns, int prog_len,
- const char *license, int kern_version)
+int bpf_load_program(enum bpf_prog_type prog_type,
+ const struct bpf_insn *insns, int prog_len,
+ const char *license, int kern_version,
+ char *log_buf, size_t log_buf_sz)
{
union bpf_attr attr = {
.prog_type = prog_type,
.insns = ptr_to_u64((void *) insns),
.insn_cnt = prog_len / sizeof(struct bpf_insn),
.license = ptr_to_u64((void *) license),
- .log_buf = ptr_to_u64(bpf_log_buf),
- .log_size = LOG_BUF_SIZE,
+ .log_buf = ptr_to_u64(log_buf),
+ .log_size = log_buf_sz,
.log_level = 1,
};

@@ -99,7 +98,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
*/
attr.kern_version = kern_version;

- bpf_log_buf[0] = 0;
+ log_buf[0] = 0;

return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
}
diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
index 94a901d86fc2..20e3457857ca 100644
--- a/samples/bpf/libbpf.h
+++ b/samples/bpf/libbpf.h
@@ -6,14 +6,15 @@ struct bpf_insn;

int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
int max_entries, int map_flags);
-int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags);
-int bpf_lookup_elem(int fd, void *key, void *value);
-int bpf_delete_elem(int fd, void *key);
-int bpf_get_next_key(int fd, void *key, void *next_key);
+int bpf_map_update_elem(int fd, void *key, void *value, unsigned long long flags);
+int bpf_map_lookup_elem(int fd, void *key, void *value);
+int bpf_map_delete_elem(int fd, void *key);
+int bpf_map_get_next_key(int fd, void *key, void *next_key);

-int bpf_prog_load(enum bpf_prog_type prog_type,
- const struct bpf_insn *insns, int insn_len,
- const char *license, int kern_version);
+int bpf_load_program(enum bpf_prog_type prog_type,
+ const struct bpf_insn *insns, int insn_len,
+ const char *license, int kern_version,
+ char *log_buf, size_t log_buf_sz);

int bpf_prog_attach(int prog_fd, int attachable_fd, enum bpf_attach_type type);
int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
@@ -21,8 +22,7 @@ int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
int bpf_obj_pin(int fd, const char *pathname);
int bpf_obj_get(const char *pathname);

-#define LOG_BUF_SIZE (256 * 1024)
-extern char bpf_log_buf[LOG_BUF_SIZE];
+#define BPF_LOG_BUF_SIZE (256 * 1024)

/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */

diff --git a/samples/bpf/lwt_len_hist_user.c b/samples/bpf/lwt_len_hist_user.c
index 05d783fc5daf..ec8f3bbcbef3 100644
--- a/samples/bpf/lwt_len_hist_user.c
+++ b/samples/bpf/lwt_len_hist_user.c
@@ -14,6 +14,8 @@
#define MAX_INDEX 64
#define MAX_STARS 38

+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
static void stars(char *str, long val, long max, int width)
{
int i;
@@ -41,13 +43,13 @@ int main(int argc, char **argv)
return -1;
}

- while (bpf_get_next_key(map_fd, &key, &next_key) == 0) {
+ while (bpf_map_get_next_key(map_fd, &key, &next_key) == 0) {
if (next_key >= MAX_INDEX) {
fprintf(stderr, "Key %lu out of bounds\n", next_key);
continue;
}

- bpf_lookup_elem(map_fd, &next_key, values);
+ bpf_map_lookup_elem(map_fd, &next_key, values);

sum = 0;
for (i = 0; i < nr_cpus; i++)
diff --git a/samples/bpf/offwaketime_user.c b/samples/bpf/offwaketime_user.c
index 6f002a9c24fa..9cce2a66bd66 100644
--- a/samples/bpf/offwaketime_user.c
+++ b/samples/bpf/offwaketime_user.c
@@ -49,14 +49,14 @@ static void print_stack(struct key_t *key, __u64 count)
int i;

printf("%s;", key->target);
- if (bpf_lookup_elem(map_fd[3], &key->tret, ip) != 0) {
+ if (bpf_map_lookup_elem(map_fd[3], &key->tret, ip) != 0) {
printf("---;");
} else {
for (i = PERF_MAX_STACK_DEPTH - 1; i >= 0; i--)
print_ksym(ip[i]);
}
printf("-;");
- if (bpf_lookup_elem(map_fd[3], &key->wret, ip) != 0) {
+ if (bpf_map_lookup_elem(map_fd[3], &key->wret, ip) != 0) {
printf("---;");
} else {
for (i = 0; i < PERF_MAX_STACK_DEPTH; i++)
@@ -77,8 +77,8 @@ static void print_stacks(int fd)
struct key_t key = {}, next_key;
__u64 value;

- while (bpf_get_next_key(fd, &key, &next_key) == 0) {
- bpf_lookup_elem(fd, &next_key, &value);
+ while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {
+ bpf_map_lookup_elem(fd, &next_key, &value);
print_stack(&next_key, value);
key = next_key;
}
diff --git a/samples/bpf/sampleip_user.c b/samples/bpf/sampleip_user.c
index 260a6bdd6413..5ac5adf75931 100644
--- a/samples/bpf/sampleip_user.c
+++ b/samples/bpf/sampleip_user.c
@@ -95,8 +95,8 @@ static void print_ip_map(int fd)

/* fetch IPs and counts */
key = 0, i = 0;
- while (bpf_get_next_key(fd, &key, &next_key) == 0) {
- bpf_lookup_elem(fd, &next_key, &value);
+ while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {
+ bpf_map_lookup_elem(fd, &next_key, &value);
counts[i].ip = next_key;
counts[i++].count = value;
key = next_key;
diff --git a/samples/bpf/sock_example.c b/samples/bpf/sock_example.c
index 28b60baa9fa8..d6b91e9a38ad 100644
--- a/samples/bpf/sock_example.c
+++ b/samples/bpf/sock_example.c
@@ -28,6 +28,8 @@
#include <stddef.h>
#include "libbpf.h"

+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
static int test_sock(void)
{
int sock = -1, map_fd, prog_fd, i, key;
@@ -55,8 +57,8 @@ static int test_sock(void)
BPF_EXIT_INSN(),
};

- prog_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, prog, sizeof(prog),
- "GPL", 0);
+ prog_fd = bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER, prog, sizeof(prog),
+ "GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);
if (prog_fd < 0) {
printf("failed to load prog '%s'\n", strerror(errno));
goto cleanup;
@@ -72,13 +74,13 @@ static int test_sock(void)

for (i = 0; i < 10; i++) {
key = IPPROTO_TCP;
- assert(bpf_lookup_elem(map_fd, &key, &tcp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &tcp_cnt) == 0);

key = IPPROTO_UDP;
- assert(bpf_lookup_elem(map_fd, &key, &udp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &udp_cnt) == 0);

key = IPPROTO_ICMP;
- assert(bpf_lookup_elem(map_fd, &key, &icmp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &icmp_cnt) == 0);

printf("TCP %lld UDP %lld ICMP %lld packets\n",
tcp_cnt, udp_cnt, icmp_cnt);
diff --git a/samples/bpf/sockex1_user.c b/samples/bpf/sockex1_user.c
index 678ce4693551..9454448bf198 100644
--- a/samples/bpf/sockex1_user.c
+++ b/samples/bpf/sockex1_user.c
@@ -32,13 +32,13 @@ int main(int ac, char **argv)
int key;

key = IPPROTO_TCP;
- assert(bpf_lookup_elem(map_fd[0], &key, &tcp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd[0], &key, &tcp_cnt) == 0);

key = IPPROTO_UDP;
- assert(bpf_lookup_elem(map_fd[0], &key, &udp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd[0], &key, &udp_cnt) == 0);

key = IPPROTO_ICMP;
- assert(bpf_lookup_elem(map_fd[0], &key, &icmp_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd[0], &key, &icmp_cnt) == 0);

printf("TCP %lld UDP %lld ICMP %lld bytes\n",
tcp_cnt, udp_cnt, icmp_cnt);
diff --git a/samples/bpf/sockex2_user.c b/samples/bpf/sockex2_user.c
index 8a4085c2d117..6a40600d5a83 100644
--- a/samples/bpf/sockex2_user.c
+++ b/samples/bpf/sockex2_user.c
@@ -39,8 +39,8 @@ int main(int ac, char **argv)
int key = 0, next_key;
struct pair value;

- while (bpf_get_next_key(map_fd[0], &key, &next_key) == 0) {
- bpf_lookup_elem(map_fd[0], &next_key, &value);
+ while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0) {
+ bpf_map_lookup_elem(map_fd[0], &next_key, &value);
printf("ip %s bytes %lld packets %lld\n",
inet_ntoa((struct in_addr){htonl(next_key)}),
value.bytes, value.packets);
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index 3fcfd8c4b2a3..9099c4255f23 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -54,8 +54,8 @@ int main(int argc, char **argv)

sleep(1);
printf("IP src.port -> dst.port bytes packets\n");
- while (bpf_get_next_key(map_fd[2], &key, &next_key) == 0) {
- bpf_lookup_elem(map_fd[2], &next_key, &value);
+ while (bpf_map_get_next_key(map_fd[2], &key, &next_key) == 0) {
+ bpf_map_lookup_elem(map_fd[2], &next_key, &value);
printf("%s.%05d -> %s.%05d %12lld %12lld\n",
inet_ntoa((struct in_addr){htonl(next_key.src)}),
next_key.port16[0],
diff --git a/samples/bpf/spintest_user.c b/samples/bpf/spintest_user.c
index 311ede532230..80676c25fa50 100644
--- a/samples/bpf/spintest_user.c
+++ b/samples/bpf/spintest_user.c
@@ -31,8 +31,8 @@ int main(int ac, char **argv)
for (i = 0; i < 5; i++) {
key = 0;
printf("kprobing funcs:");
- while (bpf_get_next_key(map_fd[0], &key, &next_key) == 0) {
- bpf_lookup_elem(map_fd[0], &next_key, &value);
+ while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0) {
+ bpf_map_lookup_elem(map_fd[0], &next_key, &value);
assert(next_key == value);
sym = ksym_search(value);
printf(" %s", sym->name);
@@ -41,8 +41,8 @@ int main(int ac, char **argv)
if (key)
printf("\n");
key = 0;
- while (bpf_get_next_key(map_fd[0], &key, &next_key) == 0)
- bpf_delete_elem(map_fd[0], &next_key);
+ while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0)
+ bpf_map_delete_elem(map_fd[0], &next_key);
sleep(1);
}

diff --git a/samples/bpf/tc_l2_redirect_user.c b/samples/bpf/tc_l2_redirect_user.c
index 4013c5337b91..28995a776560 100644
--- a/samples/bpf/tc_l2_redirect_user.c
+++ b/samples/bpf/tc_l2_redirect_user.c
@@ -60,9 +60,9 @@ int main(int argc, char **argv)
}

/* bpf_tunnel_key.remote_ipv4 expects host byte orders */
- ret = bpf_update_elem(array_fd, &array_key, &ifindex, 0);
+ ret = bpf_map_update_elem(array_fd, &array_key, &ifindex, 0);
if (ret) {
- perror("bpf_update_elem");
+ perror("bpf_map_update_elem");
goto out;
}

diff --git a/samples/bpf/test_cgrp2_array_pin.c b/samples/bpf/test_cgrp2_array_pin.c
index 70e86f7be69d..8a1b8b5d8def 100644
--- a/samples/bpf/test_cgrp2_array_pin.c
+++ b/samples/bpf/test_cgrp2_array_pin.c
@@ -85,9 +85,9 @@ int main(int argc, char **argv)
}
}

- ret = bpf_update_elem(array_fd, &array_key, &cg2_fd, 0);
+ ret = bpf_map_update_elem(array_fd, &array_key, &cg2_fd, 0);
if (ret) {
- perror("bpf_update_elem");
+ perror("bpf_map_update_elem");
goto out;
}

diff --git a/samples/bpf/test_cgrp2_attach.c b/samples/bpf/test_cgrp2_attach.c
index a19484c45b79..8283ef86d392 100644
--- a/samples/bpf/test_cgrp2_attach.c
+++ b/samples/bpf/test_cgrp2_attach.c
@@ -36,6 +36,8 @@ enum {
MAP_KEY_BYTES,
};

+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
static int prog_load(int map_fd, int verdict)
{
struct bpf_insn prog[] = {
@@ -67,8 +69,9 @@ static int prog_load(int map_fd, int verdict)
BPF_EXIT_INSN(),
};

- return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SKB,
- prog, sizeof(prog), "GPL", 0);
+ return bpf_load_program(BPF_PROG_TYPE_CGROUP_SKB,
+ prog, sizeof(prog), "GPL", 0,
+ bpf_log_buf, BPF_LOG_BUF_SIZE);
}

static int usage(const char *argv0)
@@ -108,10 +111,10 @@ static int attach_filter(int cg_fd, int type, int verdict)
}
while (1) {
key = MAP_KEY_PACKETS;
- assert(bpf_lookup_elem(map_fd, &key, &pkt_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &pkt_cnt) == 0);

key = MAP_KEY_BYTES;
- assert(bpf_lookup_elem(map_fd, &key, &byte_cnt) == 0);
+ assert(bpf_map_lookup_elem(map_fd, &key, &byte_cnt) == 0);

printf("cgroup received %lld packets, %lld bytes\n",
pkt_cnt, byte_cnt);
diff --git a/samples/bpf/test_cgrp2_attach2.c b/samples/bpf/test_cgrp2_attach2.c
index ddfac42ed4df..fc6092fdc3b0 100644
--- a/samples/bpf/test_cgrp2_attach2.c
+++ b/samples/bpf/test_cgrp2_attach2.c
@@ -32,6 +32,8 @@
#define BAR "/foo/bar/"
#define PING_CMD "ping -c1 -w1 127.0.0.1"

+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
static int prog_load(int verdict)
{
int ret;
@@ -40,8 +42,9 @@ static int prog_load(int verdict)
BPF_EXIT_INSN(),
};

- ret = bpf_prog_load(BPF_PROG_TYPE_CGROUP_SKB,
- prog, sizeof(prog), "GPL", 0);
+ ret = bpf_load_program(BPF_PROG_TYPE_CGROUP_SKB,
+ prog, sizeof(prog), "GPL", 0,
+ bpf_log_buf, BPF_LOG_BUF_SIZE);

if (ret < 0) {
log_err("Loading program");
diff --git a/samples/bpf/test_cgrp2_sock.c b/samples/bpf/test_cgrp2_sock.c
index d467b3c1c55c..43b4bde5d05c 100644
--- a/samples/bpf/test_cgrp2_sock.c
+++ b/samples/bpf/test_cgrp2_sock.c
@@ -23,6 +23,8 @@

#include "libbpf.h"

+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
static int prog_load(int idx)
{
struct bpf_insn prog[] = {
@@ -34,8 +36,8 @@ static int prog_load(int idx)
BPF_EXIT_INSN(),
};

- return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
- "GPL", 0);
+ return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+ "GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);
}

static int usage(const char *argv0)
diff --git a/samples/bpf/test_current_task_under_cgroup_user.c b/samples/bpf/test_current_task_under_cgroup_user.c
index 95aaaa846130..65b5fb51c1db 100644
--- a/samples/bpf/test_current_task_under_cgroup_user.c
+++ b/samples/bpf/test_current_task_under_cgroup_user.c
@@ -36,7 +36,7 @@ int main(int argc, char **argv)
if (!cg2)
goto err;

- if (bpf_update_elem(map_fd[0], &idx, &cg2, BPF_ANY)) {
+ if (bpf_map_update_elem(map_fd[0], &idx, &cg2, BPF_ANY)) {
log_err("Adding target cgroup to map");
goto err;
}
@@ -50,7 +50,7 @@ int main(int argc, char **argv)
*/

sync();
- bpf_lookup_elem(map_fd[1], &idx, &remote_pid);
+ bpf_map_lookup_elem(map_fd[1], &idx, &remote_pid);

if (local_pid != remote_pid) {
fprintf(stderr,
@@ -64,10 +64,10 @@ int main(int argc, char **argv)
goto err;

remote_pid = 0;
- bpf_update_elem(map_fd[1], &idx, &remote_pid, BPF_ANY);
+ bpf_map_update_elem(map_fd[1], &idx, &remote_pid, BPF_ANY);

sync();
- bpf_lookup_elem(map_fd[1], &idx, &remote_pid);
+ bpf_map_lookup_elem(map_fd[1], &idx, &remote_pid);

if (local_pid == remote_pid) {
fprintf(stderr, "BPF cgroup negative test did not work\n");
diff --git a/samples/bpf/test_lru_dist.c b/samples/bpf/test_lru_dist.c
index 316230a0ed23..d96dc88d3b04 100644
--- a/samples/bpf/test_lru_dist.c
+++ b/samples/bpf/test_lru_dist.c
@@ -134,7 +134,7 @@ static int pfect_lru_lookup_or_insert(struct pfect_lru *lru,
int seen = 0;

lru->total++;
- if (!bpf_lookup_elem(lru->map_fd, &key, &node)) {
+ if (!bpf_map_lookup_elem(lru->map_fd, &key, &node)) {
if (node) {
list_move(&node->list, &lru->list);
return 1;
@@ -151,7 +151,7 @@ static int pfect_lru_lookup_or_insert(struct pfect_lru *lru,
node = list_last_entry(&lru->list,
struct pfect_lru_node,
list);
- bpf_update_elem(lru->map_fd, &node->key, &null_node, BPF_EXIST);
+ bpf_map_update_elem(lru->map_fd, &node->key, &null_node, BPF_EXIST);
}

node->key = key;
@@ -159,10 +159,10 @@ static int pfect_lru_lookup_or_insert(struct pfect_lru *lru,

lru->nr_misses++;
if (seen) {
- assert(!bpf_update_elem(lru->map_fd, &key, &node, BPF_EXIST));
+ assert(!bpf_map_update_elem(lru->map_fd, &key, &node, BPF_EXIST));
} else {
lru->nr_unique++;
- assert(!bpf_update_elem(lru->map_fd, &key, &node, BPF_NOEXIST));
+ assert(!bpf_map_update_elem(lru->map_fd, &key, &node, BPF_NOEXIST));
}

return seen;
@@ -285,11 +285,11 @@ static void do_test_lru_dist(int task, void *data)

pfect_lru_lookup_or_insert(&pfect_lru, key);

- if (!bpf_lookup_elem(lru_map_fd, &key, &value))
+ if (!bpf_map_lookup_elem(lru_map_fd, &key, &value))
continue;

- if (bpf_update_elem(lru_map_fd, &key, &value, BPF_NOEXIST)) {
- printf("bpf_update_elem(lru_map_fd, %llu): errno:%d\n",
+ if (bpf_map_update_elem(lru_map_fd, &key, &value, BPF_NOEXIST)) {
+ printf("bpf_map_update_elem(lru_map_fd, %llu): errno:%d\n",
key, errno);
assert(0);
}
@@ -358,19 +358,19 @@ static void test_lru_loss0(int map_type, int map_flags)
for (key = 1; key <= 1000; key++) {
int start_key, end_key;

- assert(bpf_update_elem(map_fd, &key, value, BPF_NOEXIST) == 0);
+ assert(bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST) == 0);

start_key = 101;
end_key = min(key, 900);

while (start_key <= end_key) {
- bpf_lookup_elem(map_fd, &start_key, value);
+ bpf_map_lookup_elem(map_fd, &start_key, value);
start_key++;
}
}

for (key = 1; key <= 1000; key++) {
- if (bpf_lookup_elem(map_fd, &key, value)) {
+ if (bpf_map_lookup_elem(map_fd, &key, value)) {
if (key <= 100)
old_unused_losses++;
else if (key <= 900)
@@ -408,10 +408,10 @@ static void test_lru_loss1(int map_type, int map_flags)
value[0] = 1234;

for (key = 1; key <= 1000; key++)
- assert(!bpf_update_elem(map_fd, &key, value, BPF_NOEXIST));
+ assert(!bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST));

for (key = 1; key <= 1000; key++) {
- if (bpf_lookup_elem(map_fd, &key, value))
+ if (bpf_map_lookup_elem(map_fd, &key, value))
nr_losses++;
}

@@ -436,7 +436,7 @@ static void do_test_parallel_lru_loss(int task, void *data)
next_ins_key = stable_base;
value[0] = 1234;
for (i = 0; i < nr_stable_elems; i++) {
- assert(bpf_update_elem(map_fd, &next_ins_key, value,
+ assert(bpf_map_update_elem(map_fd, &next_ins_key, value,
BPF_NOEXIST) == 0);
next_ins_key++;
}
@@ -448,9 +448,9 @@ static void do_test_parallel_lru_loss(int task, void *data)

if (rn % 10) {
key = rn % nr_stable_elems + stable_base;
- bpf_lookup_elem(map_fd, &key, value);
+ bpf_map_lookup_elem(map_fd, &key, value);
} else {
- bpf_update_elem(map_fd, &next_ins_key, value,
+ bpf_map_update_elem(map_fd, &next_ins_key, value,
BPF_NOEXIST);
next_ins_key++;
}
@@ -458,7 +458,7 @@ static void do_test_parallel_lru_loss(int task, void *data)

key = stable_base;
for (i = 0; i < nr_stable_elems; i++) {
- if (bpf_lookup_elem(map_fd, &key, value))
+ if (bpf_map_lookup_elem(map_fd, &key, value))
nr_losses++;
key++;
}
diff --git a/samples/bpf/test_probe_write_user_user.c b/samples/bpf/test_probe_write_user_user.c
index a44bf347bedd..b5bf178a6ecc 100644
--- a/samples/bpf/test_probe_write_user_user.c
+++ b/samples/bpf/test_probe_write_user_user.c
@@ -50,7 +50,7 @@ int main(int ac, char **argv)
mapped_addr_in->sin_port = htons(5555);
mapped_addr_in->sin_addr.s_addr = inet_addr("255.255.255.255");

- assert(!bpf_update_elem(map_fd[0], &mapped_addr, &serv_addr, BPF_ANY));
+ assert(!bpf_map_update_elem(map_fd[0], &mapped_addr, &serv_addr, BPF_ANY));

assert(listen(serverfd, 5) == 0);

diff --git a/samples/bpf/trace_event_user.c b/samples/bpf/trace_event_user.c
index 9a130d31ecf2..704fe9fa77b2 100644
--- a/samples/bpf/trace_event_user.c
+++ b/samples/bpf/trace_event_user.c
@@ -61,14 +61,14 @@ static void print_stack(struct key_t *key, __u64 count)
int i;

printf("%3lld %s;", count, key->comm);
- if (bpf_lookup_elem(map_fd[1], &key->kernstack, ip) != 0) {
+ if (bpf_map_lookup_elem(map_fd[1], &key->kernstack, ip) != 0) {
printf("---;");
} else {
for (i = PERF_MAX_STACK_DEPTH - 1; i >= 0; i--)
print_ksym(ip[i]);
}
printf("-;");
- if (bpf_lookup_elem(map_fd[1], &key->userstack, ip) != 0) {
+ if (bpf_map_lookup_elem(map_fd[1], &key->userstack, ip) != 0) {
printf("---;");
} else {
for (i = PERF_MAX_STACK_DEPTH - 1; i >= 0; i--)
@@ -98,10 +98,10 @@ static void print_stacks(void)
int fd = map_fd[0], stack_map = map_fd[1];

sys_read_seen = sys_write_seen = false;
- while (bpf_get_next_key(fd, &key, &next_key) == 0) {
- bpf_lookup_elem(fd, &next_key, &value);
+ while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {
+ bpf_map_lookup_elem(fd, &next_key, &value);
print_stack(&next_key, value);
- bpf_delete_elem(fd, &next_key);
+ bpf_map_delete_elem(fd, &next_key);
key = next_key;
}

@@ -111,8 +111,8 @@ static void print_stacks(void)
}

/* clear stack map */
- while (bpf_get_next_key(stack_map, &stackid, &next_id) == 0) {
- bpf_delete_elem(stack_map, &next_id);
+ while (bpf_map_get_next_key(stack_map, &stackid, &next_id) == 0) {
+ bpf_map_delete_elem(stack_map, &next_id);
stackid = next_id;
}
}
diff --git a/samples/bpf/trace_output_user.c b/samples/bpf/trace_output_user.c
index 661a7d052f2c..3bedd945def1 100644
--- a/samples/bpf/trace_output_user.c
+++ b/samples/bpf/trace_output_user.c
@@ -162,7 +162,7 @@ static void test_bpf_perf_event(void)
pmu_fd = perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);

assert(pmu_fd >= 0);
- assert(bpf_update_elem(map_fd[0], &key, &pmu_fd, BPF_ANY) == 0);
+ assert(bpf_map_update_elem(map_fd[0], &key, &pmu_fd, BPF_ANY) == 0);
ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
}

diff --git a/samples/bpf/tracex2_user.c b/samples/bpf/tracex2_user.c
index 3e225e331f66..ded9804c5034 100644
--- a/samples/bpf/tracex2_user.c
+++ b/samples/bpf/tracex2_user.c
@@ -48,12 +48,12 @@ static void print_hist_for_pid(int fd, void *task)
long max_value = 0;
int i, ind;

- while (bpf_get_next_key(fd, &key, &next_key) == 0) {
+ while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {
if (memcmp(&next_key, task, SIZE)) {
key = next_key;
continue;
}
- bpf_lookup_elem(fd, &next_key, values);
+ bpf_map_lookup_elem(fd, &next_key, values);
value = 0;
for (i = 0; i < nr_cpus; i++)
value += values[i];
@@ -83,7 +83,7 @@ static void print_hist(int fd)
int task_cnt = 0;
int i;

- while (bpf_get_next_key(fd, &key, &next_key) == 0) {
+ while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {
int found = 0;

for (i = 0; i < task_cnt; i++)
@@ -136,8 +136,8 @@ int main(int ac, char **argv)

for (i = 0; i < 5; i++) {
key = 0;
- while (bpf_get_next_key(map_fd[0], &key, &next_key) == 0) {
- bpf_lookup_elem(map_fd[0], &next_key, &value);
+ while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0) {
+ bpf_map_lookup_elem(map_fd[0], &next_key, &value);
printf("location 0x%lx count %ld\n", next_key, value);
key = next_key;
}
diff --git a/samples/bpf/tracex3_user.c b/samples/bpf/tracex3_user.c
index d0851cb4fa8d..8f7d199d5945 100644
--- a/samples/bpf/tracex3_user.c
+++ b/samples/bpf/tracex3_user.c
@@ -28,7 +28,7 @@ static void clear_stats(int fd)

memset(values, 0, sizeof(values));
for (key = 0; key < SLOTS; key++)
- bpf_update_elem(fd, &key, values, BPF_ANY);
+ bpf_map_update_elem(fd, &key, values, BPF_ANY);
}

const char *color[] = {
@@ -89,7 +89,7 @@ static void print_hist(int fd)
int i;

for (key = 0; key < SLOTS; key++) {
- bpf_lookup_elem(fd, &key, values);
+ bpf_map_lookup_elem(fd, &key, values);
value = 0;
for (i = 0; i < nr_cpus; i++)
value += values[i];
diff --git a/samples/bpf/tracex4_user.c b/samples/bpf/tracex4_user.c
index bc4a3bdea6ed..03449f773cb1 100644
--- a/samples/bpf/tracex4_user.c
+++ b/samples/bpf/tracex4_user.c
@@ -37,8 +37,8 @@ static void print_old_objects(int fd)
key = write(1, "\e[1;1H\e[2J", 12); /* clear screen */

key = -1;
- while (bpf_get_next_key(map_fd[0], &key, &next_key) == 0) {
- bpf_lookup_elem(map_fd[0], &next_key, &v);
+ while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0) {
+ bpf_map_lookup_elem(map_fd[0], &next_key, &v);
key = next_key;
if (val - v.val < 1000000000ll)
/* object was allocated more then 1 sec ago */
diff --git a/samples/bpf/tracex6_user.c b/samples/bpf/tracex6_user.c
index 8ea4976cfcf1..179297cb4d35 100644
--- a/samples/bpf/tracex6_user.c
+++ b/samples/bpf/tracex6_user.c
@@ -36,7 +36,7 @@ static void test_bpf_perf_event(void)
goto exit;
}

- bpf_update_elem(map_fd[0], &i, &pmu_fd[i], BPF_ANY);
+ bpf_map_update_elem(map_fd[0], &i, &pmu_fd[i], BPF_ANY);
ioctl(pmu_fd[i], PERF_EVENT_IOC_ENABLE, 0);
}

diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
index 5f040a0d7712..d2be65d1fd86 100644
--- a/samples/bpf/xdp1_user.c
+++ b/samples/bpf/xdp1_user.c
@@ -43,7 +43,7 @@ static void poll_stats(int interval)
for (key = 0; key < nr_keys; key++) {
__u64 sum = 0;

- assert(bpf_lookup_elem(map_fd[0], &key, values) == 0);
+ assert(bpf_map_lookup_elem(map_fd[0], &key, values) == 0);
for (i = 0; i < nr_cpus; i++)
sum += (values[i] - prev[key][i]);
if (sum)
diff --git a/samples/bpf/xdp_tx_iptunnel_user.c b/samples/bpf/xdp_tx_iptunnel_user.c
index 7a71f5c74684..70e192fc61aa 100644
--- a/samples/bpf/xdp_tx_iptunnel_user.c
+++ b/samples/bpf/xdp_tx_iptunnel_user.c
@@ -51,7 +51,7 @@ static void poll_stats(unsigned int kill_after_s)
for (proto = 0; proto < nr_protos; proto++) {
__u64 sum = 0;

- assert(bpf_lookup_elem(map_fd[0], &proto, values) == 0);
+ assert(bpf_map_lookup_elem(map_fd[0], &proto, values) == 0);
for (i = 0; i < nr_cpus; i++)
sum += (values[i] - prev[proto][i]);

@@ -237,8 +237,8 @@ int main(int argc, char **argv)

while (min_port <= max_port) {
vip.dport = htons(min_port++);
- if (bpf_update_elem(map_fd[1], &vip, &tnl, BPF_NOEXIST)) {
- perror("bpf_update_elem(&vip2tnl)");
+ if (bpf_map_update_elem(map_fd[1], &vip, &tnl, BPF_NOEXIST)) {
+ perror("bpf_map_update_elem(&vip2tnl)");
return 1;
}
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:08 PM12/20/16
to
From: Joe Stringer <j...@ovn.org>

The tools version of this header is out of date; update it to the latest
version from the kernel headers.

Signed-off-by: Joe Stringer <j...@ovn.org>
Acked-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Link: http://lkml.kernel.org/r/2016120902462...@ovn.org
[ Sync it harder, after merging with what was in net-next via perf/urgent via torvalds/master to get BPG_PROG_(AT|DE)TACH, etc ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/uapi/linux/bpf.h | 593 +++++++++++++++++++++++++----------------
1 file changed, 364 insertions(+), 229 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 9e5fc168c8a3..0eb0e87dbe9f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -73,6 +73,8 @@ enum bpf_cmd {
BPF_PROG_LOAD,
BPF_OBJ_PIN,
BPF_OBJ_GET,
+ BPF_PROG_ATTACH,
+ BPF_PROG_DETACH,
};

enum bpf_map_type {
@@ -85,6 +87,8 @@ enum bpf_map_type {
BPF_MAP_TYPE_PERCPU_ARRAY,
BPF_MAP_TYPE_STACK_TRACE,
BPF_MAP_TYPE_CGROUP_ARRAY,
+ BPF_MAP_TYPE_LRU_HASH,
+ BPF_MAP_TYPE_LRU_PERCPU_HASH,
};

enum bpf_prog_type {
@@ -95,8 +99,23 @@ enum bpf_prog_type {
BPF_PROG_TYPE_SCHED_ACT,
BPF_PROG_TYPE_TRACEPOINT,
BPF_PROG_TYPE_XDP,
+ BPF_PROG_TYPE_PERF_EVENT,
+ BPF_PROG_TYPE_CGROUP_SKB,
+ BPF_PROG_TYPE_CGROUP_SOCK,
+ BPF_PROG_TYPE_LWT_IN,
+ BPF_PROG_TYPE_LWT_OUT,
+ BPF_PROG_TYPE_LWT_XMIT,
};

+enum bpf_attach_type {
+ BPF_CGROUP_INET_INGRESS,
+ BPF_CGROUP_INET_EGRESS,
+ BPF_CGROUP_INET_SOCK_CREATE,
+ __MAX_BPF_ATTACH_TYPE
+};
+
+#define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
+
#define BPF_PSEUDO_MAP_FD 1

/* flags for BPF_MAP_UPDATE_ELEM command */
@@ -105,6 +124,13 @@ enum bpf_prog_type {
#define BPF_EXIST 2 /* update existing element */

#define BPF_F_NO_PREALLOC (1U << 0)
+/* Instead of having one common LRU list in the
+ * BPF_MAP_TYPE_LRU_[PERCPU_]HASH map, use a percpu LRU list
+ * which can scale and perform better.
+ * Note, the LRU nodes (including free nodes) cannot be moved
+ * across different LRU lists.
+ */
+#define BPF_F_NO_COMMON_LRU (1U << 1)

union bpf_attr {
struct { /* anonymous struct used by BPF_MAP_CREATE command */
@@ -140,243 +166,327 @@ union bpf_attr {
__aligned_u64 pathname;
__u32 bpf_fd;
};
+
+ struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
+ __u32 target_fd; /* container object to attach to */
+ __u32 attach_bpf_fd; /* eBPF program to attach */
+ __u32 attach_type;
+ };
} __attribute__((aligned(8)));

+/* BPF helper function descriptions:
+ *
+ * void *bpf_map_lookup_elem(&map, &key)
+ * Return: Map value or NULL
+ *
+ * int bpf_map_update_elem(&map, &key, &value, flags)
+ * Return: 0 on success or negative error
+ *
+ * int bpf_map_delete_elem(&map, &key)
+ * Return: 0 on success or negative error
+ *
+ * int bpf_probe_read(void *dst, int size, void *src)
+ * Return: 0 on success or negative error
+ *
+ * u64 bpf_ktime_get_ns(void)
+ * Return: current ktime
+ *
+ * int bpf_trace_printk(const char *fmt, int fmt_size, ...)
+ * Return: length of buffer written or negative error
+ *
+ * u32 bpf_prandom_u32(void)
+ * Return: random value
+ *
+ * u32 bpf_raw_smp_processor_id(void)
+ * Return: SMP processor ID
+ *
+ * int bpf_skb_store_bytes(skb, offset, from, len, flags)
+ * store bytes into packet
+ * @skb: pointer to skb
+ * @offset: offset within packet from skb->mac_header
+ * @from: pointer where to copy bytes from
+ * @len: number of bytes to store into packet
+ * @flags: bit 0 - if true, recompute skb->csum
+ * other bits - reserved
+ * Return: 0 on success or negative error
+ *
+ * int bpf_l3_csum_replace(skb, offset, from, to, flags)
+ * recompute IP checksum
+ * @skb: pointer to skb
+ * @offset: offset within packet where IP checksum is located
+ * @from: old value of header field
+ * @to: new value of header field
+ * @flags: bits 0-3 - size of header field
+ * other bits - reserved
+ * Return: 0 on success or negative error
+ *
+ * int bpf_l4_csum_replace(skb, offset, from, to, flags)
+ * recompute TCP/UDP checksum
+ * @skb: pointer to skb
+ * @offset: offset within packet where TCP/UDP checksum is located
+ * @from: old value of header field
+ * @to: new value of header field
+ * @flags: bits 0-3 - size of header field
+ * bit 4 - is pseudo header
+ * other bits - reserved
+ * Return: 0 on success or negative error
+ *
+ * int bpf_tail_call(ctx, prog_array_map, index)
+ * jump into another BPF program
+ * @ctx: context pointer passed to next program
+ * @prog_array_map: pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY
+ * @index: index inside array that selects specific program to run
+ * Return: 0 on success or negative error
+ *
+ * int bpf_clone_redirect(skb, ifindex, flags)
+ * redirect to another netdev
+ * @skb: pointer to skb
+ * @ifindex: ifindex of the net device
+ * @flags: bit 0 - if set, redirect to ingress instead of egress
+ * other bits - reserved
+ * Return: 0 on success or negative error
+ *
+ * u64 bpf_get_current_pid_tgid(void)
+ * Return: current->tgid << 32 | current->pid
+ *
+ * u64 bpf_get_current_uid_gid(void)
+ * Return: current_gid << 32 | current_uid
+ *
+ * int bpf_get_current_comm(char *buf, int size_of_buf)
+ * stores current->comm into buf
+ * Return: 0 on success or negative error
+ *
+ * u32 bpf_get_cgroup_classid(skb)
+ * retrieve a proc's classid
+ * @skb: pointer to skb
+ * Return: classid if != 0
+ *
+ * int bpf_skb_vlan_push(skb, vlan_proto, vlan_tci)
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_vlan_pop(skb)
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_get_tunnel_key(skb, key, size, flags)
+ * int bpf_skb_set_tunnel_key(skb, key, size, flags)
+ * retrieve or populate tunnel metadata
+ * @skb: pointer to skb
+ * @key: pointer to 'struct bpf_tunnel_key'
+ * @size: size of 'struct bpf_tunnel_key'
+ * @flags: room for future extensions
+ * Return: 0 on success or negative error
+ *
+ * u64 bpf_perf_event_read(&map, index)
+ * Return: Number events read or error code
+ *
+ * int bpf_redirect(ifindex, flags)
+ * redirect to another netdev
+ * @ifindex: ifindex of the net device
+ * @flags: bit 0 - if set, redirect to ingress instead of egress
+ * other bits - reserved
+ * Return: TC_ACT_REDIRECT
+ *
+ * u32 bpf_get_route_realm(skb)
+ * retrieve a dst's tclassid
+ * @skb: pointer to skb
+ * Return: realm if != 0
+ *
+ * int bpf_perf_event_output(ctx, map, index, data, size)
+ * output perf raw sample
+ * @ctx: struct pt_regs*
+ * @map: pointer to perf_event_array map
+ * @index: index of event in the map
+ * @data: data on stack to be output as raw data
+ * @size: size of data
+ * Return: 0 on success or negative error
+ *
+ * int bpf_get_stackid(ctx, map, flags)
+ * walk user or kernel stack and return id
+ * @ctx: struct pt_regs*
+ * @map: pointer to stack_trace map
+ * @flags: bits 0-7 - numer of stack frames to skip
+ * bit 8 - collect user stack instead of kernel
+ * bit 9 - compare stacks by hash only
+ * bit 10 - if two different stacks hash into the same stackid
+ * discard old
+ * other bits - reserved
+ * Return: >= 0 stackid on success or negative error
+ *
+ * s64 bpf_csum_diff(from, from_size, to, to_size, seed)
+ * calculate csum diff
+ * @from: raw from buffer
+ * @from_size: length of from buffer
+ * @to: raw to buffer
+ * @to_size: length of to buffer
+ * @seed: optional seed
+ * Return: csum result or negative error code
+ *
+ * int bpf_skb_get_tunnel_opt(skb, opt, size)
+ * retrieve tunnel options metadata
+ * @skb: pointer to skb
+ * @opt: pointer to raw tunnel option data
+ * @size: size of @opt
+ * Return: option size
+ *
+ * int bpf_skb_set_tunnel_opt(skb, opt, size)
+ * populate tunnel options metadata
+ * @skb: pointer to skb
+ * @opt: pointer to raw tunnel option data
+ * @size: size of @opt
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_change_proto(skb, proto, flags)
+ * Change protocol of the skb. Currently supported is v4 -> v6,
+ * v6 -> v4 transitions. The helper will also resize the skb. eBPF
+ * program is expected to fill the new headers via skb_store_bytes
+ * and lX_csum_replace.
+ * @skb: pointer to skb
+ * @proto: new skb->protocol type
+ * @flags: reserved
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_change_type(skb, type)
+ * Change packet type of skb.
+ * @skb: pointer to skb
+ * @type: new skb->pkt_type type
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_under_cgroup(skb, map, index)
+ * Check cgroup2 membership of skb
+ * @skb: pointer to skb
+ * @map: pointer to bpf_map in BPF_MAP_TYPE_CGROUP_ARRAY type
+ * @index: index of the cgroup in the bpf_map
+ * Return:
+ * == 0 skb failed the cgroup2 descendant test
+ * == 1 skb succeeded the cgroup2 descendant test
+ * < 0 error
+ *
+ * u32 bpf_get_hash_recalc(skb)
+ * Retrieve and possibly recalculate skb->hash.
+ * @skb: pointer to skb
+ * Return: hash
+ *
+ * u64 bpf_get_current_task(void)
+ * Returns current task_struct
+ * Return: current
+ *
+ * int bpf_probe_write_user(void *dst, void *src, int len)
+ * safely attempt to write to a location
+ * @dst: destination address in userspace
+ * @src: source address on stack
+ * @len: number of bytes to copy
+ * Return: 0 on success or negative error
+ *
+ * int bpf_current_task_under_cgroup(map, index)
+ * Check cgroup2 membership of current task
+ * @map: pointer to bpf_map in BPF_MAP_TYPE_CGROUP_ARRAY type
+ * @index: index of the cgroup in the bpf_map
+ * Return:
+ * == 0 current failed the cgroup2 descendant test
+ * == 1 current succeeded the cgroup2 descendant test
+ * < 0 error
+ *
+ * int bpf_skb_change_tail(skb, len, flags)
+ * The helper will resize the skb to the given new size, to be used f.e.
+ * with control messages.
+ * @skb: pointer to skb
+ * @len: new skb length
+ * @flags: reserved
+ * Return: 0 on success or negative error
+ *
+ * int bpf_skb_pull_data(skb, len)
+ * The helper will pull in non-linear data in case the skb is non-linear
+ * and not all of len are part of the linear section. Only needed for
+ * read/write with direct packet access.
+ * @skb: pointer to skb
+ * @len: len to make read/writeable
+ * Return: 0 on success or negative error
+ *
+ * s64 bpf_csum_update(skb, csum)
+ * Adds csum into skb->csum in case of CHECKSUM_COMPLETE.
+ * @skb: pointer to skb
+ * @csum: csum to add
+ * Return: csum on success or negative error
+ *
+ * void bpf_set_hash_invalid(skb)
+ * Invalidate current skb->hash.
+ * @skb: pointer to skb
+ *
+ * int bpf_get_numa_node_id()
+ * Return: Id of current NUMA node.
+ *
+ * int bpf_skb_change_head()
+ * Grows headroom of skb and adjusts MAC header offset accordingly.
+ * Will extends/reallocae as required automatically.
+ * May change skb data pointer and will thus invalidate any check
+ * performed for direct packet access.
+ * @skb: pointer to skb
+ * @len: length of header to be pushed in front
+ * @flags: Flags (unused for now)
+ * Return: 0 on success or negative error
+ *
+ * int bpf_xdp_adjust_head(xdp_md, delta)
+ * Adjust the xdp_md.data by delta
+ * @xdp_md: pointer to xdp_md
+ * @delta: An positive/negative integer to be added to xdp_md.data
+ * Return: 0 on success or negative on error
+ */
+#define __BPF_FUNC_MAPPER(FN) \
+ FN(unspec), \
+ FN(map_lookup_elem), \
+ FN(map_update_elem), \
+ FN(map_delete_elem), \
+ FN(probe_read), \
+ FN(ktime_get_ns), \
+ FN(trace_printk), \
+ FN(get_prandom_u32), \
+ FN(get_smp_processor_id), \
+ FN(skb_store_bytes), \
+ FN(l3_csum_replace), \
+ FN(l4_csum_replace), \
+ FN(tail_call), \
+ FN(clone_redirect), \
+ FN(get_current_pid_tgid), \
+ FN(get_current_uid_gid), \
+ FN(get_current_comm), \
+ FN(get_cgroup_classid), \
+ FN(skb_vlan_push), \
+ FN(skb_vlan_pop), \
+ FN(skb_get_tunnel_key), \
+ FN(skb_set_tunnel_key), \
+ FN(perf_event_read), \
+ FN(redirect), \
+ FN(get_route_realm), \
+ FN(perf_event_output), \
+ FN(skb_load_bytes), \
+ FN(get_stackid), \
+ FN(csum_diff), \
+ FN(skb_get_tunnel_opt), \
+ FN(skb_set_tunnel_opt), \
+ FN(skb_change_proto), \
+ FN(skb_change_type), \
+ FN(skb_under_cgroup), \
+ FN(get_hash_recalc), \
+ FN(get_current_task), \
+ FN(probe_write_user), \
+ FN(current_task_under_cgroup), \
+ FN(skb_change_tail), \
+ FN(skb_pull_data), \
+ FN(csum_update), \
+ FN(set_hash_invalid), \
+ FN(get_numa_node_id), \
+ FN(skb_change_head), \
+ FN(xdp_adjust_head),
+
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
*/
+#define __BPF_ENUM_FN(x) BPF_FUNC_ ## x
enum bpf_func_id {
- BPF_FUNC_unspec,
- BPF_FUNC_map_lookup_elem, /* void *map_lookup_elem(&map, &key) */
- BPF_FUNC_map_update_elem, /* int map_update_elem(&map, &key, &value, flags) */
- BPF_FUNC_map_delete_elem, /* int map_delete_elem(&map, &key) */
- BPF_FUNC_probe_read, /* int bpf_probe_read(void *dst, int size, void *src) */
- BPF_FUNC_ktime_get_ns, /* u64 bpf_ktime_get_ns(void) */
- BPF_FUNC_trace_printk, /* int bpf_trace_printk(const char *fmt, int fmt_size, ...) */
- BPF_FUNC_get_prandom_u32, /* u32 prandom_u32(void) */
- BPF_FUNC_get_smp_processor_id, /* u32 raw_smp_processor_id(void) */
-
- /**
- * skb_store_bytes(skb, offset, from, len, flags) - store bytes into packet
- * @skb: pointer to skb
- * @offset: offset within packet from skb->mac_header
- * @from: pointer where to copy bytes from
- * @len: number of bytes to store into packet
- * @flags: bit 0 - if true, recompute skb->csum
- * other bits - reserved
- * Return: 0 on success
- */
- BPF_FUNC_skb_store_bytes,
-
- /**
- * l3_csum_replace(skb, offset, from, to, flags) - recompute IP checksum
- * @skb: pointer to skb
- * @offset: offset within packet where IP checksum is located
- * @from: old value of header field
- * @to: new value of header field
- * @flags: bits 0-3 - size of header field
- * other bits - reserved
- * Return: 0 on success
- */
- BPF_FUNC_l3_csum_replace,
-
- /**
- * l4_csum_replace(skb, offset, from, to, flags) - recompute TCP/UDP checksum
- * @skb: pointer to skb
- * @offset: offset within packet where TCP/UDP checksum is located
- * @from: old value of header field
- * @to: new value of header field
- * @flags: bits 0-3 - size of header field
- * bit 4 - is pseudo header
- * other bits - reserved
- * Return: 0 on success
- */
- BPF_FUNC_l4_csum_replace,
-
- /**
- * bpf_tail_call(ctx, prog_array_map, index) - jump into another BPF program
- * @ctx: context pointer passed to next program
- * @prog_array_map: pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY
- * @index: index inside array that selects specific program to run
- * Return: 0 on success
- */
- BPF_FUNC_tail_call,
-
- /**
- * bpf_clone_redirect(skb, ifindex, flags) - redirect to another netdev
- * @skb: pointer to skb
- * @ifindex: ifindex of the net device
- * @flags: bit 0 - if set, redirect to ingress instead of egress
- * other bits - reserved
- * Return: 0 on success
- */
- BPF_FUNC_clone_redirect,
-
- /**
- * u64 bpf_get_current_pid_tgid(void)
- * Return: current->tgid << 32 | current->pid
- */
- BPF_FUNC_get_current_pid_tgid,
-
- /**
- * u64 bpf_get_current_uid_gid(void)
- * Return: current_gid << 32 | current_uid
- */
- BPF_FUNC_get_current_uid_gid,
-
- /**
- * bpf_get_current_comm(char *buf, int size_of_buf)
- * stores current->comm into buf
- * Return: 0 on success
- */
- BPF_FUNC_get_current_comm,
-
- /**
- * bpf_get_cgroup_classid(skb) - retrieve a proc's classid
- * @skb: pointer to skb
- * Return: classid if != 0
- */
- BPF_FUNC_get_cgroup_classid,
- BPF_FUNC_skb_vlan_push, /* bpf_skb_vlan_push(skb, vlan_proto, vlan_tci) */
- BPF_FUNC_skb_vlan_pop, /* bpf_skb_vlan_pop(skb) */
-
- /**
- * bpf_skb_[gs]et_tunnel_key(skb, key, size, flags)
- * retrieve or populate tunnel metadata
- * @skb: pointer to skb
- * @key: pointer to 'struct bpf_tunnel_key'
- * @size: size of 'struct bpf_tunnel_key'
- * @flags: room for future extensions
- * Retrun: 0 on success
- */
- BPF_FUNC_skb_get_tunnel_key,
- BPF_FUNC_skb_set_tunnel_key,
- BPF_FUNC_perf_event_read, /* u64 bpf_perf_event_read(&map, index) */
- /**
- * bpf_redirect(ifindex, flags) - redirect to another netdev
- * @ifindex: ifindex of the net device
- * @flags: bit 0 - if set, redirect to ingress instead of egress
- * other bits - reserved
- * Return: TC_ACT_REDIRECT
- */
- BPF_FUNC_redirect,
-
- /**
- * bpf_get_route_realm(skb) - retrieve a dst's tclassid
- * @skb: pointer to skb
- * Return: realm if != 0
- */
- BPF_FUNC_get_route_realm,
-
- /**
- * bpf_perf_event_output(ctx, map, index, data, size) - output perf raw sample
- * @ctx: struct pt_regs*
- * @map: pointer to perf_event_array map
- * @index: index of event in the map
- * @data: data on stack to be output as raw data
- * @size: size of data
- * Return: 0 on success
- */
- BPF_FUNC_perf_event_output,
- BPF_FUNC_skb_load_bytes,
-
- /**
- * bpf_get_stackid(ctx, map, flags) - walk user or kernel stack and return id
- * @ctx: struct pt_regs*
- * @map: pointer to stack_trace map
- * @flags: bits 0-7 - numer of stack frames to skip
- * bit 8 - collect user stack instead of kernel
- * bit 9 - compare stacks by hash only
- * bit 10 - if two different stacks hash into the same stackid
- * discard old
- * other bits - reserved
- * Return: >= 0 stackid on success or negative error
- */
- BPF_FUNC_get_stackid,
-
- /**
- * bpf_csum_diff(from, from_size, to, to_size, seed) - calculate csum diff
- * @from: raw from buffer
- * @from_size: length of from buffer
- * @to: raw to buffer
- * @to_size: length of to buffer
- * @seed: optional seed
- * Return: csum result
- */
- BPF_FUNC_csum_diff,
-
- /**
- * bpf_skb_[gs]et_tunnel_opt(skb, opt, size)
- * retrieve or populate tunnel options metadata
- * @skb: pointer to skb
- * @opt: pointer to raw tunnel option data
- * @size: size of @opt
- * Return: 0 on success for set, option size for get
- */
- BPF_FUNC_skb_get_tunnel_opt,
- BPF_FUNC_skb_set_tunnel_opt,
-
- /**
- * bpf_skb_change_proto(skb, proto, flags)
- * Change protocol of the skb. Currently supported is
- * v4 -> v6, v6 -> v4 transitions. The helper will also
- * resize the skb. eBPF program is expected to fill the
- * new headers via skb_store_bytes and lX_csum_replace.
- * @skb: pointer to skb
- * @proto: new skb->protocol type
- * @flags: reserved
- * Return: 0 on success or negative error
- */
- BPF_FUNC_skb_change_proto,
-
- /**
- * bpf_skb_change_type(skb, type)
- * Change packet type of skb.
- * @skb: pointer to skb
- * @type: new skb->pkt_type type
- * Return: 0 on success or negative error
- */
- BPF_FUNC_skb_change_type,
-
- /**
- * bpf_skb_under_cgroup(skb, map, index) - Check cgroup2 membership of skb
- * @skb: pointer to skb
- * @map: pointer to bpf_map in BPF_MAP_TYPE_CGROUP_ARRAY type
- * @index: index of the cgroup in the bpf_map
- * Return:
- * == 0 skb failed the cgroup2 descendant test
- * == 1 skb succeeded the cgroup2 descendant test
- * < 0 error
- */
- BPF_FUNC_skb_under_cgroup,
-
- /**
- * bpf_get_hash_recalc(skb)
- * Retrieve and possibly recalculate skb->hash.
- * @skb: pointer to skb
- * Return: hash
- */
- BPF_FUNC_get_hash_recalc,
-
- /**
- * u64 bpf_get_current_task(void)
- * Returns current task_struct
- * Return: current
- */
- BPF_FUNC_get_current_task,
-
- /**
- * bpf_probe_write_user(void *dst, void *src, int len)
- * safely attempt to write to a location
- * @dst: destination address in userspace
- * @src: source address on stack
- * @len: number of bytes to copy
- * Return: 0 on success or negative error
- */
- BPF_FUNC_probe_write_user,
-
+ __BPF_FUNC_MAPPER(__BPF_ENUM_FN)
__BPF_FUNC_MAX_ID,
};
+#undef __BPF_ENUM_FN

/* All flags used by eBPF helper functions, placed here. */

@@ -450,6 +560,31 @@ struct bpf_tunnel_key {
__u32 tunnel_label;
};

+/* Generic BPF return codes which all BPF program types may support.
+ * The values are binary compatible with their TC_ACT_* counter-part to
+ * provide backwards compatibility with existing SCHED_CLS and SCHED_ACT
+ * programs.
+ *
+ * XDP is handled seprately, see XDP_*.
+ */
+enum bpf_ret_code {
+ BPF_OK = 0,
+ /* 1 reserved */
+ BPF_DROP = 2,
+ /* 3-6 reserved */
+ BPF_REDIRECT = 7,
+ /* >127 are reserved for prog type specific return codes */
+};
+
+struct bpf_sock {
+ __u32 bound_dev_if;
+ __u32 family;
+ __u32 type;
+ __u32 protocol;
+};
+
+#define XDP_PACKET_HEADROOM 256
+
/* User return codes for XDP prog type.
* A valid XDP program must return one of these defined values. All other
* return codes are reserved for future use. Unknown return codes will result
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:09 PM12/20/16
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

If jump target is outside of function range, perf is not handling it
correctly. Especially when target address is lesser than function start
address, target offset will be negative. But, target address declared to
be unsigned, converts negative number into 2's complement. See below
example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is
lesser than function start address(34cf0).

34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0

Objdump output:

0000000000034cf0 <__sigaction>:
__GI___sigaction():
34cf0: lea -0x20(%rdi),%eax
34cf3: cmp -bashx1,%eax
34cf6: jbe 34d00 <__sigaction+0x10>
34cf8: jmpq 34ac0 <__GI___libc_sigaction>
34cfd: nopl (%rax)
34d00: mov 0x386161(%rip),%rax # 3bae68 <_DYNAMIC+0x2e8>
34d07: movl -bashx16,%fs:(%rax)
34d0e: mov -bashxffffffff,%eax
34d13: retq

perf annotate before applying patch:

__GI___sigaction /usr/lib64/libc-2.22.so
lea -0x20(%rdi),%eax
cmp -bashx1,%eax
v jbe 10
v jmpq fffffffffffffdd0
nop
10: mov _DYNAMIC+0x2e8,%rax
movl -bashx16,%fs:(%rax)
mov -bashxffffffff,%eax
retq

perf annotate after applying patch:

__GI___sigaction /usr/lib64/libc-2.22.so
lea -0x20(%rdi),%eax
cmp -bashx1,%eax
v jbe 10
^ jmpq 34ac0 <__GI___libc_sigaction>
nop
10: mov _DYNAMIC+0x2e8,%rax
movl -bashx16,%fs:(%rax)
mov -bashxffffffff,%eax
retq

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Chris Riyder <chris...@arm.com>
Cc: Kim Phillips <kim.ph...@arm.com>
Cc: Markus Trippelsdorf <mar...@trippelsdorf.de>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Taeung Song <treeze...@gmail.com>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-3-git-s...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/annotate.c | 5 +++--
tools/perf/util/annotate.c | 14 +++++++++-----
tools/perf/util/annotate.h | 5 +++--
3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index ec7a30fad149..ba36aac340bc 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -215,7 +215,7 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
ui_browser__set_color(browser, color);
if (dl->ins.ops && dl->ins.ops->scnprintf) {
if (ins__is_jump(&dl->ins)) {
- bool fwd = dl->ops.target.offset > (u64)dl->offset;
+ bool fwd = dl->ops.target.offset > dl->offset;

ui_browser__write_graph(browser, fwd ? SLSMG_DARROW_CHAR :
SLSMG_UARROW_CHAR);
@@ -245,7 +245,8 @@ static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sy
{
if (!dl || !dl->ins.ops || !ins__is_jump(&dl->ins)
|| !disasm_line__has_offset(dl)
- || dl->ops.target.offset >= symbol__size(sym))
+ || dl->ops.target.offset < 0
+ || dl->ops.target.offset >= (s64)symbol__size(sym))
return false;

return true;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 590244e5781e..c81a3950a7fe 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -230,10 +230,12 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op
else
ops->target.addr = strtoull(ops->raw, NULL, 16);

- if (s++ != NULL)
+ if (s++ != NULL) {
ops->target.offset = strtoull(s, NULL, 16);
- else
- ops->target.offset = UINT64_MAX;
+ ops->target.offset_avail = true;
+ } else {
+ ops->target.offset_avail = false;
+ }

return 0;
}
@@ -241,7 +243,7 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op
static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops)
{
- if (!ops->target.addr)
+ if (!ops->target.addr || ops->target.offset < 0)
return ins__raw_scnprintf(ins, bf, size, ops);

return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset);
@@ -1209,9 +1211,11 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
if (dl == NULL)
return -1;

- if (dl->ops.target.offset == UINT64_MAX)
+ if (!disasm_line__has_offset(dl)) {
dl->ops.target.offset = dl->ops.target.addr -
map__rip_2objdump(map, sym->start);
+ dl->ops.target.offset_avail = true;
+ }

/* kcore has no symbols, so add the call target name */
if (dl->ins.ops && ins__is_call(&dl->ins) && !dl->ops.target.name) {
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 87e4cadc5d27..09776b5af991 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -24,7 +24,8 @@ struct ins_operands {
char *raw;
char *name;
u64 addr;
- u64 offset;
+ s64 offset;
+ bool offset_avail;
} target;
union {
struct {
@@ -68,7 +69,7 @@ struct disasm_line {

static inline bool disasm_line__has_offset(const struct disasm_line *dl)
{
- return dl->ops.target.offset != UINT64_MAX;
+ return dl->ops.target.offset_avail;
}

void disasm_line__free(struct disasm_line *dl);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Dec 20, 2016, 12:10:10 PM12/20/16
to
From: Joe Stringer <j...@ovn.org>

This function was declared in libbpf.c and was the only remaining
function in this library, but has nothing to do with BPF. Shift it out
into a new header, sock_example.h, and include it from the relevant
samples.

Signed-off-by: Joe Stringer <j...@ovn.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/2016120902462...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
samples/bpf/Makefile | 2 +-
samples/bpf/fds_example.c | 1 +
samples/bpf/libbpf.h | 3 ---
samples/bpf/sock_example.c | 1 +
samples/bpf/{libbpf.c => sock_example.h} | 3 +--
samples/bpf/sockex1_user.c | 1 +
samples/bpf/sockex2_user.c | 1 +
samples/bpf/sockex3_user.c | 1 +
8 files changed, 7 insertions(+), 6 deletions(-)
rename samples/bpf/{libbpf.c => sock_example.h} (92%)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 5a73f5a7ace1..f01b66f277b0 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -36,7 +36,7 @@ hostprogs-y += lwt_len_hist
hostprogs-y += xdp_tx_iptunnel

# Libbpf dependencies
-LIBBPF := libbpf.o ../../tools/lib/bpf/bpf.o
+LIBBPF := ../../tools/lib/bpf/bpf.o

test_lru_dist-objs := test_lru_dist.o $(LIBBPF)
sock_example-objs := sock_example.o $(LIBBPF)
diff --git a/samples/bpf/fds_example.c b/samples/bpf/fds_example.c
index a5cddc99cccd..e29bd52ff9e8 100644
--- a/samples/bpf/fds_example.c
+++ b/samples/bpf/fds_example.c
@@ -14,6 +14,7 @@

#include "bpf_load.h"
#include "libbpf.h"
+#include "sock_example.h"

#define BPF_F_PIN (1 << 0)
#define BPF_F_GET (1 << 1)
diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
index 09aedc320009..3705fba453a0 100644
--- a/samples/bpf/libbpf.h
+++ b/samples/bpf/libbpf.h
@@ -185,7 +185,4 @@ struct bpf_insn;
.off = 0, \
.imm = 0 })

-/* create RAW socket and bind to interface 'name' */
-int open_raw_sock(const char *name);
-
#endif
diff --git a/samples/bpf/sock_example.c b/samples/bpf/sock_example.c
index 5546f8aac37e..6fc6e193ef1b 100644
--- a/samples/bpf/sock_example.c
+++ b/samples/bpf/sock_example.c
@@ -27,6 +27,7 @@
#include <linux/ip.h>
#include <stddef.h>
#include "libbpf.h"
+#include "sock_example.h"

char bpf_log_buf[BPF_LOG_BUF_SIZE];

diff --git a/samples/bpf/libbpf.c b/samples/bpf/sock_example.h
similarity index 92%
rename from samples/bpf/libbpf.c
rename to samples/bpf/sock_example.h
index bee473a494f1..09f7fe7e5fd7 100644
--- a/samples/bpf/libbpf.c
+++ b/samples/bpf/sock_example.h
@@ -1,4 +1,3 @@
-/* eBPF mini library */
#include <stdlib.h>
#include <stdio.h>
#include <linux/unistd.h>
@@ -11,7 +10,7 @@
#include <arpa/inet.h>
#include "libbpf.h"

-int open_raw_sock(const char *name)
+static inline int open_raw_sock(const char *name)
{
struct sockaddr_ll sll;
int sock;
diff --git a/samples/bpf/sockex1_user.c b/samples/bpf/sockex1_user.c
index 9454448bf198..6cd2feb3e9b3 100644
--- a/samples/bpf/sockex1_user.c
+++ b/samples/bpf/sockex1_user.c
@@ -3,6 +3,7 @@
#include <linux/bpf.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "sock_example.h"
#include <unistd.h>
#include <arpa/inet.h>

diff --git a/samples/bpf/sockex2_user.c b/samples/bpf/sockex2_user.c
index 6a40600d5a83..0e0207c90841 100644
--- a/samples/bpf/sockex2_user.c
+++ b/samples/bpf/sockex2_user.c
@@ -3,6 +3,7 @@
#include <linux/bpf.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "sock_example.h"
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/resource.h>
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index 9099c4255f23..b5524d417eb5 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -3,6 +3,7 @@
#include <linux/bpf.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include "sock_example.h"
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/resource.h>
--
2.9.3
It is loading more messages.
0 new messages