Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[GIT PULL 00/13] perf/core improvements and fixes

192 views
Skip to first unread message

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:05 PM1/11/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

* The following description will move to the end in the next pull requests *

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

The following changes since commit ad5013d5699d30ded0cdbbc68b93b2aa28222c6e:

perf/x86/intel: Use ULL constant to prevent undefined shift behaviour (2017-01-11 16:43:30 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170111

for you to fetch changes up to 675f52b23743f396c585fc9d135435be37f320d8:

tools: Sync x86's vmx.h with the kernel (2017-01-11 16:48:02 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Add more triggers to switch the output file (perf.data.TIMESTAMP).

Now, in addition to switching to a different output file when
receiving a SIGUSR2, one can also specify file size and time based
triggers:

perf record -a --switch-output=signal

is equivalent to what we had before:

perf record -a --switch-output

While we can also ask for the file to be "sliced" by size, taking
into account that that will happen only when we get woken up by
the kernel, i.e. one has to take into account the --mmap-pages (the
size of the perf mmap ring buffer):

perf record -a --switch-output=2G

will break the perf.data output into multiple files limited to 2GB
of samples, right when generating the output.

For time based samples, alert() will be used, so to have 1 minute
limited perf.data output files:

perf record -a --switch-output=1m

(Jiri Olsa)

- Remove the need to use -e only for syscalls and --event only for
tracepoints/HW/SW/etc events, i.e. now one can use:

perf trace -e nanosleep,futex,sched:sched_switch ./workload

or:

perf trace --event nanosleep,futex,sched:sched_switch ./workload

And have it tracing raw_syscalls:sys_{enter,exit} for the nanosleep
and futex syscalls, formatting those as strace does while also
tracing sched:sched_switch, ordering it all into one strace like
output.

Using '!' as the first character in the -e/--event argument remains
a way to negate the list of syscalls, i.e. all syscalls except for
the ones specified, doesn't affect the other kinds of events.

E.g:

[root@jouet ~] # perf trace -e sched:sched_switch,nanosleep usleep 1
0.000 ( 0.028 ms): usleep/28150 nanosleep(rqtp: 0x7ffe4201b9f0) ...
0.028 ( ): sched:sched_switch:usleep:28150 [120] S ==> swapper/0:0 [120])
0.000 ( 0.065 ms): usleep/28150 ... [continued]: nanosleep()) = 0
[root@jouet ~]#

(Arnaldo Carvalho de Melo)

- 'perf kallsyms' toy tool to look for extended symbol information on
the running kernel and demonstrate the machine/thread/symbol APIs for
use in other tools, such as 'perf probe' (Arnaldo Carvalho de Melo)

Infrastructure:

- Add missing linux/kernel.h include to subcmd.h (Arnaldo Carvalho de Melo)
tools: Sync x86's vmx.h with the kernel

- Create libdir directory before installing libperf-jvmti.so (Laura Abbott)

- Fix typo in perf_evlist__start_workload() (Soramichi Akiyama)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (5):
tools lib subcmd: Add missing linux/kernel.h include to subcmd.h
perf machine: Add a kallsyms loading constructor
perf kallsyms: Introduce tool to look for extended symbol information on the running kernel
perf trace: Allow specifying list of syscalls and events in -e/--expr/--event
tools: Sync x86's vmx.h with the kernel

Jiri Olsa (6):
perf tools: Add unit_number__scnprintf function
perf record: Add struct switch_output
perf record: Change switch-output option to take optional argument
perf record: Add switch-output size option argument
perf record: Add switch-output size warning
perf record: Add switch-output time option argument

Laura Abbott (1):
perf jvmti: Create libdir directory before installing libperf-jvmti.so

Soramichi Akiyama (1):
perf evlist: Fix typo in perf_evlist__start_workload()

tools/arch/x86/include/uapi/asm/vmx.h | 5 +
tools/lib/subcmd/parse-options.h | 1 +
tools/perf/Build | 1 +
tools/perf/Documentation/perf-kallsyms.txt | 24 +++++
tools/perf/Documentation/perf-record.txt | 14 ++-
tools/perf/Documentation/perf-trace.txt | 8 +-
tools/perf/Makefile.perf | 1 +
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-kallsyms.c | 67 +++++++++++++
tools/perf/builtin-record.c | 154 ++++++++++++++++++++++++++---
tools/perf/builtin-trace.c | 120 ++++++++++++++++------
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/unit_number__scnprintf.c | 37 +++++++
tools/perf/util/evlist.c | 12 ++-
tools/perf/util/evlist.h | 2 +
tools/perf/util/machine.c | 19 ++++
tools/perf/util/machine.h | 1 +
tools/perf/util/util.c | 13 +++
tools/perf/util/util.h | 1 +
24 files changed, 439 insertions(+), 52 deletions(-)
create mode 100644 tools/perf/Documentation/perf-kallsyms.txt
create mode 100644 tools/perf/builtin-kallsyms.c
create mode 100644 tools/perf/tests/unit_number__scnprintf.c

# uname -a
Linux jouet 4.9.0+ #2 SMP Wed Dec 21 11:54:44 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF prologue generation : Ok
37.3: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip
#

[root@jouet ~]# dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips64: Ok
12 debian:experimental-x-mipsel: Ok
13 fedora:20: Ok
14 fedora:21: Ok
15 fedora:22: Ok
16 fedora:23: Ok
17 fedora:24: Ok
18 fedora:24-x-ARC-uClibc: Ok
19 fedora:25: Ok
20 fedora:rawhide: Ok
21 mageia:5: Ok
22 opensuse:13.2: Ok
23 opensuse:42.1: Ok
24 opensuse:tumbleweed: Ok
25 ubuntu:12.04.5: Ok
26 ubuntu:14.04.4-x-linaro-arm64: Ok
27 ubuntu:15.10: Ok
28 ubuntu:16.04: Ok
29 ubuntu:16.04-x-arm: Ok
30 ubuntu:16.04-x-arm64: Ok
31 ubuntu:16.04-x-powerpc: Ok
32 ubuntu:16.04-x-powerpc64: Ok
33 ubuntu:16.04-x-powerpc64el: Ok
34 ubuntu:16.04-x-s390: Ok
35 ubuntu:16.10: Ok
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libaudit_O: make NO_LIBAUDIT=1
make_static_O: make LDFLAGS=-static
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_debug_O: make DEBUG=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_help_O: make help
make_doc_O: make doc
make_no_newt_O: make NO_NEWT=1
make_no_demangle_O: make NO_DEMANGLE=1
make_install_bin_O: make install-bin
make_no_libbpf_O: make NO_LIBBPF=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_perf_o_O: make perf.o
make_pure_O: make
make_clean_all_O: make clean all
make_no_libbionic_O: make NO_LIBBIONIC=1
make_install_O: make install
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_util_map_o_O: make util/map.o
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_slang_O: make NO_SLANG=1
make_tags_O: make tags
make_no_libelf_O: make NO_LIBELF=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:05 PM1/11/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Its similar to doing grep on a /proc/kallsyms, but it also shows extra
information like the path to the kernel module and the unrelocated
addresses in it, to help in diagnosing problems.

It is also helps demonstrate the use of the symbols routines so that
tool writers can use them more effectively.

Using it:

$ perf kallsyms e1000_xmit_frame netif_rx usb_stor_set_xfer_buf
e1000_xmit_frame: [e1000e] /lib/modules/4.9.0+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 0xffffffffc046fc10-0xffffffffc0470bb0 (0x19c80-0x1ac20)
netif_rx: [kernel] [kernel.kallsyms] 0xffffffff916f03a0-0xffffffff916f0410 (0xffffffff916f03a0-0xffffffff916f0410)
usb_stor_set_xfer_buf: [usb_storage] /lib/modules/4.9.0+/kernel/drivers/usb/storage/usb-storage.ko 0xffffffffc057aea0-0xffffffffc057af19 (0xf10-0xf89)
$

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-79bk9pakuj...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/Documentation/perf-kallsyms.txt | 24 +++++++++++
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-kallsyms.c | 67 ++++++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
7 files changed, 96 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/Documentation/perf-kallsyms.txt
create mode 100644 tools/perf/builtin-kallsyms.c

diff --git a/tools/perf/Build b/tools/perf/Build
index b12d5d1666e3..0b48806f93c2 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -7,6 +7,7 @@ perf-y += builtin-help.o
perf-y += builtin-sched.o
perf-y += builtin-buildid-list.o
perf-y += builtin-buildid-cache.o
+perf-y += builtin-kallsyms.o
perf-y += builtin-list.o
perf-y += builtin-record.o
perf-y += builtin-report.o
diff --git a/tools/perf/Documentation/perf-kallsyms.txt b/tools/perf/Documentation/perf-kallsyms.txt
new file mode 100644
index 000000000000..954ea9e21236
--- /dev/null
+++ b/tools/perf/Documentation/perf-kallsyms.txt
@@ -0,0 +1,24 @@
+perf-kallsyms(1)
+==============
+
+NAME
+----
+perf-kallsyms - Searches running kernel for symbols
+
+SYNOPSIS
+--------
+[verse]
+'perf kallsyms <options> symbol_name[,symbol_name...]'
+
+DESCRIPTION
+-----------
+This command searches the running kernel kallsyms file for the given symbol(s)
+and prints information about it, including the DSO, the kallsyms begin/end
+addresses and the addresses in the ELF kallsyms symbol table (for symbols in
+modules).
+
+OPTIONS
+-------
+-v::
+--verbose=::
+ Increase verbosity level, showing details about symbol table loading, etc.
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 3bdb2c78a21b..93da24a638be 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -434,7 +434,7 @@ int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
const char * const builtin_help_subcommands[] = {
"buildid-cache", "buildid-list", "diff", "evlist", "help", "list",
"record", "report", "bench", "stat", "timechart", "top", "annotate",
- "script", "sched", "kmem", "lock", "kvm", "test", "inject", "mem", "data",
+ "script", "sched", "kallsyms", "kmem", "lock", "kvm", "test", "inject", "mem", "data",
#ifdef HAVE_LIBELF_SUPPORT
"probe",
#endif
diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
new file mode 100644
index 000000000000..224bfc454b4a
--- /dev/null
+++ b/tools/perf/builtin-kallsyms.c
@@ -0,0 +1,67 @@
+/*
+ * builtin-kallsyms.c
+ *
+ * Builtin command: Look for a symbol in the running kernel and its modules
+ *
+ * Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <ac...@redhat.com>
+ *
+ * Released under the GPL v2. (and only v2, not any later version)
+ */
+#include "builtin.h"
+#include <linux/compiler.h>
+#include <subcmd/parse-options.h>
+#include "debug.h"
+#include "machine.h"
+#include "symbol.h"
+
+static int __cmd_kallsyms(int argc, const char **argv)
+{
+ int i;
+ struct machine *machine = machine__new_kallsyms();
+
+ if (machine == NULL) {
+ pr_err("Couldn't read /proc/kallsyms\n");
+ return -1;
+ }
+
+ for (i = 0; i < argc; ++i) {
+ struct map *map;
+ struct symbol *symbol = machine__find_kernel_function_by_name(machine, argv[i], &map);
+
+ if (symbol == NULL) {
+ printf("%s: not found\n", argv[i]);
+ continue;
+ }
+
+ printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ symbol->name, map->dso->short_name, map->dso->long_name,
+ map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+ symbol->start, symbol->end);
+ }
+
+ machine__delete(machine);
+ return 0;
+}
+
+int cmd_kallsyms(int argc, const char **argv, const char *prefix __maybe_unused)
+{
+ const struct option options[] = {
+ OPT_INCR('v', "verbose", &verbose, "be more verbose (show counter open errors, etc)"),
+ OPT_END()
+ };
+ const char * const kallsyms_usage[] = {
+ "perf kallsyms [<options>] symbol_name",
+ NULL
+ };
+
+ argc = parse_options(argc, argv, options, kallsyms_usage, 0);
+ if (argc < 1)
+ usage_with_options(kallsyms_usage, options);
+
+ symbol_conf.sort_by_name = true;
+ symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
+ if (symbol__init(NULL) < 0)
+ return -1;
+
+ return __cmd_kallsyms(argc, argv);
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 0bcf68e98ccc..b55f5be486a1 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -23,6 +23,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix);
int cmd_evlist(int argc, const char **argv, const char *prefix);
int cmd_help(int argc, const char **argv, const char *prefix);
int cmd_sched(int argc, const char **argv, const char *prefix);
+int cmd_kallsyms(int argc, const char **argv, const char *prefix);
int cmd_list(int argc, const char **argv, const char *prefix);
int cmd_record(int argc, const char **argv, const char *prefix);
int cmd_report(int argc, const char **argv, const char *prefix);
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index ab5cbaa170d0..fb45613dba9e 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -12,6 +12,7 @@ perf-diff mainporcelain common
perf-config mainporcelain common
perf-evlist mainporcelain common
perf-inject mainporcelain common
+perf-kallsyms mainporcelain common
perf-kmem mainporcelain common
perf-kvm mainporcelain common
perf-list mainporcelain common
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index aa23b3347d6b..13c8a7db055e 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -47,6 +47,7 @@ static struct cmd_struct commands[] = {
{ "diff", cmd_diff, 0 },
{ "evlist", cmd_evlist, 0 },
{ "help", cmd_help, 0 },
+ { "kallsyms", cmd_kallsyms, 0 },
{ "list", cmd_list, 0 },
{ "record", cmd_record, 0 },
{ "report", cmd_report, 0 },
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:06 PM1/11/17
to
From: Jiri Olsa <jo...@kernel.org>

Next patches will add --switch-output option arguments, changing the
option to allow that and adding its default value to 'signal'.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Acked-by: Wang Nan <wang...@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-4-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-record.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f7e805b30527..2bf811acaf8d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -48,6 +48,8 @@

struct switch_output {
bool signal;
+ const char *str;
+ bool set;
};

struct record {
@@ -1356,6 +1358,22 @@ static int record__parse_mmap_pages(const struct option *opt,
return ret;
}

+static int switch_output_setup(struct record *rec)
+{
+ struct switch_output *s = &rec->switch_output;
+
+ if (!s->set)
+ return 0;
+
+ if (!strcmp(s->str, "signal")) {
+ s->signal = true;
+ pr_debug("switch-output with SIGUSR2 signal\n");
+ return 0;
+ }
+
+ return -1;
+}
+
static const char * const __record_usage[] = {
"perf record [<options>] [<command>]",
"perf record [<options>] -- <command> [<options>]",
@@ -1523,8 +1541,9 @@ static struct option __record_options[] = {
"Record build-id of all DSOs regardless of hits"),
OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
"append timestamp to output filename"),
- OPT_BOOLEAN(0, "switch-output", &record.switch_output.signal,
- "Switch output when receive SIGUSR2"),
+ OPT_STRING_OPTARG_SET(0, "switch-output", &record.switch_output.str,
+ &record.switch_output.set, "signal",
+ "Switch output when receive SIGUSR2", "signal"),
OPT_BOOLEAN(0, "dry-run", &dry_run,
"Parse options then exit"),
OPT_END()
@@ -1582,6 +1601,11 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
return -EINVAL;
}

+ if (switch_output_setup(rec)) {
+ parse_options_usage(record_usage, record_options, "switch-output", 0);
+ return -EINVAL;
+ }
+
if (rec->switch_output.signal)
rec->timestamp_filename = true;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:06 PM1/11/17
to
From: Jiri Olsa <jo...@kernel.org>

Next patches will add more --switch-output option arguments,
so preparing the data holder.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Acked-by: Wang Nan <wang...@huawei.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-3-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-record.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4ec10e9427d9..f7e805b30527 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -46,6 +46,10 @@
#include <asm/bug.h>
#include <linux/time64.h>

+struct switch_output {
+ bool signal;
+};
+
struct record {
struct perf_tool tool;
struct record_opts opts;
@@ -62,7 +66,7 @@ struct record {
bool no_buildid_cache_set;
bool buildid_all;
bool timestamp_filename;
- bool switch_output;
+ struct switch_output switch_output;
unsigned long long samples;
};

@@ -842,11 +846,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
signal(SIGTERM, sig_handler);
signal(SIGSEGV, sigsegv_handler);

- if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
+ if (rec->opts.auxtrace_snapshot_mode || rec->switch_output.signal) {
signal(SIGUSR2, snapshot_sig_handler);
if (rec->opts.auxtrace_snapshot_mode)
trigger_on(&auxtrace_snapshot_trigger);
- if (rec->switch_output)
+ if (rec->switch_output.signal)
trigger_on(&switch_output_trigger);
} else {
signal(SIGUSR2, SIG_IGN);
@@ -1519,7 +1523,7 @@ static struct option __record_options[] = {
"Record build-id of all DSOs regardless of hits"),
OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
"append timestamp to output filename"),
- OPT_BOOLEAN(0, "switch-output", &record.switch_output,
+ OPT_BOOLEAN(0, "switch-output", &record.switch_output.signal,
"Switch output when receive SIGUSR2"),
OPT_BOOLEAN(0, "dry-run", &dry_run,
"Parse options then exit"),
@@ -1578,7 +1582,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
return -EINVAL;
}

- if (rec->switch_output)
+ if (rec->switch_output.signal)
rec->timestamp_filename = true;

if (!rec->itr) {
@@ -1629,7 +1633,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)

if (rec->no_buildid_cache || rec->no_buildid) {
disable_buildid_cache();
- } else if (rec->switch_output) {
+ } else if (rec->switch_output.signal) {
/*
* In 'perf record --switch-output', disable buildid
* generation by default to reduce data file switching
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:06 PM1/11/17
to
From: Jiri Olsa <jo...@kernel.org>

Adding switch-output size warning if the requested
size of lower than the wakeup ring buffer size.

$ perf record --switch-output=1K ls
WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes
...

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-6-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-record.c | 21 +++++++++++++++++++++
tools/perf/util/evlist.c | 2 +-
tools/perf/util/evlist.h | 2 ++
3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3fa64492ee62..93319e1be3ac 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1377,6 +1377,23 @@ static int record__parse_mmap_pages(const struct option *opt,
return ret;
}

+static void switch_output_size_warn(struct record *rec)
+{
+ u64 wakeup_size = perf_evlist__mmap_size(rec->opts.mmap_pages);
+ struct switch_output *s = &rec->switch_output;
+
+ wakeup_size /= 2;
+
+ if (s->size < wakeup_size) {
+ char buf[100];
+
+ unit_number__scnprintf(buf, sizeof(buf), wakeup_size);
+ pr_warning("WARNING: switch-output data size lower than "
+ "wakeup kernel buffer size (%s) "
+ "expect bigger perf.data sizes\n", buf);
+ }
+}
+
static int switch_output_setup(struct record *rec)
{
struct switch_output *s = &rec->switch_output;
@@ -1410,6 +1427,10 @@ static int switch_output_setup(struct record *rec)
enabled:
rec->timestamp_filename = true;
s->enabled = true;
+
+ if (s->size && !rec->opts.no_buffering)
+ switch_output_size_warn(rec);
+
return 0;
}

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc4df3d2660e..b601f2814a30 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1184,7 +1184,7 @@ unsigned long perf_event_mlock_kb_in_pages(void)
return pages;
}

-static size_t perf_evlist__mmap_size(unsigned long pages)
+size_t perf_evlist__mmap_size(unsigned long pages)
{
if (pages == UINT_MAX)
pages = perf_event_mlock_kb_in_pages();
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 4fd034f22d2f..389b9ccdf8c7 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -218,6 +218,8 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
bool overwrite);
void perf_evlist__munmap(struct perf_evlist *evlist);

+size_t perf_evlist__mmap_size(unsigned long pages);
+
void perf_evlist__disable(struct perf_evlist *evlist);
void perf_evlist__enable(struct perf_evlist *evlist);
void perf_evlist__toggle_enable(struct perf_evlist *evlist);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:06 PM1/11/17
to
From: Jiri Olsa <jo...@kernel.org>

It's now possible to specify the threshold time for perf.data like:

$ perf record --switch-output=30s ...

Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.

$ perf record --switch-output=30s ...
[ perf record: dump data: Woken up 44 times ]
[ perf record: Dump perf.data.2017010213043746 ]
...

The time is expected to be a number with appended unit
character - s/m/h/d.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Wang Nan <wang...@huawei.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-7-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-record.txt | 2 ++
tools/perf/builtin-record.c | 44 ++++++++++++++++++++++++++++++--
2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 3d55d2fd48b3..27256bc68eda 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -427,6 +427,8 @@ based on 'mode' value:
"signal" - when receiving a SIGUSR2 (default value) or
<size> - when reaching the size threshold, size is expected to
be a number with appended unit character - B/K/M/G
+ <time> - when reaching the time threshold, size is expected to
+ be a number with appended unit character - s/m/h/d

Note: the precision of the size threshold hugely depends
on your configuration - the number and size of your ring
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 93319e1be3ac..33a9eaaf9db4 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,6 +50,7 @@ struct switch_output {
bool enabled;
bool signal;
unsigned long size;
+ unsigned long time;
const char *str;
bool set;
};
@@ -91,6 +92,12 @@ static bool switch_output_size(struct record *rec)
(rec->bytes_written >= rec->switch_output.size);
}

+static bool switch_output_time(struct record *rec)
+{
+ return rec->switch_output.time &&
+ trigger_is_ready(&switch_output_trigger);
+}
+
static int record__write(struct record *rec, void *bf, size_t size)
{
if (perf_data_file__write(rec->session->file, bf, size) < 0) {
@@ -737,6 +744,7 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
}

static void snapshot_sig_handler(int sig);
+static void alarm_sig_handler(int sig);

int __weak
perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused,
@@ -1068,6 +1076,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
err = fd;
goto out_child;
}
+
+ /* re-arm the alarm */
+ if (rec->switch_output.time)
+ alarm(rec->switch_output.time);
}

if (hits == rec->samples) {
@@ -1404,6 +1416,13 @@ static int switch_output_setup(struct record *rec)
{ .tag = 'G', .mult = 1 << 30 },
{ .tag = 0 },
};
+ static struct parse_tag tags_time[] = {
+ { .tag = 's', .mult = 1 },
+ { .tag = 'm', .mult = 60 },
+ { .tag = 'h', .mult = 60*60 },
+ { .tag = 'd', .mult = 60*60*24 },
+ { .tag = 0 },
+ };
unsigned long val;

if (!s->set)
@@ -1422,6 +1441,14 @@ static int switch_output_setup(struct record *rec)
goto enabled;
}

+ val = parse_tag_value(s->str, tags_time);
+ if (val != (unsigned long) -1) {
+ s->time = val;
+ pr_debug("switch-output with %s time threshold (%lu seconds)\n",
+ s->str, s->time);
+ goto enabled;
+ }
+
return -1;

enabled:
@@ -1602,8 +1629,8 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
"append timestamp to output filename"),
OPT_STRING_OPTARG_SET(0, "switch-output", &record.switch_output.str,
- &record.switch_output.set, "signal,size",
- "Switch output when receive SIGUSR2 or cross size threshold",
+ &record.switch_output.set, "signal,size,time",
+ "Switch output when receive SIGUSR2 or cross size,time threshold",
"signal"),
OPT_BOOLEAN(0, "dry-run", &dry_run,
"Parse options then exit"),
@@ -1667,6 +1694,11 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
return -EINVAL;
}

+ if (rec->switch_output.time) {
+ signal(SIGALRM, alarm_sig_handler);
+ alarm(rec->switch_output.time);
+ }
+
if (!rec->itr) {
rec->itr = auxtrace_record__init(rec->evlist, &err);
if (err)
@@ -1819,3 +1851,11 @@ static void snapshot_sig_handler(int sig __maybe_unused)
if (switch_output_signal(rec))
trigger_hit(&switch_output_trigger);
}
+
+static void alarm_sig_handler(int sig __maybe_unused)
+{
+ struct record *rec = &record;
+
+ if (switch_output_time(rec))
+ trigger_hit(&switch_output_trigger);
+}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:09 PM1/11/17
to
From: Laura Abbott <lab...@redhat.com>

The install command for libperf-jvmti.so does not check if libdir exists
before installing. This means that when the install command is run:

install libperf-jvmti.so '/tmp/test_root/usr/lib64';

libperf-jvmti.so will get installed to /usr/lib64 as a file and break
further installation. Fix this by ensuring the directory gets created
first.

See https://bugzilla.redhat.com/show_bug.cgi?id=1410296

Signed-off-by: Laura Abbott <lab...@redhat.com>
Acked-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Fixes: d4dfdf00d43e ("perf jvmti: Plug compilation into perf build")
Link: http://lkml.kernel.org/r/1483741088-13543-1-g...@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.perf | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 8bb16aa9d661..4da19b6ba94a 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -661,6 +661,7 @@ ifndef NO_PERF_READ_VDSOX32
endif
ifndef NO_JVMTI
$(call QUIET_INSTALL, $(LIBJVMTI)) \
+ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(libdir_SQ)'; \
$(INSTALL) $(OUTPUT)$(LIBJVMTI) '$(DESTDIR_SQ)$(libdir_SQ)';
endif
$(call QUIET_INSTALL, libexec) \
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 11, 2017, 3:30:12 PM1/11/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To reduce the boilerplate for searching for functions in the running
kernel and modules.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-93iqzayafp...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/machine.c | 19 +++++++++++++++++++
tools/perf/util/machine.h | 1 +
2 files changed, 20 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9b33bef54581..747a034d1ff3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -87,6 +87,25 @@ struct machine *machine__new_host(void)
return NULL;
}

+struct machine *machine__new_kallsyms(void)
+{
+ struct machine *machine = machine__new_host();
+ /*
+ * FIXME:
+ * 1) MAP__FUNCTION will go away when we stop loading separate maps for
+ * functions and data objects.
+ * 2) We should switch to machine__load_kallsyms(), i.e. not explicitely
+ * ask for not using the kcore parsing code, once this one is fixed
+ * to create a map per module.
+ */
+ if (machine && __machine__load_kallsyms(machine, "/proc/kallsyms", MAP__FUNCTION, true) <= 0) {
+ machine__delete(machine);
+ machine = NULL;
+ }
+
+ return machine;
+}
+
static void dsos__purge(struct dsos *dsos)
{
struct dso *pos, *n;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 354de6e56109..a28305029711 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -129,6 +129,7 @@ char *machine__mmap_name(struct machine *machine, char *bf, size_t size);
void machines__set_comm_exec(struct machines *machines, bool comm_exec);

struct machine *machine__new_host(void);
+struct machine *machine__new_kallsyms(void);
int machine__init(struct machine *machine, const char *root_dir, pid_t pid);
void machine__exit(struct machine *machine);
void machine__delete_threads(struct machine *machine);
--
2.9.3

Ingo Molnar

unread,
Jan 12, 2017, 3:30:04 AM1/12/17
to
Pulled, thanks a lot Arnaldo!

Ingo

tip-bot for Arnaldo Carvalho de Melo

unread,
Jan 12, 2017, 3:30:04 AM1/12/17
to
Commit-ID: 7d132caaf9392853ad637c8e6e53333cbeb99aa5
Gitweb: http://git.kernel.org/tip/7d132caaf9392853ad637c8e6e53333cbeb99aa5
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:31:48 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 11 Jan 2017 16:48:00 -0300

perf machine: Add a kallsyms loading constructor

To reduce the boilerplate for searching for functions in the running
kernel and modules.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-93iqzayafp...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/machine.c | 19 +++++++++++++++++++
tools/perf/util/machine.h | 1 +
2 files changed, 20 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9b33bef..747a034 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -87,6 +87,25 @@ out_delete:
return NULL;
}

+struct machine *machine__new_kallsyms(void)
+{
+ struct machine *machine = machine__new_host();
+ /*
+ * FIXME:
+ * 1) MAP__FUNCTION will go away when we stop loading separate maps for
+ * functions and data objects.
+ * 2) We should switch to machine__load_kallsyms(), i.e. not explicitely
+ * ask for not using the kcore parsing code, once this one is fixed
+ * to create a map per module.
+ */
+ if (machine && __machine__load_kallsyms(machine, "/proc/kallsyms", MAP__FUNCTION, true) <= 0) {
+ machine__delete(machine);
+ machine = NULL;
+ }
+
+ return machine;
+}
+
static void dsos__purge(struct dsos *dsos)
{
struct dso *pos, *n;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 354de6e..a283050 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Jan 12, 2017, 3:40:05 AM1/12/17
to
Commit-ID: 355637717d575f14169954c3ed31536d45778f08
Gitweb: http://git.kernel.org/tip/355637717d575f14169954c3ed31536d45778f08
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:33:32 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 11 Jan 2017 16:48:01 -0300

perf kallsyms: Introduce tool to look for extended symbol information on the running kernel

Its similar to doing grep on a /proc/kallsyms, but it also shows extra
information like the path to the kernel module and the unrelocated
addresses in it, to help in diagnosing problems.

It is also helps demonstrate the use of the symbols routines so that
tool writers can use them more effectively.

Using it:

$ perf kallsyms e1000_xmit_frame netif_rx usb_stor_set_xfer_buf
e1000_xmit_frame: [e1000e] /lib/modules/4.9.0+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 0xffffffffc046fc10-0xffffffffc0470bb0 (0x19c80-0x1ac20)
netif_rx: [kernel] [kernel.kallsyms] 0xffffffff916f03a0-0xffffffff916f0410 (0xffffffff916f03a0-0xffffffff916f0410)
usb_stor_set_xfer_buf: [usb_storage] /lib/modules/4.9.0+/kernel/drivers/usb/storage/usb-storage.ko 0xffffffffc057aea0-0xffffffffc057af19 (0xf10-0xf89)
$

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-79bk9pakuj...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/Documentation/perf-kallsyms.txt | 24 +++++++++++
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-kallsyms.c | 67 ++++++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
7 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Build b/tools/perf/Build
index b12d5d1..0b48806 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -7,6 +7,7 @@ perf-y += builtin-help.o
perf-y += builtin-sched.o
perf-y += builtin-buildid-list.o
perf-y += builtin-buildid-cache.o
+perf-y += builtin-kallsyms.o
perf-y += builtin-list.o
perf-y += builtin-record.o
perf-y += builtin-report.o
diff --git a/tools/perf/Documentation/perf-kallsyms.txt b/tools/perf/Documentation/perf-kallsyms.txt
new file mode 100644
index 0000000..954ea9e
index 3bdb2c7..93da24a 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -434,7 +434,7 @@ int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
const char * const builtin_help_subcommands[] = {
"buildid-cache", "buildid-list", "diff", "evlist", "help", "list",
"record", "report", "bench", "stat", "timechart", "top", "annotate",
- "script", "sched", "kmem", "lock", "kvm", "test", "inject", "mem", "data",
+ "script", "sched", "kallsyms", "kmem", "lock", "kvm", "test", "inject", "mem", "data",
#ifdef HAVE_LIBELF_SUPPORT
"probe",
#endif
diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
new file mode 100644
index 0000000..224bfc4
+ return -1;
+ }
+
+ for (i = 0; i < argc; ++i) {
+ struct map *map;
+ struct symbol *symbol = machine__find_kernel_function_by_name(machine, argv[i], &map);
+
+ if (symbol == NULL) {
+ printf("%s: not found\n", argv[i]);
+ continue;
+ }
+
+ printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
+ symbol->name, map->dso->short_name, map->dso->long_name,
+ map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+ symbol->start, symbol->end);
+ }
+
+ machine__delete(machine);
+ return 0;
+}
+
+int cmd_kallsyms(int argc, const char **argv, const char *prefix __maybe_unused)
+{
+ const struct option options[] = {
+ OPT_INCR('v', "verbose", &verbose, "be more verbose (show counter open errors, etc)"),
+ OPT_END()
+ };
+ const char * const kallsyms_usage[] = {
+ "perf kallsyms [<options>] symbol_name",
+ NULL
+ };
+
+ argc = parse_options(argc, argv, options, kallsyms_usage, 0);
+ if (argc < 1)
+ usage_with_options(kallsyms_usage, options);
+
+ symbol_conf.sort_by_name = true;
+ symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
+ if (symbol__init(NULL) < 0)
+ return -1;
+
+ return __cmd_kallsyms(argc, argv);
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 0bcf68e..b55f5be 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -23,6 +23,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix);
int cmd_evlist(int argc, const char **argv, const char *prefix);
int cmd_help(int argc, const char **argv, const char *prefix);
int cmd_sched(int argc, const char **argv, const char *prefix);
+int cmd_kallsyms(int argc, const char **argv, const char *prefix);
int cmd_list(int argc, const char **argv, const char *prefix);
int cmd_record(int argc, const char **argv, const char *prefix);
int cmd_report(int argc, const char **argv, const char *prefix);
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index ab5cbaa..fb45613 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -12,6 +12,7 @@ perf-diff mainporcelain common
perf-config mainporcelain common
perf-evlist mainporcelain common
perf-inject mainporcelain common
+perf-kallsyms mainporcelain common
perf-kmem mainporcelain common
perf-kvm mainporcelain common
perf-list mainporcelain common
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index aa23b33..13c8a7d 100644

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:08 AM1/17/17
to
From: Namhyung Kim <namh...@kernel.org>

Separate thread wait time into 3 parts - sleep, iowait and preempt based
on the prev_state of the last event.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523....@kernel.org
[ Fix the build on centos:5 where 'wait' shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 50 ++++++++++++++++++++++++++++++++++++++++------
1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5b134b0d1ff3..6d3c3e84881a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -77,6 +77,22 @@ struct sched_atom {

#define TASK_STATE_TO_CHAR_STR "RSDTtZXxKWP"

+/* task state bitmask, copied from include/linux/sched.h */
+#define TASK_RUNNING 0
+#define TASK_INTERRUPTIBLE 1
+#define TASK_UNINTERRUPTIBLE 2
+#define __TASK_STOPPED 4
+#define __TASK_TRACED 8
+/* in tsk->exit_state */
+#define EXIT_DEAD 16
+#define EXIT_ZOMBIE 32
+#define EXIT_TRACE (EXIT_ZOMBIE | EXIT_DEAD)
+/* in tsk->state again */
+#define TASK_DEAD 64
+#define TASK_WAKEKILL 128
+#define TASK_WAKING 256
+#define TASK_PARKED 512
+
enum thread_state {
THREAD_SLEEPING = 0,
THREAD_WAIT_CPU,
@@ -216,13 +232,16 @@ struct perf_sched {
struct thread_runtime {
u64 last_time; /* time of previous sched in/out event */
u64 dt_run; /* run time */
- u64 dt_wait; /* time between CPU access (off cpu) */
+ u64 dt_sleep; /* time between CPU access by sleep (off cpu) */
+ u64 dt_iowait; /* time between CPU access by iowait (off cpu) */
+ u64 dt_preempt; /* time between CPU access by preempt (off cpu) */
u64 dt_delay; /* time between wakeup and sched-in */
u64 ready_to_run; /* time of wakeup */

struct stats run_stats;
u64 total_run_time;

+ int last_state;
u64 migrations;
};

@@ -1858,6 +1877,7 @@ static void timehist_print_sample(struct perf_sched *sched,
struct thread_runtime *tr = thread__priv(thread);
u32 max_cpus = sched->max_cpu + 1;
char tstr[64];
+ u64 wait_time;

timestamp__scnprintf_usec(t, tstr, sizeof(tstr));
printf("%15s [%04d] ", tstr, sample->cpu);
@@ -1880,7 +1900,9 @@ static void timehist_print_sample(struct perf_sched *sched,

printf(" %-*s ", comm_width, timehist_get_commstr(thread));

- print_sched_time(tr->dt_wait, 6);
+ wait_time = tr->dt_sleep + tr->dt_iowait + tr->dt_preempt;
+ print_sched_time(wait_time, 6);
+
print_sched_time(tr->dt_delay, 6);
print_sched_time(tr->dt_run, 6);

@@ -1930,8 +1952,11 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
u64 t, u64 tprev)
{
r->dt_delay = 0;
- r->dt_wait = 0;
+ r->dt_sleep = 0;
+ r->dt_iowait = 0;
+ r->dt_preempt = 0;
r->dt_run = 0;
+
if (tprev) {
r->dt_run = t - tprev;
if (r->ready_to_run) {
@@ -1943,8 +1968,16 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,

if (r->last_time > tprev)
pr_debug("time travel: last sched out time for task > previous sched_switch event\n");
- else if (r->last_time)
- r->dt_wait = tprev - r->last_time;
+ else if (r->last_time) {
+ u64 dt_wait = tprev - r->last_time;
+
+ if (r->last_state == TASK_RUNNING)
+ r->dt_preempt = dt_wait;
+ else if (r->last_state == TASK_UNINTERRUPTIBLE)
+ r->dt_iowait = dt_wait;
+ else
+ r->dt_sleep = dt_wait;
+ }
}

update_stats(&r->run_stats, r->dt_run);
@@ -2447,8 +2480,10 @@ static int timehist_sched_change_event(struct perf_tool *tool,
* time. we only care total run time and run stat.
*/
last_tr->dt_run = 0;
- last_tr->dt_wait = 0;
last_tr->dt_delay = 0;
+ last_tr->dt_sleep = 0;
+ last_tr->dt_iowait = 0;
+ last_tr->dt_preempt = 0;

if (itr->cursor.nr)
callchain_append(&itr->callchain, &itr->cursor, t - tprev);
@@ -2470,6 +2505,9 @@ static int timehist_sched_change_event(struct perf_tool *tool,
/* time of this sched_switch event becomes last time task seen */
tr->last_time = sample->time;

+ /* last state is used to determine where to account wait time */
+ tr->last_state = perf_evsel__intval(evsel, sample, "prev_state");
+
/* sched out event for task so reset ready to run time */
tr->ready_to_run = 0;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:08 AM1/17/17
to
From: David Carrillo-Cisneros <dav...@google.com>

Don't warn for feature-dwarf==0 if user explicitily disabled DWARF by
using NO_DWARF=1.

Signed-off-by: David Carrillo-Cisneros <dav...@google.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <era...@google.com>
Link: http://lkml.kernel.org/r/20170112210159....@google.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Makefile.config | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 76c84f0eec52..03cf947755b9 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -291,8 +291,10 @@ else
endif
endif
ifneq ($(feature-dwarf), 1)
- msg := $(warning No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev);
- NO_DWARF := 1
+ ifndef NO_DWARF
+ msg := $(warning No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev);
+ NO_DWARF := 1
+ endif
else
ifneq ($(feature-dwarf_getlocations), 1)
msg := $(warning Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:08 AM1/17/17
to
From: Namhyung Kim <namh...@kernel.org>

When --state option is given, the summary will show total run, sleep,
iowait, preempt and delay time instead of statistics of runtime.

$ perf sched timehist -s --state

Wait-time summary
comm parent sched-in run-time sleep iowait preempt delay
(count) (msec) (msec) (msec) (msec) (msec)
---------------------------------------------------------------------
systemd[1] 0 3 0.497 1.685 0.000 0.000 0.061
ksoftirqd/0[3] 2 21 0.434 989.948 0.000 0.000 0.325
rcu_preempt[7] 2 28 0.386 993.211 0.000 0.000 0.712
migration/0[10] 2 12 0.126 50.174 0.000 0.000 0.044
watchdog/0[11] 2 1 0.009 0.000 0.000 0.000 0.016
migration/1[13] 2 2 0.029 11.755 0.000 0.000 0.007
<SNIP>

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-sched.c | 44 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index a8ac76602187..daceb3202200 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -241,6 +241,10 @@ struct thread_runtime {

struct stats run_stats;
u64 total_run_time;
+ u64 total_sleep_time;
+ u64 total_iowait_time;
+ u64 total_preempt_time;
+ u64 total_delay_time;

int last_state;
u64 migrations;
@@ -2008,7 +2012,12 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
}

update_stats(&r->run_stats, r->dt_run);
- r->total_run_time += r->dt_run;
+
+ r->total_run_time += r->dt_run;
+ r->total_delay_time += r->dt_delay;
+ r->total_sleep_time += r->dt_sleep;
+ r->total_iowait_time += r->dt_iowait;
+ r->total_preempt_time += r->dt_preempt;
}

static bool is_idle_sample(struct perf_sample *sample,
@@ -2593,7 +2602,26 @@ static void print_thread_runtime(struct thread *t,
printf("\n");
}

+static void print_thread_waittime(struct thread *t,
+ struct thread_runtime *r)
+{
+ printf("%*s %5d %9" PRIu64 " ",
+ comm_width, timehist_get_commstr(t), t->ppid,
+ (u64) r->run_stats.n);
+
+ print_sched_time(r->total_run_time, 8);
+ print_sched_time(r->total_sleep_time, 6);
+ printf(" ");
+ print_sched_time(r->total_iowait_time, 6);
+ printf(" ");
+ print_sched_time(r->total_preempt_time, 6);
+ printf(" ");
+ print_sched_time(r->total_delay_time, 6);
+ printf("\n");
+}
+
struct total_run_stats {
+ struct perf_sched *sched;
u64 sched_count;
u64 task_count;
u64 total_run_time;
@@ -2612,7 +2640,11 @@ static int __show_thread_runtime(struct thread *t, void *priv)
stats->task_count++;
stats->sched_count += r->run_stats.n;
stats->total_run_time += r->total_run_time;
- print_thread_runtime(t, r);
+
+ if (stats->sched->show_state)
+ print_thread_waittime(t, r);
+ else
+ print_thread_runtime(t, r);
}

return 0;
@@ -2700,18 +2732,24 @@ static void timehist_print_summary(struct perf_sched *sched,
u64 hist_time = sched->hist_time.end - sched->hist_time.start;

memset(&totals, 0, sizeof(totals));
+ totals.sched = sched;

if (sched->idle_hist) {
printf("\nIdle-time summary\n");
printf("%*s parent sched-out ", comm_width, "comm");
printf(" idle-time min-idle avg-idle max-idle stddev migrations\n");
+ } else if (sched->show_state) {
+ printf("\nWait-time summary\n");
+ printf("%*s parent sched-in ", comm_width, "comm");
+ printf(" run-time sleep iowait preempt delay\n");
} else {
printf("\nRuntime summary\n");
printf("%*s parent sched-in ", comm_width, "comm");
printf(" run-time min-run avg-run max-run stddev migrations\n");
}
printf("%*s (count) ", comm_width, "");
- printf(" (msec) (msec) (msec) (msec) %%\n");
+ printf(" (msec) (msec) (msec) (msec) %s\n",
+ sched->show_state ? "(msec)" : "%");
printf("%.117s\n", graph_dotted_line);

machine__for_each_thread(m, show_thread_runtime, &totals);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:08 AM1/17/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 5b485629ba0d5d027880769ff467c587b24b4bde:

kprobes, extable: Identify kprobes trampolines as kernel text area (2017-01-14 08:38:05 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170117

for you to fetch changes up to d94386f28abad0c5879f0760712e34e71f88a7da:

perf evlist: Fix typo in deliver_sample() (2017-01-17 11:36:45 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New feature:

- Account thread wait time (off cpu time) separately: sleep, iowait and
preempt, based on the prev_state of the last event, show the breakdown
when using "perf sched timehist --state" (Namhyumg Kim)

Infrastructure:

- Factor out pmu scale conversion code (Andi Kleen)

- Remove unnecessary feature-dwarf warning (David Carrillo-Cisneros)

- Add missing member name in OPT_() macros (Soramichi AKIYAMA)

- Move variables referenced in libperf.a object files from perf's main()
file, so that other tools can use libperf.a with a different main()
(Soramichi AKIYAMA)

Documentation:

- Fix 'perf script' man page about --dump-raw-trace option (Michael Petlan)

- Also allow forcing reading of non-root owned files by root in 'perf
script' (Yannick Brosseau)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Andi Kleen (1):
perf pmu: Factor out scale conversion code

David Carrillo-Cisneros (1):
perf tools: Remove unneccessary feature-dwarf warning

Michael Petlan (1):
perf script: Fix man page about --dump-raw-trace option

Namhyung Kim (3):
perf sched timehist: Account thread wait time separately
perf sched timehist: Add --state option
perf sched timehist: Show total wait times for summary

Soramichi AKIYAMA (3):
tools lib subcmd: Fix missing member name
perf tools: Move two variables usied in libperf from perf.c
perf evlist: Fix typo in deliver_sample()

Yannick Brosseau (1):
perf script: Also allow forcing reading of non-root owned files by root

tools/lib/subcmd/parse-options.h | 18 ++---
tools/perf/Build | 3 +-
tools/perf/Documentation/perf-sched.txt | 2 +
tools/perf/Documentation/perf-script.txt | 4 +-
tools/perf/Makefile.config | 6 +-
tools/perf/builtin-sched.c | 130 ++++++++++++++++++++++++++++---
tools/perf/builtin-script.c | 3 +-
tools/perf/perf.c | 3 -
tools/perf/ui/setup.c | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/header.c | 2 +
tools/perf/util/pmu.c | 62 ++++++++-------
tools/perf/util/session.c | 2 +-
13 files changed, 177 insertions(+), 60 deletions(-)

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips: Ok
12 debian:experimental-x-mips64: Ok
13 debian:experimental-x-mipsel: Ok
14 fedora:20: Ok
15 fedora:21: Ok
16 fedora:22: Ok
17 fedora:23: Ok
18 fedora:24: Ok
19 fedora:24-x-ARC-uClibc: Ok
20 fedora:25: Ok
21 fedora:rawhide: Ok
22 mageia:5: Ok
23 opensuse:13.2: Ok
24 opensuse:42.1: Ok
25 opensuse:tumbleweed: Ok
26 ubuntu:12.04.5: Ok
27 ubuntu:14.04.4-x-linaro-arm64: Ok
28 ubuntu:15.10: Ok
29 ubuntu:16.04: Ok
30 ubuntu:16.04-x-arm: Ok
31 ubuntu:16.04-x-arm64: Ok
32 ubuntu:16.04-x-powerpc: Ok
33 ubuntu:16.04-x-powerpc64: Ok
34 ubuntu:16.04-x-powerpc64el: Ok
35 ubuntu:16.04-x-s390: Ok
36 ubuntu:16.10: Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
make_no_libperl_O: make NO_LIBPERL=1
make_doc_O: make doc
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_help_O: make help
make_no_gtk2_O: make NO_GTK2=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_clean_all_O: make clean all
make_install_bin_O: make install-bin
make_no_libbionic_O: make NO_LIBBIONIC=1
make_install_O: make install
make_debug_O: make DEBUG=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_install_prefix_O: make install prefix=/tmp/krava
make_perf_o_O: make perf.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_slang_O: make NO_SLANG=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_newt_O: make NO_NEWT=1
make_no_libbpf_O: make NO_LIBBPF=1
make_util_map_o_O: make util/map.o
make_static_O: make LDFLAGS=-static
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_tags_O: make tags
make_pure_O: make
make_no_libnuma_O: make NO_LIBNUMA=1

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:09 AM1/17/17
to
From: Michael Petlan <mpe...@redhat.com>

The "--dump-raw-script" is not a valid option, replace it with the valid
one, "--dump-raw-trace"

Signed-off-by: Michael Petlan <mpe...@redhat.com>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Thomas Gleixner <tg...@linutronix.de>
Fixes: 133dc4c39c57 ("perf: Rename 'perf trace' to 'perf script'")
LPU-Reference: 728644547.14560155.14843...@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 5dc5c6a09ac4..4ed5f239ba7d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -36,7 +36,7 @@ There are several variants of perf script:

'perf script report <script> [args]' to run and display the results
of <script>. <script> is the name displayed in the output of 'perf
- trace --list' i.e. the actual script name minus any language
+ script --list' i.e. the actual script name minus any language
extension. The perf.data output from a previous run of 'perf script
record <script>' is used and should be present for this command to
succeed. [args] refers to the (mainly optional) args expected by
@@ -76,7 +76,7 @@ OPTIONS
Any command you can specify in a shell.

-D::
---dump-raw-script=::
+--dump-raw-trace=::
Display verbose dump of the trace data.

-L::
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:10 AM1/17/17
to
From: Soramichi AKIYAMA <aki...@m.soramichi.jp>

This patch adds missing member names to struct initializations.

Although in C99 for struct S {int x, int y} two init codes struct S s =
{.x = (a), (b)} and struct S s = {.x = (a), .y = (b)} are the same, it
is better to explicitly write .y (.argh in this patch) for readability
and robustness against language/compiler evolutions.

Signed-off-by: Soramichi Akiyama <aki...@m.soramichi.jp>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170113215623.32fb...@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/subcmd/parse-options.h | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/lib/subcmd/parse-options.h b/tools/lib/subcmd/parse-options.h
index 37e2d1a6fc2a..f054ca1b899d 100644
--- a/tools/lib/subcmd/parse-options.h
+++ b/tools/lib/subcmd/parse-options.h
@@ -133,32 +133,32 @@ struct option {
#define OPT_UINTEGER(s, l, v, h) { .type = OPTION_UINTEGER, .short_name = (s), .long_name = (l), .value = check_vtype(v, unsigned int *), .help = (h) }
#define OPT_LONG(s, l, v, h) { .type = OPTION_LONG, .short_name = (s), .long_name = (l), .value = check_vtype(v, long *), .help = (h) }
#define OPT_U64(s, l, v, h) { .type = OPTION_U64, .short_name = (s), .long_name = (l), .value = check_vtype(v, u64 *), .help = (h) }
-#define OPT_STRING(s, l, v, a, h) { .type = OPTION_STRING, .short_name = (s), .long_name = (l), .value = check_vtype(v, const char **), (a), .help = (h) }
+#define OPT_STRING(s, l, v, a, h) { .type = OPTION_STRING, .short_name = (s), .long_name = (l), .value = check_vtype(v, const char **), .argh = (a), .help = (h) }
#define OPT_STRING_OPTARG(s, l, v, a, h, d) \
{ .type = OPTION_STRING, .short_name = (s), .long_name = (l), \
- .value = check_vtype(v, const char **), (a), .help = (h), \
+ .value = check_vtype(v, const char **), .argh =(a), .help = (h), \
.flags = PARSE_OPT_OPTARG, .defval = (intptr_t)(d) }
#define OPT_STRING_OPTARG_SET(s, l, v, os, a, h, d) \
{ .type = OPTION_STRING, .short_name = (s), .long_name = (l), \
- .value = check_vtype(v, const char **), (a), .help = (h), \
+ .value = check_vtype(v, const char **), .argh = (a), .help = (h), \
.flags = PARSE_OPT_OPTARG, .defval = (intptr_t)(d), \
.set = check_vtype(os, bool *)}
-#define OPT_STRING_NOEMPTY(s, l, v, a, h) { .type = OPTION_STRING, .short_name = (s), .long_name = (l), .value = check_vtype(v, const char **), (a), .help = (h), .flags = PARSE_OPT_NOEMPTY}
+#define OPT_STRING_NOEMPTY(s, l, v, a, h) { .type = OPTION_STRING, .short_name = (s), .long_name = (l), .value = check_vtype(v, const char **), .argh = (a), .help = (h), .flags = PARSE_OPT_NOEMPTY}
#define OPT_DATE(s, l, v, h) \
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = "time", .help = (h), .callback = parse_opt_approxidate_cb }
#define OPT_CALLBACK(s, l, v, a, h, f) \
- { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), (a), .help = (h), .callback = (f) }
+ { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f) }
#define OPT_CALLBACK_NOOPT(s, l, v, a, h, f) \
- { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), (a), .help = (h), .callback = (f), .flags = PARSE_OPT_NOARG }
+ { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f), .flags = PARSE_OPT_NOARG }
#define OPT_CALLBACK_DEFAULT(s, l, v, a, h, f, d) \
- { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), (a), .help = (h), .callback = (f), .defval = (intptr_t)d, .flags = PARSE_OPT_LASTARG_DEFAULT }
+ { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f), .defval = (intptr_t)d, .flags = PARSE_OPT_LASTARG_DEFAULT }
#define OPT_CALLBACK_DEFAULT_NOOPT(s, l, v, a, h, f, d) \
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l),\
- .value = (v), (a), .help = (h), .callback = (f), .defval = (intptr_t)d,\
+ .value = (v), .arg = (a), .help = (h), .callback = (f), .defval = (intptr_t)d,\
.flags = PARSE_OPT_LASTARG_DEFAULT | PARSE_OPT_NOARG}
#define OPT_CALLBACK_OPTARG(s, l, v, d, a, h, f) \
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), \
- .value = (v), (a), .help = (h), .callback = (f), \
+ .value = (v), .argh = (a), .help = (h), .callback = (f), \
.flags = PARSE_OPT_OPTARG, .data = (d) }

/* parse_options() will filter out the processed options and leave the
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:10 AM1/17/17
to
From: Andi Kleen <a...@linux.intel.com>

Move the scale factor parsing code to an own function to reuse it in an
upcoming patch.

v2: Return error in case strdup returns NULL.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017010315083...@firstfloor.org
[ Keep returning -ENOMEM when strdup() fails in perf_pmu__parse_scale()/convert_scale() ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/pmu.c | 62 ++++++++++++++++++++++++++++-----------------------
1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index dc6ccaa4e927..78b16100567d 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -94,32 +94,10 @@ static int pmu_format(const char *name, struct list_head *format)
return 0;
}

-static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *name)
+static int convert_scale(const char *scale, char **end, double *sval)
{
- struct stat st;
- ssize_t sret;
- char scale[128];
- int fd, ret = -1;
- char path[PATH_MAX];
char *lc;
-
- snprintf(path, PATH_MAX, "%s/%s.scale", dir, name);
-
- fd = open(path, O_RDONLY);
- if (fd == -1)
- return -1;
-
- if (fstat(fd, &st) < 0)
- goto error;
-
- sret = read(fd, scale, sizeof(scale)-1);
- if (sret < 0)
- goto error;
-
- if (scale[sret - 1] == '\n')
- scale[sret - 1] = '\0';
- else
- scale[sret] = '\0';
+ int ret = 0;

/*
* save current locale
@@ -134,7 +112,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
lc = strdup(lc);
if (!lc) {
ret = -ENOMEM;
- goto error;
+ goto out;
}

/*
@@ -144,14 +122,42 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
*/
setlocale(LC_NUMERIC, "C");

- alias->scale = strtod(scale, NULL);
+ *sval = strtod(scale, end);

+out:
/* restore locale */
setlocale(LC_NUMERIC, lc);
-
free(lc);
+ return ret;
+}
+
+static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *name)
+{
+ struct stat st;
+ ssize_t sret;
+ char scale[128];
+ int fd, ret = -1;
+ char path[PATH_MAX];
+
+ snprintf(path, PATH_MAX, "%s/%s.scale", dir, name);
+
+ fd = open(path, O_RDONLY);
+ if (fd == -1)
+ return -1;
+
+ if (fstat(fd, &st) < 0)
+ goto error;
+
+ sret = read(fd, scale, sizeof(scale)-1);
+ if (sret < 0)
+ goto error;
+
+ if (scale[sret - 1] == '\n')
+ scale[sret - 1] = '\0';
+ else
+ scale[sret] = '\0';

- ret = 0;
+ ret = convert_scale(scale, NULL, &alias->scale);
error:
close(fd);
return ret;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:10 AM1/17/17
to
From: Soramichi AKIYAMA <aki...@m.soramichi.jp>

This patch fixes a typo: s/delievery/delivery/

Signed-off-by: Soramichi Akiyama <aki...@m.soramichi.jp>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170117222233.dfd9...@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/session.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f268201048a0..349c68144e55 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1191,7 +1191,7 @@ static int
u64 sample_type = evsel->attr.sample_type;
u64 read_format = evsel->attr.read_format;

- /* Standard sample delievery. */
+ /* Standard sample delivery. */
if (!(sample_type & PERF_SAMPLE_READ))
return tool->sample(tool, event, sample, evsel, machine);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:11 AM1/17/17
to
From: Namhyung Kim <namh...@kernel.org>

The --state option is to show task state when switched out. The state
is printed as a single character like in the /proc but I added 'I' for
idle state rather than 'R'.

$ perf sched timehist --state | head
Samples do not have callchains.
time cpu task name wait time sch delay run time state
[tid/pid] (msec) (msec) (msec)
-------- --- ----------------------- -------- ------------------ -----
1.753791 [3] <idle> 0.000 0.000 0.000 I
1.753834 [1] perf[27469] 0.000 0.000 0.000 S
1.753904 [3] perf[27470] 0.000 0.006 0.112 S
1.753914 [1] <idle> 0.000 0.000 0.079 I
1.753915 [3] migration/3[23] 0.000 0.002 0.011 S
1.754287 [2] <idle> 0.000 0.000 0.000 I
1.754335 [2] transmission[1773/1739] 0.000 0.004 0.047 S

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Minchan Kim <min...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 2 ++
tools/perf/builtin-sched.c | 38 +++++++++++++++++++++++++++++----
2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 76173969ab80..d33deddb0146 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -143,6 +143,8 @@ OPTIONS for 'perf sched timehist'
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.

+--state::
+ Show task state when it switched out.

SEE ALSO
--------
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 6d3c3e84881a..a8ac76602187 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -222,6 +222,7 @@ struct perf_sched {
bool show_cpu_visual;
bool show_wakeups;
bool show_migrations;
+ bool show_state;
u64 skipped_samples;
const char *time_str;
struct perf_time_interval ptime;
@@ -1840,6 +1841,9 @@ static void timehist_header(struct perf_sched *sched)
printf(" %-*s %9s %9s %9s", comm_width,
"task name", "wait time", "sch delay", "run time");

+ if (sched->show_state)
+ printf(" %s", "state");
+
printf("\n");

/*
@@ -1850,9 +1854,14 @@ static void timehist_header(struct perf_sched *sched)
if (sched->show_cpu_visual)
printf(" %*s ", ncpus, "");

- printf(" %-*s %9s %9s %9s\n", comm_width,
+ printf(" %-*s %9s %9s %9s", comm_width,
"[tid/pid]", "(msec)", "(msec)", "(msec)");

+ if (sched->show_state)
+ printf(" %5s", "");
+
+ printf("\n");
+
/*
* separator
*/
@@ -1865,14 +1874,29 @@ static void timehist_header(struct perf_sched *sched)
graph_dotted_line, graph_dotted_line, graph_dotted_line,
graph_dotted_line);

+ if (sched->show_state)
+ printf(" %.5s", graph_dotted_line);
+
printf("\n");
}

+static char task_state_char(struct thread *thread, int state)
+{
+ static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
+ unsigned bit = state ? ffs(state) : 0;
+
+ /* 'I' for idle */
+ if (thread->tid == 0)
+ return 'I';
+
+ return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
+}
+
static void timehist_print_sample(struct perf_sched *sched,
struct perf_sample *sample,
struct addr_location *al,
struct thread *thread,
- u64 t)
+ u64 t, int state)
{
struct thread_runtime *tr = thread__priv(thread);
u32 max_cpus = sched->max_cpu + 1;
@@ -1906,6 +1930,9 @@ static void timehist_print_sample(struct perf_sched *sched,
print_sched_time(tr->dt_delay, 6);
print_sched_time(tr->dt_run, 6);

+ if (sched->show_state)
+ printf(" %5c ", task_state_char(thread, state));
+
if (sched->show_wakeups)
printf(" %-*s", comm_width, "");

@@ -2406,6 +2433,8 @@ static int timehist_sched_change_event(struct perf_tool *tool,
struct thread_runtime *tr = NULL;
u64 tprev, t = sample->time;
int rc = 0;
+ int state = perf_evsel__intval(evsel, sample, "prev_state");
+

if (machine__resolve(machine, &al, sample) < 0) {
pr_err("problem processing %d event. skipping it\n",
@@ -2493,7 +2522,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
}

if (!sched->summary_only)
- timehist_print_sample(sched, sample, &al, thread, t);
+ timehist_print_sample(sched, sample, &al, thread, t, state);

out:
if (sched->hist_time.start == 0 && t >= ptime->start)
@@ -2506,7 +2535,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
tr->last_time = sample->time;

/* last state is used to determine where to account wait time */
- tr->last_state = perf_evsel__intval(evsel, sample, "prev_state");
+ tr->last_state = state;

/* sched out event for task so reset ready to run time */
tr->ready_to_run = 0;
@@ -3278,6 +3307,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('I', "idle-hist", &sched.idle_hist, "Show idle events only"),
OPT_STRING(0, "time", &sched.time_str, "str",
"Time span for analysis (start,stop)"),
+ OPT_BOOLEAN(0, "state", &sched.show_state, "Show task state when sched-out"),
OPT_PARENT(sched_options)
};

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 17, 2017, 11:10:11 AM1/17/17
to
From: Yannick Brosseau <scie...@fb.com>

In 2059fc7a5a9e ("perf symbols: Allow forcing reading of non-root owned
files by root") 'perf report' was added the option of forcing reading of
non-root owned symbol file.

This add the same behavior for perf script.

Reported-by: Mark Drayton <m...@fb.com>
Signed-off-by: Yannick Brosseau <scie...@fb.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: kerne...@fb.com
Link: http://lkml.kernel.org/r/20170113182527.1...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-script.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 2f3ff69fc4e7..c0783b4f7b6c 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2180,7 +2180,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"Show the mmap events"),
OPT_BOOLEAN('\0', "show-switch-events", &script.show_switch_events,
"Show context switch events (if recorded)"),
- OPT_BOOLEAN('f', "force", &file.force, "don't complain, do it"),
+ OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_BOOLEAN(0, "ns", &nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
@@ -2212,6 +2212,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
PARSE_OPT_STOP_AT_NON_OPTION);

file.path = input_name;
+ file.force = symbol_conf.force;

if (argc > 1 && !strncmp(argv[0], "rec", strlen("rec"))) {
rec_script_path = get_script_path(argv[1], RECORD_SUFFIX);
--
2.9.3

Ingo Molnar

unread,
Jan 18, 2017, 4:20:06 AM1/18/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:06 AM1/25/17
to
From: Jiri Olsa <jo...@kernel.org>

Currently we allow only to expand or collapse all entries in the browser
with 'E' or 'C' keys. Allow user to expand or collapse only current
entry in the browser with e or c key.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-3-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/hists.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 8bf18afe2a1f..fc4fb669ceee 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -571,6 +571,15 @@ static void hist_browser__set_folding(struct hist_browser *browser, bool unfold)
ui_browser__reset_index(&browser->b);
}

+static void hist_browser__set_folding_selected(struct hist_browser *browser, bool unfold)
+{
+ if (!browser->he_selection)
+ return;
+
+ hist_entry__set_folding(browser->he_selection, browser, unfold);
+ browser->b.nr_entries = hist_browser__nr_entries(browser);
+}
+
static void ui_browser__warn_lost_events(struct ui_browser *browser)
{
ui_browser__warning(browser, 4,
@@ -644,10 +653,18 @@ int hist_browser__run(struct hist_browser *browser, const char *help)
/* Collapse the whole world. */
hist_browser__set_folding(browser, false);
break;
+ case 'c':
+ /* Collapse the selected entry. */
+ hist_browser__set_folding_selected(browser, false);
+ break;
case 'E':
/* Expand the whole world. */
hist_browser__set_folding(browser, true);
break;
+ case 'e':
+ /* Expand the selected entry. */
+ hist_browser__set_folding_selected(browser, true);
+ break;
case 'H':
browser->show_headers = !browser->show_headers;
hist_browser__update_rows(browser);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:06 AM1/25/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9f6f941e25bad8fcffc24d10762962d62edba767:

Merge tag 'perf-core-for-mingo-4.11-20170117' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-01-18 10:06:20 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170125

for you to fetch changes up to bb6457b8af267b92ba3e752d9ccb3a4d4965a912:

perf ftrace: Make 'function_graph' be the default tracer (2017-01-25 10:37:27 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Disassemble x86 branch stacks using "-F brstackasm" and Intel PT
traces with "-F asm" in 'perf script' using Intel's XED library.

Since this is not widely available in pre-packaged forms in distros,
it will not be automatically feature probed, needing to be explicitly
enabled at perf build time using the XED=1 and optionally the XED_DIR
make command line variables.

See the changeset log messages and committer notes to see it in action
(Andi Kleen)

- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
function and function_graph tracer, defaulting to the "function_graph"
tracer, more work will be done in reviving this effort, forward porting
it from its initial patch submission (Namhyung Kim)

- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)

Fixes:

- Fix wrong register name for arm64, used in 'perf probe' (He Kuang)

- Fix map offsets in relocation in libbpf (Joe Stringer)

- Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)

Infrastructure:

- libbpf prog functions sync with what is exported via uapi (Joe Stringer)

Trivial:

- Remove unnecessary checks and assignments in 'perf probe's
try_to_find_absolute_address() (Markus Elfring)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Andi Kleen (5):
perf tools: Add probing for the XED disassembler library
perf tools: Add one liner warning for disabled features
perf tools: Add disassembler for x86 using the XED library
perf script: Add support for printing assembler
perf script: Add "brstackasm" output for branch stacks

Arnaldo Carvalho de Melo (3):
perf scripting perl: Do not die() when not founding event for a type
perf tools: Propagate perf_config() errors
perf ftrace: Make 'function_graph' be the default tracer

He Kuang (1):
perf probe: Fix wrong register name for arm64

Jiri Olsa (4):
perf hists browser: Put hist_entry folding logic into single function
perf hists browser: Add e/c hotkeys to expand/collapse callchain for current entry
perf c2c report: Display Total records column in offset view
perf c2c report: Coalesce by default only by pid,iaddr

Joe Stringer (4):
tools lib bpf: Fix map offsets in relocation
tools lib bpf: Define prog_type fns with macro
tools lib bpf: Add set/is helpers for all prog types
tools lib bpf: Add libbpf_get_error()

Markus Elfring (2):
perf probe: Delete an unnecessary check in try_to_find_absolute_address()
perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()

Matija Glavinic Pecotic (1):
perf unwind: Fix looking up dwarf unwind stack info

Namhyung Kim (3):
perf util: Save pid-cmdline mapping into tracing header
perf util: Add more debug message on failure path
perf ftrace: Introduce new 'ftrace' tool

tools/build/Makefile.feature | 4 +-
tools/build/feature/Makefile | 6 +-
tools/build/feature/test-all.c | 14 +
tools/build/feature/test-xed.c | 9 +
tools/lib/bpf/libbpf.c | 69 +++--
tools/lib/bpf/libbpf.h | 14 +-
tools/perf/Build | 1 +
tools/perf/Documentation/perf-c2c.txt | 2 +-
tools/perf/Documentation/perf-ftrace.txt | 36 +++
tools/perf/Documentation/perf-script.txt | 15 +-
tools/perf/Makefile.config | 29 ++
tools/perf/Makefile.perf | 3 +
tools/perf/arch/arm64/include/dwarf-regs-table.h | 12 +-
tools/perf/arch/x86/util/Build | 3 +
tools/perf/arch/x86/util/dis.c | 86 ++++++
tools/perf/builtin-c2c.c | 3 +-
tools/perf/builtin-ftrace.c | 243 +++++++++++++++
tools/perf/builtin-help.c | 6 +-
tools/perf/builtin-kmem.c | 8 +-
tools/perf/builtin-record.c | 4 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-script.c | 338 ++++++++++++++++++++-
tools/perf/builtin-top.c | 4 +-
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 16 +-
tools/perf/tests/llvm.c | 2 +-
tools/perf/ui/browsers/hists.c | 60 ++--
tools/perf/util/Build | 1 +
tools/perf/util/callchain.c | 14 +-
tools/perf/util/config.c | 9 +-
tools/perf/util/data-convert-bt.c | 7 +-
tools/perf/util/dis.c | 15 +
tools/perf/util/dis.h | 23 ++
tools/perf/util/dso.c | 48 ++-
tools/perf/util/header.c | 4 +-
tools/perf/util/hist.c | 4 +-
tools/perf/util/intel-pt.c | 4 +-
tools/perf/util/llvm-utils.c | 4 +-
tools/perf/util/probe-event.c | 11 +-
.../perf/util/scripting-engines/trace-event-perl.c | 6 +-
tools/perf/util/trace-event-info.c | 33 +-
tools/perf/util/trace-event-parse.c | 17 ++
tools/perf/util/trace-event-read.c | 77 ++++-
tools/perf/util/trace-event.h | 1 +
tools/perf/util/unwind-libunwind-local.c | 54 +++-
46 files changed, 1192 insertions(+), 133 deletions(-)
create mode 100644 tools/build/feature/test-xed.c
create mode 100644 tools/perf/Documentation/perf-ftrace.txt
create mode 100644 tools/perf/arch/x86/util/dis.c
create mode 100644 tools/perf/builtin-ftrace.c
create mode 100644 tools/perf/util/dis.c
create mode 100644 tools/perf/util/dis.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# dm
make_no_libbpf_O: make NO_LIBBPF=1
make_no_demangle_O: make NO_DEMANGLE=1
make_tags_O: make tags
make_no_libbionic_O: make NO_LIBBIONIC=1
make_help_O: make help
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_newt_O: make NO_NEWT=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_auxtrace_O: make NO_AUXTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_debug_O: make DEBUG=1
make_util_map_o_O: make util/map.o
make_no_libelf_O: make NO_LIBELF=1
make_install_bin_O: make install-bin
make_no_libpython_O: make NO_LIBPYTHON=1
make_install_O: make install
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_gtk2_O: make NO_GTK2=1
make_pure_O: make
make_no_libperl_O: make NO_LIBPERL=1
make_no_slang_O: make NO_SLANG=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libaudit_O: make NO_LIBAUDIT=1
make_clean_all_O: make clean all
make_perf_o_O: make perf.o
make_static_O: make LDFLAGS=-static
make_doc_O: make doc
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:06 AM1/25/17
to
From: Jiri Olsa <jo...@kernel.org>

It seems to be the most used argument for -c option so far. In the
beginning when you want to have the overall process report, so it makes
sense to make it the default one.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-5-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-c2c.txt | 2 +-
tools/perf/builtin-c2c.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index 3f06730c7f47..2da07e51e119 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -248,7 +248,7 @@ output fields set for caheline offsets output:
Code address, Code symbol, Shared Object, Source line
dso - coalesced by shared object

-By default the coalescing is setup with 'pid,tid,iaddr'.
+By default the coalescing is setup with 'pid,iaddr'.

STDIO OUTPUT
------------
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 616cb1418c3f..e2b21723bbf8 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -58,7 +58,7 @@ struct c2c_hist_entry {
struct hist_entry he;
};

-static char const *coalesce_default = "pid,tid,iaddr";
+static char const *coalesce_default = "pid,iaddr";

struct perf_c2c {
struct perf_tool tool;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:06 AM1/25/17
to
From: Joe Stringer <j...@ovn.org>

Turning this into a macro allows future prog types to be added with a
single line per type.

Signed-off-by: Joe Stringer <j...@ovn.org>
Acked-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/2017012301112...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 41 ++++++++++++++++-------------------------
1 file changed, 16 insertions(+), 25 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 671d5ad07cf1..371cb40a2304 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1428,37 +1428,28 @@ static void bpf_program__set_type(struct bpf_program *prog,
prog->type = type;
}

-int bpf_program__set_tracepoint(struct bpf_program *prog)
-{
- if (!prog)
- return -EINVAL;
- bpf_program__set_type(prog, BPF_PROG_TYPE_TRACEPOINT);
- return 0;
-}
-
-int bpf_program__set_kprobe(struct bpf_program *prog)
-{
- if (!prog)
- return -EINVAL;
- bpf_program__set_type(prog, BPF_PROG_TYPE_KPROBE);
- return 0;
-}
-
static bool bpf_program__is_type(struct bpf_program *prog,
enum bpf_prog_type type)
{
return prog ? (prog->type == type) : false;
}

-bool bpf_program__is_tracepoint(struct bpf_program *prog)
-{
- return bpf_program__is_type(prog, BPF_PROG_TYPE_TRACEPOINT);
-}
-
-bool bpf_program__is_kprobe(struct bpf_program *prog)
-{
- return bpf_program__is_type(prog, BPF_PROG_TYPE_KPROBE);
-}
+#define BPF_PROG_TYPE_FNS(NAME, TYPE) \
+int bpf_program__set_##NAME(struct bpf_program *prog) \
+{ \
+ if (!prog) \
+ return -EINVAL; \
+ bpf_program__set_type(prog, TYPE); \
+ return 0; \
+} \
+ \
+bool bpf_program__is_##NAME(struct bpf_program *prog) \
+{ \
+ return bpf_program__is_type(prog, TYPE); \
+} \
+
+BPF_PROG_TYPE_FNS(kprobe, BPF_PROG_TYPE_KPROBE);
+BPF_PROG_TYPE_FNS(tracepoint, BPF_PROG_TYPE_TRACEPOINT);

int bpf_map__fd(struct bpf_map *map)
{
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:07 AM1/25/17
to
From: Andi Kleen <a...@linux.intel.com>

Add a generic disassembler function for x86 using the XED library, and a
fallback function for architectures that don't implement one. Other
architectures can implement their own disassembler functions.

The previous version of this patch used udis86, but was
rejected because udis86 was unmaintained and a runtime dependency.
Using the recently released xed avoids both of these problems:

- XED is well maintained, uptodate, and used by many Intel tools

- XED is linked statically so there is no runtime dependency.

The XED library can be downloaded from http://github.com/intelxed/xed

v2: Clean up includes.

Committer notes:

- Aligned struct member definitions;

- Added missing includes to dis.h

- Went back to the feature detection patch to make sure the xed
path is added to CFLAGS so that this patch can actually build.

Disable -Werror=old-style-declaration for tools/perf/arch/x86/util/dis.o
due to xed header problems, for which we don't have control:

$ gcc --version
gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
$ make XED=1
XED_DIR=/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64 O=/tmp/build/perf -C tools/perf

CC /tmp/build/perf/tests/openat-syscall.o
In file included from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-inst.h:41:0,
from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-decoded-inst.h:28,
from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-decode.h:24,
from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-interface.h:40,
from arch/x86/util/dis.c:3:
/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-iform-map.h:74:1: error: ‘inline’ is not at beginning of declaration [-Werror=old-style-declaration]
xed_iclass_enum_t XED_INLINE xed_iform_to_iclass(xed_iform_enum_t iform) {
^~~~~~~~~~~~~~~~~
In file included from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-interface.h:43:0,
from arch/x86/util/dis.c:3:

Ditto for -Werror=switch-enum:

In file included from /home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-interface.h:43:0,
from arch/x86/util/dis.c:8:
/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-state.h: In function ‘xed_state_get_address_width’:
/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-state.h:144:5: error: enumeration value ‘XED_MACHINE_MODE_INVALID’ not handled in switch [-Werror=switch-enum]
switch(xed_state_get_machine_mode(p)) {
^~~~~~
/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/include/xed/xed-state.h:144:5: error: enumeration value ‘XED_MACHINE_MODE_LAST’ not handled in switch [-Werror=switch-enum]
cc1: all warnings being treated as errors
mv: cannot stat '/tmp/build/perf/arch/x86/util/.dis.o.tmp': No such file or directory
/home/acme/git/linux/tools/build/Makefile.build:91: recipe for target '/tmp/build/perf/arch/x86/util/dis.o' failed

Now we have the static xed library linked with perf:

$ nm /tmp/build/perf/perf | grep "T xed" | head
0000000000682761 T xed3_decode_operands
0000000000682700 T xed3_dynamic_decode_part2
0000000000742002 T xed3_get_generic_operand
0000000000742ca4 T xed3_set_generic_operand
000000000068380e T xed3_static_decode
000000000064a6d0 T xed_attribute
000000000064a6ca T xed_attribute_max
00000000007409ab T xed_chip_enum_t2str
00000000007409d8 T xed_chip_enum_t_last
000000000074c8a7 T xed_dec_lu_ASZ_NONTERM_EASZ_MAP_MOD3_REXW_RM4_VEXVALID_VEX_PREFIX_VL
$

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017011901415...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/x86/util/Build | 3 ++
tools/perf/arch/x86/util/dis.c | 86 ++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/Build | 1 +
tools/perf/util/dis.c | 15 ++++++++
tools/perf/util/dis.h | 23 +++++++++++
5 files changed, 128 insertions(+)
create mode 100644 tools/perf/arch/x86/util/dis.c
create mode 100644 tools/perf/util/dis.c
create mode 100644 tools/perf/util/dis.h

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index f95e6f46ef0d..fd1dfd7c6321 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -14,3 +14,6 @@ libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
libperf-$(CONFIG_AUXTRACE) += auxtrace.o
libperf-$(CONFIG_AUXTRACE) += intel-pt.o
libperf-$(CONFIG_AUXTRACE) += intel-bts.o
+libperf-$(CONFIG_XED) += dis.o
+
+CFLAGS_dis.o += -Wno-old-style-declaration -Wno-switch-enum
diff --git a/tools/perf/arch/x86/util/dis.c b/tools/perf/arch/x86/util/dis.c
new file mode 100644
index 000000000000..39703512fe17
--- /dev/null
+++ b/tools/perf/arch/x86/util/dis.c
@@ -0,0 +1,86 @@
+/* Disassembler using the XED library */
+#include "perf.h"
+#include "util/session.h"
+#include "util/symbol.h"
+#include "util/thread.h"
+#include "util/dis.h"
+
+#include <xed/xed-interface.h>
+#include <xed/xed-decode.h>
+#include <xed/xed-decoded-inst-api.h>
+
+static int dis_resolve(xed_uint64_t addr, char *buf, xed_uint32_t buflen,
+ xed_uint64_t *off, void *data)
+{
+ struct perf_dis *x = data;
+ struct addr_location al;
+
+ memset(&al, 0, sizeof(struct addr_location));
+
+ thread__find_addr_map(x->thread, x->cpumode, MAP__FUNCTION, addr, &al);
+ if (!al.map)
+ thread__find_addr_map(x->thread, x->cpumode, MAP__VARIABLE,
+ addr, &al);
+ al.cpu = x->cpu;
+ al.sym = NULL;
+
+ if (al.map)
+ al.sym = map__find_symbol(al.map, al.addr);
+
+ if (!al.sym)
+ return 0;
+
+ if (al.addr < al.sym->end)
+ *off = al.addr - al.sym->start;
+ else
+ *off = al.addr - al.map->start - al.sym->start;
+ snprintf(buf, buflen, "%s", al.sym->name);
+ return 1;
+}
+
+/* x must be set up earlier */
+char *disas_inst(struct perf_dis *x, uint64_t ip, u8 *inbuf, int inlen,
+ int *lenp)
+{
+ xed_decoded_inst_t inst;
+ xed_print_info_t info;
+ xed_error_enum_t err;
+ static bool init;
+
+ if (!init) {
+ xed_tables_init();
+ init = true;
+ }
+
+ if (lenp)
+ *lenp = 0;
+
+ xed_init_print_info(&info);
+ info.syntax = XED_SYNTAX_ATT;
+ info.disassembly_callback = dis_resolve;
+ info.context = x;
+
+ xed_decoded_inst_zero(&inst);
+ if (x->is64bit)
+ xed_decoded_inst_set_mode(&inst, XED_MACHINE_MODE_LONG_64,
+ XED_ADDRESS_WIDTH_64b);
+ else
+ xed_decoded_inst_set_mode(&inst, XED_MACHINE_MODE_LEGACY_32,
+ XED_ADDRESS_WIDTH_32b);
+
+ err = xed_decode(&inst, (uint8_t *)inbuf, inlen);
+ if (err != XED_ERROR_NONE) {
+ snprintf(x->out, sizeof(x->out), "err: %s for %d bytes",
+ xed_error_enum_t2str(err), inlen);
+ return x->out;
+ }
+ if (lenp)
+ *lenp = xed_decoded_inst_get_length(&inst);
+ info.p = &inst;
+ info.buf = x->out;
+ info.blen = sizeof(x->out);
+ info.runtime_address = ip;
+ if (!xed_format_generic(&info))
+ strcpy(x->out, "err: cannot format");
+ return x->out;
+}
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 5da376bc1afc..cdaeb4764fee 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -88,6 +88,7 @@ libperf-y += mem-events.o
libperf-y += vsprintf.o
libperf-y += drv_configs.o
libperf-y += time-utils.o
+libperf-y += dis.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/dis.c b/tools/perf/util/dis.c
new file mode 100644
index 000000000000..61cf96fbcd1b
--- /dev/null
+++ b/tools/perf/util/dis.c
@@ -0,0 +1,15 @@
+#include "perf.h"
+#include "dis.h"
+#include "util.h"
+
+/* Fallback for architectures with no disassembler */
+
+__weak char *disas_inst(struct perf_dis *x, uint64_t ip __maybe_unused,
+ u8 *inbuf __maybe_unused, int inlen __maybe_unused,
+ int *lenp)
+{
+ if (lenp)
+ *lenp = 0;
+ strcpy(x->out, "?");
+ return x->out;
+}
diff --git a/tools/perf/util/dis.h b/tools/perf/util/dis.h
new file mode 100644
index 000000000000..79ff8d915d3b
--- /dev/null
+++ b/tools/perf/util/dis.h
@@ -0,0 +1,23 @@
+#ifndef DIS_H
+#define DIS_H 1
+
+#include <stdbool.h>
+#include <linux/types.h>
+
+struct thread;
+
+#define MAXINSN 15
+
+struct perf_dis {
+ /* Initialized by callers: */
+ struct thread *thread;
+ u8 cpumode;
+ int cpu;
+ bool is64bit;
+ /* Temporary */
+ char out[256];
+};
+
+char *disas_inst(struct perf_dis *x, uint64_t ip, u8 *inbuf, int inlen, int *lenp);
+
+#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:09 AM1/25/17
to
From: Jiri Olsa <jo...@kernel.org>

Adding "Total records" column into cacheline pareto table, between
cycles and cpu info.

$ perf c2c report
...

--- ---------- cycles ---------- Total cpu
rmt hitm lcl hitm load records cnt
... ........ ........ ........ ....... ........

0 112 71 34 4
0 0 0 18 1
0 0 0 2 1
0 132 0 3 3

...

It's useful to see how many recorded samples represent each offset.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Don Zickus <dzi...@redhat.com>
Cc: Joe Mario <jma...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-4-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-c2c.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f8ca7a4ebabc..616cb1418c3f 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2476,6 +2476,7 @@ static int build_cl_output(char *cl_sort, bool no_source)
"mean_rmt,"
"mean_lcl,"
"mean_load,"
+ "tot_recs,"
"cpucnt,",
add_sym ? "symbol," : "",
add_dso ? "dso," : "",
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:10 AM1/25/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Do just like handling other cases i.e. print some debug message and
ignore the sample.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxy...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/scripting-engines/trace-event-perl.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index e55a132f69b7..014ecd6f67c4 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -350,8 +350,10 @@ static void perl_process_tracepoint(struct perf_sample *sample,
if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
return;

- if (!event)
- die("ug! no event found for type %" PRIu64, (u64)evsel->attr.config);
+ if (!event) {
+ pr_debug("ug! no event found for type %" PRIu64, (u64)evsel->attr.config);
+ return;
+ }

pid = raw_field_value(event, "common_pid", data);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:11 AM1/25/17
to
From: Andi Kleen <a...@linux.intel.com>

Can be downloaded from https://github.com/intelxed/xed

v2: Hide. Require XED=1 to enable. Add XED_DIR
v3: Remove -lxed from probe all. Don't touch FEATURE_DISPLAY.
v4: Move to FEATURE_FLAGS_BASIC

Committer Notes:

From https://intelxed.github.io/:

---
The X86 Encoder Decoder (XED), is a software library (and associated
headers) for encoding and decoding X86 (IA32 and Intel64) instructions.
The decoder takes sequences of 1-15 bytes along with machine mode
information and produces a data structure describing the opcode,
operands, and flags. The encoder takes a similar data structure and
produces a sequence of 1 to 15 bytes. Disassembly is essentially a
printing pass on the data structure.

XED is the encoder/decoder used by Pin and the Intel® Software
Development Emulator (SDE). XED is used by the ASIM performance model,
ZSIM, Intel® VTune Amplifier, IACA, as well as in projects at McAfee,
Wind River, and many other projects inside and outside Intel.

XED supports the notion of layers of instructions. As new instructions
are made public corresponding to future processors, the associated
layers are added to the tables of instructions in the "datafiles"
subdirectory. (Intel announces and documents instructions in the ISE and
SDM documents typically).
---

XED License: Apache, version 2.0.

Also remove it from test-all.c, since it is not normally available on
distros at this time and not selected by default.

For the same reason, move it from FEATURE_TESTS_BASIC to
FEATURE_TESTS_EXTRA, to avoid always trying to build
tools/build/feature/test-xed.c.

One has to define XED=1 to build WITH the XED disassembler to be used in
'perf script'. It is also possible to set XED_DIR=/path to set the XED
directory.

Added the XED_DIR to CFLAGS in addition to the feature detection
specific CFLAGS, so that the following patches can work.

Testing it:

WITHOUT XED in the system, NOT explicitely selecting it:

$ make O=/tmp/build/perf -C tools/perf install-bin
$ cat /tmp/build/perf/feature/test-xed.make.output
cat: /tmp/build/perf/feature/test-xed.make.output: No such file or directory
$

The feature test is NOT performed, ok.

WITHOUT XED in the system, EXPLICITELY selecting it:

$ make XED=1 O=/tmp/build/perf -C tools/perf install-bin
$ cat /tmp/build/perf/feature/test-xed.make.output
test-xed.c:1:31: fatal error: xed/xed-interface.h: No such file or directory
#include <xed/xed-interface.h>
^
compilation terminated.
$

Ok, the test _IS_ performed, the required devel files are not found, the
build is done without XED.

Now lets try building and installing xed in this system:

$ git clone https://github.com/intelxed/mbuild.git
$ git clone https://github.com/intelxed/xed.git
$ cd xed
$ ./mfile.py
$ ./mfile.py install

And lets try enabling XED, informing where 'mfile.py install' installed
it:

$ make XED=1 XED_DIR=/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/ O=/tmp/build/perf -C tools/perf install-bin
$ ls -la /tmp/build/perf/feature/test-xed.make.output
-rw-rw-r--. 1 acme acme 0 Jan 23 10:57 /tmp/build/perf/feature/test-xed.make.output
$

But it still doesn't link with
/home/acme/git/xed/kits/xed-install-base-2017-01-23-lin-x86-64/lib/libxed.a
because the feature will only be used in the following csets.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Link: http://lkml.kernel.org/r/2017011901415...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/build/Makefile.feature | 4 +++-
tools/build/feature/Makefile | 6 +++++-
tools/build/feature/test-all.c | 14 ++++++++++++++
tools/build/feature/test-xed.c | 9 +++++++++
tools/perf/Makefile.config | 19 +++++++++++++++++++
tools/perf/Makefile.perf | 3 +++
6 files changed, 53 insertions(+), 2 deletions(-)
create mode 100644 tools/build/feature/test-xed.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index e3fb5ecbdcb6..a3fa78296cb5 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -78,7 +78,8 @@ FEATURE_TESTS_EXTRA := \
liberty-z \
libunwind-debug-frame \
libunwind-debug-frame-arm \
- libunwind-debug-frame-aarch64
+ libunwind-debug-frame-aarch64 \
+ xed

FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)

@@ -140,6 +141,7 @@ ifeq ($(feature-all), 1)
$(call feature_check,compile-x32)
$(call feature_check,bionic)
$(call feature_check,libbabeltrace)
+ $(call feature_check,xed)
else
$(foreach feat,$(FEATURE_TESTS),$(call feature_check,$(feat)))
endif
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index b564a2eea039..4f1aa82b867a 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -48,7 +48,8 @@ FILES= \
test-get_cpuid.bin \
test-sdt.bin \
test-cxx.bin \
- test-jvmti.bin
+ test-jvmti.bin \
+ test-xed.bin

FILES := $(addprefix $(OUTPUT),$(FILES))

@@ -123,6 +124,9 @@ $(OUTPUT)test-numa_num_possible_cpus.bin:
$(OUTPUT)test-libunwind.bin:
$(BUILD) -lelf

+$(OUTPUT)test-xed.bin:
+ $(BUILD) -lxed
+
$(OUTPUT)test-libunwind-debug-frame.bin:
$(BUILD) -lelf
$(OUTPUT)test-libunwind-x86.bin:
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 699e43627397..cffe9fcda7c4 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -129,6 +129,19 @@
#undef main
#endif

+#if 0
+/*
+ * Disable xed check for test-all, because it is not available
+ * in most distributions. Will reenable when it becomes generally
+ * available in major distros.
+ */
+#define main main_test_xed
+# include "test-xed.c"
+#endif
+#else
+#define main_test_xed while(0)
+#endif
+
#define main main_test_lzma
# include "test-lzma.c"
#undef main
@@ -183,6 +196,7 @@ int main(int argc, char *argv[])
main_test_bpf();
main_test_libcrypto();
main_test_sdt();
+ main_test_xed();

return 0;
}
diff --git a/tools/build/feature/test-xed.c b/tools/build/feature/test-xed.c
new file mode 100644
index 000000000000..ef9aebf1559d
--- /dev/null
+++ b/tools/build/feature/test-xed.c
@@ -0,0 +1,9 @@
+#include <xed/xed-interface.h>
+#include <xed/xed-decode.h>
+#include <xed/xed-decoded-inst-api.h>
+
+int main(void)
+{
+ xed_tables_init();
+ return 0;
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 03cf947755b9..fd798d118739 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -684,6 +684,25 @@ ifndef NO_ZLIB
endif
endif

+ifdef XED
+ ifdef XED_DIR
+ XED_CFLAGS := -DHAVE_XED_SUPPORT -I$(XED_DIR)/include
+ XED_LDFLAGS := -L$(XED_DIR)/lib
+ endif
+ FEATURE_CHECK_CFLAGS-xed := $(XED_CFLAGS)
+ # override for lib64?
+ FEATURE_CHECK_LDFLAGS-xed := $(XED_LDFLAGS) -lxed
+ $(call feature_check,xed)
+ ifeq ($(feature-xed), 1)
+ CFLAGS += $(XED_CFLAGS)
+ LDFLAGS += $(XED_LDFLAGS)
+ EXTLIBS += -lxed
+ $(call detected,CONFIG_XED)
+ else
+ msg := $(warning No xed found, disables x86 disassembler support, please install xed);
+ endif
+endif
+
ifndef NO_LZMA
ifeq ($(feature-lzma), 1)
CFLAGS += -DHAVE_LZMA_SUPPORT
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 4da19b6ba94a..f6760309edd3 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -84,6 +84,9 @@ include ../scripts/utilities.mak
# Define NO_SDT if you do not want to define SDT event in perf tools,
# note that it doesn't disable SDT scanning support.
#
+# Define XED=1 to build WITH the XED disassembler for perf script
+# Can also set XED_DIR=/path to set XED directory.
+#
# Define FEATURES_DUMP to provide features detection dump file
# and bypass the feature detection
#
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:11 AM1/25/17
to
From: Andi Kleen <a...@linux.intel.com>

Implement printing full disassembled sequences for branch stacks in perf
script.

This allows to directly print hot paths for individual samples, together
with branch misprediction and cycle count / IPC information if available
(on Skylake systems).

This only works when no special branch filters are specified.

E.g.:

% perf record -b ...
% perf script -F brstackasm
...
000055b55d1147d0 pushq %rbp
000055b55d1147d1 pushq %r15
000055b55d1147d3 pushq %r14
000055b55d1147d5 pushq %r13
000055b55d1147d7 pushq %r12
000055b55d1147d9 pushq %rbx
000055b55d1147da sub $0x18, %rsp
000055b55d1147de mov %r8, %r13
000055b55d1147e1 mov %rcx, %rbp
000055b55d1147e4 mov %rdx, %r14
000055b55d1147e7 mov %rsi, %r15
000055b55d1147ea mov %rdi, %rbx
000055b55d1147ed movl $0x0, 0xc(%rsp)
000055b55d1147f5 movq (%rbp), %rax
000055b55d1147f9 test $0x1, %al
000055b55d1147fb jnz 0x55b55d114890 # PRED 4 cycles 3.75 IPC
000055b55d114890 mov %eax, %ecx
000055b55d114892 and $0x3, %ecx
000055b55d114895 cmp $0x1, %rcx
000055b55d114899 jnz 0x55b55d1148f8
000055b55d11489b movq -0x1(%rax), %rcx
000055b55d11489f cmpb $0x81, 0xb(%rcx)
000055b55d1148a3 jnz 0x55b55d1148fe # PRED 1 cycles 6.00 IPC
...

Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.

v2:
Use low level abstracted disassembler interface.
Print symbols and source lines as labels.
Print first jump in LBR too.
Patch up blocks with filtered ring transfers.
Show IPC
Lots of cleanups and improvements.

v3:
Print special message for branches frozen early.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017011901415...@firstfloor.org
[ Fix up u64 formatting on PRIu64, use {} on multiline if/else blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 13 +-
tools/perf/builtin-script.c | 268 +++++++++++++++++++++++++++++++
2 files changed, 279 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 497989ea9768..15a80815941e 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, brstack, brstacksym, flags, bpf-output, asm.
+ srcline, period, iregs, brstack, brstacksym, flags, bpf-output, asm, brstackasm,
callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -189,17 +189,22 @@ OPTIONS
i.e., -F "" is not allowed.

The brstack output includes branch related information with raw addresses using the
- /v/v/v/v/ syntax in the following order:
+ /v/v/v/v/cycles syntax in the following order:
FROM: branch source instruction
TO : branch target instruction
M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
X/- : X=branch inside a transactional region, -=not in transaction region or not supported
A/- : A=TSX abort entry, -=not aborted region or not supported
+ cycles

The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.

When asm is specified the assembler instruction of each sample is printed in disassembled form.

+ When brstackasm is specified the full assembler sequences of branch sequences for each sample
+ is printed. This is the full execution path leading to the sample. This is only supported when the
+ sample was recorded with perf record -b or -j any.
+
-k::
--vmlinux=<file>::
vmlinux pathname
@@ -301,6 +306,10 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.

+--max-blocks::
+ Set the maximum number of program blocks to print with brstackasm for
+ each sample.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7a09c4f7df3f..512d298031b9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -43,6 +43,7 @@ static bool nanosecs;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
+static int max_blocks;

unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;

@@ -71,6 +72,7 @@ enum perf_output_field {
PERF_OUTPUT_INSN = 1U << 21,
PERF_OUTPUT_INSNLEN = 1U << 22,
PERF_OUTPUT_ASM = 1U << 23,
+ PERF_OUTPUT_BRSTACKASM = 1U << 24,
};

struct output_option {
@@ -101,6 +103,7 @@ struct output_option {
{.str = "insn", .field = PERF_OUTPUT_INSN},
{.str = "insnlen", .field = PERF_OUTPUT_INSNLEN},
{.str = "asm", .field = PERF_OUTPUT_ASM},
+ {.str = "brstackasm", .field = PERF_OUTPUT_BRSTACKASM},
};

/* default set to maintain compatibility with current format */
@@ -300,6 +303,13 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected.\n");
return -EINVAL;
}
+ if (PRINT_FIELD(BRSTACKASM) &&
+ !(perf_evlist__combined_branch_type(session->evlist) &
+ PERF_SAMPLE_BRANCH_ANY)) {
+ pr_err("Display of branch stack assembler requested, but non all-branch filter set\n");
+ return -EINVAL;
+ }
+
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID",
PERF_OUTPUT_TID|PERF_OUTPUT_PID))
@@ -586,6 +596,260 @@ static void print_sample_brstacksym(struct perf_sample *sample,
}
}

+#define MAXBB 16384UL
+
+static int grab_bb(u8 *buffer, u64 start, u64 end,
+ struct machine *machine, struct thread *thread,
+ bool *is64bit, u8 *cpumode, bool last)
+{
+ int offset, len;
+ struct addr_location al;
+ bool kernel;
+
+ if (!start || !end)
+ return 0;
+
+ kernel = machine__kernel_ip(machine, start);
+ if (kernel)
+ *cpumode = PERF_RECORD_MISC_KERNEL;
+ else
+ *cpumode = PERF_RECORD_MISC_USER;
+
+ /*
+ * Block overlaps between kernel and user.
+ * This can happen due to ring filtering
+ * On Intel CPUs the entry into the kernel is filtered,
+ * but the exit is not. Let the caller patch it up.
+ */
+ if (kernel != machine__kernel_ip(machine, end)) {
+ printf("\tblock %" PRIx64 "-%" PRIx64 " transfers between kernel and user\n",
+ start, end);
+ return -ENXIO;
+ }
+
+ memset(&al, 0, sizeof(al));
+ if (end - start > MAXBB - MAXINSN) {
+ if (last) {
+ printf("\tbrstack does not reach to final jump (%" PRIx64 "-%" PRIx64 ")\n",
+ start, end);
+ } else {
+ printf("\tblock %" PRIx64 "-%" PRIx64 " (%" PRIu64 ") too long to dump\n",
+ start, end, end - start);
+ }
+ return 0;
+ }
+
+ thread__find_addr_map(thread, *cpumode, MAP__FUNCTION, start, &al);
+ if (!al.map || !al.map->dso) {
+ printf("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n",
+ start, end);
+ return 0;
+ }
+ if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR) {
+ printf("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n",
+ start, end);
+ return 0;
+ }
+
+ /* Load maps to ensure dso->is_64_bit has been updated */
+ map__load(al.map);
+
+ offset = al.map->map_ip(al.map, start);
+ len = dso__data_read_offset(al.map->dso, machine,
+ offset, (u8 *)buffer,
+ end - start + MAXINSN);
+
+ *is64bit = al.map->dso->is_64_bit;
+ if (len <= 0)
+ printf("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n",
+ start, end);
+ return len;
+}
+
+static void print_jump(uint64_t ip, struct branch_entry *en,
+ struct perf_dis *x, u8 *inbuf, int len,
+ int insn)
+{
+ printf("\t%016" PRIx64 "\t%-30s\t#%s%s%s%s",
+ ip,
+ disas_inst(x, ip, inbuf, len, NULL),
+ en->flags.predicted ? " PRED" : "",
+ en->flags.mispred ? " MISPRED" : "",
+ en->flags.in_tx ? " INTX" : "",
+ en->flags.abort ? " ABORT" : "");
+ if (en->flags.cycles) {
+ printf(" %d cycles", en->flags.cycles);
+ if (insn)
+ printf(" %.2f IPC", (float)insn / en->flags.cycles);
+ }
+ putchar('\n');
+}
+
+static void print_ip_sym(struct thread *thread,
+ u8 cpumode, int cpu,
+ uint64_t addr,
+ struct symbol **lastsym,
+ struct perf_event_attr *attr)
+{
+ struct addr_location al;
+ int off;
+
+ memset(&al, 0, sizeof(struct addr_location));
+
+ thread__find_addr_map(thread, cpumode, MAP__FUNCTION, addr, &al);
+ if (!al.map)
+ thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
+ addr, &al);
+ if ((*lastsym) && al.addr >= (*lastsym)->start && al.addr < (*lastsym)->end)
+ return;
+
+ al.cpu = cpu;
+ al.sym = NULL;
+ if (al.map)
+ al.sym = map__find_symbol(al.map, al.addr);
+
+ if (!al.sym)
+ return;
+
+ if (al.addr < al.sym->end)
+ off = al.addr - al.sym->start;
+ else
+ off = al.addr - al.map->start - al.sym->start;
+ printf("\t%s", al.sym->name);
+ if (off)
+ printf("%+d", off);
+ putchar(':');
+ if (PRINT_FIELD(SRCLINE))
+ map__fprintf_srcline(al.map, al.addr, "\t", stdout);
+ putchar('\n');
+ *lastsym = al.sym;
+}
+
+static void print_sample_brstackasm(struct perf_sample *sample,
+ struct thread *thread,
+ struct perf_event_attr *attr,
+ struct machine *machine)
+{
+ struct branch_stack *br = sample->branch_stack;
+ u64 start, end;
+ int i, insn;
+ struct perf_dis x;
+ u8 buffer[MAXBB];
+ int len;
+ int nr;
+ unsigned off;
+ int ilen;
+ struct symbol *lastsym = NULL;
+
+ if (!(br && br->nr))
+ return;
+ nr = br->nr;
+ if (max_blocks && nr > max_blocks + 1)
+ nr = max_blocks + 1;
+
+ x.thread = thread;
+ x.cpu = sample->cpu;
+
+ putchar('\n');
+
+ /* Handle first from jump, of which we don't know the entry. */
+ len = grab_bb(buffer, br->entries[nr-1].from,
+ br->entries[nr-1].from,
+ machine, thread, &x.is64bit, &x.cpumode, false);
+ if (len > 0) {
+ print_ip_sym(thread, x.cpumode, x.cpu,
+ br->entries[nr - 1].from,
+ &lastsym, attr);
+ print_jump(br->entries[nr - 1].from, &br->entries[nr - 1],
+ &x, buffer, len, 0);
+ }
+
+ /* Print all blocks */
+ for (i = nr - 2; i >= 0; i--) {
+ if (br->entries[i].from || br->entries[i].to)
+ pr_debug("%d: %" PRIx64 "-%" PRIx64 "\n", i,
+ br->entries[i].from,
+ br->entries[i].to);
+ start = br->entries[i + 1].to;
+ end = br->entries[i].from;
+
+ len = grab_bb(buffer, start, end,
+ machine, thread, &x.is64bit,
+ &x.cpumode, false);
+ /* Patch up missing kernel transfers due to ring filters */
+ if (len == -ENXIO && i > 0) {
+ end = br->entries[--i].from;
+ pr_debug("\tpatching up to %" PRIx64 "-%" PRIx64 "\n",
+ start, end);
+ len = grab_bb(buffer, start, end,
+ machine, thread, &x.is64bit,
+ &x.cpumode, false);
+ }
+ if (len <= 0)
+ continue;
+
+ insn = 0;
+ for (off = 0;; off += ilen) {
+ uint64_t ip = start + off;
+
+ print_ip_sym(thread, x.cpumode, x.cpu,
+ ip,
+ &lastsym, attr);
+ if (ip == end) {
+ print_jump(ip, &br->entries[i], &x,
+ buffer + off,
+ len - off, insn);
+ break;
+ } else {
+ printf("\t%016" PRIx64 "\t%s\n", ip,
+ disas_inst(&x, ip, buffer + off,
+ len - off, &ilen));
+ if (ilen == 0)
+ break;
+ insn++;
+ }
+ }
+ }
+
+ /*
+ * Hit the branch? In this case we are already done, and the target
+ * has not been executed yet.
+ */
+ if (br->entries[0].from == sample->ip)
+ return;
+ if (br->entries[0].flags.abort)
+ return;
+
+ /*
+ * Print final block upto sample
+ */
+ start = br->entries[0].to;
+ end = sample->ip;
+ len = grab_bb(buffer, start, end, machine, thread, &x.is64bit,
+ &x.cpumode, true);
+ print_ip_sym(thread, x.cpumode, x.cpu,
+ start,
+ &lastsym, attr);
+ if (len <= 0) {
+ /* Print at least last IP if basic block did not work */
+ len = grab_bb(buffer, sample->ip, sample->ip,
+ machine, thread, &x.is64bit, &x.cpumode,
+ false);
+ if (len <= 0)
+ return;
+
+ printf("\t%016" PRIx64 "\t%s\n", sample->ip,
+ disas_inst(&x, sample->ip, buffer, len, NULL));
+ return;
+ }
+ for (off = 0; off <= end - start; off += ilen) {
+ printf("\t%016" PRIx64 "\t%s\n", start + off,
+ disas_inst(&x, start + off, buffer + off,
+ len - off, &ilen));
+ if (ilen == 0)
+ break;
+ }
+}

static void print_sample_addr(struct perf_sample *sample,
struct thread *thread,
@@ -689,6 +953,8 @@ static void print_insn(union perf_event *event,
}
if (PRINT_FIELD(ASM))
print_sample_asm(event, sample, thread, al, machine);
+ if (PRINT_FIELD(BRSTACKASM))
+ print_sample_brstackasm(sample, thread, attr, machine);
}

static void print_sample_bts(union perf_event *event,
@@ -2231,6 +2497,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('\0', "show-switch-events", &script.show_switch_events,
"Show context switch events (if recorded)"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
+ OPT_INTEGER(0, "max-blocks", &max_blocks,
+ "Maximum number of code blocks to dump with brstackasm"),
OPT_BOOLEAN(0, "ns", &nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:13 AM1/25/17
to
From: Jiri Olsa <jo...@kernel.org>

It will be used in following patch to expand or collapse only the
current browser entry.

Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-2-...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/hists.c | 43 ++++++++++++++++++++++++------------------
1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 641b40234a9d..8bf18afe2a1f 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -501,8 +501,8 @@ static int hierarchy_set_folding(struct hist_browser *hb, struct hist_entry *he,
return n;
}

-static void hist_entry__set_folding(struct hist_entry *he,
- struct hist_browser *hb, bool unfold)
+static void __hist_entry__set_folding(struct hist_entry *he,
+ struct hist_browser *hb, bool unfold)
{
hist_entry__init_have_children(he);
he->unfolded = unfold ? he->has_children : false;
@@ -520,12 +520,34 @@ static void hist_entry__set_folding(struct hist_entry *he,
he->nr_rows = 0;
}

+static void hist_entry__set_folding(struct hist_entry *he,
+ struct hist_browser *browser, bool unfold)
+{
+ double percent;
+
+ percent = hist_entry__get_percent_limit(he);
+ if (he->filtered || percent < browser->min_pcnt)
+ return;
+
+ __hist_entry__set_folding(he, browser, unfold);
+
+ if (!he->depth || unfold)
+ browser->nr_hierarchy_entries++;
+ if (he->leaf)
+ browser->nr_callchain_rows += he->nr_rows;
+ else if (unfold && !hist_entry__has_hierarchy_children(he, browser->min_pcnt)) {
+ browser->nr_hierarchy_entries++;
+ he->has_no_entry = true;
+ he->nr_rows = 1;
+ } else
+ he->has_no_entry = false;
+}
+
static void
__hist_browser__set_folding(struct hist_browser *browser, bool unfold)
{
struct rb_node *nd;
struct hist_entry *he;
- double percent;

nd = rb_first(&browser->hists->entries);
while (nd) {
@@ -535,21 +557,6 @@ __hist_browser__set_folding(struct hist_browser *browser, bool unfold)
nd = __rb_hierarchy_next(nd, HMD_FORCE_CHILD);

hist_entry__set_folding(he, browser, unfold);
-
- percent = hist_entry__get_percent_limit(he);
- if (he->filtered || percent < browser->min_pcnt)
- continue;
-
- if (!he->depth || unfold)
- browser->nr_hierarchy_entries++;
- if (he->leaf)
- browser->nr_callchain_rows += he->nr_rows;
- else if (unfold && !hist_entry__has_hierarchy_children(he, browser->min_pcnt)) {
- browser->nr_hierarchy_entries++;
- he->has_no_entry = true;
- he->nr_rows = 1;
- } else
- he->has_no_entry = false;
}
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:36 AM1/25/17
to
From: Namhyung Kim <namhyu...@lge.com>

It's helpful for debugging on tracing features.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>,
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/n/tip-rjysr9ljie...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 4 ++-
tools/perf/util/trace-event-read.c | 53 ++++++++++++++++++++++++++------------
2 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 00fd8a8850d3..c567d9f0aa92 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2803,8 +2803,10 @@ static int perf_evsel__prepare_tracepoint_event(struct perf_evsel *evsel,
}

event = pevent_find_event(pevent, evsel->attr.config);
- if (event == NULL)
+ if (event == NULL) {
+ pr_debug("cannot find event format for %d\n", (int)evsel->attr.config);
return -1;
+ }

if (!evsel->name) {
snprintf(bf, sizeof(bf), "%s:%s", event->system, event->name);
diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c
index 61c7162c31c7..27420159bf69 100644
--- a/tools/perf/util/trace-event-read.c
+++ b/tools/perf/util/trace-event-read.c
@@ -260,39 +260,53 @@ static int read_header_files(struct pevent *pevent)

static int read_ftrace_file(struct pevent *pevent, unsigned long long size)
{
+ int ret;
char *buf;

buf = malloc(size);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
- free(buf);
- return -1;
+ ret = do_read(buf, size);
+ if (ret < 0) {
+ pr_debug("error reading ftrace file.\n");
+ goto out;
}

- parse_ftrace_file(pevent, buf, size);
+ ret = parse_ftrace_file(pevent, buf, size);
+ if (ret < 0)
+ pr_debug("error parsing ftrace file.\n");
+out:
free(buf);
- return 0;
+ return ret;
}

static int read_event_file(struct pevent *pevent, char *sys,
unsigned long long size)
{
+ int ret;
char *buf;

buf = malloc(size);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
+ ret = do_read(buf, size);
+ if (ret < 0) {
free(buf);
- return -1;
+ goto out;
}

- parse_event_file(pevent, buf, size, sys);
+ ret = parse_event_file(pevent, buf, size, sys);
+ if (ret < 0)
+ pr_debug("error parsing event file.\n");
+out:
free(buf);
- return 0;
+ return ret;
}

static int read_ftrace_files(struct pevent *pevent)
@@ -345,6 +359,7 @@ static int read_saved_cmdline(struct pevent *pevent)
{
unsigned long long size;
char *buf;
+ int ret;

/* it can have 0 size */
size = read8(pevent);
@@ -352,18 +367,22 @@ static int read_saved_cmdline(struct pevent *pevent)
return 0;

buf = malloc(size + 1);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
- free(buf);
- return -1;
+ ret = do_read(buf, size);
+ if (ret < 0) {
+ pr_debug("error reading saved cmdlines\n");
+ goto out;
}

parse_saved_cmdline(pevent, buf, size);
-
+ ret = 0;
+out:
free(buf);
- return 0;
+ return ret;
}

ssize_t trace_report(int fd, struct trace_event *tevent, bool __repipe)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:36 AM1/25/17
to
From: Andi Kleen <a...@linux.intel.com>

When dumping PT traces with perf script it is very useful to see the
assembler for each sample, so that it is easily possible to follow
the control flow.

As using objdump is difficult and inefficient from perf script this
patch uses the Intel xed library to implement assembler output.
The library can be downloaded from http://github.com/intelxed/xed

The previous version of this patch used udis86, but was
rejected because udis86 was unmaintained and a runtime dependency.
Using the recently released xed avoids both of these problems:
- XED is well maintained and used by many Intel tools
- XED is linked statically so there is no runtime dependency.

The library is probed as an external dependency in the usual way. Then perf
script calls into it when needed, and handles callbacks to resolve
symbols.

% perf record -e intel_pt//u true
% perf script -F sym,symoff,ip,asm --itrace=i0ns | head
7fc7188b4190 _start+0x0 mov %rsp, %rdi
7fc7188b4193 _start+0x3 call _dl_start
7fc7188b7710 _dl_start+0x0 push %rbp
7fc7188b7711 _dl_start+0x1 mov %rsp, %rbp
7fc7188b7714 _dl_start+0x4 push %r15
7fc7188b7716 _dl_start+0x6 push %r14
7fc7188b7718 _dl_start+0x8 push %r13
7fc7188b771a _dl_start+0xa push %r12
7fc7188b771c _dl_start+0xc mov %rdi, %r12
7fc7188b771f _dl_start+0xf push %rbx

v2:
Converted to use XED instead of udis86.
Separate disassembler interface into separate arch specific file.
Lots of cleanups and improvements.

Committer notes:

Testing it:

Disassembly of perf_evsel__enable() bits referenced in the samples
collected for this test

'perf script' with this patch: DIFF objdump -d

4b8506 jz 0x4b84d0 <perf_evsel__enable+0x70> 74 c8 je 4b84d0 <perf_evsel__enable+0x70>
4b84d0 add $0x1, %r14 add $0x1,%r14
4b84d4 cmp %r14d, %ebx cmp %r14d,%ebx
4b84d7 jle 0x4b8530 <perf_evsel__enable+0xd0> jle 4b8530 <perf_evsel__enable+0xd0>
4b8530 add $0x1, %r12 add $0x1,%r12
4b8534 cmp %r12d, %r13d cmp %r12d,%r13d
4b8537 jnle 0x4b84c2 <perf_evsel__enable+0x62> 7f 89 jg 4b84c2 <perf_evsel__enable+0x62>
4b84c2 xor %r14d, %r14d xor %r14d,%r14d
4b84c5 test %ebx, %ebx test %ebx,%ebx
4b84c7 jnle 0x4b84d9 <perf_evsel__enable+0x79> 7f 10 jg 4b84d9 <perf_evsel__enable+0x79>
4b84d9 movq 0x90(%r15), %rax 49 8b 87 90 00 00 00 mov 0x90(%r15),%rax
4b84e0 mov %r12, %rdx mov %r12,%rdx
4b84e3 mov %r14, %rcx mov %r14,%rcx
4b84e6 mov $0x2400, %esi mov $0x2400,%esi
4b84eb imulq (%rax), %rdx 48 0f af 10 imul (%rax),%rdx
4b84ef imulq 0x8(%rax), %rcx 48 0f af 48 08 imul 0x8(%rax),%rcx
4b84f4 add %rdx, %rax add %rdx,%rax
4b84f7 xor %edx, %edx xor %edx,%edx
4b84f9 movl 0x18(%rcx,%rax,1), %edi 8b 7c 01 18 mov 0x18(%rcx,%rax,1),%edi
4b84fd xor %eax, %eax xor %eax,%eax
4b84ff callq 0x42d990 <ioctl@plt> callq 42d990 <ioctl@plt>
4b8504 test %eax, %eax test %eax,%eax
4b8506 jz 0x4b84d0 <perf_evsel__enable+0x70> 74 c8 je 4b84d0 <perf_evsel__enable+0x70>

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Adrian Hunter <adrian...@intel.com>
Link: http://lkml.kernel.org/r/2017011901415...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 4 +-
tools/perf/builtin-script.c | 72 +++++++++++++++++++++++++++-----
2 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 4ed5f239ba7d..497989ea9768 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, brstack, brstacksym, flags, bpf-output,
+ srcline, period, iregs, brstack, brstacksym, flags, bpf-output, asm.
callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -198,6 +198,8 @@ OPTIONS

The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.

+ When asm is specified the assembler instruction of each sample is printed in disassembled form.
+
-k::
--vmlinux=<file>::
vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c0783b4f7b6c..7a09c4f7df3f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -28,6 +28,7 @@
#include <linux/time64.h>
#include "asm/bug.h"
#include "util/mem-events.h"
+#include "util/dis.h"

static char const *script_name;
static char const *generate_script_lang;
@@ -69,6 +70,7 @@ enum perf_output_field {
PERF_OUTPUT_CALLINDENT = 1U << 20,
PERF_OUTPUT_INSN = 1U << 21,
PERF_OUTPUT_INSNLEN = 1U << 22,
+ PERF_OUTPUT_ASM = 1U << 23,
};

struct output_option {
@@ -98,6 +100,7 @@ struct output_option {
{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
{.str = "insn", .field = PERF_OUTPUT_INSN},
{.str = "insnlen", .field = PERF_OUTPUT_INSNLEN},
+ {.str = "asm", .field = PERF_OUTPUT_ASM},
};

/* default set to maintain compatibility with current format */
@@ -292,7 +295,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected. Hence, no address to lookup the source line number.\n");
return -EINVAL;
}
-
+ if (PRINT_FIELD(ASM) && !PRINT_FIELD(IP)) {
+ pr_err("Display of assembler requested but sample IP is not\n"
+ "selected.\n");
+ return -EINVAL;
+ }
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID",
PERF_OUTPUT_TID|PERF_OUTPUT_PID))
@@ -436,6 +443,39 @@ static void print_sample_iregs(struct perf_sample *sample,
}
}

+static void print_sample_asm(union perf_event *event,
+ struct perf_sample *sample,
+ struct thread *thread,
+ struct addr_location *al,
+ struct machine *machine)
+{
+ struct perf_dis x;
+ u8 buffer[32];
+ int len;
+ u64 offset;
+
+ x.thread = thread;
+ x.cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+ x.cpu = sample->cpu;
+
+ if (!al->map || !al->map->dso)
+ return;
+ if (al->map->dso->data.status == DSO_DATA_STATUS_ERROR)
+ return;
+
+ /* Load maps to ensure dso->is_64_bit has been updated */
+ map__load(al->map);
+ x.is64bit = al->map->dso->is_64_bit;
+
+ offset = al->map->map_ip(al->map, sample->ip);
+ len = dso__data_read_offset(al->map->dso, machine,
+ offset, buffer, MAXINSN);
+ if (len <= 0)
+ return;
+
+ printf("\t%s", disas_inst(&x, sample->ip, buffer, len, NULL));
+}
+
static void print_sample_start(struct perf_sample *sample,
struct thread *thread,
struct perf_evsel *evsel)
@@ -631,8 +671,12 @@ static void print_sample_callindent(struct perf_sample *sample,
printf("%*s", spacing - len, "");
}

-static void print_insn(struct perf_sample *sample,
- struct perf_event_attr *attr)
+static void print_insn(union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_event_attr *attr,
+ struct thread *thread,
+ struct addr_location *al,
+ struct machine *machine)
{
if (PRINT_FIELD(INSNLEN))
printf(" ilen: %d", sample->insn_len);
@@ -643,12 +687,16 @@ static void print_insn(struct perf_sample *sample,
for (i = 0; i < sample->insn_len; i++)
printf(" %02x", (unsigned char)sample->insn[i]);
}
+ if (PRINT_FIELD(ASM))
+ print_sample_asm(event, sample, thread, al, machine);
}

-static void print_sample_bts(struct perf_sample *sample,
+static void print_sample_bts(union perf_event *event,
+ struct perf_sample *sample,
struct perf_evsel *evsel,
struct thread *thread,
- struct addr_location *al)
+ struct addr_location *al,
+ struct machine *machine)
{
struct perf_event_attr *attr = &evsel->attr;
bool print_srcline_last = false;
@@ -689,7 +737,7 @@ static void print_sample_bts(struct perf_sample *sample,
if (print_srcline_last)
map__fprintf_srcline(al->map, al->addr, "\n ", stdout);

- print_insn(sample, attr);
+ print_insn(event, sample, attr, thread, al, machine);

printf("\n");
}
@@ -871,7 +919,9 @@ static size_t data_src__printf(u64 data_src)

static void process_event(struct perf_script *script,
struct perf_sample *sample, struct perf_evsel *evsel,
- struct addr_location *al)
+ struct addr_location *al,
+ struct machine *machine,
+ union perf_event *event)
{
struct thread *thread = al->thread;
struct perf_event_attr *attr = &evsel->attr;
@@ -898,7 +948,7 @@ static void process_event(struct perf_script *script,
print_sample_flags(sample->flags);

if (is_bts_event(attr)) {
- print_sample_bts(sample, evsel, thread, al);
+ print_sample_bts(event, sample, evsel, thread, al, machine);
return;
}

@@ -936,7 +986,7 @@ static void process_event(struct perf_script *script,

if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
print_sample_bpf_output(sample);
- print_insn(sample, attr);
+ print_insn(event, sample, attr, thread, al, machine);
printf("\n");
}

@@ -1046,7 +1096,7 @@ static int process_sample_event(struct perf_tool *tool,
if (scripting_ops)
scripting_ops->process_event(event, sample, evsel, &al);
else
- process_event(scr, sample, evsel, &al);
+ process_event(scr, sample, evsel, &al, machine, event);

out_put:
addr_location__put(&al);
@@ -2152,7 +2202,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"Valid types: hw,sw,trace,raw. "
"Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
"addr,symoff,period,iregs,brstack,brstacksym,flags,"
- "bpf-output,callindent,insn,insnlen", parse_output_fields),
+ "bpf-output,callindent,insn,insnlen,asm", parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:37 AM1/25/17
to
From: Namhyung Kim <namhyu...@lge.com>

The 'perf ftrace' command is a simple wrapper of kernel's ftrace
functionality. It only supports single thread tracing currently and
just reads trace_pipe in text and then write it to stdout.

Committer notes:

Testing it:

# perf ftrace -f function_graph usleep 123456
<SNIP>
2) | SyS_nanosleep() {
2) | _copy_from_user() {
<SNIP>
2) 0.900 us | }
2) 1.354 us | }
2) | hrtimer_nanosleep() {
2) 0.062 us | __hrtimer_init();
2) | do_nanosleep() {
2) | hrtimer_start_range_ns() {
<SNIP>
2) 5.025 us | }
2) | schedule() {
2) 0.125 us | rcu_note_context_switch();
2) 0.057 us | _raw_spin_lock();
2) | deactivate_task() {
2) 0.369 us | update_rq_clock.part.77();
2) | dequeue_task_fair() {
<SNIP>
2) + 22.453 us | }
2) + 23.736 us | }
2) | pick_next_task_fair() {
<SNIP>
2) + 47.167 us | }
2) | pick_next_task_idle() {
<SNIP>
2) 4.462 us | }
------------------------------------------
2) usleep-20387 => <idle>-0
------------------------------------------

2) 0.806 us | switch_mm_irqs_off();
------------------------------------------
2) <idle>-0 => usleep-20387
------------------------------------------

2) 0.151 us | finish_task_switch();
2) @ 123597.2 us | }
2) 0.037 us | _cond_resched();
2) | hrtimer_try_to_cancel() {
2) 0.064 us | hrtimer_active();
2) 0.353 us | }
2) @ 123605.3 us | }
2) @ 123606.2 us | }
2) @ 123608.3 us | } /* SyS_nanosleep */
2) | __do_page_fault() {
<SNIP>

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>,
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/n/tip-r1hgmsj4dx...@git.kernel.org
[ Various foward port fixes, add man page ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/Documentation/perf-ftrace.txt | 36 +++++
tools/perf/builtin-ftrace.c | 242 +++++++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
6 files changed, 282 insertions(+)
create mode 100644 tools/perf/Documentation/perf-ftrace.txt
create mode 100644 tools/perf/builtin-ftrace.c

diff --git a/tools/perf/Build b/tools/perf/Build
index 7039ecb89170..9b79f8d7db50 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -3,6 +3,7 @@ perf-y += builtin-annotate.o
perf-y += builtin-config.o
perf-y += builtin-diff.o
perf-y += builtin-evlist.o
+perf-y += builtin-ftrace.o
perf-y += builtin-help.o
perf-y += builtin-sched.o
perf-y += builtin-buildid-list.o
diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt
new file mode 100644
index 000000000000..2d96de6132a9
--- /dev/null
+++ b/tools/perf/Documentation/perf-ftrace.txt
@@ -0,0 +1,36 @@
+perf-ftrace(1)
+=============
+
+NAME
+----
+perf-ftrace - simple wrapper for kernel's ftrace functionality
+
+
+SYNOPSIS
+--------
+[verse]
+'perf ftrace' <command>
+
+DESCRIPTION
+-----------
+The 'perf ftrace' command is a simple wrapper of kernel's ftrace
+functionality. It only supports single thread tracing currently and
+just reads trace_pipe in text and then write it to stdout.
+
+The following options apply to perf ftrace.
+
+OPTIONS
+-------
+
+-t::
+--tracer=::
+ Tracer to use: function_graph or function.
+
+-v::
+--verbose=::
+ Verbosity level.
+
+
+SEE ALSO
+--------
+linkperf:perf-record[1], linkperf:perf-trace[1]
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
new file mode 100644
index 000000000000..c320ae25c8e6
--- /dev/null
+++ b/tools/perf/builtin-ftrace.c
@@ -0,0 +1,242 @@
+/*
+ * builtin-ftrace.c
+ *
+ * Copyright (c) 2013 LG Electronics, Namhyung Kim <namh...@kernel.org>
+ *
+ * Released under the GPL v2.
+ */
+
+#include "builtin.h"
+#include "perf.h"
+
+#include <unistd.h>
+#include <signal.h>
+
+#include "debug.h"
+#include <subcmd/parse-options.h>
+#include "evlist.h"
+#include "target.h"
+#include "thread_map.h"
+
+
+#define DEFAULT_TRACER "function_graph"
+
+struct perf_ftrace {
+ struct perf_evlist *evlist;
+ struct target target;
+ const char *tracer;
+};
+
+static bool done;
+
+static void sig_handler(int sig __maybe_unused)
+{
+ done = true;
+}
+
+/*
+ * perf_evlist__prepare_workload will send a SIGUSR1 if the fork fails, since
+ * we asked by setting its exec_error to the function below,
+ * ftrace__workload_exec_failed_signal.
+ *
+ * XXX We need to handle this more appropriately, emitting an error, etc.
+ */
+static void ftrace__workload_exec_failed_signal(int signo __maybe_unused,
+ siginfo_t *info __maybe_unused,
+ void *ucontext __maybe_unused)
+{
+ /* workload_exec_errno = info->si_value.sival_int; */
+ done = true;
+}
+
+static int write_tracing_file(const char *name, const char *val)
+{
+ char *file;
+ int fd, ret = -1;
+ ssize_t size = strlen(val);
+
+ file = get_tracing_file(name);
+ if (!file) {
+ pr_debug("cannot get tracing file: %s\n", name);
+ return -1;
+ }
+
+ fd = open(file, O_WRONLY);
+ if (fd < 0) {
+ pr_debug("cannot open tracing file: %s\n", name);
+ goto out;
+ }
+
+ if (write(fd, val, size) == size)
+ ret = 0;
+ else
+ pr_debug("write '%s' to tracing/%s failed\n", val, name);
+
+ close(fd);
+out:
+ put_tracing_file(file);
+ return ret;
+}
+
+static int reset_tracing_files(struct perf_ftrace *ftrace __maybe_unused)
+{
+ if (write_tracing_file("tracing_on", "0") < 0)
+ return -1;
+
+ if (write_tracing_file("current_tracer", "nop") < 0)
+ return -1;
+
+ if (write_tracing_file("set_ftrace_pid", " ") < 0)
+ return -1;
+
+ return 0;
+}
+
+static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
+{
+ char *trace_file;
+ int trace_fd;
+ char *trace_pid;
+ char buf[4096];
+ struct pollfd pollfd = {
+ .events = POLLIN,
+ };
+
+ if (geteuid() != 0) {
+ pr_err("ftrace only works for root!\n");
+ return -1;
+ }
+
+ if (argc < 1)
+ return -1;
+
+ signal(SIGINT, sig_handler);
+ signal(SIGUSR1, sig_handler);
+ signal(SIGCHLD, sig_handler);
+
+ reset_tracing_files(ftrace);
+
+ /* reset ftrace buffer */
+ if (write_tracing_file("trace", "0") < 0)
+ goto out;
+
+ if (perf_evlist__prepare_workload(ftrace->evlist, &ftrace->target,
+ argv, false, ftrace__workload_exec_failed_signal) < 0)
+ goto out;
+
+ if (write_tracing_file("current_tracer", ftrace->tracer) < 0) {
+ pr_err("failed to set current_tracer to %s\n", ftrace->tracer);
+ goto out;
+ }
+
+ if (asprintf(&trace_pid, "%d", thread_map__pid(ftrace->evlist->threads, 0)) < 0) {
+ pr_err("failed to allocate pid string\n");
+ goto out;
+ }
+
+ if (write_tracing_file("set_ftrace_pid", trace_pid) < 0) {
+ pr_err("failed to set pid: %s\n", trace_pid);
+ goto out_free_pid;
+ }
+
+ trace_file = get_tracing_file("trace_pipe");
+ if (!trace_file) {
+ pr_err("failed to open trace_pipe\n");
+ goto out_free_pid;
+ }
+
+ trace_fd = open(trace_file, O_RDONLY);
+
+ put_tracing_file(trace_file);
+
+ if (trace_fd < 0) {
+ pr_err("failed to open trace_pipe\n");
+ goto out_free_pid;
+ }
+
+ fcntl(trace_fd, F_SETFL, O_NONBLOCK);
+ pollfd.fd = trace_fd;
+
+ if (write_tracing_file("tracing_on", "1") < 0) {
+ pr_err("can't enable tracing\n");
+ goto out_close_fd;
+ }
+
+ perf_evlist__start_workload(ftrace->evlist);
+
+ while (!done) {
+ if (poll(&pollfd, 1, -1) < 0)
+ break;
+
+ if (pollfd.revents & POLLIN) {
+ int n = read(trace_fd, buf, sizeof(buf));
+ if (n < 0)
+ break;
+ if (fwrite(buf, n, 1, stdout) != 1)
+ break;
+ }
+ }
+
+ write_tracing_file("tracing_on", "0");
+
+ /* read remaining buffer contents */
+ while (true) {
+ int n = read(trace_fd, buf, sizeof(buf));
+ if (n <= 0)
+ break;
+ if (fwrite(buf, n, 1, stdout) != 1)
+ break;
+ }
+
+out_close_fd:
+ close(trace_fd);
+out_free_pid:
+ free(trace_pid);
+out:
+ reset_tracing_files(ftrace);
+
+ return done ? 0 : -1;
+}
+
+int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
+{
+ int ret;
+ struct perf_ftrace ftrace = {
+ .target = { .uid = UINT_MAX, },
+ };
+ const char * const ftrace_usage[] = {
+ "perf ftrace [<options>] <command>",
+ "perf ftrace [<options>] -- <command> [<options>]",
+ NULL
+ };
+ const struct option ftrace_options[] = {
+ OPT_STRING('t', "tracer", &ftrace.tracer, "tracer",
+ "tracer to use: function_graph or function"),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose"),
+ OPT_END()
+ };
+
+ argc = parse_options(argc, argv, ftrace_options, ftrace_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+ if (!argc)
+ usage_with_options(ftrace_usage, ftrace_options);
+
+ ftrace.evlist = perf_evlist__new();
+ if (ftrace.evlist == NULL)
+ return -ENOMEM;
+
+ ret = perf_evlist__create_maps(ftrace.evlist, &ftrace.target);
+ if (ret < 0)
+ goto out_delete_evlist;
+
+ if (ftrace.tracer == NULL)
+ ftrace.tracer = DEFAULT_TRACER;
+
+ ret = __cmd_ftrace(&ftrace, argc, argv);
+
+out_delete_evlist:
+ perf_evlist__delete(ftrace.evlist);
+
+ return ret;
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index b55f5be486a1..036e1e35b1a8 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -41,6 +41,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix);
int cmd_inject(int argc, const char **argv, const char *prefix);
int cmd_mem(int argc, const char **argv, const char *prefix);
int cmd_data(int argc, const char **argv, const char *prefix);
+int cmd_ftrace(int argc, const char **argv, const char *prefix);

int find_scripts(char **scripts_array, char **scripts_path_array);
#endif
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index fb45613dba9e..ac3efd396a72 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -11,6 +11,7 @@ perf-data mainporcelain common
perf-diff mainporcelain common
perf-config mainporcelain common
perf-evlist mainporcelain common
+perf-ftrace mainporcelain common
perf-inject mainporcelain common
perf-kallsyms mainporcelain common
perf-kmem mainporcelain common
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index c71000e022d7..6d5479e03e0d 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -71,6 +71,7 @@ static struct cmd_struct commands[] = {
{ "inject", cmd_inject, 0 },
{ "mem", cmd_mem, 0 },
{ "data", cmd_data, 0 },
+ { "ftrace", cmd_ftrace, 0 },
};

struct pager_config {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:37 AM1/25/17
to
From: Joe Stringer <j...@ovn.org>

Commit 4708bbda5cb2 ("tools lib bpf: Fix maps resolution") attempted to
fix map resolution by identifying the number of symbols that point to
maps, and using this number to resolve each of the maps.

However, during relocation the original definition of the map size was
still in use. For up to two maps, the calculation was correct if there
was a small difference in size between the map definition in libbpf and
the one that the client library uses. However if the difference was
large, particularly if more than two maps were used in the BPF program,
the relocation would fail.

For example, when using a map definition with size 28, with three maps,
map relocation would count:

(sym_offset / sizeof(struct bpf_map_def) => map_idx)
(0 / 16 => 0), ie map_idx = 0
(28 / 16 => 1), ie map_idx = 1
(56 / 16 => 3), ie map_idx = 3

So, libbpf reports:

libbpf: bpf relocation: map_idx 3 large than 2

Fix map relocation by checking the exact offset of maps when doing
relocation.

Signed-off-by: Joe Stringer <j...@ovn.org>
[Allow different map size in an object]
Signed-off-by: Wang Nan <wang...@huawei.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: net...@vger.kernel.org
Fixes: 4708bbda5cb2 ("tools lib bpf: Fix maps resolution")
Link: http://lkml.kernel.org/r/2017012301112...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 84e6b35da4bd..671d5ad07cf1 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -779,7 +779,7 @@ static int
bpf_program__collect_reloc(struct bpf_program *prog,
size_t nr_maps, GElf_Shdr *shdr,
Elf_Data *data, Elf_Data *symbols,
- int maps_shndx)
+ int maps_shndx, struct bpf_map *maps)
{
int i, nrels;

@@ -829,7 +829,15 @@ bpf_program__collect_reloc(struct bpf_program *prog,
return -LIBBPF_ERRNO__RELOC;
}

- map_idx = sym.st_value / sizeof(struct bpf_map_def);
+ /* TODO: 'maps' is sorted. We can use bsearch to make it faster. */
+ for (map_idx = 0; map_idx < nr_maps; map_idx++) {
+ if (maps[map_idx].offset == sym.st_value) {
+ pr_debug("relocation: find map %zd (%s) for insn %u\n",
+ map_idx, maps[map_idx].name, insn_idx);
+ break;
+ }
+ }
+
if (map_idx >= nr_maps) {
pr_warning("bpf relocation: map_idx %d large than %d\n",
(int)map_idx, (int)nr_maps - 1);
@@ -953,7 +961,8 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
err = bpf_program__collect_reloc(prog, nr_maps,
shdr, data,
obj->efile.symbols,
- obj->efile.maps_shndx);
+ obj->efile.maps_shndx,
+ obj->maps);
if (err)
return err;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:37 AM1/25/17
to
From: Matija Glavinic Pecotic <matija.glavin...@nokia.com>

Using perf with call graph method dwarf fails to provide backtrace
support for stripped binary even though .gnu_debuglink points to *.dbg
flavor with properly populated debug symbols.

Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and
4.10.rc3. Perf is configured with libunwind, and unwind dwarf support
[1]. Test code (stress_bt.c) can be found on [2].

Running (explicitly disable other unwinding methods):

$ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
-fno-asynchronous-unwind-tables stress_bt.c
$ perf record -N --call-graph dwarf ./stress_bt
$ perf report

results in properly generated call graph. Stripping the binary and running
it results with missing call graph. Expected result is to have call graph:

$ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
-fno-asynchronous-unwind-tables stress_bt.c
$ objcopy --only-keep-debug stress_bt stress_bt.dbg
$ objcopy --strip-debug stress_bt
$ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt
$ perf record -N --call-graph dwarf ./stress_bt
$ perf report

Problem is that perf doesn't try to read symbols pointed by gnu
debuglink. Patch adds checking, and reading of the symbols from
debuglink and symsrc. Order of the check is to first check within dso,
then check whether symsrc is defined and try to read from it. Finally,
debuglink is checked. Default locations of debug files are discussed in
[3] and [4]. Comments on RFC are on [5].

[1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding
[2] [1]#Backtrace_stress_application
[3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
[4] https://sourceware.org/binutils/docs/binutils/objcopy.html
[5] https://lkml.org/lkml/2016/8/22/473

Signed-off-by: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Alexander Sverdlin <alexander...@nokia.com>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Link: http://lkml.kernel.org/r/d309d40a-463f-482b...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/dso.c | 48 +++++++++++++++++++++-------
tools/perf/util/unwind-libunwind-local.c | 54 +++++++++++++++++++++++++++++---
2 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index d2c6cdd9d42b..28d41e709128 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -9,6 +9,13 @@
#include "debug.h"
#include "vdso.h"

+static const char * const debuglink_paths[] = {
+ "%.0s%s",
+ "%s/%s",
+ "%s/.debug/%s",
+ "/usr/lib/debug%s/%s"
+};
+
char dso__symtab_origin(const struct dso *dso)
{
static const char origin[] = {
@@ -44,24 +51,43 @@ int dso__read_binary_type_filename(const struct dso *dso,
size_t len;

switch (type) {
- case DSO_BINARY_TYPE__DEBUGLINK: {
- char *debuglink;
+ case DSO_BINARY_TYPE__DEBUGLINK:
+ {
+ const char *last_slash;
+ char dso_dir[PATH_MAX];
+ char symfile[PATH_MAX];
+ unsigned int i;

len = __symbol__join_symfs(filename, size, dso->long_name);
- debuglink = filename + len;
- while (debuglink != filename && *debuglink != '/')
- debuglink--;
- if (*debuglink == '/')
- debuglink++;
+ last_slash = filename + len;
+ while (last_slash != filename && *last_slash != '/')
+ last_slash--;

- ret = -1;
- if (!is_regular_file(filename))
+ strncpy(dso_dir, filename, last_slash - filename);
+ dso_dir[last_slash-filename] = '\0';
+
+ if (!is_regular_file(filename)) {
+ ret = -1;
+ break;
+ }
+
+ ret = filename__read_debuglink(filename, symfile, PATH_MAX);
+ if (ret)
break;

- ret = filename__read_debuglink(filename, debuglink,
- size - (debuglink - filename));
+ /* Check predefined locations where debug file might reside */
+ ret = -1;
+ for (i = 0; i < ARRAY_SIZE(debuglink_paths); i++) {
+ snprintf(filename, size,
+ debuglink_paths[i], dso_dir, symfile);
+ if (is_regular_file(filename)) {
+ ret = 0;
+ break;
+ }
}
+
break;
+ }
case DSO_BINARY_TYPE__BUILD_ID_CACHE:
if (dso__build_id_filename(dso, filename, size) == NULL)
ret = -1;
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 6fec84dff3f7..bfb9b7987692 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -35,6 +35,7 @@
#include "util.h"
#include "debug.h"
#include "asm/bug.h"
+#include "dso.h"

extern int
UNW_OBJ(dwarf_search_unwind_table) (unw_addr_space_t as,
@@ -297,15 +298,58 @@ static int read_unwind_spec_debug_frame(struct dso *dso,
int fd;
u64 ofs = dso->data.debug_frame_offset;

+ /* debug_frame can reside in:
+ * - dso
+ * - debug pointed by symsrc_filename
+ * - gnu_debuglink, which doesn't necessary
+ * has to be pointed by symsrc_filename
+ */
if (ofs == 0) {
fd = dso__data_get_fd(dso, machine);
- if (fd < 0)
- return -EINVAL;
+ if (fd >= 0) {
+ ofs = elf_section_offset(fd, ".debug_frame");
+ dso__data_put_fd(dso);
+ }
+
+ if (ofs <= 0) {
+ fd = open(dso->symsrc_filename, O_RDONLY);
+ if (fd >= 0) {
+ ofs = elf_section_offset(fd, ".debug_frame");
+ close(fd);
+ }
+ }
+
+ if (ofs <= 0) {
+ char *debuglink = malloc(PATH_MAX);
+ int ret = 0;
+
+ ret = dso__read_binary_type_filename(
+ dso, DSO_BINARY_TYPE__DEBUGLINK,
+ machine->root_dir, debuglink, PATH_MAX);
+ if (!ret) {
+ fd = open(debuglink, O_RDONLY);
+ if (fd >= 0) {
+ ofs = elf_section_offset(fd,
+ ".debug_frame");
+ close(fd);
+ }
+ }
+ if (ofs > 0) {
+ if (dso->symsrc_filename != NULL) {
+ pr_warning(
+ "%s: overwrite symsrc(%s,%s)\n",
+ __func__,
+ dso->symsrc_filename,
+ debuglink);
+ free(dso->symsrc_filename);
+ }
+ dso->symsrc_filename = debuglink;
+ } else {
+ free(debuglink);
+ }
+ }

- /* Check the .debug_frame section for unwinding info */
- ofs = elf_section_offset(fd, ".debug_frame");
dso->data.debug_frame_offset = ofs;
- dso__data_put_fd(dso);
}

*offset = ofs;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:37 AM1/25/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6x...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-help.c | 6 ++++--
tools/perf/builtin-kmem.c | 8 ++++++--
tools/perf/builtin-record.c | 4 +++-
tools/perf/builtin-report.c | 4 +++-
tools/perf/builtin-top.c | 4 +++-
tools/perf/perf.c | 15 ++++++++++-----
tools/perf/util/callchain.c | 14 ++++++++++++--
tools/perf/util/config.c | 9 +++++----
tools/perf/util/data-convert-bt.c | 7 +++++--
tools/perf/util/hist.c | 4 +++-
tools/perf/util/intel-pt.c | 4 +++-
tools/perf/util/llvm-utils.c | 4 +++-
12 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 93da24a638be..aed0d844e8c2 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -447,11 +447,13 @@ int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
const char *alias;
- int rc = 0;
+ int rc;

load_command_list("perf-", &main_cmds, &other_cmds);

- perf_config(perf_help_config, &help_format);
+ rc = perf_config(perf_help_config, &help_format);
+ if (rc)
+ return rc;

argc = parse_options_subcommand(argc, argv, builtin_help_options,
builtin_help_subcommands, builtin_help_usage, 0);
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 35a02f8e5a4a..70ff88d8be7e 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1921,10 +1921,12 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
struct perf_session *session;
- int ret = -1;
const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";
+ int ret = perf_config(kmem_config, NULL);
+
+ if (ret)
+ return ret;

- perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);

@@ -1949,6 +1951,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ ret = -1;
+
if (kmem_slab) {
if (!perf_evlist__find_tracepoint_by_name(session->evlist,
"kmem:kmalloc")) {
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 33a9eaaf9db4..ffac8ca9fb01 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1670,7 +1670,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

- perf_config(perf_record_config, rec);
+ err = perf_config(perf_record_config, rec);
+ if (err)
+ return err;

argc = parse_options(argc, argv, record_options, record_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 06cc759a4597..dbd7fa028861 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -847,7 +847,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
if (ret < 0)
return ret;

- perf_config(report__config, &report);
+ ret = perf_config(report__config, &report);
+ if (ret)
+ return ret;

argc = parse_options(argc, argv, options, report_usage, 0);
if (argc) {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 3df4178ba378..20aef9815cd8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1216,7 +1216,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (top.evlist == NULL)
return -ENOMEM;

- perf_config(perf_top_config, &top);
+ status = perf_config(perf_top_config, &top);
+ if (status)
+ return status;

argc = parse_options(argc, argv, options, top_usage, 0);
if (argc)
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 0324cd15e42a..c71000e022d7 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -89,11 +89,12 @@ static int pager_command_config(const char *var, const char *value, void *data)
/* returns 0 for "no pager", 1 for "use pager", and -1 for "not specified" */
int check_pager_config(const char *cmd)
{
+ int err;
struct pager_config c;
c.cmd = cmd;
c.val = -1;
- perf_config(pager_command_config, &c);
- return c.val;
+ err = perf_config(pager_command_config, &c);
+ return err ?: c.val;
}

static int browser_command_config(const char *var, const char *value, void *data)
@@ -112,11 +113,12 @@ static int browser_command_config(const char *var, const char *value, void *data
*/
static int check_browser_config(const char *cmd)
{
+ int err;
struct pager_config c;
c.cmd = cmd;
c.val = -1;
- perf_config(browser_command_config, &c);
- return c.val;
+ err = perf_config(browser_command_config, &c);
+ return err ?: c.val;
}

static void commit_pager_choice(void)
@@ -508,6 +510,7 @@ static void cache_line_size(int *cacheline_sizep)

int main(int argc, const char **argv)
{
+ int err;
const char *cmd;
char sbuf[STRERR_BUFSIZE];
int value;
@@ -533,7 +536,9 @@ int main(int argc, const char **argv)
srandom(time(NULL));

perf_config__init();
- perf_config(perf_default_config, NULL);
+ err = perf_config(perf_default_config, NULL);
+ if (err)
+ return err;
set_buildid_dir(NULL);

/* get debugfs/tracefs mount point from /proc/mounts */
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 42922512c1c6..f19aff042925 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -48,6 +48,8 @@ static int parse_callchain_mode(const char *value)
callchain_param.mode = CHAIN_FOLDED;
return 0;
}
+
+ pr_debug("Invalid callchain mode: %s\n", value);
return -1;
}

@@ -63,6 +65,8 @@ static int parse_callchain_order(const char *value)
callchain_param.order_set = true;
return 0;
}
+
+ pr_debug("Invalid callchain order: %s\n", value);
return -1;
}

@@ -80,6 +84,8 @@ static int parse_callchain_sort_key(const char *value)
callchain_param.branch_callstack = 1;
return 0;
}
+
+ pr_debug("Invalid callchain sort key: %s\n", value);
return -1;
}

@@ -210,13 +216,17 @@ int perf_callchain_config(const char *var, const char *value)
return parse_callchain_sort_key(value);
if (!strcmp(var, "threshold")) {
callchain_param.min_percent = strtod(value, &endptr);
- if (value == endptr)
+ if (value == endptr) {
+ pr_debug("Invalid callchain threshold: %s\n", value);
return -1;
+ }
}
if (!strcmp(var, "print-limit")) {
callchain_param.print_limit = strtod(value, &endptr);
- if (value == endptr)
+ if (value == endptr) {
+ pr_debug("Invalid callchain print limit: %s\n", value);
return -1;
+ }
}

return 0;
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 3d906dbbef74..b0b61af7c493 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -386,8 +386,10 @@ static int perf_buildid_config(const char *var, const char *value)
if (!strcmp(var, "buildid.dir")) {
const char *dir = perf_config_dirname(var, value);

- if (!dir)
+ if (!dir) {
+ pr_debug("Invalid buildid directory!\n");
return -1;
+ }
strncpy(buildid_dir, dir, MAXPATHLEN-1);
buildid_dir[MAXPATHLEN-1] = '\0';
}
@@ -405,10 +407,9 @@ static int perf_default_core_config(const char *var __maybe_unused,
static int perf_ui_config(const char *var, const char *value)
{
/* Add other config variables here. */
- if (!strcmp(var, "ui.show-headers")) {
+ if (!strcmp(var, "ui.show-headers"))
symbol_conf.show_hist_headers = perf_config_bool(var, value);
- return 0;
- }
+
return 0;
}

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 7123f4de32cc..4e6cbc99f08e 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1473,7 +1473,7 @@ int bt_convert__perf2ctf(const char *input, const char *path,
},
};
struct ctf_writer *cw = &c.writer;
- int err = -1;
+ int err;

if (opts->all) {
c.tool.comm = process_comm_event;
@@ -1481,12 +1481,15 @@ int bt_convert__perf2ctf(const char *input, const char *path,
c.tool.fork = process_fork_event;
}

- perf_config(convert__config, &c);
+ err = perf_config(convert__config, &c);
+ if (err)
+ return err;

/* CTF writer */
if (ctf_writer__init(cw, path))
return -1;

+ err = -1;
/* perf.data session */
session = perf_session__new(&file, 0, &c.tool);
if (!session)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6770a9645609..cff2e9041b15 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2439,8 +2439,10 @@ int parse_filter_percentage(const struct option *opt __maybe_unused,
symbol_conf.filter_relative = true;
else if (!strcmp(arg, "absolute"))
symbol_conf.filter_relative = false;
- else
+ else {
+ pr_debug("Invalud percentage: %s\n", arg);
return -1;
+ }

return 0;
}
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 85d5eeb66c75..da20cd5612e9 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2159,7 +2159,9 @@ int intel_pt_process_auxtrace_info(union perf_event *event,

addr_filters__init(&pt->filts);

- perf_config(intel_pt_perf_config, pt);
+ err = perf_config(intel_pt_perf_config, pt);
+ if (err)
+ goto err_free;

err = auxtrace_queues__init(&pt->queues);
if (err)
diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
index b23ff44cf214..824356488ce6 100644
--- a/tools/perf/util/llvm-utils.c
+++ b/tools/perf/util/llvm-utils.c
@@ -48,8 +48,10 @@ int perf_llvm_config(const char *var, const char *value)
llvm_param.kbuild_opts = strdup(value);
else if (!strcmp(var, "dump-obj"))
llvm_param.dump_obj = !!perf_config_bool(var, value);
- else
+ else {
+ pr_debug("Invalid LLVM config option: %s\n", value);
return -1;
+ }
llvm_param.user_set_param = true;
return 0;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:37 AM1/25/17
to
From: Markus Elfring <elf...@users.sourceforge.net>

Remove an error code assignment which is redundant in an if branch for
the handling of a memory allocation failure because the same value was
set for the local variable "err" before.

Signed-off-by: Markus Elfring <elf...@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Milian Wolff <milian...@kdab.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Wang Nan <wang...@huawei.com>
Cc: kernel-...@vger.kernel.org
Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd...@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-event.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index cdfc468d4d5f..ded1e7d88874 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2994,10 +2994,9 @@ static int try_to_find_absolute_address(struct perf_probe_event *pev,

tev->nargs = pev->nargs;
tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
- if (!tev->args) {
- err = -ENOMEM;
+ if (!tev->args)
goto errout;
- }
+
for (i = 0; i < tev->nargs; i++)
copy_to_probe_trace_arg(&tev->args[i], &pev->args[i]);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:38 AM1/25/17
to
From: Markus Elfring <elf...@users.sourceforge.net>

Remove a condition check which is unnecessary at the end
because this source code place should usually only be reached
with a non-zero pointer.

Signed-off-by: Markus Elfring <elf...@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Milian Wolff <milian...@kdab.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Wang Nan <wang...@huawei.com>
Cc: kernel-...@vger.kernel.org
Link: http://lkml.kernel.org/r/a3f2473b-6383-a326...@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-event.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4a57c8a60bd9..cdfc468d4d5f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3004,10 +3004,8 @@ static int try_to_find_absolute_address(struct perf_probe_event *pev,
return 1;

errout:
- if (*tevs) {
- clear_probe_trace_events(*tevs, 1);
- *tevs = NULL;
- }
+ clear_probe_trace_events(*tevs, 1);
+ *tevs = NULL;
return err;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Jan 25, 2017, 9:00:44 AM1/25/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

So that we can suppress the '-t function_graph' and get a more compact command
line:

# perf ftrace usleep 123456 | grep raw_spin_lock | sort -k2 -nr | head -5
2) 0.555 us | _raw_spin_lock();
2) 0.516 us | _raw_spin_lock();
2) 0.410 us | _raw_spin_lock_irq();
2) 0.374 us | _raw_spin_lock_irqsave();
#

Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-ss9xgx5htp...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-ftrace.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index c320ae25c8e6..d05658d2b8f1 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -202,6 +202,7 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
{
int ret;
struct perf_ftrace ftrace = {
+ .tracer = "function_graph",
.target = { .uid = UINT_MAX, },
};
const char * const ftrace_usage[] = {
@@ -211,7 +212,7 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
};
const struct option ftrace_options[] = {
OPT_STRING('t', "tracer", &ftrace.tracer, "tracer",
- "tracer to use: function_graph or function"),
+ "tracer to use: function_graph(default) or function"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose"),
OPT_END()
--
2.9.3

tip-bot for Matija Glavinic Pecotic

unread,
Jan 26, 2017, 10:30:06 AM1/26/17
to
Commit-ID: 9343e45bf6cc4a05f6e271e9f8d06bc87875c604
Gitweb: http://git.kernel.org/tip/9343e45bf6cc4a05f6e271e9f8d06bc87875c604
Author: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
AuthorDate: Tue, 17 Jan 2017 15:50:35 +0100
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 18 Jan 2017 12:29:52 -0300

perf unwind: Fix looking up dwarf unwind stack info

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/dso.c | 48 +++++++++++++++++++++-------
tools/perf/util/unwind-libunwind-local.c | 54 +++++++++++++++++++++++++++++---
2 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index d2c6cdd..28d41e7 100644
+ ret = -1;
+ break;
+ }
+
+ ret = filename__read_debuglink(filename, symfile, PATH_MAX);
+ if (ret)
break;

- ret = filename__read_debuglink(filename, debuglink,
- size - (debuglink - filename));
+ /* Check predefined locations where debug file might reside */
+ ret = -1;
+ for (i = 0; i < ARRAY_SIZE(debuglink_paths); i++) {
+ snprintf(filename, size,
+ debuglink_paths[i], dso_dir, symfile);
+ if (is_regular_file(filename)) {
+ ret = 0;
+ break;
+ }
}
+
break;
+ }
case DSO_BINARY_TYPE__BUILD_ID_CACHE:
if (dso__build_id_filename(dso, filename, size) == NULL)
ret = -1;
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 6fec84d..bfb9b79 100644

tip-bot for Namhyung Kim

unread,
Jan 26, 2017, 10:40:04 AM1/26/17
to
Commit-ID: a7619aef6dca913d830675dd914ca6fe7241a730
Gitweb: http://git.kernel.org/tip/a7619aef6dca913d830675dd914ca6fe7241a730
Author: Namhyung Kim <namhyu...@lge.com>
AuthorDate: Thu, 18 Apr 2013 21:24:16 +0900
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 26 Jan 2017 11:43:00 -0300

perf util: Add more debug message on failure path

It's helpful for debugging on tracing features.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>,
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/n/tip-rjysr9ljie...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 4 ++-
tools/perf/util/trace-event-read.c | 53 ++++++++++++++++++++++++++------------
2 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 00fd8a8..c567d9f 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2803,8 +2803,10 @@ static int perf_evsel__prepare_tracepoint_event(struct perf_evsel *evsel,
}

event = pevent_find_event(pevent, evsel->attr.config);
- if (event == NULL)
+ if (event == NULL) {
+ pr_debug("cannot find event format for %d\n", (int)evsel->attr.config);
return -1;
+ }

if (!evsel->name) {
snprintf(bf, sizeof(bf), "%s:%s", event->system, event->name);
diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c
index 61c7162..2742015 100644
--- a/tools/perf/util/trace-event-read.c
+++ b/tools/perf/util/trace-event-read.c
@@ -260,39 +260,53 @@ static int read_header_files(struct pevent *pevent)

static int read_ftrace_file(struct pevent *pevent, unsigned long long size)
{
+ int ret;
char *buf;

buf = malloc(size);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
- free(buf);
- return -1;
+ ret = do_read(buf, size);
+ if (ret < 0) {
+ pr_debug("error reading ftrace file.\n");
+ goto out;
}

- parse_ftrace_file(pevent, buf, size);
+ ret = parse_ftrace_file(pevent, buf, size);
+ if (ret < 0)
+ pr_debug("error parsing ftrace file.\n");
+out:
free(buf);
- return 0;
+ return ret;
}

static int read_event_file(struct pevent *pevent, char *sys,
unsigned long long size)
{
+ int ret;
char *buf;

buf = malloc(size);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
+ ret = do_read(buf, size);
+ if (ret < 0) {
free(buf);
- return -1;
+ goto out;
}

- parse_event_file(pevent, buf, size, sys);
+ ret = parse_event_file(pevent, buf, size, sys);
+ if (ret < 0)
+ pr_debug("error parsing event file.\n");
+out:
free(buf);
- return 0;
+ return ret;
}

static int read_ftrace_files(struct pevent *pevent)
@@ -345,6 +359,7 @@ static int read_saved_cmdline(struct pevent *pevent)
{
unsigned long long size;
char *buf;
+ int ret;

/* it can have 0 size */
size = read8(pevent);
@@ -352,18 +367,22 @@ static int read_saved_cmdline(struct pevent *pevent)
return 0;

buf = malloc(size + 1);
- if (buf == NULL)
+ if (buf == NULL) {
+ pr_debug("memory allocation failure\n");
return -1;
+ }

- if (do_read(buf, size) < 0) {
- free(buf);
- return -1;
+ ret = do_read(buf, size);
+ if (ret < 0) {
+ pr_debug("error reading saved cmdlines\n");
+ goto out;
}

tip-bot for Arnaldo Carvalho de Melo

unread,
Jan 26, 2017, 10:40:06 AM1/26/17
to
Commit-ID: 0a87e7bc6c55dd248270ee0ab4212cd0ef8ea04a
Gitweb: http://git.kernel.org/tip/0a87e7bc6c55dd248270ee0ab4212cd0ef8ea04a
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 24 Jan 2017 13:19:06 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 26 Jan 2017 11:42:59 -0300

perf scripting perl: Do not die() when not founding event for a type

Do just like handling other cases i.e. print some debug message and
ignore the sample.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxy...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/scripting-engines/trace-event-perl.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index e55a132..014ecd6 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Jan 26, 2017, 10:40:06 AM1/26/17
to
Commit-ID: ec347870a9d423a4b88657d6a85b5163b3f949ee
Gitweb: http://git.kernel.org/tip/ec347870a9d423a4b88657d6a85b5163b3f949ee
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 18 Jan 2017 21:49:14 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 26 Jan 2017 11:43:01 -0300

perf ftrace: Make 'function_graph' be the default tracer

So that we can suppress the '-t function_graph' and get a more compact command
line:

# perf ftrace usleep 123456 | grep raw_spin_lock | sort -k2 -nr | head -5
2) 0.555 us | _raw_spin_lock();
2) 0.516 us | _raw_spin_lock();
2) 0.410 us | _raw_spin_lock_irq();
2) 0.374 us | _raw_spin_lock_irqsave();
#

Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-ss9xgx5htp...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-ftrace.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index c320ae2..d05658d 100644

tip-bot for Namhyung Kim

unread,
Jan 26, 2017, 10:40:22 AM1/26/17
to
Commit-ID: d01f4e8db22cf4d04f6c86351d959b584eb1f5f7
Gitweb: http://git.kernel.org/tip/d01f4e8db22cf4d04f6c86351d959b584eb1f5f7
Author: Namhyung Kim <namhyu...@lge.com>
AuthorDate: Thu, 7 Mar 2013 21:45:20 +0900
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 26 Jan 2017 11:43:01 -0300

perf ftrace: Introduce new 'ftrace' tool
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Tested-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jeremy Eder <je...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>,
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/n/tip-r1hgmsj4dx...@git.kernel.org
[ Various foward port fixes, add man page ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/Documentation/perf-ftrace.txt | 36 +++++
tools/perf/builtin-ftrace.c | 242 +++++++++++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 1 +
6 files changed, 282 insertions(+)

diff --git a/tools/perf/Build b/tools/perf/Build
index 7039ecb..9b79f8d 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -3,6 +3,7 @@ perf-y += builtin-annotate.o
perf-y += builtin-config.o
perf-y += builtin-diff.o
perf-y += builtin-evlist.o
+perf-y += builtin-ftrace.o
perf-y += builtin-help.o
perf-y += builtin-sched.o
perf-y += builtin-buildid-list.o
diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt
new file mode 100644
index 0000000..2d96de6
index 0000000..c320ae2
+ ret = 0;
+ pr_err("failed to allocate pid string\n");
+ goto out;
+ break;
+ }
+ }
+
+ if (ret < 0)
+ goto out_delete_evlist;
+
+ if (ftrace.tracer == NULL)
+ ftrace.tracer = DEFAULT_TRACER;
+
+ ret = __cmd_ftrace(&ftrace, argc, argv);
+
+out_delete_evlist:
+ perf_evlist__delete(ftrace.evlist);
+
+ return ret;
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index b55f5be..036e1e3 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -41,6 +41,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix);
int cmd_inject(int argc, const char **argv, const char *prefix);
int cmd_mem(int argc, const char **argv, const char *prefix);
int cmd_data(int argc, const char **argv, const char *prefix);
+int cmd_ftrace(int argc, const char **argv, const char *prefix);

int find_scripts(char **scripts_array, char **scripts_path_array);
#endif
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index fb45613..ac3efd3 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -11,6 +11,7 @@ perf-data mainporcelain common
perf-diff mainporcelain common
perf-config mainporcelain common
perf-evlist mainporcelain common
+perf-ftrace mainporcelain common
perf-inject mainporcelain common
perf-kallsyms mainporcelain common
perf-kmem mainporcelain common
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 0324cd1..34bcf90 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:05 AM2/1/17
to
From: Ingo Molnar <mi...@kernel.org>

The following upstream headers were updated:

- The x86 cpufeatures.h file picked up a couple of new feature entries
- The PowerPC and ARM KVM headers picked up new features

None of which requires changes to perf tooling, so refresh the tooling copy.

Solves these build time warnings:

Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel

Signed-off-by: Ingo Molnar <mi...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/2017013008...@gmail.com
[ resync tools/arch/x86/include/asm/cpufeatures.h ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/arch/arm/include/uapi/asm/kvm.h | 9 +++++++++
tools/arch/powerpc/include/uapi/asm/kvm.h | 5 +++++
tools/arch/x86/include/asm/cpufeatures.h | 11 +++++++++++
3 files changed, 25 insertions(+)

diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h
index a2b3eb313a25..af05f8e0903e 100644
--- a/tools/arch/arm/include/uapi/asm/kvm.h
+++ b/tools/arch/arm/include/uapi/asm/kvm.h
@@ -84,6 +84,15 @@ struct kvm_regs {
#define KVM_VGIC_V2_DIST_SIZE 0x1000
#define KVM_VGIC_V2_CPU_SIZE 0x2000

+/* Supported VGICv3 address types */
+#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST 3
+#define KVM_VGIC_ITS_ADDR_TYPE 4
+
+#define KVM_VGIC_V3_DIST_SIZE SZ_64K
+#define KVM_VGIC_V3_REDIST_SIZE (2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE (2 * SZ_64K)
+
#define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
#define KVM_ARM_VCPU_PSCI_0_2 1 /* CPU uses PSCI v0.2 */

diff --git a/tools/arch/powerpc/include/uapi/asm/kvm.h b/tools/arch/powerpc/include/uapi/asm/kvm.h
index c93cf35ce379..3603b6f51b11 100644
--- a/tools/arch/powerpc/include/uapi/asm/kvm.h
+++ b/tools/arch/powerpc/include/uapi/asm/kvm.h
@@ -573,6 +573,10 @@ struct kvm_get_htab_header {
#define KVM_REG_PPC_SPRG9 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xba)
#define KVM_REG_PPC_DBSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbb)

+/* POWER9 registers */
+#define KVM_REG_PPC_TIDR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbc)
+#define KVM_REG_PPC_PSSCR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbd)
+
/* Transactional Memory checkpointed state:
* This is all GPRs, all VSX regs and a subset of SPRs
*/
@@ -596,6 +600,7 @@ struct kvm_get_htab_header {
#define KVM_REG_PPC_TM_VSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U32 | 0x67)
#define KVM_REG_PPC_TM_DSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x68)
#define KVM_REG_PPC_TM_TAR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x69)
+#define KVM_REG_PPC_TM_XER (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x6a)

/* PPC64 eXternal Interrupt Controller Specification */
#define KVM_DEV_XICS_GRP_SOURCES 1 /* 64-bit source attributes */
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index cddd5d06e1cb..eafee3161d1c 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -105,6 +105,7 @@
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
+#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */
@@ -188,10 +189,14 @@

#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
+#define X86_FEATURE_CAT_L3 ( 7*32+ 4) /* Cache Allocation Technology L3 */
+#define X86_FEATURE_CAT_L2 ( 7*32+ 5) /* Cache Allocation Technology L2 */
+#define X86_FEATURE_CDP_L3 ( 7*32+ 6) /* Code and Data Prioritization L3 */

#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

+#define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */
#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
@@ -220,11 +225,13 @@
#define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */
#define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */
#define X86_FEATURE_MPX ( 9*32+14) /* Memory Protection Extension */
+#define X86_FEATURE_RDT_A ( 9*32+15) /* Resource Director Technology Allocation */
#define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */
#define X86_FEATURE_AVX512DQ ( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
#define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */
#define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */
#define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */
+#define X86_FEATURE_AVX512IFMA ( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
#define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */
#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */
#define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */
@@ -278,8 +285,10 @@
#define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */

/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
+#define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */
#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */
+#define X86_FEATURE_RDPID (16*32+ 22) /* RDPID instruction */

/* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */
#define X86_FEATURE_OVERFLOW_RECOV (17*32+0) /* MCA overflow recovery support */
@@ -310,4 +319,6 @@
#define X86_BUG_NULL_SEG X86_BUG(10) /* Nulling a selector preserves the base */
#define X86_BUG_SWAPGS_FENCE X86_BUG(11) /* SWAPGS without input dep on GS */
#define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */
+#define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */
+
#endif /* _ASM_X86_CPUFEATURES_H */
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:06 AM2/1/17
to
From: Joe Stringer <j...@ovn.org>

rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.

Signed-off-by: Joe Stringer <j...@ovn.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/2017012621200...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/util.c | 2 +-
tools/perf/util/util.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index bf29aed16bd6..d8b45cea54d0 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -85,7 +85,7 @@ int mkdir_p(char *path, mode_t mode)
return (stat(path, &st) && mkdir(path, mode)) ? -1 : 0;
}

-int rm_rf(char *path)
+int rm_rf(const char *path)
{
DIR *dir;
int ret = 0;
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 6e8be174ec0b..c74708da8571 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -209,7 +209,7 @@ static inline int sane_case(int x, int high)
}

int mkdir_p(char *path, mode_t mode);
-int rm_rf(char *path);
+int rm_rf(const char *path);
struct strlist *lsdir(const char *name, bool (*filter)(const char *, struct dirent *));
bool lsdir_no_dot_filter(const char *name, struct dirent *d);
int copyfile(const char *from, const char *to);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:08 AM2/1/17
to
From: Joe Stringer <j...@ovn.org>

Add new APIs to pin a BPF program (or specific instances) to the
filesystem. The user can specify the path full path within a BPF
filesystem to pin the program.

bpf_program__pin_instance(prog, path, n) will pin the nth instance of
'prog' to the specified path.

bpf_program__pin(prog, path) will create the directory 'path' (if it
does not exist) and pin each instance within that directory. For
instance, path/0, path/1, path/2.

Committer notes:

- Add missing headers for mkdir()

- Check strdup() for failure

- Check snprintf >= size, not >, as == also means truncated, see 'man
snprintf', return value.

- Conditionally define BPF_FS_MAGIC, as it isn't in magic.h in older
systems and we're not yet having a tools/include/uapi/linux/magic.h
copy.

- Do not include linux/magic.h, not present in older distros.

Signed-off-by: Joe Stringer <j...@ovn.org>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Wang Nan <wang...@huawei.com>
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/2017012621200...@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/bpf/libbpf.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
tools/lib/bpf/libbpf.h | 3 ++
2 files changed, 123 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e6cd62b1264b..c4465b2fddf6 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4,6 +4,7 @@
* Copyright (C) 2013-2015 Alexei Starovoitov <a...@kernel.org>
* Copyright (C) 2015 Wang Nan <wang...@huawei.com>
* Copyright (C) 2015 Huawei Inc.
+ * Copyright (C) 2017 Nicira, Inc.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -22,6 +23,7 @@
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
+#include <libgen.h>
#include <inttypes.h>
#include <string.h>
#include <unistd.h>
@@ -32,6 +34,10 @@
#include <linux/kernel.h>
#include <linux/bpf.h>
#include <linux/list.h>
+#include <linux/limits.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/vfs.h>
#include <libelf.h>
#include <gelf.h>

@@ -42,6 +48,10 @@
#define EM_BPF 247
#endif

+#ifndef BPF_FS_MAGIC
+#define BPF_FS_MAGIC 0xcafe4a11
+#endif
+
#define __printf(a, b) __attribute__((format(printf, a, b)))

__printf(1, 2)
@@ -1237,6 +1247,116 @@ int bpf_object__load(struct bpf_object *obj)
return err;
}

+static int check_path(const char *path)
+{
+ struct statfs st_fs;
+ char *dname, *dir;
+ int err = 0;
+
+ if (path == NULL)
+ return -EINVAL;
+
+ dname = strdup(path);
+ if (dname == NULL)
+ return -ENOMEM;
+
+ dir = dirname(dname);
+ if (statfs(dir, &st_fs)) {
+ pr_warning("failed to statfs %s: %s\n", dir, strerror(errno));
+ err = -errno;
+ }
+ free(dname);
+
+ if (!err && st_fs.f_type != BPF_FS_MAGIC) {
+ pr_warning("specified path %s is not on BPF FS\n", path);
+ err = -EINVAL;
+ }
+
+ return err;
+}
+
+int bpf_program__pin_instance(struct bpf_program *prog, const char *path,
+ int instance)
+{
+ int err;
+
+ err = check_path(path);
+ if (err)
+ return err;
+
+ if (prog == NULL) {
+ pr_warning("invalid program pointer\n");
+ return -EINVAL;
+ }
+
+ if (instance < 0 || instance >= prog->instances.nr) {
+ pr_warning("invalid prog instance %d of prog %s (max %d)\n",
+ instance, prog->section_name, prog->instances.nr);
+ return -EINVAL;
+ }
+
+ if (bpf_obj_pin(prog->instances.fds[instance], path)) {
+ pr_warning("failed to pin program: %s\n", strerror(errno));
+ return -errno;
+ }
+ pr_debug("pinned program '%s'\n", path);
+
+ return 0;
+}
+
+static int make_dir(const char *path)
+{
+ int err = 0;
+
+ if (mkdir(path, 0700) && errno != EEXIST)
+ err = -errno;
+
+ if (err)
+ pr_warning("failed to mkdir %s: %s\n", path, strerror(-err));
+ return err;
+}
+
+int bpf_program__pin(struct bpf_program *prog, const char *path)
+{
+ int i, err;
+
+ err = check_path(path);
+ if (err)
+ return err;
+
+ if (prog == NULL) {
+ pr_warning("invalid program pointer\n");
+ return -EINVAL;
+ }
+
+ if (prog->instances.nr <= 0) {
+ pr_warning("no instances of prog %s to pin\n",
+ prog->section_name);
+ return -EINVAL;
+ }
+
+ err = make_dir(path);
+ if (err)
+ return err;
+
+ for (i = 0; i < prog->instances.nr; i++) {
+ char buf[PATH_MAX];
+ int len;
+
+ len = snprintf(buf, PATH_MAX, "%s/%d", path, i);
+ if (len < 0)
+ return -EINVAL;
+ else if (len >= PATH_MAX)
+ return -ENAMETOOLONG;
+
+ err = bpf_program__pin_instance(prog, buf, i);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
void bpf_object__close(struct bpf_object *obj)
{
size_t i;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 4014d1ba5e3d..9f8aa63b95f4 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -106,6 +106,9 @@ void *bpf_program__priv(struct bpf_program *prog);
const char *bpf_program__title(struct bpf_program *prog, bool needs_copy);

int bpf_program__fd(struct bpf_program *prog);
+int bpf_program__pin_instance(struct bpf_program *prog, const char *path,
+ int instance);
+int bpf_program__pin(struct bpf_program *prog, const char *path);

struct bpf_insn;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:08 AM2/1/17
to
From: Krister Johansen <kj...@templeofstupid.com>

If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.

Signed-off-by: Krister Johansen <kj...@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: sta...@kernel.org
Fixes: 84c2cafa2889 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/2017010606...@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 11 +++++++++--
tools/perf/util/callchain.h | 6 ++++++
tools/perf/util/hist.c | 7 +++++++
3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e16db30dfe40..aba953421a03 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -449,7 +449,7 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor)
}
call->ip = cursor_node->ip;
call->ms.sym = cursor_node->sym;
- call->ms.map = cursor_node->map;
+ call->ms.map = map__get(cursor_node->map);

if (cursor_node->branch) {
call->branch_count = 1;
@@ -489,6 +489,7 @@ add_child(struct callchain_node *parent,

list_for_each_entry_safe(call, tmp, &new->val, list) {
list_del(&call->list);
+ map__zput(call->ms.map);
free(call);
}
free(new);
@@ -773,6 +774,7 @@ merge_chain_branch(struct callchain_cursor *cursor,
list->ms.map, list->ms.sym,
false, NULL, 0, 0);
list_del(&list->list);
+ map__zput(list->ms.map);
free(list);
}

@@ -823,7 +825,8 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
}

node->ip = ip;
- node->map = map;
+ map__zput(node->map);
+ node->map = map__get(map);
node->sym = sym;
node->branch = branch;
node->nr_loop_iter = nr_loop_iter;
@@ -1154,11 +1157,13 @@ static void free_callchain_node(struct callchain_node *node)

list_for_each_entry_safe(list, tmp, &node->parent_val, list) {
list_del(&list->list);
+ map__zput(list->ms.map);
free(list);
}

list_for_each_entry_safe(list, tmp, &node->val, list) {
list_del(&list->list);
+ map__zput(list->ms.map);
free(list);
}

@@ -1222,6 +1227,7 @@ int callchain_node__make_parent_list(struct callchain_node *node)
goto out;
*new = *chain;
new->has_children = false;
+ map__get(new->ms.map);
list_add_tail(&new->list, &head);
}
parent = parent->parent;
@@ -1242,6 +1248,7 @@ int callchain_node__make_parent_list(struct callchain_node *node)
out:
list_for_each_entry_safe(chain, new, &head, list) {
list_del(&chain->list);
+ map__zput(chain->ms.map);
free(chain);
}
return -ENOMEM;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 35c8e379530f..4f4b60f1558a 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -5,6 +5,7 @@
#include <linux/list.h>
#include <linux/rbtree.h>
#include "event.h"
+#include "map.h"
#include "symbol.h"

#define HELP_PAD "\t\t\t\t"
@@ -184,8 +185,13 @@ int callchain_merge(struct callchain_cursor *cursor,
*/
static inline void callchain_cursor_reset(struct callchain_cursor *cursor)
{
+ struct callchain_cursor_node *node;
+
cursor->nr = 0;
cursor->last = &cursor->first;
+
+ for (node = cursor->first; node != NULL; node = node->next)
+ map__zput(node->map);
}

int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cff2e9041b15..32c6a939e4cc 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1,6 +1,7 @@
#include "util.h"
#include "build-id.h"
#include "hist.h"
+#include "map.h"
#include "session.h"
#include "sort.h"
#include "evlist.h"
@@ -1019,6 +1020,10 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
int max_stack_depth, void *arg)
{
int err, err2;
+ struct map *alm = NULL;
+
+ if (al && al->map)
+ alm = map__get(al->map);

err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
iter->evsel, al, max_stack_depth);
@@ -1058,6 +1063,8 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
if (!err)
err = err2;

+ map__put(alm);
+
return err;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:09 AM2/1/17
to
From: Joe Stringer <j...@ovn.org>

Add a test for the newly added BPF object pinning functionality.

For example:

# tools/perf/perf test 37
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok

# tools/perf/perf test 37 -v 2>&1 | grep pinned
libbpf: pinned map '/sys/fs/bpf/perf_test/flip_table'
libbpf: pinned program '/sys/fs/bpf/perf_test/func=SyS_epoll_wait/0'

Signed-off-by: Joe Stringer <j...@ovn.org>
Requested-and-Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/bpf.c | 42 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 92343f43e44a..1a04fe77487d 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -5,11 +5,13 @@
#include <util/evlist.h>
#include <linux/bpf.h>
#include <linux/filter.h>
+#include <api/fs/fs.h>
#include <bpf/bpf.h>
#include "tests.h"
#include "llvm.h"
#include "debug.h"
#define NR_ITERS 111
+#define PERF_TEST_BPF_PATH "/sys/fs/bpf/perf_test"

#ifdef HAVE_LIBBPF_SUPPORT

@@ -54,6 +56,7 @@ static struct {
const char *msg_load_fail;
int (*target_func)(void);
int expect_result;
+ bool pin;
} bpf_testcase_table[] = {
{
LLVM_TESTCASE_BASE,
@@ -63,6 +66,17 @@ static struct {
"load bpf object failed",
&epoll_wait_loop,
(NR_ITERS + 1) / 2,
+ false,
+ },
+ {
+ LLVM_TESTCASE_BASE,
+ "BPF pinning",
+ "[bpf_pinning]",
+ "fix kbuild first",
+ "check your vmlinux setting?",
+ &epoll_wait_loop,
+ (NR_ITERS + 1) / 2,
+ true,
},
#ifdef HAVE_BPF_PROLOGUE
{
@@ -73,6 +87,7 @@ static struct {
"check your vmlinux setting?",
&llseek_loop,
(NR_ITERS + 1) / 4,
+ false,
},
#endif
{
@@ -83,6 +98,7 @@ static struct {
"libbpf error when dealing with relocation",
NULL,
0,
+ false,
},
};

@@ -226,10 +242,34 @@ static int __test__bpf(int idx)
goto out;
}

- if (obj)
+ if (obj) {
ret = do_test(obj,
bpf_testcase_table[idx].target_func,
bpf_testcase_table[idx].expect_result);
+ if (ret != TEST_OK)
+ goto out;
+ if (bpf_testcase_table[idx].pin) {
+ int err;
+
+ if (!bpf_fs__mount()) {
+ pr_debug("BPF filesystem not mounted\n");
+ ret = TEST_FAIL;
+ goto out;
+ }
+ err = mkdir(PERF_TEST_BPF_PATH, 0777);
+ if (err && errno != EEXIST) {
+ pr_debug("Failed to make perf_test dir: %s\n",
+ strerror(errno));
+ ret = TEST_FAIL;
+ goto out;
+ }
+ if (bpf_object__pin(obj, PERF_TEST_BPF_PATH))
+ ret = TEST_FAIL;
+ if (rm_rf(PERF_TEST_BPF_PATH))
+ ret = TEST_FAIL;
+ }
+ }
+
out:
bpf__clear();
return ret;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:09 AM2/1/17
to
From: Taeung Song <treeze...@gmail.com>

As a result of commit a3497642c261 ("perf ftrace: Make 'function_graph'
be the default tracer") the ftrace.tracer variable can't be NULL but the
other code setting default tracer remained.

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1485423339-22780-1-git-...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-ftrace.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d05658d2b8f1..414444d0e919 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -202,7 +202,7 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
{
int ret;
struct perf_ftrace ftrace = {
- .tracer = "function_graph",
+ .tracer = DEFAULT_TRACER,
.target = { .uid = UINT_MAX, },
};
const char * const ftrace_usage[] = {
@@ -231,9 +231,6 @@ int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
if (ret < 0)
goto out_delete_evlist;

- if (ftrace.tracer == NULL)
- ftrace.tracer = DEFAULT_TRACER;
-
ret = __cmd_ftrace(&ftrace, argc, argv);

out_delete_evlist:
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:09 AM2/1/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

$ cat ~/.perfconfig
cat: /home/acme/.perfconfig: No such file or directory
$ perf stat -e cycles usleep 1

Performance counter stats for 'usleep 1':

938,996 cycles:u

0.003813731 seconds time elapsed

$ perf top --stdio
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
<SNIP>
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
[acme@jouet linux]$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
# Overhead Command Shared Object Symbol
# ........ ....... ................. .........................
71.77% usleep libc-2.24.so [.] _dl_addr
27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry
1.13% usleep [kernel.kallsyms] [k] page_fault
$
$ touch ~/.perfconfig
$ ls -la ~/.perfconfig
-rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
$
$ perf stat -e instructions usleep 1

Performance counter stats for 'usleep 1':

244,610 instructions:u

0.000805383 seconds time elapsed

$
[root@jouet ~]# chown acme.acme ~/.perfconfig
[root@jouet ~]# perf stat -e cycles usleep 1
Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

Performance counter stats for 'usleep 1':

937,615 cycles

0.000836931 seconds time elapsed
#

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6x...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-help.c | 6 ++++--
tools/perf/builtin-kmem.c | 8 ++++++--
tools/perf/builtin-record.c | 4 +++-
tools/perf/builtin-report.c | 4 +++-
tools/perf/builtin-top.c | 4 +++-
tools/perf/perf.c | 15 ++++++++++-----
tools/perf/util/callchain.c | 16 ++++++++++++++--
tools/perf/util/config.c | 9 +++++----
tools/perf/util/data-convert-bt.c | 7 +++++--
tools/perf/util/hist.c | 4 +++-
tools/perf/util/intel-pt.c | 4 +++-
tools/perf/util/llvm-utils.c | 4 +++-
12 files changed, 62 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 93da24a638be..aed0d844e8c2 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -447,11 +447,13 @@ int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
const char *alias;
- int rc = 0;
+ int rc;

load_command_list("perf-", &main_cmds, &other_cmds);

- perf_config(perf_help_config, &help_format);
+ rc = perf_config(perf_help_config, &help_format);
+ if (rc)
+ return rc;

argc = parse_options_subcommand(argc, argv, builtin_help_options,
builtin_help_subcommands, builtin_help_usage, 0);
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 915869e00d86..29f4751a3574 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1920,10 +1920,12 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
struct perf_session *session;
- int ret = -1;
const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";
+ int ret = perf_config(kmem_config, NULL);
+
+ if (ret)
+ return ret;

- perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);

@@ -1948,6 +1950,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ ret = -1;
+
if (kmem_slab) {
if (!perf_evlist__find_tracepoint_by_name(session->evlist,
"kmem:kmalloc")) {
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 33a9eaaf9db4..ffac8ca9fb01 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1670,7 +1670,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

- perf_config(perf_record_config, rec);
+ err = perf_config(perf_record_config, rec);
+ if (err)
+ return err;

argc = parse_options(argc, argv, record_options, record_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 06cc759a4597..dbd7fa028861 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -847,7 +847,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
if (ret < 0)
return ret;

- perf_config(report__config, &report);
+ ret = perf_config(report__config, &report);
+ if (ret)
+ return ret;

argc = parse_options(argc, argv, options, report_usage, 0);
if (argc) {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 3df4178ba378..20aef9815cd8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1216,7 +1216,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (top.evlist == NULL)
return -ENOMEM;

- perf_config(perf_top_config, &top);
+ status = perf_config(perf_top_config, &top);
+ if (status)
+ return status;

argc = parse_options(argc, argv, options, top_usage, 0);
if (argc)
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 34bcf90f8871..6d5479e03e0d 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -90,11 +90,12 @@ static int pager_command_config(const char *var, const char *value, void *data)
/* returns 0 for "no pager", 1 for "use pager", and -1 for "not specified" */
int check_pager_config(const char *cmd)
{
+ int err;
struct pager_config c;
c.cmd = cmd;
c.val = -1;
- perf_config(pager_command_config, &c);
- return c.val;
+ err = perf_config(pager_command_config, &c);
+ return err ?: c.val;
}

static int browser_command_config(const char *var, const char *value, void *data)
@@ -113,11 +114,12 @@ static int browser_command_config(const char *var, const char *value, void *data
*/
static int check_browser_config(const char *cmd)
{
+ int err;
struct pager_config c;
c.cmd = cmd;
c.val = -1;
- perf_config(browser_command_config, &c);
- return c.val;
+ err = perf_config(browser_command_config, &c);
+ return err ?: c.val;
}

static void commit_pager_choice(void)
@@ -509,6 +511,7 @@ static void cache_line_size(int *cacheline_sizep)

int main(int argc, const char **argv)
{
+ int err;
const char *cmd;
char sbuf[STRERR_BUFSIZE];
int value;
@@ -534,7 +537,9 @@ int main(int argc, const char **argv)
srandom(time(NULL));

perf_config__init();
- perf_config(perf_default_config, NULL);
+ err = perf_config(perf_default_config, NULL);
+ if (err)
+ return err;
set_buildid_dir(NULL);

/* get debugfs/tracefs mount point from /proc/mounts */
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 42922512c1c6..e16db30dfe40 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -48,6 +48,8 @@ static int parse_callchain_mode(const char *value)
callchain_param.mode = CHAIN_FOLDED;
return 0;
}
+
+ pr_err("Invalid callchain mode: %s\n", value);
return -1;
}

@@ -63,6 +65,8 @@ static int parse_callchain_order(const char *value)
callchain_param.order_set = true;
return 0;
}
+
+ pr_err("Invalid callchain order: %s\n", value);
return -1;
}

@@ -80,6 +84,8 @@ static int parse_callchain_sort_key(const char *value)
callchain_param.branch_callstack = 1;
return 0;
}
+
+ pr_err("Invalid callchain sort key: %s\n", value);
return -1;
}

@@ -97,6 +103,8 @@ static int parse_callchain_value(const char *value)
callchain_param.value = CCVAL_COUNT;
return 0;
}
+
+ pr_err("Invalid callchain config key: %s\n", value);
return -1;
}

@@ -210,13 +218,17 @@ int perf_callchain_config(const char *var, const char *value)
return parse_callchain_sort_key(value);
if (!strcmp(var, "threshold")) {
callchain_param.min_percent = strtod(value, &endptr);
- if (value == endptr)
+ if (value == endptr) {
+ pr_err("Invalid callchain threshold: %s\n", value);
return -1;
+ }
}
if (!strcmp(var, "print-limit")) {
callchain_param.print_limit = strtod(value, &endptr);
- if (value == endptr)
+ if (value == endptr) {
+ pr_err("Invalid callchain print limit: %s\n", value);
return -1;
+ }
}

return 0;
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 615e8b4a693b..0c7d5a4975cd 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -386,8 +386,10 @@ static int perf_buildid_config(const char *var, const char *value)
if (!strcmp(var, "buildid.dir")) {
const char *dir = perf_config_dirname(var, value);

- if (!dir)
+ if (!dir) {
+ pr_err("Invalid buildid directory!\n");
return -1;
+ }
+ if (err)
+ return err;

/* CTF writer */
if (ctf_writer__init(cw, path))
return -1;

+ err = -1;
/* perf.data session */
session = perf_session__new(&file, 0, &c.tool);
if (!session)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6770a9645609..cff2e9041b15 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:10 AM2/1/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit e2cf00c257f5bbc071b489b1dfbeaa30b6f12da6:

Merge tag 'perf-core-for-mingo-4.11-20170126' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-01-26 16:20:59 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170201

for you to fetch changes up to b05d1093987a78695766b71a2d723aa65b5c25c5:

perf ftrace: Add ftrace.tracer config option (2017-01-31 16:20:09 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

. Allow configuring a 'perf ftrace' default --tracer (Taeung Song)

Infrastructure:

. Sync tools/arch/{powerpc,arm}/include/uapi/asm/kvm.h and
tools/arch/x86/include/asm/cpufeatures.h (Ingo Molnar)

. Add BPF program file system pinning APIs and respective
'perf test' entry (Joe Stringer)

. Make tools tree support 'make -s' (Josh Poimboeuf)

. Reference count maps in callchains, fixing SEGFAULT when
referencing maps after it is freed (Krister Johansen)

. Create for_each_event trace points iterator (Taeung Song)

. Do not consider an error not to have any perfconfig file
(Arnaldo Carvalho de Melo

. Propagate perf_config() errors (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (2):
perf config: Do not consider an error not to have any perfconfig file
perf tools: Propagate perf_config() errors

Ingo Molnar (1):
tools headers: Sync {tools/,}arch/powerpc/include/uapi/asm/kvm.h, {tools/,}arch/x86/include/asm/cpufeatures.h and {tools/,}arch/arm/include/uapi/asm/kvm.h

Joe Stringer (6):
tools lib bpf: Add BPF program pinning APIs
tools lib bpf: Add bpf_map__pin()
tools lib bpf: Add bpf_object__pin()
tools perf util: Make rm_rf(path) argument const
tools lib api fs: Add bpf_fs filesystem detector
perf test: Add libbpf pinning test

Josh Poimboeuf (1):
tools build: Add tools tree support for 'make -s'

Krister Johansen (1):
perf callchain: Reference count maps

Taeung Song (3):
perf ftrace: Remove needless code setting default tracer
perf tools: Create for_each_event macro for tracepoints iteration
perf ftrace: Add ftrace.tracer config option

Makefile | 6 +-
tools/arch/arm/include/uapi/asm/kvm.h | 9 ++
tools/arch/powerpc/include/uapi/asm/kvm.h | 5 +
tools/arch/x86/include/asm/cpufeatures.h | 11 ++
tools/build/Makefile.build | 10 ++
tools/lib/api/fs/fs.c | 16 +++
tools/lib/api/fs/fs.h | 1 +
tools/lib/bpf/libbpf.c | 195 ++++++++++++++++++++++++++++++
tools/lib/bpf/libbpf.h | 5 +
tools/perf/builtin-ftrace.c | 30 ++++-
tools/perf/builtin-help.c | 6 +-
tools/perf/builtin-kmem.c | 8 +-
tools/perf/builtin-record.c | 4 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-top.c | 4 +-
tools/perf/perf.c | 15 ++-
tools/perf/tests/bpf.c | 42 ++++++-
tools/perf/util/callchain.c | 27 ++++-
tools/perf/util/callchain.h | 6 +
tools/perf/util/config.c | 23 ++--
tools/perf/util/data-convert-bt.c | 7 +-
tools/perf/util/hist.c | 11 +-
tools/perf/util/intel-pt.c | 4 +-
tools/perf/util/llvm-utils.c | 4 +-
tools/perf/util/trace-event-info.c | 38 +++---
tools/perf/util/util.c | 2 +-
tools/perf/util/util.h | 2 +-
tools/scripts/Makefile.include | 12 +-
28 files changed, 446 insertions(+), 61 deletions(-)

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips: Ok
12 debian:experimental-x-mips64: Ok
13 debian:experimental-x-mipsel: Ok
14 fedora:20: Ok
15 fedora:21: Ok
16 fedora:22: Ok
17 fedora:23: Ok
18 fedora:24: Ok
19 fedora:24-x-ARC-uClibc: Ok
20 fedora:25: Ok
21 fedora:rawhide: Ok
22 mageia:5: Ok
23 opensuse:13.2: Ok
24 opensuse:42.1: Ok
25 opensuse:tumbleweed: Ok
26 ubuntu:12.04.5: Ok
27 ubuntu:14.04.4-x-linaro-arm64: Ok
28 ubuntu:15.10: Ok
29 ubuntu:16.04: Ok
30 ubuntu:16.04-x-arm: Ok
31 ubuntu:16.04-x-arm64: Ok
32 ubuntu:16.04-x-powerpc: Ok
33 ubuntu:16.04-x-powerpc64: Ok
34 ubuntu:16.04-x-powerpc64el: Ok
35 ubuntu:16.04-x-s390: Ok
36 ubuntu:16.10: Ok
#

# uname -a
Linux jouet 4.9.6-200.fc25.x86_64 #1 SMP Thu Jan 26 10:17:45 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip

$ perf stat make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_debug_O: make DEBUG=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_doc_O: make doc
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_slang_O: make NO_SLANG=1
make_clean_all_O: make clean all
make_no_libperl_O: make NO_LIBPERL=1
make_install_prefix_O: make install prefix=/tmp/krava
make_help_O: make help
make_no_auxtrace_O: make NO_AUXTRACE=1
make_tags_O: make tags
make_perf_o_O: make perf.o
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_static_O: make LDFLAGS=-static
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_newt_O: make NO_NEWT=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_pure_O: make
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_gtk2_O: make NO_GTK2=1
make_util_map_o_O: make util/map.o
make_no_libaudit_O: make NO_LIBAUDIT=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_bin_O: make install-bin
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libbpf_O: make NO_LIBBPF=1
make_install_O: make install
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$

Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 7:30:11 AM2/1/17
to
From: Josh Poimboeuf <jpoi...@redhat.com>

When doing a kernel build with 'make -s', everything is silenced except
the objtool build. That's because the tools tree support for silent
builds is some combination of missing and broken.

Three changes are needed to fix it:

- Makefile: propagate '-s' to the sub-make's MAKEFLAGS variable so the
tools Makefiles can see it.

- tools/scripts/Makefile.include: fix the tools Makefiles' ability to
recognize '-s'. The MAKE_VERSION and MAKEFLAGS checks are copied from
the top-level Makefile. This silences the "DESCEND objtool" message.

- tools/build/Makefile.build: add support to the tools Build files for
recognizing '-s'. Again the MAKE_VERSION and MAKEFLAGS checks are
copied from the top-level Makefile. This silences all the object
compile/link messages.

Reported-and-Tested-by: Peter Zijlstra <pet...@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Michal Marek <mma...@suse.com>
Link: http://lkml.kernel.org/r/e8967562ef640c3ae9a76da4ae0f4e47...@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
Makefile | 6 ++++--
tools/build/Makefile.build | 10 ++++++++++
tools/scripts/Makefile.include | 12 +++++++++++-
3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 098840012b9b..80341dfb2c50 100644
--- a/Makefile
+++ b/Makefile
@@ -87,10 +87,12 @@ endif
ifneq ($(filter 4.%,$(MAKE_VERSION)),) # make-4
ifneq ($(filter %s ,$(firstword x$(MAKEFLAGS))),)
quiet=silent_
+ tools_silent=s
endif
else # make-3.8x
ifneq ($(filter s% -s%,$(MAKEFLAGS)),)
quiet=silent_
+ tools_silent=-s
endif
endif

@@ -1607,11 +1609,11 @@ image_name:
# Clear a bunch of variables before executing the submake
tools/: FORCE
$(Q)mkdir -p $(objtree)/tools
- $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(filter --j% -j,$(MAKEFLAGS))" O=$(shell cd $(objtree) && /bin/pwd) subdir=tools -C $(src)/tools/
+ $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(shell cd $(objtree) && /bin/pwd) subdir=tools -C $(src)/tools/

tools/%: FORCE
$(Q)mkdir -p $(objtree)/tools
- $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(filter --j% -j,$(MAKEFLAGS))" O=$(shell cd $(objtree) && /bin/pwd) subdir=tools -C $(src)/tools/ $*
+ $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(shell cd $(objtree) && /bin/pwd) subdir=tools -C $(src)/tools/ $*

# Single targets
# ---------------------------------------------------------------------------
diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
index 99c0ccd2f176..e279a71c650d 100644
--- a/tools/build/Makefile.build
+++ b/tools/build/Makefile.build
@@ -19,6 +19,16 @@ else
Q=@
endif

+ifneq ($(filter 4.%,$(MAKE_VERSION)),) # make-4
+ifneq ($(filter %s ,$(firstword x$(MAKEFLAGS))),)
+ quiet=silent_
+endif
+else # make-3.8x
+ifneq ($(filter s% -s%,$(MAKEFLAGS)),)
+ quiet=silent_
+endif
+endif
+
build-dir := $(srctree)/tools/build

# Define $(fixdep) for dep-cmd function
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 8abbef164b4e..19edc1a7a232 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -46,6 +46,16 @@ else
NO_SUBDIR = :
endif

+ifneq ($(filter 4.%,$(MAKE_VERSION)),) # make-4
+ifneq ($(filter %s ,$(firstword x$(MAKEFLAGS))),)
+ silent=1
+endif
+else # make-3.8x
+ifneq ($(filter s% -s%,$(MAKEFLAGS)),)
+ silent=1
+endif
+endif
+
#
# Define a callable command for descending to a new directory
#
@@ -58,7 +68,7 @@ descend = \
QUIET_SUBDIR0 = +$(MAKE) $(COMMAND_O) -C # space to separate -C and subdir
QUIET_SUBDIR1 =

-ifneq ($(findstring $(MAKEFLAGS),s),s)
+ifneq ($(silent),1)
ifneq ($(V),1)
QUIET_CC = @echo ' CC '$@;
QUIET_CC_FPIC = @echo ' CC FPIC '$@;
--
2.9.3

Ingo Molnar

unread,
Feb 1, 2017, 9:40:05 AM2/1/17
to
Pulled, thanks a lot Arnaldo!

Ingo

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 1, 2017, 9:40:09 AM2/1/17
to
Commit-ID: ecc4c5614b24ee8ebaa35b834b5768dc9302ee3e
Gitweb: http://git.kernel.org/tip/ecc4c5614b24ee8ebaa35b834b5768dc9302ee3e
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 24 Jan 2017 13:44:10 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 27 Jan 2017 12:23:33 -0300

perf tools: Propagate perf_config() errors

Cc: Jiri Olsa <jo...@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-help.c | 6 ++++--
tools/perf/builtin-kmem.c | 8 ++++++--
tools/perf/builtin-record.c | 4 +++-
tools/perf/builtin-report.c | 4 +++-
tools/perf/builtin-top.c | 4 +++-
tools/perf/perf.c | 15 ++++++++++-----
tools/perf/util/callchain.c | 16 ++++++++++++++--
tools/perf/util/config.c | 9 +++++----
tools/perf/util/data-convert-bt.c | 7 +++++--
tools/perf/util/hist.c | 4 +++-
tools/perf/util/intel-pt.c | 4 +++-
tools/perf/util/llvm-utils.c | 4 +++-
12 files changed, 62 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 93da24a..aed0d84 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -447,11 +447,13 @@ int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
const char *alias;
- int rc = 0;
+ int rc;

load_command_list("perf-", &main_cmds, &other_cmds);

- perf_config(perf_help_config, &help_format);
+ rc = perf_config(perf_help_config, &help_format);
+ if (rc)
+ return rc;

argc = parse_options_subcommand(argc, argv, builtin_help_options,
builtin_help_subcommands, builtin_help_usage, 0);
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 915869e..29f4751 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1920,10 +1920,12 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
NULL
};
struct perf_session *session;
- int ret = -1;
const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";
+ int ret = perf_config(kmem_config, NULL);
+
+ if (ret)
+ return ret;

- perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);

@@ -1948,6 +1950,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ ret = -1;
+
if (kmem_slab) {
if (!perf_evlist__find_tracepoint_by_name(session->evlist,
"kmem:kmalloc")) {
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 33a9eaa..ffac8ca 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1670,7 +1670,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist == NULL)
return -ENOMEM;

- perf_config(perf_record_config, rec);
+ err = perf_config(perf_record_config, rec);
+ if (err)
+ return err;

argc = parse_options(argc, argv, record_options, record_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 06cc759..dbd7fa0 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -847,7 +847,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
if (ret < 0)
return ret;

- perf_config(report__config, &report);
+ ret = perf_config(report__config, &report);
+ if (ret)
+ return ret;

argc = parse_options(argc, argv, options, report_usage, 0);
if (argc) {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 3df4178..20aef98 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1216,7 +1216,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (top.evlist == NULL)
return -ENOMEM;

- perf_config(perf_top_config, &top);
+ status = perf_config(perf_top_config, &top);
+ if (status)
+ return status;

argc = parse_options(argc, argv, options, top_usage, 0);
if (argc)
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 34bcf90..6d5479e 100644
index 4292251..e16db30 100644
index 615e8b4..0c7d5a4 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -386,8 +386,10 @@ static int perf_buildid_config(const char *var, const char *value)
if (!strcmp(var, "buildid.dir")) {
const char *dir = perf_config_dirname(var, value);

- if (!dir)
+ if (!dir) {
+ pr_err("Invalid buildid directory!\n");
return -1;
+ }
strncpy(buildid_dir, dir, MAXPATHLEN-1);
buildid_dir[MAXPATHLEN-1] = '\0';
}
@@ -405,10 +407,9 @@ static int perf_default_core_config(const char *var __maybe_unused,
static int perf_ui_config(const char *var, const char *value)
{
/* Add other config variables here. */
- if (!strcmp(var, "ui.show-headers")) {
+ if (!strcmp(var, "ui.show-headers"))
symbol_conf.show_hist_headers = perf_config_bool(var, value);
- return 0;
- }
+
return 0;
}

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 7123f4d..4e6cbc9 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1473,7 +1473,7 @@ int bt_convert__perf2ctf(const char *input, const char *path,
},
};
struct ctf_writer *cw = &c.writer;
- int err = -1;
+ int err;

if (opts->all) {
c.tool.comm = process_comm_event;
@@ -1481,12 +1481,15 @@ int bt_convert__perf2ctf(const char *input, const char *path,
c.tool.fork = process_fork_event;
}

- perf_config(convert__config, &c);
+ err = perf_config(convert__config, &c);
+ if (err)
+ return err;

/* CTF writer */
if (ctf_writer__init(cw, path))
return -1;

+ err = -1;
/* perf.data session */
session = perf_session__new(&file, 0, &c.tool);
if (!session)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6770a96..cff2e90 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2439,8 +2439,10 @@ int parse_filter_percentage(const struct option *opt __maybe_unused,
symbol_conf.filter_relative = true;
else if (!strcmp(arg, "absolute"))
symbol_conf.filter_relative = false;
- else
+ else {
+ pr_debug("Invalud percentage: %s\n", arg);
return -1;
+ }

return 0;
}
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 85d5eeb..da20cd5 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2159,7 +2159,9 @@ int intel_pt_process_auxtrace_info(union perf_event *event,

addr_filters__init(&pt->filts);

- perf_config(intel_pt_perf_config, pt);
+ err = perf_config(intel_pt_perf_config, pt);
+ if (err)
+ goto err_free;

err = auxtrace_queues__init(&pt->queues);
if (err)
diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
index b23ff44..8243564 100644

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 8:50:06 PM2/9/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 53e74a112ce5c1c9b6a6923bdd6612133625d579:

Merge tag 'perf-urgent-for-mingo-4.10-20170203' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2017-02-03 20:42:30 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170209

for you to fetch changes up to 7ea6856d6f5629d742edc23b8b76e6263371ef45:

perf intel-pt: Use __fallthrough (2017-02-09 16:32:03 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Add support for parsing Intel uncore vendor event files and add uncore
vendor events for the Intel server processors (Haswell, Broadwell,
IvyBridge), Xeon Phi (Knights Landing) and Broadwell DE (Andi Kleen)

- Support --symfs in 'perf probe' (Uwe Kleine-König)

- Add support for generating bpf prologue on the aarch64 architecture (He Kuang)

- Show proper hint when SDT event not yet in place via 'perf probe' (Ravi Bangoria)

- Take into account symfs setting when reading file build ID (Victor Kamensky)

Infrastructure:

- Map gcc7's '__attribute__ ((fallthrough))', that warns when code
associated to case blocks in switches continue into the next case entry,
to '__falltrough' and use it where warned by gcc, tested on Fedora Rawhide
(Arnaldo Carvalho de Melo)

- Fix buffer sizes used with snprintf that could lead to truncation,
another warning introduced in gcc7 (Arnaldo Carvalho de Melo)

- Robustify do_generate_dynamic_list_file in libtraceevent (David Carrillo-Cisneros)

- Use zfree() in more places (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Andi Kleen (11):
perf jevents: Parse eventcode as number
perf jevents: Add support for parsing uncore json files
perf pmu: Support per pmu json aliases
perf pmu: Support event aliases for non cpu// pmus
perf list: Add debug support for outputing alias string
perf vendor events intel: Add uncore events for Haswell Server processor
perf vendor events intel: Add uncore events for Broadwell Server
perf vendor events intel: Add uncore events for IvyBridge Server
perf vendor events intel: Add uncore events for Sandy Bridge Server
perf vendor events intel: Add uncore events for Xeon Phi (Knights Landing)
perf vendor events intel: Add uncore events for Broadwell DE

Arnaldo Carvalho de Melo (11):
Merge remote-tracking branch 'tip/perf/urgent' into perf/core
perf tools: Fix include of linux/mman.h
tools include: Add a __fallthrough statement
tools string: Use __fallthrough in perf_atoll()
tools strfilter: Use __fallthrough
perf top: Use __fallthrough
perf thread_map: Correctly size buffer used with dirent->dt_name
perf header: Fix handling of PERF_EVENT_UPDATE__SCALE
perf bench numa: Avoid possible truncation when using snprintf()
perf tests: Avoid possible truncation with dirent->d_name + snprintf
perf intel-pt: Use __fallthrough

David Carrillo-Cisneros (1):
tools lib traceevent: Robustify do_generate_dynamic_list_file

He Kuang (2):
perf tools arm64: Add support for generating bpf prologue
perf bpf: Add missing newline in debug messages

Mickaël Salaün (1):
tools lib bpf: Add missing header to the library

Ravi Bangoria (1):
perf sdt: Show proper hint when event not yet in place via 'perf probe'

Taeung Song (4):
perf tools: Only increase index if perf_evsel__new_idx() succeeds
perf tools: Add missing check for failure in a zalloc() call
perf tools: Use zfree() instead of ad hoc equivalent
perf tools: Use zfree() to avoid keeping dangling pointers

Uwe Kleine-König (1):
perf probe: Add option --symfs

Victor Kamensky (1):
perf symbols: Take into account symfs setting when reading file build ID

Makefile | 6 +-
arch/x86/events/Makefile | 13 +-
arch/x86/events/amd/Makefile | 7 +
arch/x86/events/amd/uncore.c | 204 ++++++++-----
arch/x86/events/intel/pt.c | 6 +
include/linux/kprobes.h | 30 +-
include/linux/perf_event.h | 2 +-
kernel/events/core.c | 223 ++++++++------
kernel/extable.c | 9 +-
kernel/kprobes.c | 73 +++--
tools/arch/arm/include/uapi/asm/kvm.h | 9 +
tools/arch/powerpc/include/uapi/asm/kvm.h | 5 +
tools/arch/x86/include/asm/cpufeatures.h | 11 +
tools/arch/x86/include/uapi/asm/vmx.h | 5 +
tools/build/Makefile.build | 10 +
tools/include/linux/compiler.h | 9 +
tools/lib/api/fs/fs.c | 16 +
tools/lib/api/fs/fs.h | 1 +
tools/lib/api/fs/tracing_path.c | 32 +-
tools/lib/bpf/bpf.h | 1 +
tools/lib/bpf/libbpf.c | 264 +++++++++++++++--
tools/lib/bpf/libbpf.h | 19 +-
tools/lib/subcmd/parse-options.h | 19 +-
tools/lib/traceevent/Makefile | 14 +-
tools/perf/Build | 5 +-
tools/perf/Documentation/perf-c2c.txt | 2 +-
tools/perf/Documentation/perf-ftrace.txt | 36 +++
tools/perf/Documentation/perf-kallsyms.txt | 24 ++
tools/perf/Documentation/perf-record.txt | 14 +-
tools/perf/Documentation/perf-sched.txt | 2 +
tools/perf/Documentation/perf-script.txt | 4 +-
tools/perf/Documentation/perf-trace.txt | 8 +-
tools/perf/Makefile.config | 6 +-
tools/perf/Makefile.perf | 1 +
tools/perf/arch/arm64/Makefile | 1 +
tools/perf/arch/arm64/include/dwarf-regs-table.h | 12 +-
tools/perf/arch/arm64/util/dwarf-regs.c | 15 +-
tools/perf/bench/numa.c | 6 +-
tools/perf/builtin-c2c.c | 3 +-
tools/perf/builtin-ftrace.c | 265 +++++++++++++++++
tools/perf/builtin-help.c | 8 +-
tools/perf/builtin-kallsyms.c | 67 +++++
tools/perf/builtin-kmem.c | 8 +-
tools/perf/builtin-list.c | 3 +
tools/perf/builtin-probe.c | 2 +
tools/perf/builtin-record.c | 158 +++++++++-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-sched.c | 130 ++++++++-
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-top.c | 6 +-
tools/perf/builtin-trace.c | 120 ++++++--
tools/perf/builtin.h | 2 +
tools/perf/command-list.txt | 2 +
tools/perf/perf.c | 20 +-
.../arch/x86/broadwellde/uncore-cache.json | 317 ++++++++++++++++++++
.../arch/x86/broadwellde/uncore-memory.json | 83 ++++++
.../arch/x86/broadwellde/uncore-power.json | 84 ++++++
.../arch/x86/broadwellx/uncore-cache.json | 317 ++++++++++++++++++++
.../arch/x86/broadwellx/uncore-interconnect.json | 28 ++
.../arch/x86/broadwellx/uncore-memory.json | 83 ++++++
.../arch/x86/broadwellx/uncore-power.json | 84 ++++++
.../pmu-events/arch/x86/haswellx/uncore-cache.json | 317 ++++++++++++++++++++
.../arch/x86/haswellx/uncore-interconnect.json | 28 ++
.../arch/x86/haswellx/uncore-memory.json | 83 ++++++
.../pmu-events/arch/x86/haswellx/uncore-power.json | 84 ++++++
.../pmu-events/arch/x86/ivytown/uncore-cache.json | 322 +++++++++++++++++++++
.../arch/x86/ivytown/uncore-interconnect.json | 46 +++
.../pmu-events/arch/x86/ivytown/uncore-memory.json | 75 +++++
.../pmu-events/arch/x86/ivytown/uncore-power.json | 249 ++++++++++++++++
.../pmu-events/arch/x86/jaketown/uncore-cache.json | 209 +++++++++++++
.../arch/x86/jaketown/uncore-interconnect.json | 46 +++
.../arch/x86/jaketown/uncore-memory.json | 79 +++++
.../pmu-events/arch/x86/jaketown/uncore-power.json | 248 ++++++++++++++++
.../arch/x86/knightslanding/uncore-memory.json | 42 +++
tools/perf/pmu-events/jevents.c | 84 +++++-
tools/perf/pmu-events/jevents.h | 4 +-
tools/perf/pmu-events/pmu-events.h | 3 +
tools/perf/tests/Build | 1 +
tools/perf/tests/bpf.c | 42 ++-
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/llvm.c | 2 +-
tools/perf/tests/parse-events.c | 8 +-
tools/perf/tests/tests.h | 1 +
tools/perf/tests/unit_number__scnprintf.c | 37 +++
tools/perf/ui/browsers/hists.c | 60 ++--
tools/perf/ui/setup.c | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/bpf-loader.c | 4 +-
tools/perf/util/callchain.c | 16 +-
tools/perf/util/config.c | 23 +-
tools/perf/util/data-convert-bt.c | 7 +-
tools/perf/util/dso.c | 48 ++-
tools/perf/util/event.c | 2 +-
tools/perf/util/evlist.c | 12 +-
tools/perf/util/evlist.h | 2 +
tools/perf/util/header.c | 7 +-
tools/perf/util/hist.c | 4 +-
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 +
.../util/intel-pt-decoder/intel-pt-pkt-decoder.c | 2 +
tools/perf/util/intel-pt.c | 4 +-
tools/perf/util/llvm-utils.c | 4 +-
tools/perf/util/machine.c | 19 ++
tools/perf/util/machine.h | 1 +
tools/perf/util/parse-events.c | 69 +++--
tools/perf/util/parse-events.y | 35 ++-
tools/perf/util/pmu.c | 109 ++++---
tools/perf/util/pmu.h | 1 +
tools/perf/util/probe-event.c | 11 +-
.../perf/util/scripting-engines/trace-event-perl.c | 6 +-
tools/perf/util/session.c | 2 +-
tools/perf/util/strfilter.c | 1 +
tools/perf/util/string.c | 2 +
tools/perf/util/symbol.c | 6 +-
tools/perf/util/thread_map.c | 2 +-
tools/perf/util/trace-event-info.c | 71 +++--
tools/perf/util/trace-event-parse.c | 17 ++
tools/perf/util/trace-event-read.c | 77 ++++-
tools/perf/util/trace-event.h | 1 +
tools/perf/util/unwind-libunwind-local.c | 54 +++-
tools/perf/util/util.c | 15 +-
tools/perf/util/util.h | 3 +-
tools/scripts/Makefile.include | 12 +-
122 files changed, 5101 insertions(+), 550 deletions(-)
create mode 100644 arch/x86/events/amd/Makefile
create mode 100644 tools/perf/Documentation/perf-ftrace.txt
create mode 100644 tools/perf/Documentation/perf-kallsyms.txt
create mode 100644 tools/perf/builtin-ftrace.c
create mode 100644 tools/perf/builtin-kallsyms.c
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json
create mode 100644 tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
create mode 100644 tools/perf/tests/unit_number__scnprintf.c
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libperl_O: make NO_LIBPERL=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_perf_o_O: make perf.o
make_pure_O: make
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libunwind_O: make NO_LIBUNWIND=1
make_static_O: make LDFLAGS=-static
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_help_O: make help
make_doc_O: make doc
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_tags_O: make tags
make_debug_O: make DEBUG=1
make_install_bin_O: make install-bin
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_util_map_o_O: make util/map.o
make_no_libelf_O: make NO_LIBELF=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_gtk2_O: make NO_GTK2=1
make_clean_all_O: make clean all
make_no_newt_O: make NO_NEWT=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_install_O: make install
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_slang_O: make NO_SLANG=1

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

This is not a full uncore event list, but a short list of useful and
understandable metrics.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../pmu-events/arch/x86/haswellx/uncore-cache.json | 317 +++++++++++++++++++++
.../arch/x86/haswellx/uncore-interconnect.json | 28 ++
.../arch/x86/haswellx/uncore-memory.json | 83 ++++++
.../pmu-events/arch/x86/haswellx/uncore-power.json | 84 ++++++
4 files changed, 512 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json

diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
new file mode 100644
index 000000000000..076459c51d4e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
@@ -0,0 +1,317 @@
+[
+ {
+ "BriefDescription": "Uncore cache clock ticks. Derived from unc_c_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_C_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "All LLC Misses (code+ data rd + data wr - including demand and prefetch). Derived from unc_c_llc_lookup.any",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x34",
+ "EventName": "UNC_C_LLC_LOOKUP.ANY",
+ "Filter": "filter_state=0x1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x11",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "M line evictions from LLC (writebacks to memory). Derived from unc_c_llc_victims.m_state",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x37",
+ "EventName": "UNC_C_LLC_VICTIMS.M_STATE",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - demand and prefetch data reads - excludes LLC prefetches. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.UNCACHEABLE",
+ "Filter": "filter_opc=0x187",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "MMIO reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.MMIO_READ",
+ "Filter": "filter_opc=0x187,filter_nc=1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "MMIO writes. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.MMIO_WRITE",
+ "Filter": "filter_opc=0x18f,filter_nc=1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for RFO. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.RFO_LLC_PREFETCH",
+ "Filter": "filter_opc=0x190",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for code reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.CODE_LLC_PREFETCH",
+ "Filter": "filter_opc=0x191",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for data reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_LLC_PREFETCH",
+ "Filter": "filter_opc=0x192",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for PCIe read current. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "ItoM write misses (as part of fast string memcpy stores) + PCIe full line writes. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe write misses (full cache line). Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_NON_SNOOP_WRITE",
+ "Filter": "filter_opc=0x1c8,filter_tid=0x3e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe writes (partial cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_PARTIAL_WRITE",
+ "Filter": "filter_opc=0x180,filter_tid=0x3e",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "L2 demand and L2 prefetch code references to LLC. Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.CODE_LLC_PREFETCH",
+ "Filter": "filter_opc=0x181",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (full cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_FULL",
+ "Filter": "filter_opc=0x18c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (partial cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
+ "Filter": "filter_opc=0x18d",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe read current. Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe write references (full cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_WRITE",
+ "Filter": "filter_opc=0x1c8,filter_tid=0x3e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy counter for LLC data reads (demand and L2 prefetch). Derived from unc_c_tor_occupancy.miss_opcode",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.LLC_DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "read requests to home agent. Derived from unc_h_requests.reads",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "read requests to local home agent. Derived from unc_h_requests.reads_local",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS_LOCAL",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "read requests to remote home agent. Derived from unc_h_requests.reads_remote",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS_REMOTE",
+ "PerPkg": "1",
+ "UMask": "0x2",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to home agent. Derived from unc_h_requests.writes",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES",
+ "PerPkg": "1",
+ "UMask": "0xC",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to local home agent. Derived from unc_h_requests.writes_local",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES_LOCAL",
+ "PerPkg": "1",
+ "UMask": "0x4",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to remote home agent. Derived from unc_h_requests.writes_remote",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES_REMOTE",
+ "PerPkg": "1",
+ "UMask": "0x8",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Conflict requests (requests for same address from multiple agents simultaneously). Derived from unc_h_snoop_resp.rspcnflct",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPCNFLCT",
+ "PerPkg": "1",
+ "UMask": "0x40",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "M line forwarded from remote cache along with writeback to memory. Derived from unc_h_snoop_resp.rsp_fwd_wb",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSP_FWD_WB",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x20",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "M line forwarded from remote cache with no writeback to memory. Derived from unc_h_snoop_resp.rspifwd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPIFWD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x4",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Shared line response from remote cache. Derived from unc_h_snoop_resp.rsps",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPS",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x2",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Shared line forwarded from remote cache. Derived from unc_h_snoop_resp.rspsfwd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPSFWD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x8",
+ "Unit": "HA"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json
new file mode 100644
index 000000000000..39387f7909b2
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json
@@ -0,0 +1,28 @@
+[
+ {
+ "BriefDescription": "QPI clock ticks. Derived from unc_q_clockticks",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x14",
+ "EventName": "UNC_Q_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of data flits transmitted . Derived from unc_q_txl_flits_g0.data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x2",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of non data (control) flits transmitted . Derived from unc_q_txl_flits_g0.non_data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.NON_DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x4",
+ "Unit": "QPI LL"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json
new file mode 100644
index 000000000000..d17dc235f734
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json
@@ -0,0 +1,83 @@
+[
+ {
+ "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0xC",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Memory controller clock ticks. Derived from unc_m_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_M_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x85",
+ "EventName": "UNC_M_POWER_CHANNEL_PPD",
+ "MetricExpr": "(UNC_M_POWER_CHANNEL_PPD / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles all ranks are in critical thermal throttle. Derived from unc_m_power_critical_throttle_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x86",
+ "EventName": "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES",
+ "MetricExpr": "(UNC_M_POWER_CRITICAL_THROTTLE_CYCLES / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles Memory is in self refresh power mode. Derived from unc_m_power_self_refresh",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x43",
+ "EventName": "UNC_M_POWER_SELF_REFRESH",
+ "MetricExpr": "(UNC_M_POWER_SELF_REFRESH / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Pre-charges due to page misses. Derived from unc_m_pre_count.page_miss",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.PAGE_MISS",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Pre-charge for reads. Derived from unc_m_pre_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.RD",
+ "PerPkg": "1",
+ "UMask": "0x4",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Pre-charge for writes. Derived from unc_m_pre_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.WR",
+ "PerPkg": "1",
+ "UMask": "0x8",
+ "Unit": "iMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json
new file mode 100644
index 000000000000..b44d43088bbb
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json
@@ -0,0 +1,84 @@
+[
+ {
+ "BriefDescription": "PCU clock ticks. Use to get percentages of PCU cycles events. Derived from unc_p_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_P_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C0 and C1. Derived from unc_p_power_state_occupancy.cores_c0",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
+ "Filter": "occ_sel=1",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C0 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C3. Derived from unc_p_power_state_occupancy.cores_c3",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
+ "Filter": "occ_sel=2",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C3 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C6 and C7. Derived from unc_p_power_state_occupancy.cores_c6",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
+ "Filter": "occ_sel=3",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C6 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "External Prochot. Derived from unc_p_prochot_external_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xA",
+ "EventName": "UNC_P_PROCHOT_EXTERNAL_CYCLES",
+ "MetricExpr": "(UNC_P_PROCHOT_EXTERNAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Thermal Strongest Upper Limit Cycles. Derived from unc_p_freq_max_limit_thermal_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "OS Strongest Upper Limit Cycles. Derived from unc_p_freq_max_os_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x6",
+ "EventName": "UNC_P_FREQ_MAX_OS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_OS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Power Strongest Upper Limit Cycles. Derived from unc_p_freq_max_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x5",
+ "EventName": "UNC_P_FREQ_MAX_POWER_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_POWER_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Cycles spent changing Frequency. Derived from unc_p_freq_trans_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x74",
+ "EventName": "UNC_P_FREQ_TRANS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_TRANS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ }
+]
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

Add metrics for memory and MCDRAM. Minimal metrics only for now.
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../arch/x86/knightslanding/uncore-memory.json | 42 ++++++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json

diff --git a/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json b/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
new file mode 100644
index 000000000000..e3bcd86c4f56
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
@@ -0,0 +1,42 @@
+[
+ {
+ "BriefDescription": "ddr bandwidth read (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x03",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "imc"
+ },
+ {
+ "BriefDescription": "ddr bandwidth write (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x03",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x02",
+ "Unit": "imc"
+ },
+ {
+ "BriefDescription": "mcdram bandwidth read (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x01",
+ "EventName": "UNC_E_RPQ_INSERTS",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "edc_eclk"
+ },
+ {
+ "BriefDescription": "mcdram bandwidth write (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x02",
+ "EventName": "UNC_E_WPQ_INSERTS",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "edc_eclk"
+ }
+]
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: David Carrillo-Cisneros <dav...@google.com>

The dynamic-list-file used to export dynamic symbols introduced in

commit e3d09ec8126f ("tools lib traceevent: Export dynamic symbols
used by traceevent plugins")

is generated without any sort of error checking.

I experienced problems due to an old version of nm (v 0.158) that outputs
in a format distinct from the assumed by the script.

Robustify the built of dynamic symbol list by enforcing that the second
column of $(NM) -u <files> is either "U" (Undefined), "W" or "w" (undefined
weak), which are the possible outputs from non-ancient $(NM) versions.
Print an error if format is unexpected.

v2: Accept "W" and "w" symbol options.

Signed-off-by: David Carrillo-Cisneros <dav...@google.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Michal Marek <mma...@suse.com>
Cc: Paul Turner <p...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <era...@google.com>
Cc: Uwe Kleine-König <u.klein...@pengutronix.de>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20170208052840....@google.com
[ Use STRING1 = STRING1 instead of == to make this work on Ubuntu systems ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/traceevent/Makefile | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 2616c66e10c1..47076b15eebe 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -257,10 +257,16 @@ define do_install_plugins
endef

define do_generate_dynamic_list_file
- (echo '{'; \
- $(NM) -u -D $1 | awk 'NF>1 {print "\t"$$2";"}' | sort -u; \
- echo '};'; \
- ) > $2
+ symbol_type=`$(NM) -u -D $1 | awk 'NF>1 {print $$1}' | \
+ xargs echo "U W w" | tr ' ' '\n' | sort -u | xargs echo`;\
+ if [ "$$symbol_type" = "U W w" ];then \
+ (echo '{'; \
+ $(NM) -u -D $1 | awk 'NF>1 {print "\t"$$2";"}' | sort -u;\
+ echo '};'; \
+ ) > $2; \
+ else \
+ (echo Either missing one of [$1] or bad version of $(NM)) 1>&2;\
+ fi
endef

install_lib: all_cmd install_plugins
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

util/strfilter.c: In function 'strfilter_node__sprint':
util/strfilter.c:270:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (len < 0)
^
util/strfilter.c:272:2: note: here
case '!':
^~~~
cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-z2dpywg7u8...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/strfilter.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/strfilter.c b/tools/perf/util/strfilter.c
index bcae659b6546..efb53772e0ec 100644
--- a/tools/perf/util/strfilter.c
+++ b/tools/perf/util/strfilter.c
@@ -269,6 +269,7 @@ static int strfilter_node__sprint(struct strfilter_node *node, char *buf)
len = strfilter_node__sprint_pt(node->l, buf);
if (len < 0)
return len;
+ __fallthrough;
case '!':
if (buf) {
*(buf + len++) = *node->p;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

CC /tmp/build/perf/builtin-top.o
builtin-top.c: In function 'display_thread':
builtin-top.c:644:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (errno == EINTR)
^
builtin-top.c:647:3: note: here
default:
^~~~~~~

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-lmcfnnyx9i...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-top.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 20aef9815cd8..d90927f31ff6 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -643,7 +643,7 @@ static void *display_thread(void *arg)
case -1:
if (errno == EINTR)
continue;
- /* Fall trhu */
+ __fallthrough;
default:
c = getc(stdin);
tcsetattr(0, TCSAFLUSH, &save);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ophb30v9a...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/string.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index d8dfaf64b32e..bddca519dd58 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -21,6 +21,8 @@ s64 perf_atoll(const char *str)
case 'b': case 'B':
if (*p)
goto out_err;
+
+ __fallthrough;
case '\0':
return length;
default:
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Uwe Kleine-König <u.klein...@pengutronix.de>

perf probe makes use of debug symbols, so add --symfs as the other
commands have.

Signed-off-by: Uwe Kleine-König <u.klein...@pengutronix.de>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: ker...@pengutronix.de
Link: http://lkml.kernel.org/r/1469094512-13440-2-git-s...@pengutronix.de
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-probe.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index f87996b0cb29..1fcebc31a508 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -552,6 +552,8 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_BOOLEAN(0, "cache", &probe_conf.cache, "Manipulate probe cache"),
+ OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
+ "Look for files with symbols relative to this directory"),
OPT_END()
};
int ret;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:10:06 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

The size of dirent->dt_name is NAME_MAX + 1, but the size for the 'path'
buffer is hard coded at 256, which may truncate it because we also
prepend "/proc/", so that all that into account and thank gcc 7 for this
warning:

/git/linux/tools/perf/util/thread_map.c: In function 'thread_map__new_by_uid':
/git/linux/tools/perf/util/thread_map.c:119:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 250 [-Werror=format-truncation=]
snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
^~
In file included from /usr/include/stdio.h:939:0,
from /git/linux/tools/perf/util/thread_map.c:5:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 7 and 262 bytes into a destination of size 256
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-csy0r8zrvz...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/thread_map.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index f9eab200fd75..7c3fcc538a70 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -93,7 +93,7 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
{
DIR *proc;
int max_threads = 32, items, i;
- char path[256];
+ char path[NAME_MAX + 1 + 6];
struct dirent *dirent, **namelist = NULL;
struct thread_map *threads = thread_map__alloc(max_threads);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:40:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

The code for handling pmu aliases without specifying the PMU hardcoded
only supported the cpu PMU.

This patch extends it to work for all PMUs. We always duplicate the
event for all PMUs that have an matching alias. This allows to
automatically expand an alias for all instances of a PMU (so for example
you can monitor all cache boxes with a single event)

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017012802034...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.c | 46 ++++++++++++++++++++++++------------------
tools/perf/util/parse-events.y | 32 ++++++++++++++++++++++-------
2 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3c876b8ba4de..6dbcba7f0969 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1504,35 +1504,41 @@ static void perf_pmu__parse_init(void)
struct perf_pmu_alias *alias;
int len = 0;

- pmu = perf_pmu__find("cpu");
- if ((pmu == NULL) || list_empty(&pmu->aliases)) {
+ pmu = NULL;
+ while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+ list_for_each_entry(alias, &pmu->aliases, list) {
+ if (strchr(alias->name, '-'))
+ len++;
+ len++;
+ }
+ }
+
+ if (len == 0) {
perf_pmu_events_list_num = -1;
return;
}
- list_for_each_entry(alias, &pmu->aliases, list) {
- if (strchr(alias->name, '-'))
- len++;
- len++;
- }
perf_pmu_events_list = malloc(sizeof(struct perf_pmu_event_symbol) * len);
if (!perf_pmu_events_list)
return;
perf_pmu_events_list_num = len;

len = 0;
- list_for_each_entry(alias, &pmu->aliases, list) {
- struct perf_pmu_event_symbol *p = perf_pmu_events_list + len;
- char *tmp = strchr(alias->name, '-');
-
- if (tmp != NULL) {
- SET_SYMBOL(strndup(alias->name, tmp - alias->name),
- PMU_EVENT_SYMBOL_PREFIX);
- p++;
- SET_SYMBOL(strdup(++tmp), PMU_EVENT_SYMBOL_SUFFIX);
- len += 2;
- } else {
- SET_SYMBOL(strdup(alias->name), PMU_EVENT_SYMBOL);
- len++;
+ pmu = NULL;
+ while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+ list_for_each_entry(alias, &pmu->aliases, list) {
+ struct perf_pmu_event_symbol *p = perf_pmu_events_list + len;
+ char *tmp = strchr(alias->name, '-');
+
+ if (tmp != NULL) {
+ SET_SYMBOL(strndup(alias->name, tmp - alias->name),
+ PMU_EVENT_SYMBOL_PREFIX);
+ p++;
+ SET_SYMBOL(strdup(++tmp), PMU_EVENT_SYMBOL_SUFFIX);
+ len += 2;
+ } else {
+ SET_SYMBOL(strdup(alias->name), PMU_EVENT_SYMBOL);
+ len++;
+ }
}
}
qsort(perf_pmu_events_list, len,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 879115f93edc..f3b5ec901600 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -12,6 +12,7 @@
#include <linux/list.h>
#include <linux/types.h>
#include "util.h"
+#include "pmu.h"
#include "parse-events.h"
#include "parse-events-bison.h"

@@ -236,15 +237,32 @@ PE_KERNEL_PMU_EVENT sep_dc
struct list_head *head;
struct parse_events_term *term;
struct list_head *list;
+ struct perf_pmu *pmu = NULL;
+ int ok = 0;

- ALLOC_LIST(head);
- ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
- $1, 1, &@1, NULL));
- list_add_tail(&term->list, head);
-
+ /* Add it for all PMUs that support the alias */
ALLOC_LIST(list);
- ABORT_ON(parse_events_add_pmu(data, list, "cpu", head));
- parse_events_terms__delete(head);
+ while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+ struct perf_pmu_alias *alias;
+
+ list_for_each_entry(alias, &pmu->aliases, list) {
+ if (!strcasecmp(alias->name, $1)) {
+ ALLOC_LIST(head);
+ ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
+ $1, 1, &@1, NULL));
+ list_add_tail(&term->list, head);
+
+ if (!parse_events_add_pmu(data, list,
+ pmu->name, head)) {
+ ok++;
+ }
+
+ parse_events_terms__delete(head);
+ }
+ }
+ }
+ if (!ok)
+ YYABORT;
$$ = list;
}
|
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:40:05 PM2/9/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.

Probe points for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.

Perf shows misleading hint when a user tries to record SDT event without
first creating a probe point. Show proper hint there.

Before patch:

$ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
\___ unknown tracepoint

Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
...

After patch:

$ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
\___ unknown tracepoint

Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint: SDT event cannot be directly recorded on.
Please first use 'perf probe sdt_glib:idle__add' before recording it.
...

$ perf probe sdt_glib:idle__add
Added new event:
sdt_glib:idle__add (on %idle__add in /usr/lib64/libglib-2.0.so.0.5000.2)

You can now use it in all perf tools, such as:

perf record -e sdt_glib:idle__add -aR sleep 1

$ perf record -a -e sdt_glib:idle__add
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.175 MB perf.data ]

Suggested-and-Acked-by: Ingo Molnar <mi...@redhat.com>
Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexis Berlemont <alexis.b...@gmail.com>
Cc: Madhavan Srinivasan <ma...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170203102642.172...@linux.vnet.ibm.com
[ s/Please use/Please first use/ and break the Hint line in two ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/lib/api/fs/tracing_path.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 251b7c342a87..3e606b9c443e 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -86,9 +86,13 @@ void put_tracing_file(char *file)
free(file);
}

-static int strerror_open(int err, char *buf, size_t size, const char *filename)
+int tracing_path__strerror_open_tp(int err, char *buf, size_t size,
+ const char *sys, const char *name)
{
char sbuf[128];
+ char filename[PATH_MAX];
+
+ snprintf(filename, PATH_MAX, "%s/%s", sys, name ?: "*");

switch (err) {
case ENOENT:
@@ -99,10 +103,19 @@ static int strerror_open(int err, char *buf, size_t size, const char *filename)
* - jirka
*/
if (debugfs__configured() || tracefs__configured()) {
- snprintf(buf, size,
- "Error:\tFile %s/%s not found.\n"
- "Hint:\tPerhaps this kernel misses some CONFIG_ setting to enable this feature?.\n",
- tracing_events_path, filename);
+ /* sdt markers */
+ if (!strncmp(filename, "sdt_", 4)) {
+ snprintf(buf, size,
+ "Error:\tFile %s/%s not found.\n"
+ "Hint:\tSDT event cannot be directly recorded on.\n"
+ "\tPlease first use 'perf probe %s:%s' before recording it.\n",
+ tracing_events_path, filename, sys, name);
+ } else {
+ snprintf(buf, size,
+ "Error:\tFile %s/%s not found.\n"
+ "Hint:\tPerhaps this kernel misses some CONFIG_ setting to enable this feature?.\n",
+ tracing_events_path, filename);
+ }
break;
}
snprintf(buf, size, "%s",
@@ -125,12 +138,3 @@ static int strerror_open(int err, char *buf, size_t size, const char *filename)

return 0;
}
-
-int tracing_path__strerror_open_tp(int err, char *buf, size_t size, const char *sys, const char *name)
-{
- char path[PATH_MAX];
-
- snprintf(path, PATH_MAX, "%s/%s", sys, name ?: "*");
-
- return strerror_open(err, buf, size, path);
-}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:40:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

Handle the "Unit" field, which is needed to find the right PMU for an
event. We call it "pmu" and convert it to the perf pmu name with an
uncore prefix.

Handle the "ExtSel" field, which just extends the event mask with an
additional bit.

Handle the "Filter" field which adds parameters to the main event
to configure filtering.

Handle the "Unit" field which declares the unit the values should be
scaled too (similar to what the kernel exports)

Set up the "perpkg" field for uncore events so that perf knows they are
per package (similar to what the kernel exports)

Then output the fields into the pmu-events data structures which are
compiled into perf.

Filter out zero fields, except for the event itself.

v2: Fix compilation. Add uncore_ prefix at pre-processing time.
Move eventcode change to separate patch.

v3: Remove extra __maybe_unused

v4: dont duplicate aliases for cpu pmu events
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/pmu-events/jevents.c | 74 +++++++++++++++++++++++++++++++++++---
tools/perf/pmu-events/jevents.h | 4 ++-
tools/perf/pmu-events/pmu-events.h | 3 ++
tools/perf/util/pmu.c | 30 +++++++++++-----
4 files changed, 98 insertions(+), 13 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 551377bf8428..eed09346a72a 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -188,6 +188,27 @@ static struct msrmap *lookup_msr(char *map, jsmntok_t *val)
return NULL;
}

+static struct map {
+ const char *json;
+ const char *perf;
+} unit_to_pmu[] = {
+ { "CBO", "uncore_cbox" },
+ { "QPI LL", "uncore_qpi" },
+ { "SBO", "uncore_sbox" },
+ {}
+};
+
+static const char *field_to_perf(struct map *table, char *map, jsmntok_t *val)
+{
+ int i;
+
+ for (i = 0; table[i].json; i++) {
+ if (json_streq(map, val, table[i].json))
+ return table[i].perf;
+ }
+ return NULL;
+}
+
#define EXPECT(e, t, m) do { if (!(e)) { \
jsmntok_t *loc = (t); \
if (!(t)->start && (t) > tokens) \
@@ -269,7 +290,8 @@ static void print_events_table_prefix(FILE *fp, const char *tblname)
}

static int print_events_table_entry(void *data, char *name, char *event,
- char *desc, char *long_desc)
+ char *desc, char *long_desc,
+ char *pmu, char *unit, char *perpkg)
{
struct perf_entry_data *pd = data;
FILE *outfp = pd->outfp;
@@ -287,7 +309,12 @@ static int print_events_table_entry(void *data, char *name, char *event,
fprintf(outfp, "\t.topic = \"%s\",\n", topic);
if (long_desc && long_desc[0])
fprintf(outfp, "\t.long_desc = \"%s\",\n", long_desc);
-
+ if (pmu)
+ fprintf(outfp, "\t.pmu = \"%s\",\n", pmu);
+ if (unit)
+ fprintf(outfp, "\t.unit = \"%s\",\n", unit);
+ if (perpkg)
+ fprintf(outfp, "\t.perpkg = \"%s\",\n", perpkg);
fprintf(outfp, "},\n");

return 0;
@@ -334,7 +361,8 @@ static char *real_event(const char *name, char *event)
/* Call func with each event in the json file */
int json_events(const char *fn,
int (*func)(void *data, char *name, char *event, char *desc,
- char *long_desc),
+ char *long_desc,
+ char *pmu, char *unit, char *perpkg),
void *data)
{
int err = -EIO;
@@ -356,6 +384,10 @@ int json_events(const char *fn,
char *event = NULL, *desc = NULL, *name = NULL;
char *long_desc = NULL;
char *extra_desc = NULL;
+ char *pmu = NULL;
+ char *filter = NULL;
+ char *perpkg = NULL;
+ char *unit = NULL;
unsigned long long eventcode = 0;
struct msrmap *msr = NULL;
jsmntok_t *msrval = NULL;
@@ -382,6 +414,11 @@ int json_events(const char *fn,
addfield(map, &code, "", "", val);
eventcode |= strtoul(code, NULL, 0);
free(code);
+ } else if (json_streq(map, field, "ExtSel")) {
+ char *code = NULL;
+ addfield(map, &code, "", "", val);
+ eventcode |= strtoul(code, NULL, 0) << 21;
+ free(code);
} else if (json_streq(map, field, "EventName")) {
addfield(map, &name, "", "", val);
} else if (json_streq(map, field, "BriefDescription")) {
@@ -405,6 +442,28 @@ int json_events(const char *fn,
addfield(map, &extra_desc, ". ",
" Supports address when precise",
NULL);
+ } else if (json_streq(map, field, "Unit")) {
+ const char *ppmu;
+ char *s;
+
+ ppmu = field_to_perf(unit_to_pmu, map, val);
+ if (ppmu) {
+ pmu = strdup(ppmu);
+ } else {
+ if (!pmu)
+ pmu = strdup("uncore_");
+ addfield(map, &pmu, "", "", val);
+ for (s = pmu; *s; s++)
+ *s = tolower(*s);
+ }
+ addfield(map, &desc, ". ", "Unit: ", NULL);
+ addfield(map, &desc, "", pmu, NULL);
+ } else if (json_streq(map, field, "Filter")) {
+ addfield(map, &filter, "", "", val);
+ } else if (json_streq(map, field, "ScaleUnit")) {
+ addfield(map, &unit, "", "", val);
+ } else if (json_streq(map, field, "PerPkg")) {
+ addfield(map, &perpkg, "", "", val);
}
/* ignore unknown fields */
}
@@ -422,16 +481,23 @@ int json_events(const char *fn,
addfield(map, &desc, " ", extra_desc, NULL);
if (long_desc && extra_desc)
addfield(map, &long_desc, " ", extra_desc, NULL);
+ if (filter)
+ addfield(map, &event, ",", filter, NULL);
if (msr != NULL)
addfield(map, &event, ",", msr->pname, msrval);
fixname(name);

- err = func(data, name, real_event(name, event), desc, long_desc);
+ err = func(data, name, real_event(name, event), desc, long_desc,
+ pmu, unit, perpkg);
free(event);
free(desc);
free(name);
free(long_desc);
free(extra_desc);
+ free(pmu);
+ free(filter);
+ free(perpkg);
+ free(unit);
if (err)
break;
tok += j;
diff --git a/tools/perf/pmu-events/jevents.h b/tools/perf/pmu-events/jevents.h
index b0eb2744b498..71e13de31092 100644
--- a/tools/perf/pmu-events/jevents.h
+++ b/tools/perf/pmu-events/jevents.h
@@ -3,7 +3,9 @@

int json_events(const char *fn,
int (*func)(void *data, char *name, char *event, char *desc,
- char *long_desc),
+ char *long_desc,
+ char *pmu,
+ char *unit, char *perpkg),
void *data);
char *get_cpu_str(void);

diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index 2eaef595d8a0..c669a3cdb9f0 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -10,6 +10,9 @@ struct pmu_event {
const char *desc;
const char *topic;
const char *long_desc;
+ const char *pmu;
+ const char *unit;
+ const char *perpkg;
};

/*
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 78b16100567d..6dc3cc050105 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -229,11 +229,13 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias *alias,
}

static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
- char *desc, char *val, char *long_desc,
- char *topic)
+ char *desc, char *val,
+ char *long_desc, char *topic,
+ char *unit, char *perpkg)
{
struct perf_pmu_alias *alias;
int ret;
+ int num;

alias = malloc(sizeof(*alias));
if (!alias)
@@ -267,7 +269,12 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
alias->long_desc = long_desc ? strdup(long_desc) :
desc ? strdup(desc) : NULL;
alias->topic = topic ? strdup(topic) : NULL;
-
+ if (unit) {
+ if (convert_scale(unit, &unit, &alias->scale) < 0)
+ return -1;
+ snprintf(alias->unit, sizeof(alias->unit), "%s", unit);
+ }
+ alias->per_pkg = perpkg && sscanf(perpkg, "%d", &num) == 1 && num == 1;
list_add_tail(&alias->list, list);

return 0;
@@ -284,7 +291,8 @@ static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FI

buf[ret] = 0;

- return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL);
+ return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL, NULL,
+ NULL);
}

static inline bool pmu_alias_info_file(char *name)
@@ -504,7 +512,7 @@ char * __weak get_cpuid_str(void)
* to the current running CPU. Then, add all PMU events from that table
* as aliases.
*/
-static void pmu_add_cpu_aliases(struct list_head *head)
+static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
{
int i;
struct pmu_events_map *map;
@@ -540,14 +548,21 @@ static void pmu_add_cpu_aliases(struct list_head *head)
*/
i = 0;
while (1) {
+ const char *pname;
+
pe = &map->table[i++];
if (!pe->name)
break;

+ pname = pe->pmu ? pe->pmu : "cpu";
+ if (strncmp(pname, name, strlen(pname)))
+ continue;
+
/* need type casts to override 'const' */
__perf_pmu__new_alias(head, NULL, (char *)pe->name,
(char *)pe->desc, (char *)pe->event,
- (char *)pe->long_desc, (char *)pe->topic);
+ (char *)pe->long_desc, (char *)pe->topic,
+ (char *)pe->unit, (char *)pe->perpkg);
}

out:
@@ -578,8 +593,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
if (pmu_aliases(name, &aliases))
return NULL;

- if (!strcmp(name, "cpu"))
- pmu_add_cpu_aliases(&aliases);
+ pmu_add_cpu_aliases(&aliases, name);

if (pmu_type(name, &type))
return NULL;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:40:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

For debugging and testing it is useful to see the converted alias
string. Add support to perf stat/record and perf list to print the alias
conversion. The text string is saved in the alias structure. For perf
stat/record it is folded into the normal -v. For perf list -v was taken,
so we use --debug.

Before:

% perf list
...
cache:
l1d.replacement
[L1D data line replacements]
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]

After

% perf list --debug
...
cache:
l1d.replacement
[L1D data line replacements]
cpu/umask=0x1,period=2000003,event=0x51/
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
cpu/umask=0x2,period=2000003,cmask=1,event=0x48/

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017012802034...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-list.c | 3 +++
tools/perf/util/parse-events.y | 3 +++
tools/perf/util/pmu.c | 8 ++++++++
tools/perf/util/pmu.h | 1 +
4 files changed, 15 insertions(+)

diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index ba9322ff858b..3b9d98b5feef 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -14,6 +14,7 @@
#include "util/parse-events.h"
#include "util/cache.h"
#include "util/pmu.h"
+#include "util/debug.h"
#include <subcmd/parse-options.h>

static bool desc_flag = true;
@@ -29,6 +30,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
"Print extra event descriptions. --no-desc to not print."),
OPT_BOOLEAN('v', "long-desc", &long_desc_flag,
"Print longer event descriptions."),
+ OPT_INCR(0, "debug", &verbose,
+ "Enable debugging output"),
OPT_END()
};
const char * const list_usage[] = {
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index f3b5ec901600..3a5196380609 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -13,6 +13,7 @@
#include <linux/types.h>
#include "util.h"
#include "pmu.h"
+#include "debug.h"
#include "parse-events.h"
#include "parse-events-bison.h"

@@ -254,6 +255,8 @@ PE_KERNEL_PMU_EVENT sep_dc

if (!parse_events_add_pmu(data, list,
pmu->name, head)) {
+ pr_debug("%s -> %s/%s/\n", $1,
+ pmu->name, alias->str);
ok++;
}

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8e9d00fd418e..82a654dec666 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -275,6 +275,8 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
snprintf(alias->unit, sizeof(alias->unit), "%s", unit);
}
alias->per_pkg = perpkg && sscanf(perpkg, "%d", &num) == 1 && num == 1;
+ alias->str = strdup(val);
+
list_add_tail(&alias->list, list);

return 0;
@@ -1087,6 +1089,8 @@ struct sevent {
char *name;
char *desc;
char *topic;
+ char *str;
+ char *pmu;
};

static int cmp_sevent(const void *a, const void *b)
@@ -1183,6 +1187,8 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
aliases[j].desc = long_desc ? alias->long_desc :
alias->desc;
aliases[j].topic = alias->topic;
+ aliases[j].str = alias->str;
+ aliases[j].pmu = pmu->name;
j++;
}
if (pmu->selectable &&
@@ -1217,6 +1223,8 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
printf("%*s", 8, "[");
wordwrap(aliases[j].desc, 8, columns, 0);
printf("]\n");
+ if (verbose)
+ printf("%*s%s/%s/\n", 8, "", aliases[j].pmu, aliases[j].str);
} else
printf(" %-50s [Kernel PMU event]\n", aliases[j].name);
printed++;
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 25712034c815..00852ddc7741 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -43,6 +43,7 @@ struct perf_pmu_alias {
char *desc;
char *long_desc;
char *topic;
+ char *str;
struct list_head terms; /* HEAD struct parse_events_term -> list */
struct list_head list; /* ELEM */
char unit[UNIT_MAX_LEN+1];
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:40:06 PM2/9/17
to
From: He Kuang <hek...@huawei.com>

Since HAVE_KPROBES can be enabled in arm64, this patch introduces
regs_query_register_offset() to convert register name to offset for
arm64, so the BPF prologue feature is ready to use.

Signed-off-by: He Kuang <hek...@huawei.com>
Reviewed-by: Will Deacon <will....@arm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Bintian Wang <bintia...@huawei.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: linux-ar...@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412....@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/arm64/Makefile | 1 +
tools/perf/arch/arm64/util/dwarf-regs.c | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 18b13518d8d8..eebe1ec9d2ee 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -2,3 +2,4 @@ ifndef NO_DWARF
PERF_HAVE_DWARF_REGS := 1
endif
PERF_HAVE_JITDUMP := 1
+PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
diff --git a/tools/perf/arch/arm64/util/dwarf-regs.c b/tools/perf/arch/arm64/util/dwarf-regs.c
index d49efeb8172e..068b6189157b 100644
--- a/tools/perf/arch/arm64/util/dwarf-regs.c
+++ b/tools/perf/arch/arm64/util/dwarf-regs.c
@@ -10,17 +10,20 @@

#include <stddef.h>
#include <dwarf-regs.h>
+#include <linux/ptrace.h> /* for struct user_pt_regs */
+#include "util.h"

struct pt_regs_dwarfnum {
const char *name;
unsigned int dwarfnum;
};

-#define STR(s) #s
#define REG_DWARFNUM_NAME(r, num) {.name = r, .dwarfnum = num}
#define GPR_DWARFNUM_NAME(num) \
{.name = STR(%x##num), .dwarfnum = num}
#define REG_DWARFNUM_END {.name = NULL, .dwarfnum = 0}
+#define DWARFNUM2OFFSET(index) \
+ (index * sizeof((struct user_pt_regs *)0)->regs[0])

/*
* Reference:
@@ -78,3 +81,13 @@ const char *get_arch_regstr(unsigned int n)
return roff->name;
return NULL;
}
+
+int regs_query_register_offset(const char *name)
+{
+ const struct pt_regs_dwarfnum *roff;
+
+ for (roff = regdwarfnum_table; roff->name != NULL; roff++)
+ if (!strcmp(roff->name, name))
+ return DWARFNUM2OFFSET(roff->dwarfnum);
+ return -EINVAL;
+}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:04 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

For cases where implicit fall through case labels are intended,
to let us inform that to gcc >= 7:

CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~

So we introduce:

#define __fallthrough __attribute__ ((fallthrough))

And use it in such cases.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: William Cohen <wco...@redhat.com>
Link: http://lkml.kernel.org/n/tip-qnpig0xfop...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/linux/compiler.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index e33fc1df3935..d94179f94caa 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -126,4 +126,13 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
#define WRITE_ONCE(x, val) \
({ union { typeof(x) __val; char __c[1]; } __u = { .__val = (val) }; __write_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })

+
+#ifndef __fallthrough
+# if defined(__GNUC__) && __GNUC__ >= 7
+# define __fallthrough __attribute__ ((fallthrough))
+# else
+# define __fallthrough
+# endif
+#endif
+
#endif /* _TOOLS_LINUX_COMPILER_H */
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Taeung Song <treeze...@gmail.com>

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-2-git-s...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6dbcba7f0969..2720600b13a9 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -310,10 +310,11 @@ __add_event(struct list_head *list, int *idx,

event_attr_init(attr);

- evsel = perf_evsel__new_idx(attr, (*idx)++);
+ evsel = perf_evsel__new_idx(attr, *idx);
if (!evsel)
return NULL;

+ (*idx)++;
evsel->cpus = cpu_map__get(cpus);
evsel->own_cpus = cpu_map__get(cpus);

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

The next patch needs to modify event code. Previously eventcode was just
passed through as a string. Now parse it as a number.

v2: Don't special case 0

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017012802034...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/pmu-events/jevents.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 41611d7f9873..551377bf8428 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -135,7 +135,6 @@ static struct field {
const char *field;
const char *kernel;
} fields[] = {
- { "EventCode", "event=" },
{ "UMask", "umask=" },
{ "CounterMask", "cmask=" },
{ "Invert", "inv=" },
@@ -343,6 +342,7 @@ int json_events(const char *fn,
jsmntok_t *tokens, *tok;
int i, j, len;
char *map;
+ char buf[128];

if (!fn)
return -ENOENT;
@@ -356,6 +356,7 @@ int json_events(const char *fn,
char *event = NULL, *desc = NULL, *name = NULL;
char *long_desc = NULL;
char *extra_desc = NULL;
+ unsigned long long eventcode = 0;
struct msrmap *msr = NULL;
jsmntok_t *msrval = NULL;
jsmntok_t *precise = NULL;
@@ -376,6 +377,11 @@ int json_events(const char *fn,
nz = !json_streq(map, val, "0");
if (match_field(map, field, nz, &event, val)) {
/* ok */
+ } else if (json_streq(map, field, "EventCode")) {
+ char *code = NULL;
+ addfield(map, &code, "", "", val);
+ eventcode |= strtoul(code, NULL, 0);
+ free(code);
} else if (json_streq(map, field, "EventName")) {
addfield(map, &name, "", "", val);
} else if (json_streq(map, field, "BriefDescription")) {
@@ -410,6 +416,8 @@ int json_events(const char *fn,
addfield(map, &extra_desc, " ",
"(Precise event)", NULL);
}
+ snprintf(buf, sizeof buf, "event=%#llx", eventcode);
+ addfield(map, &event, ",", buf, NULL);
if (desc && extra_desc)
addfield(map, &desc, " ", extra_desc, NULL);
if (long_desc && extra_desc)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

This is not a full uncore event list, but a short list of useful
and understandable metrics.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../pmu-events/arch/x86/jaketown/uncore-cache.json | 209 +++++++++++++++++
.../arch/x86/jaketown/uncore-interconnect.json | 46 ++++
.../arch/x86/jaketown/uncore-memory.json | 79 +++++++
.../pmu-events/arch/x86/jaketown/uncore-power.json | 248 +++++++++++++++++++++
4 files changed, 582 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json

diff --git a/tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json b/tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json
new file mode 100644
index 000000000000..2f23cf0129e7
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json
@@ -0,0 +1,209 @@
+[
+ {
+ "BriefDescription": "Uncore cache clock ticks. Derived from unc_c_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_C_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "All LLC Misses (code+ data rd + data wr - including demand and prefetch). Derived from unc_c_llc_lookup.any",
+ "Counter": "0,1",
+ "EventCode": "0x34",
+ "EventName": "UNC_C_LLC_LOOKUP.ANY",
+ "Filter": "filter_state=0x1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x11",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "M line evictions from LLC (writebacks to memory). Derived from unc_c_llc_victims.m_state",
+ "Counter": "0,1",
+ "EventCode": "0x37",
+ "EventName": "UNC_C_LLC_VICTIMS.M_STATE",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - demand and prefetch data reads - excludes LLC prefetches. Derived from unc_c_tor_inserts.miss_opcode.demand",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - Uncacheable reads. Derived from unc_c_tor_inserts.miss_opcode.uncacheable",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.UNCACHEABLE",
+ "Filter": "filter_opc=0x187",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe allocating writes that miss LLC - DDIO misses. Derived from unc_c_tor_inserts.miss_opcode.ddio_miss",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_WRITE",
+ "Filter": "filter_opc=0x19c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for ItoM writes (as part of fast string memcpy stores). Derived from unc_c_tor_inserts.miss_opcode.itom_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.ITOM_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (full cache line). Derived from unc_c_tor_inserts.opcode.streaming_full",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_FULL",
+ "Filter": "filter_opc=0x18c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (partial cache line). Derived from unc_c_tor_inserts.opcode.streaming_partial",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
+ "Filter": "filter_opc=0x18d",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Partial PCIe reads. Derived from unc_c_tor_inserts.opcode.pcie_partial",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_PARTIAL_READ",
+ "Filter": "filter_opc=0x195",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe allocating writes that hit in LLC (DDIO hits). Derived from unc_c_tor_inserts.opcode.ddio_hit",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_WRITE",
+ "Filter": "filter_opc=0x19c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe read current. Derived from unc_c_tor_inserts.opcode.pcie_read_current",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "ItoM write hits (as part of fast string memcpy stores). Derived from unc_c_tor_inserts.opcode.itom_write_hit",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.ITOM_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop reads. Derived from unc_c_tor_inserts.opcode.pcie_read",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_READ",
+ "Filter": "filter_opc=0x1e4",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop writes (partial). Derived from unc_c_tor_inserts.opcode.pcie_partial_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_PARTIAL_WRITE",
+ "Filter": "filter_opc=0x1e5",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop writes (full line). Derived from unc_c_tor_inserts.opcode.pcie_full_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_WRITE",
+ "Filter": "filter_opc=0x1e6",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy counter for all LLC misses; we divide this by UNC_C_CLOCKTICKS to get average Q depth. Derived from unc_c_tor_occupancy.miss_all",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.MISS_ALL",
+ "Filter": "filter_opc=0x182",
+ "MetricExpr": "(UNC_C_TOR_OCCUPANCY.MISS_ALL / UNC_C_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "UMask": "0xa",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy counter for LLC data reads (demand and L2 prefetch). Derived from unc_c_tor_occupancy.miss_opcode.llc_data_read",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.LLC_DATA_READ",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "read requests to home agent. Derived from unc_h_requests.reads",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to home agent. Derived from unc_h_requests.writes",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES",
+ "PerPkg": "1",
+ "UMask": "0xc",
+ "Unit": "HA"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json
new file mode 100644
index 000000000000..63351876eb57
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json
@@ -0,0 +1,46 @@
+[
+ {
+ "BriefDescription": "QPI clock ticks. Used to get percentages of QPI cycles events. Derived from unc_q_clockticks",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x14",
+ "EventName": "UNC_Q_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Cycles where receiving QPI link is in half-width mode. Derived from unc_q_rxl0p_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x10",
+ "EventName": "UNC_Q_RxL0P_POWER_CYCLES",
+ "MetricExpr": "(UNC_Q_RxL0P_POWER_CYCLES / UNC_Q_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Cycles where transmitting QPI link is in half-width mode. Derived from unc_q_txl0p_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_Q_TxL0P_POWER_CYCLES",
+ "MetricExpr": "(UNC_Q_TxL0P_POWER_CYCLES / UNC_Q_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of data flits transmitted . Derived from unc_q_txl_flits_g0.data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x2",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of non data (control) flits transmitted . Derived from unc_q_txl_flits_g0.non_data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.NON_DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x4",
+ "Unit": "QPI LL"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json b/tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json
new file mode 100644
index 000000000000..e2cf6daa7b37
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json
@@ -0,0 +1,79 @@
+[
+ {
+ "BriefDescription": "Memory page activates. Derived from unc_m_act_count",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_M_ACT_COUNT",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "UMask": "0xc",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Memory controller clock ticks. Used to get percentages of memory controller cycles events. Derived from unc_m_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_M_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles where DRAM ranks are in power down (CKE) mode. Derived from unc_m_power_channel_ppd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x85",
+ "EventName": "UNC_M_POWER_CHANNEL_PPD",
+ "MetricExpr": "(UNC_M_POWER_CHANNEL_PPD / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles all ranks are in critical thermal throttle. Derived from unc_m_power_critical_throttle_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x86",
+ "EventName": "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES",
+ "MetricExpr": "(UNC_M_POWER_CRITICAL_THROTTLE_CYCLES / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Cycles Memory is in self refresh power mode. Derived from unc_m_power_self_refresh",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x43",
+ "EventName": "UNC_M_POWER_SELF_REFRESH",
+ "MetricExpr": "(UNC_M_POWER_SELF_REFRESH / UNC_M_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Memory page conflicts. Derived from unc_m_pre_count.page_miss",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.PAGE_MISS",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Occupancy counter for memory read queue. Derived from unc_m_rpq_occupancy",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_M_RPQ_OCCUPANCY",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json b/tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json
new file mode 100644
index 000000000000..bbe36d547386
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json
@@ -0,0 +1,248 @@
+[
+ {
+ "BriefDescription": "PCU clock ticks. Use to get percentages of PCU cycles events. Derived from unc_p_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_P_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band0=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_BAND0_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND0_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band1=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_BAND1_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND1_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band2=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_BAND2_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND2_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band3=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_BAND3_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND3_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned a frequency greater than or equal to the frequency that is configured in the filter. (filter_band0=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_BAND0_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND0_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transistioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band1=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_BAND1_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND1_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band2=XXX with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_BAND2_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND2_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band3=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_BAND3_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND3_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in C0. It can be used by itself to get the average number of cores in C0, with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details. Derived from unc_p_power_state_occupancy.cores_c0",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
+ "Filter": "occ_sel=1",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C0 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in C3. It can be used by itself to get the average number of cores in C0, with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details. Derived from unc_p_power_state_occupancy.cores_c3",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
+ "Filter": "occ_sel=2",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C3 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in C6. It can be used by itself to get the average number of cores in C0, with threshholding to generate histograms, or with other PCU events . Derived from unc_p_power_state_occupancy.cores_c6",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
+ "Filter": "occ_sel=3",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C6 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that we are in external PROCHOT mode. This mode is triggered when a sensor off the die determines that something off-die (like DRAM) is too hot and must throttle to avoid damaging the chip. Derived from unc_p_prochot_external_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xa",
+ "EventName": "UNC_P_PROCHOT_EXTERNAL_CYCLES",
+ "MetricExpr": "(UNC_P_PROCHOT_EXTERNAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when temperature is the upper limit on frequency. Derived from unc_p_freq_max_limit_thermal_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when the OS is the upper limit on frequency. Derived from unc_p_freq_max_os_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x6",
+ "EventName": "UNC_P_FREQ_MAX_OS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_OS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when power is the upper limit on frequency. Derived from unc_p_freq_max_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x5",
+ "EventName": "UNC_P_FREQ_MAX_POWER_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_POWER_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when current is the upper limit on frequency. Derived from unc_p_freq_max_current_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x7",
+ "EventName": "UNC_P_FREQ_MAX_CURRENT_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_CURRENT_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Cycles spent changing Frequency. Derived from unc_p_freq_trans_cycles",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_P_FREQ_TRANS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_TRANS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 1.2Ghz. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_GE_1200MHZ_CYCLES",
+ "Filter": "filter_band0=1200",
+ "MetricExpr": "(UNC_P_FREQ_GE_1200MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 2Ghz. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_GE_2000MHZ_CYCLES",
+ "Filter": "filter_band1=2000",
+ "MetricExpr": "(UNC_P_FREQ_GE_2000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 3Ghz. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_GE_3000MHZ_CYCLES",
+ "Filter": "filter_band2=3000",
+ "MetricExpr": "(UNC_P_FREQ_GE_3000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 4Ghz. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_GE_4000MHZ_CYCLES",
+ "Filter": "filter_band3=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_4000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned to a frequency greater than or equal to 1.2Ghz. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_GE_1200MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band0=1200",
+ "MetricExpr": "(UNC_P_FREQ_GE_1200MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned to a frequency greater than or equal to 2Ghz. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_GE_2000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band1=2000",
+ "MetricExpr": "(UNC_P_FREQ_GE_2000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to 3Ghz. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_GE_3000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band2=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_3000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to 4Ghz. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_GE_4000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band3=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_4000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

Add support for registering json aliases per PMU. Any alias with an unit
matching the prefix is registered to the PMU. Uncore has multiple
instances of most units, so all these aliases get registered for each
individual PMU (this is important later to run the event on every
instance of the PMU).

To avoid printing the events multiple times in perf list filter out
duplicated events during printing.

v2: Rely on uncore_ prefix already in unit
v3: Document why calls were reordered

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/r/2017012802034...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/pmu.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 6dc3cc050105..8e9d00fd418e 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -590,14 +590,16 @@ static struct perf_pmu *pmu_lookup(const char *name)
if (pmu_format(name, &format))
return NULL;

- if (pmu_aliases(name, &aliases))
+ /*
+ * Check the type first to avoid unnecessary work.
+ */
+ if (pmu_type(name, &type))
return NULL;

- pmu_add_cpu_aliases(&aliases, name);
-
- if (pmu_type(name, &type))
+ if (pmu_aliases(name, &aliases))
return NULL;

+ pmu_add_cpu_aliases(&aliases, name);
pmu = zalloc(sizeof(*pmu));
if (!pmu)
return NULL;
@@ -1195,6 +1197,9 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
len = j;
qsort(aliases, len, sizeof(struct sevent), cmp_sevent);
for (j = 0; j < len; j++) {
+ /* Skip duplicates */
+ if (j > 0 && !strcmp(aliases[j].name, aliases[j - 1].name))
+ continue;
if (name_only) {
printf("%s ", aliases[j].name);
continue;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To address new warnings emmited by gcc 7, e.g.::

CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
CC /tmp/build/perf/tests/parse-events.o
util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc':
util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (!(packet->count))
^
util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here
case INTEL_PT_CYC:
^~~~
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
cc1: all warnings being treated as errors

Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-mf0hw789pu...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 +++++
tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c | 2 ++
2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index e4e7dc781d21..7cf7f7aca4d2 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -22,6 +22,7 @@
#include <errno.h>
#include <stdint.h>
#include <inttypes.h>
+#include <linux/compiler.h>

#include "../cache.h"
#include "../util.h"
@@ -1746,6 +1747,7 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
switch (decoder->packet.type) {
case INTEL_PT_TIP_PGD:
decoder->continuous_period = false;
+ __fallthrough;
case INTEL_PT_TIP_PGE:
case INTEL_PT_TIP:
intel_pt_log("ERROR: Unexpected packet\n");
@@ -1799,6 +1801,8 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
decoder->pge = false;
decoder->continuous_period = false;
intel_pt_clear_tx_flags(decoder);
+ __fallthrough;
+
case INTEL_PT_TNT:
decoder->have_tma = false;
intel_pt_log("ERROR: Unexpected packet\n");
@@ -1839,6 +1843,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
switch (decoder->packet.type) {
case INTEL_PT_TIP_PGD:
decoder->continuous_period = false;
+ __fallthrough;
case INTEL_PT_TIP_PGE:
case INTEL_PT_TIP:
decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
index 4f7b32020487..7528ae4f7e28 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
@@ -17,6 +17,7 @@
#include <string.h>
#include <endian.h>
#include <byteswap.h>
+#include <linux/compiler.h>

#include "intel-pt-pkt-decoder.h"

@@ -498,6 +499,7 @@ int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf,
case INTEL_PT_FUP:
if (!(packet->count))
return snprintf(buf, buf_len, "%s no ip", name);
+ __fallthrough;
case INTEL_PT_CYC:
case INTEL_PT_VMCS:
case INTEL_PT_MTC:
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: He Kuang <hek...@huawei.com>

These two debug messages are missing the trailing newline.

Signed-off-by: He Kuang <hek...@huawei.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Bintian Wang <bintia...@huawei.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Cc: Will Deacon <will....@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/bpf-loader.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 36c861103291..bc6bc7062eb4 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -670,13 +670,13 @@ int bpf__probe(struct bpf_object *obj)

err = convert_perf_probe_events(pev, 1);
if (err < 0) {
- pr_debug("bpf_probe: failed to convert perf probe events");
+ pr_debug("bpf_probe: failed to convert perf probe events\n");
goto out;
}

err = apply_perf_probe_events(pev, 1);
if (err < 0) {
- pr_debug("bpf_probe: failed to apply perf probe events");
+ pr_debug("bpf_probe: failed to apply perf probe events\n");
goto out;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Taeung Song <treeze...@gmail.com>

We have zfree(&ptr) for this very common pattern:

free(ptr);
ptr = NULL;

So use it in a few more places.

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-4-git-s...@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1f1f77d8d3ab..ab1ba22d879a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -254,8 +254,7 @@ struct tracepoint_path *tracepoint_name_to_path(const char *name)
if (path->system == NULL || path->name == NULL) {
zfree(&path->system);
zfree(&path->name);
- free(path);
- path = NULL;
+ zfree(&path);
}

return path;
@@ -1482,8 +1481,7 @@ static void perf_pmu__parse_cleanup(void)
p = perf_pmu_events_list + i;
free(p->symbol);
}
- free(perf_pmu_events_list);
- perf_pmu_events_list = NULL;
+ zfree(&perf_pmu_events_list);
perf_pmu_events_list_num = 0;
}
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

This is not a full uncore event list, but a short list of useful
and understandable metrics.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../arch/x86/broadwellx/uncore-cache.json | 317 +++++++++++++++++++++
.../arch/x86/broadwellx/uncore-interconnect.json | 28 ++
.../arch/x86/broadwellx/uncore-memory.json | 83 ++++++
.../arch/x86/broadwellx/uncore-power.json | 84 ++++++
4 files changed, 512 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json

diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json
new file mode 100644
index 000000000000..076459c51d4e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json
@@ -0,0 +1,317 @@
+[
+ {
+ "BriefDescription": "Uncore cache clock ticks. Derived from unc_c_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_C_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "All LLC Misses (code+ data rd + data wr - including demand and prefetch). Derived from unc_c_llc_lookup.any",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x34",
+ "EventName": "UNC_C_LLC_LOOKUP.ANY",
+ "Filter": "filter_state=0x1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x11",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "M line evictions from LLC (writebacks to memory). Derived from unc_c_llc_victims.m_state",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x37",
+ "EventName": "UNC_C_LLC_VICTIMS.M_STATE",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - demand and prefetch data reads - excludes LLC prefetches. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.UNCACHEABLE",
+ "Filter": "filter_opc=0x187",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "MMIO reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.MMIO_READ",
+ "Filter": "filter_opc=0x187,filter_nc=1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "MMIO writes. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.MMIO_WRITE",
+ "Filter": "filter_opc=0x18f,filter_nc=1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for RFO. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.RFO_LLC_PREFETCH",
+ "Filter": "filter_opc=0x190",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for code reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.CODE_LLC_PREFETCH",
+ "Filter": "filter_opc=0x191",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for data reads. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_LLC_PREFETCH",
+ "Filter": "filter_opc=0x192",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for PCIe read current. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "ItoM write misses (as part of fast string memcpy stores) + PCIe full line writes. Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe write misses (full cache line). Derived from unc_c_tor_inserts.miss_opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_NON_SNOOP_WRITE",
+ "Filter": "filter_opc=0x1c8,filter_tid=0x3e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe writes (partial cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_PARTIAL_WRITE",
+ "Filter": "filter_opc=0x180,filter_tid=0x3e",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "L2 demand and L2 prefetch code references to LLC. Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.CODE_LLC_PREFETCH",
+ "Filter": "filter_opc=0x181",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (full cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_FULL",
+ "Filter": "filter_opc=0x18c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (partial cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
+ "Filter": "filter_opc=0x18d",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe read current. Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe write references (full cache line). Derived from unc_c_tor_inserts.opcode",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_WRITE",
+ "Filter": "filter_opc=0x1c8,filter_tid=0x3e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy counter for LLC data reads (demand and L2 prefetch). Derived from unc_c_tor_occupancy.miss_opcode",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.LLC_DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "read requests to home agent. Derived from unc_h_requests.reads",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "read requests to local home agent. Derived from unc_h_requests.reads_local",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS_LOCAL",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "read requests to remote home agent. Derived from unc_h_requests.reads_remote",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS_REMOTE",
+ "PerPkg": "1",
+ "UMask": "0x2",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to home agent. Derived from unc_h_requests.writes",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES",
+ "PerPkg": "1",
+ "UMask": "0xC",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to local home agent. Derived from unc_h_requests.writes_local",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES_LOCAL",
+ "PerPkg": "1",
+ "UMask": "0x4",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "write requests to remote home agent. Derived from unc_h_requests.writes_remote",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES_REMOTE",
+ "PerPkg": "1",
+ "UMask": "0x8",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Conflict requests (requests for same address from multiple agents simultaneously). Derived from unc_h_snoop_resp.rspcnflct",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPCNFLCT",
+ "PerPkg": "1",
+ "UMask": "0x40",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "M line forwarded from remote cache along with writeback to memory. Derived from unc_h_snoop_resp.rsp_fwd_wb",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSP_FWD_WB",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x20",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "M line forwarded from remote cache with no writeback to memory. Derived from unc_h_snoop_resp.rspifwd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPIFWD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x4",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Shared line response from remote cache. Derived from unc_h_snoop_resp.rsps",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPS",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x2",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Shared line forwarded from remote cache. Derived from unc_h_snoop_resp.rspsfwd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x21",
+ "EventName": "UNC_H_SNOOP_RESP.RSPSFWD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x8",
+ "Unit": "HA"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json
new file mode 100644
index 000000000000..39387f7909b2
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json
@@ -0,0 +1,28 @@
+[
+ {
+ "BriefDescription": "QPI clock ticks. Derived from unc_q_clockticks",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x14",
+ "EventName": "UNC_Q_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of data flits transmitted . Derived from unc_q_txl_flits_g0.data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x2",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of non data (control) flits transmitted . Derived from unc_q_txl_flits_g0.non_data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.NON_DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x4",
+ "Unit": "QPI LL"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json
new file mode 100644
index 000000000000..d17dc235f734
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json
@@ -0,0 +1,83 @@
+[
+ {
+ "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0xC",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Memory controller clock ticks. Derived from unc_m_clockticks",
+ "BriefDescription": "Pre-charges due to page misses. Derived from unc_m_pre_count.page_miss",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.PAGE_MISS",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Pre-charge for reads. Derived from unc_m_pre_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.RD",
+ "PerPkg": "1",
+ "UMask": "0x4",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Pre-charge for writes. Derived from unc_m_pre_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.WR",
+ "PerPkg": "1",
+ "UMask": "0x8",
+ "Unit": "iMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json
new file mode 100644
index 000000000000..b44d43088bbb
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json
@@ -0,0 +1,84 @@
+[
+ {
+ "BriefDescription": "PCU clock ticks. Use to get percentages of PCU cycles events. Derived from unc_p_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_P_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C0 and C1. Derived from unc_p_power_state_occupancy.cores_c0",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
+ "Filter": "occ_sel=1",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C0 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C3. Derived from unc_p_power_state_occupancy.cores_c3",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
+ "Filter": "occ_sel=2",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C3 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "C6 and C7. Derived from unc_p_power_state_occupancy.cores_c6",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
+ "Filter": "occ_sel=3",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C6 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "External Prochot. Derived from unc_p_prochot_external_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xA",
+ "EventName": "UNC_P_PROCHOT_EXTERNAL_CYCLES",
+ "MetricExpr": "(UNC_P_PROCHOT_EXTERNAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Thermal Strongest Upper Limit Cycles. Derived from unc_p_freq_max_limit_thermal_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "OS Strongest Upper Limit Cycles. Derived from unc_p_freq_max_os_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x6",
+ "EventName": "UNC_P_FREQ_MAX_OS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_OS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Power Strongest Upper Limit Cycles. Derived from unc_p_freq_max_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x5",
+ "EventName": "UNC_P_FREQ_MAX_POWER_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_POWER_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Cycles spent changing Frequency. Derived from unc_p_freq_trans_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x74",
+ "EventName": "UNC_P_FREQ_TRANS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_TRANS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ }
+]
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:06 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

In commit daeecbc0c431 ("perf tools: Add event_update event scale type"), the
handling of PERF_EVENT_UPDATE__SCALE cast struct event_update_event->data to a
pointer to event_update_event_scale, uses some field from this casted struct
and then ends up falling through to the handling of another event type,
PERF_EVENT_UPDATE__CPUS were it casts that ev->data to yet another type, oops,
fix it by inserting the missing break.

Noticed when building perf using gcc 7 on Fedora Rawhide:

util/header.c: In function 'perf_event__process_event_update':
util/header.c:3207:16: error: this statement may fall through [-Werror=implicit-fallthrough=]
evsel->scale = ev_scale->scale;
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
util/header.c:3208:2: note: here
case PERF_EVENT_UPDATE__CPUS:
^~~~

This wasn't noticed because probably PERF_EVENT_UPDATE__CPUS comes after
PERF_EVENT_UPDATE__SCALE, so we would just create a bogus evsel->own_cpus when
processing a PERF_EVENT_UPDATE__SCALE to then leak it and create a new cpu map
with the correct data.

Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Fixes: daeecbc0c431 ("perf tools: Add event_update event scale type")
Link: http://lkml.kernel.org/n/tip-lukcf9hdj0...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c567d9f0aa92..3d12c16e5103 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3205,6 +3205,7 @@ int perf_event__process_event_update(struct perf_tool *tool __maybe_unused,
case PERF_EVENT_UPDATE__SCALE:
ev_scale = (struct event_update_event_scale *) ev->data;
evsel->scale = ev_scale->scale;
+ break;
case PERF_EVENT_UPDATE__CPUS:
ev_cpus = (struct event_update_event_cpus *) ev->data;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:06 PM2/9/17
to
From: Taeung Song <treeze...@gmail.com>

The cases changed in this patch are for when we free but keep the
pointer to the freed area, which is not always a good idea.

Be more defensive and zero the pointer to avoid possible use after
free bugs to take more time to be detected.

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-5-git-s...@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index ab1ba22d879a..07be0762e53e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1479,7 +1479,7 @@ static void perf_pmu__parse_cleanup(void)

for (i = 0; i < perf_pmu_events_list_num; i++) {
p = perf_pmu_events_list + i;
- free(p->symbol);
+ zfree(&p->symbol);
}
zfree(&perf_pmu_events_list);
perf_pmu_events_list_num = 0;
@@ -1570,7 +1570,7 @@ perf_pmu__parse_check(const char *name)
r = bsearch(&p, perf_pmu_events_list,
(size_t) perf_pmu_events_list_num,
sizeof(struct perf_pmu_event_symbol), comp_pmu);
- free(p.symbol);
+ zfree(&p.symbol);
return r ? r->type : PMU_EVENT_SYMBOL_ERR;
}

@@ -1717,8 +1717,8 @@ static void parse_events_print_error(struct parse_events_error *err,
fprintf(stderr, "%*s\\___ %s\n", idx + 1, "", err->str);
if (err->help)
fprintf(stderr, "\n%s\n", err->help);
- free(err->str);
- free(err->help);
+ zfree(&err->str);
+ zfree(&err->help);
}

fprintf(stderr, "Run 'perf list' for a list of valid events\n");
@@ -2413,7 +2413,7 @@ void parse_events_terms__purge(struct list_head *terms)

list_for_each_entry_safe(term, h, terms, list) {
if (term->array.nr_ranges)
- free(term->array.ranges);
+ zfree(&term->array.ranges);
list_del_init(&term->list);
free(term);
}
@@ -2429,7 +2429,7 @@ void parse_events_terms__delete(struct list_head *terms)

void parse_events__clear_array(struct parse_events_array *a)
{
- free(a->ranges);
+ zfree(&a->ranges);
}

void parse_events_evlist_error(struct parse_events_evlist *data,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:06 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

This is not a full uncore event list, but a short list of useful
and understandable metrics.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../arch/x86/broadwellde/uncore-cache.json | 317 +++++++++++++++++++++
.../arch/x86/broadwellde/uncore-memory.json | 83 ++++++
.../arch/x86/broadwellde/uncore-power.json | 84 ++++++
3 files changed, 484 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json

diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json
new file mode 100644
index 000000000000..076459c51d4e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
new file mode 100644
index 000000000000..d17dc235f734
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json
new file mode 100644
index 000000000000..b44d43088bbb
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 9:50:09 PM2/9/17
to
From: Taeung Song <treeze...@gmail.com>

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-3-git-s...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 2720600b13a9..1f1f77d8d3ab 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -211,6 +211,8 @@ struct tracepoint_path *tracepoint_id_to_path(u64 config)
closedir(evt_dir);
closedir(sys_dir);
path = zalloc(sizeof(*path));
+ if (!path)
+ return NULL;
path->system = malloc(MAX_EVENT_LENGTH);
if (!path->system) {
free(path);
--
2.9.3

Joe Perches

unread,
Feb 9, 2017, 10:10:05 PM2/9/17
to
On Thu, 2017-02-09 at 22:39 -0300, Arnaldo Carvalho de Melo wrote:
> From: Arnaldo Carvalho de Melo <ac...@redhat.com>
>
> For cases where implicit fall through case labels are intended,
> to let us inform that to gcc >= 7:

I believe this should be added to compiler_gcc.h

> diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 10:10:05 PM2/9/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Addressing this warning from gcc 7:

CC /tmp/build/perf/bench/numa.o
bench/numa.c: In function '__bench_numa':
bench/numa.c:1582:42: error: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size between 8 and 17 [-Werror=format-truncation=]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~
bench/numa.c:1582:25: note: directive argument in the range [0, 2147483647]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:939:0,
from bench/../util/util.h:47,
from bench/../builtin.h:4,
from bench/numa.c:11:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 17 and 35 bytes into a destination of size 32
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Petr Holasek <phol...@redhat.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-twa37vsfqc...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/bench/numa.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 8efe904e486b..9e5a02d6b9a9 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -1573,13 +1573,13 @@ static int __bench_numa(const char *name)
"GB/sec,", "total-speed", "GB/sec total speed");

if (g->p.show_details >= 2) {
- char tname[32];
+ char tname[14 + 2 * 10 + 1];
struct thread_data *td;
for (p = 0; p < g->p.nr_proc; p++) {
for (t = 0; t < g->p.nr_threads; t++) {
- memset(tname, 0, 32);
+ memset(tname, 0, sizeof(tname));
td = g->threads + p*g->p.nr_threads + t;
- snprintf(tname, 32, "process%d:thread%d", p, t);
+ snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
print_res(tname, td->speed_gbs,
"GB/sec", "thread-speed", "GB/sec/thread speed");
print_res(tname, td->system_time_ns / NSEC_PER_SEC,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Feb 9, 2017, 10:50:05 PM2/9/17
to
From: Andi Kleen <a...@linux.intel.com>

This is not a full uncore event list, but a short list of useful
and understandable metrics.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprb...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../pmu-events/arch/x86/ivytown/uncore-cache.json | 322 +++++++++++++++++++++
.../arch/x86/ivytown/uncore-interconnect.json | 46 +++
.../pmu-events/arch/x86/ivytown/uncore-memory.json | 75 +++++
.../pmu-events/arch/x86/ivytown/uncore-power.json | 249 ++++++++++++++++
4 files changed, 692 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json

diff --git a/tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json b/tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json
new file mode 100644
index 000000000000..2efdc6772e0b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json
@@ -0,0 +1,322 @@
+[
+ {
+ "BriefDescription": "Uncore cache clock ticks. Derived from unc_c_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_C_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "All LLC Misses (code+ data rd + data wr - including demand and prefetch). Derived from unc_c_llc_lookup.any",
+ "Counter": "0,1",
+ "EventCode": "0x34",
+ "EventName": "UNC_C_LLC_LOOKUP.ANY",
+ "Filter": "filter_state=0x1",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x11",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "M line evictions from LLC (writebacks to memory). Derived from unc_c_llc_victims.m_state",
+ "Counter": "0,1",
+ "EventCode": "0x37",
+ "EventName": "UNC_C_LLC_VICTIMS.M_STATE",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - demand and prefetch data reads - excludes LLC prefetches. Derived from unc_c_tor_inserts.miss_opcode.demand",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses - Uncacheable reads. Derived from unc_c_tor_inserts.miss_opcode.uncacheable",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.UNCACHEABLE",
+ "Filter": "filter_opc=0x187",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for RFO. Derived from unc_c_tor_inserts.miss_opcode.rfo_prefetch",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.RFO_LLC_PREFETCH",
+ "Filter": "filter_opc=0x190",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for code reads. Derived from unc_c_tor_inserts.miss_opcode.code",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.CODE_LLC_PREFETCH",
+ "Filter": "filter_opc=0x191",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC prefetch misses for data reads. Derived from unc_c_tor_inserts.miss_opcode.data_read",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.DATA_LLC_PREFETCH",
+ "Filter": "filter_opc=0x192",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe allocating writes that miss LLC - DDIO misses. Derived from unc_c_tor_inserts.miss_opcode.ddio_miss",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_WRITE",
+ "Filter": "filter_opc=0x19c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for PCIe read current. Derived from unc_c_tor_inserts.miss_opcode.pcie_read",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for ItoM writes (as part of fast string memcpy stores). Derived from unc_c_tor_inserts.miss_opcode.itom_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.ITOM_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for PCIe non-snoop reads. Derived from unc_c_tor_inserts.miss_opcode.pcie_read",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_NON_SNOOP_READ",
+ "Filter": "filter_opc=0x1e4",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "LLC misses for PCIe non-snoop writes (full line). Derived from unc_c_tor_inserts.miss_opcode.pcie_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_MISSES.PCIE_NON_SNOOP_WRITE",
+ "Filter": "filter_opc=0x1e6",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (full cache line). Derived from unc_c_tor_inserts.opcode.streaming_full",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_FULL",
+ "Filter": "filter_opc=0x18c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Streaming stores (partial cache line). Derived from unc_c_tor_inserts.opcode.streaming_partial",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
+ "Filter": "filter_opc=0x18d",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Partial PCIe reads. Derived from unc_c_tor_inserts.opcode.pcie_partial",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_PARTIAL_READ",
+ "Filter": "filter_opc=0x195",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe allocating writes that hit in LLC (DDIO hits). Derived from unc_c_tor_inserts.opcode.ddio_hit",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_WRITE",
+ "Filter": "filter_opc=0x19c",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe read current. Derived from unc_c_tor_inserts.opcode.pcie_read_current",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_READ",
+ "Filter": "filter_opc=0x19e",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "ItoM write hits (as part of fast string memcpy stores). Derived from unc_c_tor_inserts.opcode.itom_write_hit",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.ITOM_WRITE",
+ "Filter": "filter_opc=0x1c8",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop reads. Derived from unc_c_tor_inserts.opcode.pcie_read",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_READ",
+ "Filter": "filter_opc=0x1e4",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop writes (partial). Derived from unc_c_tor_inserts.opcode.pcie_partial_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_PARTIAL_WRITE",
+ "Filter": "filter_opc=0x1e5",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "PCIe non-snoop writes (full line). Derived from unc_c_tor_inserts.opcode.pcie_full_write",
+ "Counter": "0,1",
+ "EventCode": "0x35",
+ "EventName": "LLC_REFERENCES.PCIE_NS_WRITE",
+ "Filter": "filter_opc=0x1e6",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x1",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy for all LLC misses that are addressed to local memory. Derived from unc_c_tor_occupancy.miss_local",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.MISS_LOCAL",
+ "PerPkg": "1",
+ "UMask": "0x2A",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy counter for LLC data reads (demand and L2 prefetch). Derived from unc_c_tor_occupancy.miss_opcode.llc_data_read",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.LLC_DATA_READ",
+ "Filter": "filter_opc=0x182",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Occupancy for all LLC misses that are addressed to remote memory. Derived from unc_c_tor_occupancy.miss_remote",
+ "EventCode": "0x36",
+ "EventName": "UNC_C_TOR_OCCUPANCY.MISS_REMOTE",
+ "PerPkg": "1",
+ "UMask": "0x8A",
+ "Unit": "CBO"
+ },
+ {
+ "BriefDescription": "Read requests to home agent. Derived from unc_h_requests.reads",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.READS",
+ "PerPkg": "1",
+ "UMask": "0x3",
+ "Unit": "HA"
+ },
+ {
+ "BriefDescription": "Write requests to home agent. Derived from unc_h_requests.writes",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_H_REQUESTS.WRITES",
+ "PerPkg": "1",
+ "UMask": "0xC",
+ "Unit": "HA"
+ },
+ {
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json
new file mode 100644
index 000000000000..d7e2fda1d695
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json
@@ -0,0 +1,46 @@
+[
+ {
+ "BriefDescription": "QPI clock ticks. Use to get percentages for QPI cycles events. Derived from unc_q_clockticks",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x14",
+ "EventName": "UNC_Q_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Cycles where receiving QPI link is in half-width mode. Derived from unc_q_rxl0p_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x10",
+ "EventName": "UNC_Q_RxL0P_POWER_CYCLES",
+ "MetricExpr": "(UNC_Q_RxL0P_POWER_CYCLES / UNC_Q_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Cycles where transmitting QPI link is in half-width mode. Derived from unc_q_txl0p_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_Q_TxL0P_POWER_CYCLES",
+ "MetricExpr": "(UNC_Q_TxL0P_POWER_CYCLES / UNC_Q_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of data flits transmitted . Derived from unc_q_txl_flits_g0.data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x2",
+ "Unit": "QPI LL"
+ },
+ {
+ "BriefDescription": "Number of non data (control) flits transmitted . Derived from unc_q_txl_flits_g0.non_data",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_Q_TxL_FLITS_G0.NON_DATA",
+ "PerPkg": "1",
+ "ScaleUnit": "8Bytes",
+ "UMask": "0x4",
+ "Unit": "QPI LL"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json b/tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json
new file mode 100644
index 000000000000..ac4ad4d6357b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json
@@ -0,0 +1,75 @@
+[
+ {
+ "BriefDescription": "Memory page activates for reads and writes. Derived from unc_m_act_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x1",
+ "EventName": "UNC_M_ACT_COUNT.RD",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Umask": "0x3",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Read requests to memory controller. Derived from unc_m_cas_count.rd",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0x3",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Write requests to memory controller. Derived from unc_m_cas_count.wr",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "ScaleUnit": "64Bytes",
+ "UMask": "0xC",
+ "Unit": "iMC"
+ },
+ {
+ "BriefDescription": "Memory controller clock ticks. Use to generate percentages for memory controller CYCLES events. Derived from unc_m_clockticks",
+ "BriefDescription": "Memory page conflicts. Derived from unc_m_pre_count.page_miss",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x2",
+ "EventName": "UNC_M_PRE_COUNT.PAGE_MISS",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "iMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json b/tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json
new file mode 100644
index 000000000000..dc2586db0dfc
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json
@@ -0,0 +1,249 @@
+[
+ {
+ "BriefDescription": "PCU clock ticks. Use to get percentages of PCU cycles events. Derived from unc_p_clockticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_P_CLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band0=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_BAND0_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND0_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band1=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_BAND1_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND1_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band2=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_BAND2_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND2_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to the frequency that is configured in the filter. (filter_band3=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_BAND3_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_BAND3_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned a frequency greater than or equal to the frequency that is configured in the filter. (filter_band0=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_BAND0_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND0_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band1=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_BAND1_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND1_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band2=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_BAND2_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND2_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to the frequency that is configured in the filter. (filter_band3=XXX, with XXX in 100Mhz units). One can also use inversion (filter_inv=1) to track cycles when we were less than the configured frequency. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_BAND3_TRANSITIONS",
+ "Filter": "edge=1",
+ "MetricExpr": "(UNC_P_FREQ_BAND3_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details. Derived from unc_p_power_state_occupancy.cores_c0",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
+ "Filter": "occ_sel=1",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C0 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details. Derived from unc_p_power_state_occupancy.cores_c3",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
+ "Filter": "occ_sel=2",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C3 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details. Derived from unc_p_power_state_occupancy.cores_c6",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x80",
+ "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
+ "Filter": "occ_sel=3",
+ "MetricExpr": "(UNC_P_POWER_STATE_OCCUPANCY.CORES_C6 / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that we are in external PROCHOT mode. This mode is triggered when a sensor off the die determines that something off-die (like DRAM) is too hot and must throttle to avoid damaging the chip. Derived from unc_p_prochot_external_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xa",
+ "EventName": "UNC_P_PROCHOT_EXTERNAL_CYCLES",
+ "MetricExpr": "(UNC_P_PROCHOT_EXTERNAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when thermal conditions are the upper limit on frequency. This is related to the THERMAL_THROTTLE CYCLES_ABOVE_TEMP event, which always counts cycles when we are above the thermal temperature. This event (STRONGEST_UPPER_LIMIT) is sampled at the output of the algorithm that determines the actual frequency, while THERMAL_THROTTLE looks at the input. Derived from unc_p_freq_max_limit_thermal_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x4",
+ "EventName": "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when the OS is the upper limit on frequency. Derived from unc_p_freq_max_os_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x6",
+ "EventName": "UNC_P_FREQ_MAX_OS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_OS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when power is the upper limit on frequency. Derived from unc_p_freq_max_power_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x5",
+ "EventName": "UNC_P_FREQ_MAX_POWER_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_POWER_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when current is the upper limit on frequency. Derived from unc_p_freq_max_current_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x7",
+ "EventName": "UNC_P_FREQ_MAX_CURRENT_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_MAX_CURRENT_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles when the system is changing frequency. This can not be filtered by thread ID. One can also use it with the occupancy counter that monitors number of threads in C0 to estimate the performance impact that frequency transitions had on the system. Derived from unc_p_freq_trans_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x60",
+ "EventName": "UNC_P_FREQ_TRANS_CYCLES",
+ "MetricExpr": "(UNC_P_FREQ_TRANS_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 1.2Ghz. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_GE_1200MHZ_CYCLES",
+ "Filter": "filter_band0=1200",
+ "MetricExpr": "(UNC_P_FREQ_GE_1200MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 2Ghz. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_GE_2000MHZ_CYCLES",
+ "Filter": "filter_band1=2000",
+ "MetricExpr": "(UNC_P_FREQ_GE_2000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 3Ghz. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_GE_3000MHZ_CYCLES",
+ "Filter": "filter_band2=3000",
+ "MetricExpr": "(UNC_P_FREQ_GE_3000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore was running at a frequency greater than or equal to 4Ghz. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_GE_4000MHZ_CYCLES",
+ "Filter": "filter_band3=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_4000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned to a frequency greater than or equal to 1.2Ghz. Derived from unc_p_freq_band0_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xb",
+ "EventName": "UNC_P_FREQ_GE_1200MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band0=1200",
+ "MetricExpr": "(UNC_P_FREQ_GE_1200MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of times that the uncore transitioned to a frequency greater than or equal to 2Ghz. Derived from unc_p_freq_band1_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xc",
+ "EventName": "UNC_P_FREQ_GE_2000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band1=2000",
+ "MetricExpr": "(UNC_P_FREQ_GE_2000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to 3Ghz. Derived from unc_p_freq_band2_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xd",
+ "EventName": "UNC_P_FREQ_GE_3000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band2=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_3000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",
+ "PerPkg": "1",
+ "Unit": "PCU"
+ },
+ {
+ "BriefDescription": "Counts the number of cycles that the uncore transitioned to a frequency greater than or equal to 4Ghz. Derived from unc_p_freq_band3_cycles",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xe",
+ "EventName": "UNC_P_FREQ_GE_4000MHZ_TRANSITIONS",
+ "Filter": "edge=1,filter_band3=4000",
+ "MetricExpr": "(UNC_P_FREQ_GE_4000MHZ_CYCLES / UNC_P_CLOCKTICKS) * 100.",

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:00:05 AM2/10/17
to
Commit-ID: 94bdd5edb34e472980d1e18b4600d6fb92bd6b0a
Gitweb: http://git.kernel.org/tip/94bdd5edb34e472980d1e18b4600d6fb92bd6b0a
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 8 Feb 2017 17:01:46 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 17:31:01 -0300

tools string: Use __fallthrough in perf_atoll()

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ophb30v9a...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/string.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index d8dfaf6..bddca51 100644

tip-bot for Andi Kleen

unread,
Feb 10, 2017, 3:00:05 AM2/10/17
to
Commit-ID: 22c8e5526b7bf33840c20b4e717e6560e5dfb294
Gitweb: http://git.kernel.org/tip/22c8e5526b7bf33840c20b4e717e6560e5dfb294
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Mon, 19 Sep 2016 09:36:31 -0700
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 16:38:03 -0300

perf vendor events intel: Add uncore events for Xeon Phi (Knights Landing)

Add metrics for memory and MCDRAM. Minimal metrics only for now.
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../arch/x86/knightslanding/uncore-memory.json | 42 ++++++++++++++++++++++
1 file changed, 42 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json b/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
new file mode 100644
index 0000000..e3bcd86
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
@@ -0,0 +1,42 @@
+[
+ {
+ "BriefDescription": "ddr bandwidth read (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x03",
+ "EventName": "UNC_M_CAS_COUNT.RD",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "imc"
+ },
+ {
+ "BriefDescription": "ddr bandwidth write (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x03",
+ "EventName": "UNC_M_CAS_COUNT.WR",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x02",
+ "Unit": "imc"
+ },
+ {
+ "BriefDescription": "mcdram bandwidth read (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x01",
+ "EventName": "UNC_E_RPQ_INSERTS",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "edc_eclk"
+ },
+ {
+ "BriefDescription": "mcdram bandwidth write (CPU traffic only) (MB/sec). ",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x02",
+ "EventName": "UNC_E_WPQ_INSERTS",
+ "PerPkg": "1",
+ "ScaleUnit": "6.4e-05MiB",
+ "UMask": "0x01",
+ "Unit": "edc_eclk"
+ }
+]

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:00:06 AM2/10/17
to
Commit-ID: b5bf1733d6a391c4e90ea8f8468d83023be74a2a
Gitweb: http://git.kernel.org/tip/b5bf1733d6a391c4e90ea8f8468d83023be74a2a
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 8 Feb 2017 17:01:46 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 17:30:58 -0300

tools include: Add a __fallthrough statement

For cases where implicit fall through case labels are intended,
to let us inform that to gcc >= 7:

CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~

So we introduce:

#define __fallthrough __attribute__ ((fallthrough))

And use it in such cases.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/linux/compiler.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index e33fc1d..d94179f 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:00:07 AM2/10/17
to
Commit-ID: 3aff8ba0a4c9c9191bb788171a1c54778e1246a2
Gitweb: http://git.kernel.org/tip/3aff8ba0a4c9c9191bb788171a1c54778e1246a2
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Thu, 9 Feb 2017 14:39:42 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 9 Feb 2017 14:39:42 -0300

perf bench numa: Avoid possible truncation when using snprintf()

Addressing this warning from gcc 7:

CC /tmp/build/perf/bench/numa.o
bench/numa.c: In function '__bench_numa':
bench/numa.c:1582:42: error: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size between 8 and 17 [-Werror=format-truncation=]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~
bench/numa.c:1582:25: note: directive argument in the range [0, 2147483647]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:939:0,
from bench/../util/util.h:47,
from bench/../builtin.h:4,
from bench/numa.c:11:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 17 and 35 bytes into a destination of size 32
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Petr Holasek <phol...@redhat.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-twa37vsfqc...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/bench/numa.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 8efe904..9e5a02d 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:00:07 AM2/10/17
to
Commit-ID: 8434a2ec13d5c8cb25716950bfbf7c9d7b64628a
Gitweb: http://git.kernel.org/tip/8434a2ec13d5c8cb25716950bfbf7c9d7b64628a
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 8 Feb 2017 21:57:22 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 22:06:18 -0300

perf header: Fix handling of PERF_EVENT_UPDATE__SCALE

In commit daeecbc0c431 ("perf tools: Add event_update event scale type"), the
handling of PERF_EVENT_UPDATE__SCALE cast struct event_update_event->data to a
pointer to event_update_event_scale, uses some field from this casted struct
and then ends up falling through to the handling of another event type,
PERF_EVENT_UPDATE__CPUS were it casts that ev->data to yet another type, oops,
fix it by inserting the missing break.

Noticed when building perf using gcc 7 on Fedora Rawhide:

util/header.c: In function 'perf_event__process_event_update':
util/header.c:3207:16: error: this statement may fall through [-Werror=implicit-fallthrough=]
evsel->scale = ev_scale->scale;
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
util/header.c:3208:2: note: here
case PERF_EVENT_UPDATE__CPUS:
^~~~

This wasn't noticed because probably PERF_EVENT_UPDATE__CPUS comes after
PERF_EVENT_UPDATE__SCALE, so we would just create a bogus evsel->own_cpus when
processing a PERF_EVENT_UPDATE__SCALE to then leak it and create a new cpu map
with the correct data.

Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Fixes: daeecbc0c431 ("perf tools: Add event_update event scale type")
Link: http://lkml.kernel.org/n/tip-lukcf9hdj0...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c567d9f..3d12c16 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:00:07 AM2/10/17
to
Commit-ID: 7ea6856d6f5629d742edc23b8b76e6263371ef45
Gitweb: http://git.kernel.org/tip/7ea6856d6f5629d742edc23b8b76e6263371ef45
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Thu, 9 Feb 2017 15:22:22 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Thu, 9 Feb 2017 16:32:03 -0300

perf intel-pt: Use __fallthrough

To address new warnings emmited by gcc 7, e.g.::

CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
CC /tmp/build/perf/tests/parse-events.o
util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc':
util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (!(packet->count))
^
util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here
case INTEL_PT_CYC:
^~~~
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
cc1: all warnings being treated as errors

Acked-by: Andi Kleen <a...@linux.intel.com>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-mf0hw789pu...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 +++++
tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c | 2 ++
2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index e4e7dc7..7cf7f7a 100644
index 4f7b320..7528ae4 100644

Ingo Molnar

unread,
Feb 10, 2017, 3:20:06 AM2/10/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 53e74a112ce5c1c9b6a6923bdd6612133625d579:
>
> Merge tag 'perf-urgent-for-mingo-4.10-20170203' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2017-02-03 20:42:30 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170209
>
> for you to fetch changes up to 7ea6856d6f5629d742edc23b8b76e6263371ef45:
>
> perf intel-pt: Use __fallthrough (2017-02-09 16:32:03 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> - Add support for parsing Intel uncore vendor event files and add uncore
> vendor events for the Intel server processors (Haswell, Broadwell,
> IvyBridge), Xeon Phi (Knights Landing) and Broadwell DE (Andi Kleen)
>
> - Support --symfs in 'perf probe' (Uwe Kleine-König)
>
> - Add support for generating bpf prologue on the aarch64 architecture (He Kuang)
>
> - Show proper hint when SDT event not yet in place via 'perf probe' (Ravi Bangoria)
>
> - Take into account symfs setting when reading file build ID (Victor Kamensky)
>
> Infrastructure:
>
> - Map gcc7's '__attribute__ ((fallthrough))', that warns when code
> associated to case blocks in switches continue into the next case entry,
> to '__falltrough' and use it where warned by gcc, tested on Fedora Rawhide
> (Arnaldo Carvalho de Melo)
>
> - Fix buffer sizes used with snprintf that could lead to truncation,
> another warning introduced in gcc7 (Arnaldo Carvalho de Melo)
>
> - Robustify do_generate_dynamic_list_file in libtraceevent (David Carrillo-Cisneros)
>
> - Use zfree() in more places (Taeung Song)
>
> Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
>
> ----------------------------------------------------------------
> Andi Kleen (11):
> perf jevents: Parse eventcode as number
> perf jevents: Add support for parsing uncore json files
> perf pmu: Support per pmu json aliases
> perf pmu: Support event aliases for non cpu// pmus
> perf list: Add debug support for outputing alias string
> perf vendor events intel: Add uncore events for Haswell Server processor
> perf vendor events intel: Add uncore events for Broadwell Server
> perf vendor events intel: Add uncore events for IvyBridge Server
> perf vendor events intel: Add uncore events for Sandy Bridge Server
> perf vendor events intel: Add uncore events for Xeon Phi (Knights Landing)
> perf vendor events intel: Add uncore events for Broadwell DE
>
> Arnaldo Carvalho de Melo (11):
> Merge remote-tracking branch 'tip/perf/urgent' into perf/core
> perf tools: Fix include of linux/mman.h
> tools include: Add a __fallthrough statement
> tools string: Use __fallthrough in perf_atoll()
> tools strfilter: Use __fallthrough
> perf top: Use __fallthrough
> perf thread_map: Correctly size buffer used with dirent->dt_name
> perf header: Fix handling of PERF_EVENT_UPDATE__SCALE
> perf bench numa: Avoid possible truncation when using snprintf()
> perf tests: Avoid possible truncation with dirent->d_name + snprintf
> perf intel-pt: Use __fallthrough
>
> David Carrillo-Cisneros (1):
> tools lib traceevent: Robustify do_generate_dynamic_list_file
>
> He Kuang (2):
> perf tools arm64: Add support for generating bpf prologue
> perf bpf: Add missing newline in debug messages
>
> Mickaël Salaün (1):
> tools lib bpf: Add missing header to the library
>
> Ravi Bangoria (1):
> perf sdt: Show proper hint when event not yet in place via 'perf probe'
>
> Taeung Song (4):
> perf tools: Only increase index if perf_evsel__new_idx() succeeds
> perf tools: Add missing check for failure in a zalloc() call
> perf tools: Use zfree() instead of ad hoc equivalent
> perf tools: Use zfree() to avoid keeping dangling pointers
>
> Uwe Kleine-König (1):
> perf probe: Add option --symfs
>
> Victor Kamensky (1):
> perf symbols: Take into account symfs setting when reading file build ID
>
> Makefile | 6 +-
> arch/x86/events/Makefile | 13 +-
> arch/x86/events/amd/Makefile | 7 +
> arch/x86/events/amd/uncore.c | 204 ++++++++-----
> arch/x86/events/intel/pt.c | 6 +
> include/linux/kprobes.h | 30 +-
> include/linux/perf_event.h | 2 +-
> kernel/events/core.c | 223 ++++++++------
> kernel/extable.c | 9 +-
> kernel/kprobes.c | 73 +++--
> tools/arch/arm/include/uapi/asm/kvm.h | 9 +
> tools/arch/powerpc/include/uapi/asm/kvm.h | 5 +
> tools/arch/x86/include/asm/cpufeatures.h | 11 +
> tools/arch/x86/include/uapi/asm/vmx.h | 5 +
> tools/build/Makefile.build | 10 +
> tools/include/linux/compiler.h | 9 +
> tools/lib/api/fs/fs.c | 16 +
> tools/lib/api/fs/fs.h | 1 +
> tools/lib/api/fs/tracing_path.c | 32 +-
> tools/lib/bpf/bpf.h | 1 +
> tools/lib/bpf/libbpf.c | 264 +++++++++++++++--
> tools/lib/bpf/libbpf.h | 19 +-
> tools/lib/subcmd/parse-options.h | 19 +-
> tools/lib/traceevent/Makefile | 14 +-
> tools/perf/Build | 5 +-
> tools/perf/Documentation/perf-c2c.txt | 2 +-
> tools/perf/Documentation/perf-ftrace.txt | 36 +++
> tools/perf/Documentation/perf-kallsyms.txt | 24 ++
> tools/perf/Documentation/perf-record.txt | 14 +-
> tools/perf/Documentation/perf-sched.txt | 2 +
> tools/perf/Documentation/perf-script.txt | 4 +-
> tools/perf/Documentation/perf-trace.txt | 8 +-
> tools/perf/Makefile.config | 6 +-
> tools/perf/Makefile.perf | 1 +
> tools/perf/arch/arm64/Makefile | 1 +
> tools/perf/arch/arm64/include/dwarf-regs-table.h | 12 +-
> tools/perf/arch/arm64/util/dwarf-regs.c | 15 +-
> tools/perf/bench/numa.c | 6 +-
> tools/perf/builtin-c2c.c | 3 +-
> tools/perf/builtin-ftrace.c | 265 +++++++++++++++++
> tools/perf/builtin-help.c | 8 +-
> tools/perf/builtin-kallsyms.c | 67 +++++
> tools/perf/builtin-kmem.c | 8 +-
> tools/perf/builtin-list.c | 3 +
> tools/perf/builtin-probe.c | 2 +
> tools/perf/builtin-record.c | 158 +++++++++-
> tools/perf/builtin-report.c | 4 +-
> tools/perf/builtin-sched.c | 130 ++++++++-
> tools/perf/builtin-script.c | 3 +-
> tools/perf/builtin-top.c | 6 +-
> tools/perf/builtin-trace.c | 120 ++++++--
> tools/perf/builtin.h | 2 +
> tools/perf/command-list.txt | 2 +
> tools/perf/perf.c | 20 +-
> .../arch/x86/broadwellde/uncore-cache.json | 317 ++++++++++++++++++++
> .../arch/x86/broadwellde/uncore-memory.json | 83 ++++++
> .../arch/x86/broadwellde/uncore-power.json | 84 ++++++
> .../arch/x86/broadwellx/uncore-cache.json | 317 ++++++++++++++++++++
> .../arch/x86/broadwellx/uncore-interconnect.json | 28 ++
> .../arch/x86/broadwellx/uncore-memory.json | 83 ++++++
> .../arch/x86/broadwellx/uncore-power.json | 84 ++++++
> .../pmu-events/arch/x86/haswellx/uncore-cache.json | 317 ++++++++++++++++++++
> .../arch/x86/haswellx/uncore-interconnect.json | 28 ++
> .../arch/x86/haswellx/uncore-memory.json | 83 ++++++
> .../pmu-events/arch/x86/haswellx/uncore-power.json | 84 ++++++
> .../pmu-events/arch/x86/ivytown/uncore-cache.json | 322 +++++++++++++++++++++
> .../arch/x86/ivytown/uncore-interconnect.json | 46 +++
> .../pmu-events/arch/x86/ivytown/uncore-memory.json | 75 +++++
> .../pmu-events/arch/x86/ivytown/uncore-power.json | 249 ++++++++++++++++
> .../pmu-events/arch/x86/jaketown/uncore-cache.json | 209 +++++++++++++
> .../arch/x86/jaketown/uncore-interconnect.json | 46 +++
> .../arch/x86/jaketown/uncore-memory.json | 79 +++++
> .../pmu-events/arch/x86/jaketown/uncore-power.json | 248 ++++++++++++++++
> .../arch/x86/knightslanding/uncore-memory.json | 42 +++
> tools/perf/pmu-events/jevents.c | 84 +++++-
> tools/perf/pmu-events/jevents.h | 4 +-
> tools/perf/pmu-events/pmu-events.h | 3 +
> tools/perf/tests/Build | 1 +
> tools/perf/tests/bpf.c | 42 ++-
> tools/perf/tests/builtin-test.c | 4 +
> tools/perf/tests/llvm.c | 2 +-
> tools/perf/tests/parse-events.c | 8 +-
> tools/perf/tests/tests.h | 1 +
> tools/perf/tests/unit_number__scnprintf.c | 37 +++
> tools/perf/ui/browsers/hists.c | 60 ++--
> tools/perf/ui/setup.c | 1 +
> tools/perf/util/Build | 1 +
> tools/perf/util/bpf-loader.c | 4 +-
> tools/perf/util/callchain.c | 16 +-
> tools/perf/util/config.c | 23 +-
> tools/perf/util/data-convert-bt.c | 7 +-
> tools/perf/util/dso.c | 48 ++-
> tools/perf/util/event.c | 2 +-
> tools/perf/util/evlist.c | 12 +-
> tools/perf/util/evlist.h | 2 +
> tools/perf/util/header.c | 7 +-
> tools/perf/util/hist.c | 4 +-
> .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 +
> .../util/intel-pt-decoder/intel-pt-pkt-decoder.c | 2 +
> tools/perf/util/intel-pt.c | 4 +-
> tools/perf/util/llvm-utils.c | 4 +-
> tools/perf/util/machine.c | 19 ++
> tools/perf/util/machine.h | 1 +
> tools/perf/util/parse-events.c | 69 +++--
> tools/perf/util/parse-events.y | 35 ++-
> tools/perf/util/pmu.c | 109 ++++---
> tools/perf/util/pmu.h | 1 +
> tools/perf/util/probe-event.c | 11 +-
> .../perf/util/scripting-engines/trace-event-perl.c | 6 +-
> tools/perf/util/session.c | 2 +-
> tools/perf/util/strfilter.c | 1 +
> tools/perf/util/string.c | 2 +
> tools/perf/util/symbol.c | 6 +-
> tools/perf/util/thread_map.c | 2 +-
> tools/perf/util/trace-event-info.c | 71 +++--
> tools/perf/util/trace-event-parse.c | 17 ++
> tools/perf/util/trace-event-read.c | 77 ++++-
> tools/perf/util/trace-event.h | 1 +
> tools/perf/util/unwind-libunwind-local.c | 54 +++-
> tools/perf/util/util.c | 15 +-
> tools/perf/util/util.h | 3 +-
> tools/scripts/Makefile.include | 12 +-
> 122 files changed, 5101 insertions(+), 550 deletions(-)
> create mode 100644 arch/x86/events/amd/Makefile
> create mode 100644 tools/perf/Documentation/perf-ftrace.txt
> create mode 100644 tools/perf/Documentation/perf-kallsyms.txt
> create mode 100644 tools/perf/builtin-ftrace.c
> create mode 100644 tools/perf/builtin-kallsyms.c
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-cache.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/uncore-power.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-cache.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-interconnect.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-memory.json
> create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/uncore-power.json
> create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
> create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-interconnect.json
> create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-memory.json
> create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/uncore-power.json
> create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-cache.json
> create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-interconnect.json
> create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-memory.json
> create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/uncore-power.json
> create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-cache.json
> create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-interconnect.json
> create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-memory.json
> create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/uncore-power.json
> create mode 100644 tools/perf/pmu-events/arch/x86/knightslanding/uncore-memory.json
> create mode 100644 tools/perf/tests/unit_number__scnprintf.c

Pulled, thanks a lot Arnaldo!

Ingo

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:50:06 AM2/10/17
to
Commit-ID: 7b0214b702ad8e124e039a317beeebb3f020d125
Gitweb: http://git.kernel.org/tip/7b0214b702ad8e124e039a317beeebb3f020d125
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 8 Feb 2017 17:01:46 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 17:31:22 -0300

perf top: Use __fallthrough

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

CC /tmp/build/perf/builtin-top.o
builtin-top.c: In function 'display_thread':
builtin-top.c:644:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (errno == EINTR)
^
builtin-top.c:647:3: note: here
default:
^~~~~~~

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-lmcfnnyx9i...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-top.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 20aef98..d90927f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -643,7 +643,7 @@ repeat:
case -1:
if (errno == EINTR)
continue;
- /* Fall trhu */
+ __fallthrough;
default:
c = getc(stdin);
tcsetattr(0, TCSAFLUSH, &save);

tip-bot for Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 3:50:08 AM2/10/17
to
Commit-ID: d64b721d27aef3fbeb16ecda9dd22ee34818ff70
Gitweb: http://git.kernel.org/tip/d64b721d27aef3fbeb16ecda9dd22ee34818ff70
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Wed, 8 Feb 2017 17:01:46 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Wed, 8 Feb 2017 17:31:10 -0300

tools strfilter: Use __fallthrough

The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:

util/strfilter.c: In function 'strfilter_node__sprint':
util/strfilter.c:270:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (len < 0)
^
util/strfilter.c:272:2: note: here
case '!':
^~~~
cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-z2dpywg7u8...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/strfilter.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/strfilter.c b/tools/perf/util/strfilter.c
index bcae659..efb5377 100644
--- a/tools/perf/util/strfilter.c
+++ b/tools/perf/util/strfilter.c
@@ -269,6 +269,7 @@ static int strfilter_node__sprint(struct strfilter_node *node, char *buf)
len = strfilter_node__sprint_pt(node->l, buf);
if (len < 0)
return len;
+ __fallthrough;
case '!':
if (buf) {
*(buf + len++) = *node->p;

Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 9:50:06 AM2/10/17
to
Em Thu, Feb 09, 2017 at 07:02:06PM -0800, Joe Perches escreveu:
> On Thu, 2017-02-09 at 22:39 -0300, Arnaldo Carvalho de Melo wrote:
> > From: Arnaldo Carvalho de Melo <ac...@redhat.com>
> >
> > For cases where implicit fall through case labels are intended,
> > to let us inform that to gcc >= 7:
>
> I believe this should be added to compiler_gcc.h

We still don't have it in tools/include/linux, but yeah, its a good idea
to have the equivalent of include/linux/compiler-gcc.h.

Then, at some point, the kernel can grab the definition from tools, i.e.
when people start trying to build the kernel _and_ this specific warning
is enabled there, as it is in tools/.

- Arnaldo

Joe Perches

unread,
Feb 10, 2017, 1:00:06 PM2/10/17
to
gcc v7.0 can warn on missing break statements from case labels
using a special __attribute__((fallthrough))__ marker.

Add a __fallthrough convenience macro for gcc versions >= 7 and
make the generic use of __fallthrough a no-op.

Signed-off-by: Joe Perches <j...@perches.com>
---
include/linux/compiler-gcc.h | 7 +++++++
include/linux/compiler.h | 6 ++++++
2 files changed, 13 insertions(+)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d5e1fedbad24..6af8d6448f10 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -304,6 +304,13 @@
#define __no_sanitize_address __attribute__((no_sanitize_address))
#endif

+#if GCC_VERSION >= 70000
+/*
+ * Tell the compiler not to warn when a switch/case fallthrough marker exists
+ */
+#define __fallthrough __attribute__ ((fallthrough))
+#endif
+
#endif /* gcc version >= 40000 specific checks */

#if !defined(__noclone)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6e8e160b1e4b..16b6efc877f4 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -477,6 +477,12 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
#define __assume_aligned(a, ...)
#endif

+/*
+ * switch/case fallthrough checking
+ */
+#ifndef __fallthrough
+#define __fallthrough
+#endif

/* Are two types/vars the same type (ignoring qualifiers)? */
#ifndef __same_type
--
2.10.0.rc2.1.g053435c

Arnaldo Carvalho de Melo

unread,
Feb 10, 2017, 1:20:06 PM2/10/17
to
Em Fri, Feb 10, 2017 at 09:08:35AM -0800, Joe Perches escreveu:
> gcc v7.0 can warn on missing break statements from case labels
> using a special __attribute__((fallthrough))__ marker.
>
> Add a __fallthrough convenience macro for gcc versions >= 7 and
> make the generic use of __fallthrough a no-op.

Can you state in the log message were this idea came from? Say,
something like:

"This was introduced in the tools/include/linux/compiler.h, where it was
first noticed while buildint tools/perf/ in a Fedora Rawhide
environment."

BTW, I added this to my tree:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=e058554b1418e00864f1702c84bfa1d7cf314627

tools include: Introduce linux/compiler-gcc.h

To match the kernel headers structure, setting up things that are
specific to gcc or to some specific version of gcc.

It gets included by linux/compiler.h when gcc is the compiler being
used.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Joe Perches <j...@perches.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-pi5mxx176l...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Thanks,

- Arnaldo

Joe Perches

unread,
Feb 10, 2017, 2:00:06 PM2/10/17
to
On Fri, 2017-02-10 at 14:18 -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 10, 2017 at 09:08:35AM -0800, Joe Perches escreveu:
> > gcc v7.0 can warn on missing break statements from case labels
> > using a special __attribute__((fallthrough))__ marker.
> >
> > Add a __fallthrough convenience macro for gcc versions >= 7 and
> > make the generic use of __fallthrough a no-op.
>
> Can you state in the log message were this idea came from? Say,
> something like:
>
> "This was introduced in the tools/include/linux/compiler.h, where it was
> first noticed while buildint tools/perf/ in a Fedora Rawhide
> environment."

The request is from at least 2002.

Maybe Andrew could add a link like:

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7652

if he applies it.
It is loading more messages.
0 new messages