Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 02/35] perf stat: Issue a HW watchdog disable hint

280 views
Skip to first unread message

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:07 PM3/6/17
to
From: Borislav Petkov <b...@suse.de>

When using perf stat on an AMD F15h system with the default hw events
attributes, some of the events don't get counted:

Performance counter stats for 'sleep 1':

0.749208 task-clock (msec) # 0.001 CPUs utilized
1 context-switches # 0.001 M/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.072 M/sec
1,122,815 cycles # 1.499 GHz
286,740 stalled-cycles-frontend # 25.54% frontend cycles idle
<not counted> stalled-cycles-backend (0.00%)
^^^^^^^^^^^^
<not counted> instructions (0.00%)
^^^^^^^^^^^^
<not counted> branches (0.00%)
<not counted> branch-misses (0.00%)

1.001550070 seconds time elapsed

The reason is that we have the HW watchdog consuming one PMU counter and
when perf tries to schedule 6 events on 6 counters and some of those
counters are constrained to only a specific subset of PMCs by the
hardware, the event scheduling fails.

So issue a hint to disable the HW watchdog around a perf stat session.

Committer note:

Testing it...

# perf stat -d usleep 1

Performance counter stats for 'usleep 1':

1.180203 task-clock (msec) # 0.490 CPUs utilized
1 context-switches # 0.847 K/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.046 M/sec
184,754 cycles # 0.157 GHz
714,553 instructions # 3.87 insn per cycle
154,661 branches # 131.046 M/sec
7,247 branch-misses # 4.69% of all branches
219,984 L1-dcache-loads # 186.395 M/sec
17,600 L1-dcache-load-misses # 8.00% of all L1-dcache hits (90.16%)
<not counted> LLC-loads (0.00%)
<not counted> LLC-load-misses (0.00%)

0.002406823 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
#

Signed-off-by: Borislav Petkov <b...@suse.de>
Acked-by: Ingo Molnar <mi...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Robert Richter <rr...@kernel.org>
Cc: Vince Weaver <vi...@deater.net>
Link: http://lkml.kernel.org/r/20170211183218....@pd.tnic
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-stat.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 13b54999ad79..f4f555a67e9b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -146,6 +146,7 @@ static aggr_get_id_t aggr_get_id;
static bool append_file;
static const char *output_name;
static int output_fd;
+static int print_free_counters_hint;

struct perf_stat {
bool record;
@@ -1109,6 +1110,9 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
counter->supported ? CNTR_NOT_COUNTED : CNTR_NOT_SUPPORTED,
csv_sep);

+ if (counter->supported)
+ print_free_counters_hint = 1;
+
fprintf(stat_config.output, "%-*s%s",
csv_output ? 0 : unit_width,
counter->unit, csv_sep);
@@ -1477,6 +1481,13 @@ static void print_footer(void)
avg_stats(&walltime_nsecs_stats));
}
fprintf(output, "\n\n");
+
+ if (print_free_counters_hint)
+ fprintf(output,
+"Some events weren't counted. Try disabling the NMI watchdog:\n"
+" echo 0 > /proc/sys/kernel/nmi_watchdog\n"
+" perf stat ...\n"
+" echo 1 > /proc/sys/kernel/nmi_watchdog\n");
}

static void print_counters(struct timespec *ts, int argc, const char **argv)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:07 PM3/6/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:

Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306

for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:

perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)

E.g.:

# perf report -s symbol_size,symbol

Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
Overhead Symbol size Symbol
14.55% 326 [k] flush_tlb_mm_range
7.20% 1045 [k] filemap_map_pages
5.82% 124 [k] vma_interval_tree_insert
5.18% 2430 [k] unmap_page_range
2.57% 571 [k] vma_interval_tree_remove
1.94% 494 [k] page_add_file_rmap
1.82% 740 [k] page_remove_rmap
1.66% 1017 [k] release_pages
1.57% 1636 [k] update_blocked_averages
1.57% 76 [k] unlock_page

- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)

Change in behaviour:

- Make system wide (-a) the default option if no target was specified and one
of following conditions is met:

- No workload specified (current behaviour)

- A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)

Fixes:

- Add missing initialization to the instruction decoder used in the
intel PT/BTS code, which was causing lots of failures in 'perf test',
looking for a value when there was none (Adrian Hunter)

Infrastructure:

- Add arch code needed to adopt the kernel's refcount_t to aid in
catching bugs when using atomic_t as a reference counter, basically
cmpxchg related functions (Arnaldo Carvalho de Melo)

- Convert the code using atomic_t as reference counts to refcount_t
(Elena Rashetova)

- Add feature test for sched_getcpu() to more easily check for its
presence in the many libc implementations and accross different
versions of such C libraries (Arnaldo Carvalho de Melo)

- Issue a HW watchdog disable hint in 'perf stat' for when some of the
requested events can't get counted because a PMU counter is taken by that
watchdog (Borislav Petkov).

- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)

Documentation:

- Clarify the term 'convergence' in:

perf bench numa numa-mem -h --show_convergence (Jiri Olsa)

Kernel code:

- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)

- Allow return probes with offsets and absolute addresses (Naveen N. Rao)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Adrian Hunter (1):
perf intel-PT/BTS: Add missing initialization

Arnaldo Carvalho de Melo (12):
tools include: Adopt __compiletime_error
tools arch x86: Include asm/cmpxchg.h
tools arch x86: Introduce atomic_cmpxchg()
tools include: Introduce atomic_cmpxchg_{relaxed,release}()
tools include: Provide gcc based cmpxchg fallback for !x86
tools include: Add UINT_MAX def to kernel.h
tools include: Adopt kernel's refcount.h
perf evlist: Clarify a bit the use of perf_mmap->refcnt
tools build: Add test for sched_getcpu()
perf bench futex: Use __maybe_unused
perf bench futex: Fix build on musl + clang
tools build: Use the same CC for feature detection and actual build

Borislav Petkov (1):
perf stat: Issue a HW watchdog disable hint

Charles Baylis (1):
perf tools: Allow sorting by symbol size

Elena Reshetova (9):
perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
perf dso: Convert dso.refcnt from atomic_t to refcount_t
perf map: Convert map.refcnt from atomic_t to refcount_t
perf map: Convert map_groups.refcnt from atomic_t to refcount_t
perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
perf thread: convert thread.refcnt from atomic_t to refcount_t
perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t

Jiri Olsa (2):
perf tools: Force uncore events to system wide monitoring
perf bench numa: Add more comment for -c option

Karol Wachowski (1):
perf vendor events: Add mapping for KnightsMill PMU events

Namhyung Kim (4):
perf ftrace: Add support for --pid option
perf cpumap: Introduce cpu_map__snprint_mask()
perf ftrace: Add support for -a and -C option
perf ftrace: Use pager for displaying result

Naveen N. Rao (3):
kretprobes: Ensure probe location is at function entry
trace/kprobes: Allow return probes with offsets and absolute addresses
perf probe: Generalize probe event file open routine

Steven Rostedt (VMware) (1):
trace/kprobes: Add back warning about offset in return probes

include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 ++
kernel/trace/trace.c | 1 +
kernel/trace/trace_kprobe.c | 9 +-
tools/arch/x86/include/asm/atomic.h | 7 +
tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++
tools/build/Makefile.feature | 1 +
tools/build/feature/Makefile | 10 +-
tools/build/feature/test-all.c | 5 +
tools/build/feature/test-sched_getcpu.c | 7 +
tools/include/asm-generic/atomic-gcc.h | 8 ++
tools/include/linux/atomic.h | 6 +
tools/include/linux/compiler-gcc.h | 4 +
tools/include/linux/compiler.h | 4 +
tools/include/linux/kernel.h | 4 +
tools/include/linux/refcount.h | 151 ++++++++++++++++++++
tools/perf/Documentation/perf-ftrace.txt | 18 +++
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/MANIFEST | 2 +
tools/perf/Makefile.config | 4 +
tools/perf/bench/futex-hash.c | 1 +
tools/perf/bench/futex-lock-pi.c | 1 +
tools/perf/bench/futex-requeue.c | 1 +
tools/perf/bench/futex-wake-parallel.c | 1 +
tools/perf/bench/futex-wake.c | 1 +
tools/perf/bench/futex.h | 10 +-
tools/perf/bench/numa.c | 3 +-
tools/perf/builtin-ftrace.c | 152 +++++++++++++++++----
tools/perf/builtin-stat.c | 44 +++++-
tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
tools/perf/tests/cpumap.c | 2 +-
tools/perf/tests/thread-map.c | 6 +-
tools/perf/tests/thread-mg-share.c | 12 +-
tools/perf/util/cgroup.c | 6 +-
tools/perf/util/cgroup.h | 4 +-
tools/perf/util/cloexec.h | 6 -
tools/perf/util/comm.c | 15 +-
tools/perf/util/cpumap.c | 62 +++++++--
tools/perf/util/cpumap.h | 5 +-
tools/perf/util/dso.c | 6 +-
tools/perf/util/dso.h | 4 +-
tools/perf/util/evlist.c | 31 +++--
tools/perf/util/evlist.h | 4 +-
tools/perf/util/hist.h | 1 +
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +
tools/perf/util/machine.c | 2 +-
tools/perf/util/map.c | 10 +-
tools/perf/util/map.h | 10 +-
tools/perf/util/parse-events.c | 5 +-
tools/perf/util/probe-file.c | 20 +--
tools/perf/util/probe-file.h | 1 +
tools/perf/util/sort.c | 41 ++++++
tools/perf/util/sort.h | 1 +
tools/perf/util/thread.c | 6 +-
tools/perf/util/thread.h | 4 +-
tools/perf/util/thread_map.c | 20 +--
tools/perf/util/thread_map.h | 4 +-
tools/perf/util/util.h | 4 +-
tools/scripts/Makefile.include | 9 ++
59 files changed, 720 insertions(+), 143 deletions(-)
create mode 100644 tools/arch/x86/include/asm/cmpxchg.h
create mode 100644 tools/build/feature/test-sched_getcpu.c
create mode 100644 tools/include/linux/refcount.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

[root@jouet ~]# waitp `pidof perf` ; time dm
1 alpine:3.4: Ok
2 alpine:3.5: Ok
3 alpine:edge: Ok
4 android-ndk:r12b-arm: Ok
5 archlinux:latest: Ok
6 centos:5: Ok
7 centos:6: Ok
8 centos:7: Ok
9 debian:7: Ok
10 debian:8: Ok
11 debian:experimental: Ok
12 debian:experimental-x-arm64: Ok
13 debian:experimental-x-mips: Ok
14 debian:experimental-x-mips64: Ok
15 debian:experimental-x-mipsel: Ok
16 fedora:20: Ok
17 fedora:21: Ok
18 fedora:22: Ok
19 fedora:23: Ok
20 fedora:24: Ok
21 fedora:24-x-ARC-uClibc: Ok
22 fedora:25: Ok
23 fedora:rawhide: Ok
24 mageia:5: Ok
25 opensuse:13.2: Ok
26 opensuse:42.1: Ok
27 opensuse:tumbleweed: Ok
28 ubuntu:12.04.5: Ok
29 ubuntu:14.04.4: Ok
30 ubuntu:14.04.4-x-linaro-arm64: Ok
31 ubuntu:15.10: Ok
32 ubuntu:16.04: Ok
33 ubuntu:16.04-x-arm: Ok
34 ubuntu:16.04-x-arm64: Ok
35 ubuntu:16.04-x-powerpc: Ok
36 ubuntu:16.04-x-powerpc64: Ok
37 ubuntu:16.04-x-s390: Ok
38 ubuntu:16.10: Ok
39 ubuntu:17.04: Ok
[root@jouet ~]#

[root@zoo ~]# uname -a
Linux zoo 4.9.13-100.fc24.x86_64 #1 SMP Mon Feb 27 16:57:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@zoo ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip
[root@zoo ~]#

[acme@jouet linux]$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_pure_O: make
make_doc_O: make doc
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_static_O: make LDFLAGS=-static
make_help_O: make help
make_no_libnuma_O: make NO_LIBNUMA=1
make_clean_all_O: make clean all
make_no_libelf_O: make NO_LIBELF=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_tags_O: make tags
make_debug_O: make DEBUG=1
make_no_newt_O: make NO_NEWT=1
make_install_prefix_O: make install prefix=/tmp/krava
make_install_bin_O: make install-bin
make_perf_o_O: make perf.o
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_util_map_o_O: make util/map.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_install_O: make install
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
OK
[acme@jouet linux]$

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:07 PM3/6/17
to
From: Elena Reshetova <elena.r...@intel.com>

The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <elena.r...@intel.com>
Signed-off-by: David Windsor <dwin...@gmail.com>
Signed-off-by: Hans Liljestrand <ishk...@gmail.com>
Signed-off-by: Kees Kook <kees...@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Windsor <dwin...@gmail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Hans Liljestrand <ishk...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kees Kook <kees...@chromium.org>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: alsa-...@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-3-git-s...@intel.com
[ fixed mixed conversion to refcount in tests/cpumap.c ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/cpumap.c | 2 +-
tools/perf/util/cpumap.c | 16 ++++++++--------
tools/perf/util/cpumap.h | 4 ++--
3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index f168a85992d0..4478773cdb97 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -66,7 +66,7 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
TEST_ASSERT_VAL("wrong nr", map->nr == 2);
TEST_ASSERT_VAL("wrong cpu", map->map[0] == 1);
TEST_ASSERT_VAL("wrong cpu", map->map[1] == 256);
- TEST_ASSERT_VAL("wrong refcnt", atomic_read(&map->refcnt) == 1);
+ TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
cpu_map__put(map);
return 0;
}
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 8c7504939113..39ad2caccf56 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -29,7 +29,7 @@ static struct cpu_map *cpu_map__default_new(void)
cpus->map[i] = i;

cpus->nr = nr_cpus;
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -43,7 +43,7 @@ static struct cpu_map *cpu_map__trim_new(int nr_cpus, int *tmp_cpus)
if (cpus != NULL) {
cpus->nr = nr_cpus;
memcpy(cpus->map, tmp_cpus, payload_size);
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -252,7 +252,7 @@ struct cpu_map *cpu_map__dummy_new(void)
if (cpus != NULL) {
cpus->nr = 1;
cpus->map[0] = -1;
- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -269,7 +269,7 @@ struct cpu_map *cpu_map__empty_new(int nr)
for (i = 0; i < nr; i++)
cpus->map[i] = -1;

- atomic_set(&cpus->refcnt, 1);
+ refcount_set(&cpus->refcnt, 1);
}

return cpus;
@@ -278,7 +278,7 @@ struct cpu_map *cpu_map__empty_new(int nr)
static void cpu_map__delete(struct cpu_map *map)
{
if (map) {
- WARN_ONCE(atomic_read(&map->refcnt) != 0,
+ WARN_ONCE(refcount_read(&map->refcnt) != 0,
"cpu_map refcnt unbalanced\n");
free(map);
}
@@ -287,13 +287,13 @@ static void cpu_map__delete(struct cpu_map *map)
struct cpu_map *cpu_map__get(struct cpu_map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

void cpu_map__put(struct cpu_map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
cpu_map__delete(map);
}

@@ -357,7 +357,7 @@ int cpu_map__build_map(struct cpu_map *cpus, struct cpu_map **res,
/* ensure we process id in increasing order */
qsort(c->map, c->nr, sizeof(int), cmp_ids);

- atomic_set(&c->refcnt, 1);
+ refcount_set(&c->refcnt, 1);
*res = c;
return 0;
}
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 1a0549af8f5c..e84491636c1b 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -3,13 +3,13 @@

#include <stdio.h>
#include <stdbool.h>
-#include <linux/atomic.h>
+#include <linux/refcount.h>

#include "perf.h"
#include "util/debug.h"

struct cpu_map {
- atomic_t refcnt;
+ refcount_t refcnt;
int nr;
int map[];
};
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:09 PM3/6/17
to
From: Namhyung Kim <namh...@kernel.org>

It's convenient to use the pager when seeing many lines of result.

Note that setup_pager() should be called after perf_evlist__prepare_workload()
since they can interfere each other regarding shared stdio streams.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: kerne...@lge.com
Link: http://lkml.kernel.org/r/20170224011251....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-ftrace.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d5b566ed7178..6087295f8827 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -195,6 +195,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
signal(SIGINT, sig_handler);
signal(SIGUSR1, sig_handler);
signal(SIGCHLD, sig_handler);
+ signal(SIGPIPE, sig_handler);

if (reset_tracing_files(ftrace) < 0)
goto out;
@@ -247,6 +248,8 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
goto out_close_fd;
}

+ setup_pager();
+
perf_evlist__start_workload(ftrace->evlist);

while (!done) {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:09 PM3/6/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

Instead of attributing a variable to itself to silence the compiler, use
the attribute designed for that, avoiding this:

In file included from bench/futex-hash.c:24:
bench/futex.h:95:7: error: explicitly assigning value of variable of type 'pthread_attr_t *' to itself [-Werror,-Wself-assign]
attr = attr;
~~~~ ^ ~~~~
bench/futex.h:96:13: error: explicitly assigning value of variable of type 'size_t' (aka 'unsigned long') to itself [-Werror,-Wself-assign]
cpusetsize = cpusetsize;
~~~~~~~~~~ ^ ~~~~~~~~~~
bench/futex.h:97:9: error: explicitly assigning value of variable of type 'cpu_set_t *' (aka 'struct cpu_set_t *') to itself [-Werror,-Wself-assign]
cpuset = cpuset;
~~~~~~ ^ ~~~~~~

That is only triggered when HAVE_PTHREAD_ATTR_SETAFFINITY_NP isn't set.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-14ws1d1elj...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/bench/futex.h | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h
index b2e06d1190d0..e44fd3239530 100644
--- a/tools/perf/bench/futex.h
+++ b/tools/perf/bench/futex.h
@@ -88,13 +88,11 @@ futex_cmp_requeue(u_int32_t *uaddr, u_int32_t val, u_int32_t *uaddr2, int nr_wak

#ifndef HAVE_PTHREAD_ATTR_SETAFFINITY_NP
#include <pthread.h>
-static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr,
- size_t cpusetsize,
- cpu_set_t *cpuset)
+#include <linux/compiler.h>
+static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
+ size_t cpusetsize __maybe_unused,
+ cpu_set_t *cpuset __maybe_unused)
{
- attr = attr;
- cpusetsize = cpusetsize;
- cpuset = cpuset;
return 0;
}
#endif
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:09 PM3/6/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To aid in catching bugs when using atomics as a reference count.

This is a trimmed down version with just what is used by tools/ at
this point.

After this, the patches submitted by Elena for tools/ doing the
conversion from atomic_ to recount_ methods can be applied and tested.

To activate it, buint perf with:

make DEBUG=1 -C tools/perf

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Elena Reshetova <elena.r...@intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-dqtxsumns9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/linux/refcount.h | 151 +++++++++++++++++++++++++++++++++++++++++
tools/perf/MANIFEST | 1 +
2 files changed, 152 insertions(+)
create mode 100644 tools/include/linux/refcount.h

diff --git a/tools/include/linux/refcount.h b/tools/include/linux/refcount.h
new file mode 100644
index 000000000000..a0177c1f55b1
--- /dev/null
+++ b/tools/include/linux/refcount.h
@@ -0,0 +1,151 @@
+#ifndef _TOOLS_LINUX_REFCOUNT_H
+#define _TOOLS_LINUX_REFCOUNT_H
+
+/*
+ * Variant of atomic_t specialized for reference counts.
+ *
+ * The interface matches the atomic_t interface (to aid in porting) but only
+ * provides the few functions one should use for reference counting.
+ *
+ * It differs in that the counter saturates at UINT_MAX and will not move once
+ * there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free issues.
+ *
+ * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
+ * and provide only what is strictly required for refcounts.
+ *
+ * The increments are fully relaxed; these will not provide ordering. The
+ * rationale is that whatever is used to obtain the object we're increasing the
+ * reference count on will provide the ordering. For locked data structures,
+ * its the lock acquire, for RCU/lockless data structures its the dependent
+ * load.
+ *
+ * Do note that inc_not_zero() provides a control dependency which will order
+ * future stores against the inc, this ensures we'll never modify the object
+ * if we did not in fact acquire a reference.
+ *
+ * The decrements will provide release order, such that all the prior loads and
+ * stores will be issued before, it also provides a control dependency, which
+ * will order us against the subsequent free().
+ *
+ * The control dependency is against the load of the cmpxchg (ll/sc) that
+ * succeeded. This means the stores aren't fully ordered, but this is fine
+ * because the 1->0 transition indicates no concurrency.
+ *
+ * Note that the allocator is responsible for ordering things between free()
+ * and alloc().
+ *
+ */
+
+#include <linux/atomic.h>
+#include <linux/kernel.h>
+
+#ifdef NDEBUG
+#define REFCOUNT_WARN(cond, str) (void)(cond)
+#define __refcount_check
+#else
+#define REFCOUNT_WARN(cond, str) BUG_ON(cond)
+#define __refcount_check __must_check
+#endif
+
+typedef struct refcount_struct {
+ atomic_t refs;
+} refcount_t;
+
+#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+
+static inline void refcount_set(refcount_t *r, unsigned int n)
+{
+ atomic_set(&r->refs, n);
+}
+
+static inline unsigned int refcount_read(const refcount_t *r)
+{
+ return atomic_read(&r->refs);
+}
+
+/*
+ * Similar to atomic_inc_not_zero(), will saturate at UINT_MAX and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ */
+static inline __refcount_check
+bool refcount_inc_not_zero(refcount_t *r)
+{
+ unsigned int old, new, val = atomic_read(&r->refs);
+
+ for (;;) {
+ new = val + 1;
+
+ if (!val)
+ return false;
+
+ if (unlikely(!new))
+ return true;
+
+ old = atomic_cmpxchg_relaxed(&r->refs, val, new);
+ if (old == val)
+ break;
+
+ val = old;
+ }
+
+ REFCOUNT_WARN(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/*
+ * Similar to atomic_inc(), will saturate at UINT_MAX and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller already has a
+ * reference on the object, will WARN when this is not so.
+ */
+static inline void refcount_inc(refcount_t *r)
+{
+ REFCOUNT_WARN(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+}
+
+/*
+ * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
+ * decrement when saturated at UINT_MAX.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides a control dependency such that free() must come after.
+ * See the comment on top.
+ */
+static inline __refcount_check
+bool refcount_sub_and_test(unsigned int i, refcount_t *r)
+{
+ unsigned int old, new, val = atomic_read(&r->refs);
+
+ for (;;) {
+ if (unlikely(val == UINT_MAX))
+ return false;
+
+ new = val - i;
+ if (new > val) {
+ REFCOUNT_WARN(new > val, "refcount_t: underflow; use-after-free.\n");
+ return false;
+ }
+
+ old = atomic_cmpxchg_release(&r->refs, val, new);
+ if (old == val)
+ break;
+
+ val = old;
+ }
+
+ return !new;
+}
+
+static inline __refcount_check
+bool refcount_dec_and_test(refcount_t *r)
+{
+ return refcount_sub_and_test(1, r);
+}
+
+
+#endif /* _ATOMIC_LINUX_REFCOUNT_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index e2c52190cf28..28648c09dcd6 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -79,6 +79,7 @@ tools/include/uapi/linux/perf_event.h
tools/include/linux/poison.h
tools/include/linux/rbtree.h
tools/include/linux/rbtree_augmented.h
+tools/include/linux/refcount.h
tools/include/linux/string.h
tools/include/linux/stringify.h
tools/include/linux/types.h
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:10 PM3/6/17
to
From: Elena Reshetova <elena.r...@intel.com>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <elena.r...@intel.com>
Signed-off-by: David Windsor <dwin...@gmail.com>
Signed-off-by: Hans Liljestrand <ishk...@gmail.com>
Signed-off-by: Kees Kook <kees...@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Windsor <dwin...@gmail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Hans Liljestrand <ishk...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kees Kook <kees...@chromium.org>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: alsa-...@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-8-git-s...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/evlist.c | 18 +++++++++---------
tools/perf/util/evlist.h | 4 ++--
2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b601f2814a30..564b924fb48a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -777,7 +777,7 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *md, bool check_messu
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
*/
- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return NULL;

head = perf_mmap__read_head(md);
@@ -794,7 +794,7 @@ perf_mmap__read_backward(struct perf_mmap *md)
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
*/
- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return NULL;

head = perf_mmap__read_head(md);
@@ -856,7 +856,7 @@ void perf_mmap__read_catchup(struct perf_mmap *md)
{
u64 head;

- if (!atomic_read(&md->refcnt))
+ if (!refcount_read(&md->refcnt))
return;

head = perf_mmap__read_head(md);
@@ -875,14 +875,14 @@ static bool perf_mmap__empty(struct perf_mmap *md)

static void perf_mmap__get(struct perf_mmap *map)
{
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
}

static void perf_mmap__put(struct perf_mmap *md)
{
- BUG_ON(md->base && atomic_read(&md->refcnt) == 0);
+ BUG_ON(md->base && refcount_read(&md->refcnt) == 0);

- if (atomic_dec_and_test(&md->refcnt))
+ if (refcount_dec_and_test(&md->refcnt))
perf_mmap__munmap(md);
}

@@ -894,7 +894,7 @@ void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
perf_mmap__write_tail(md, old);
}

- if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
+ if (refcount_read(&md->refcnt) == 1 && perf_mmap__empty(md))
perf_mmap__put(md);
}

@@ -937,7 +937,7 @@ static void perf_mmap__munmap(struct perf_mmap *map)
munmap(map->base, perf_mmap__mmap_len(map));
map->base = NULL;
map->fd = -1;
- atomic_set(&map->refcnt, 0);
+ refcount_set(&map->refcnt, 0);
}
auxtrace_mmap__munmap(&map->auxtrace_mmap);
}
@@ -1001,7 +1001,7 @@ static int perf_mmap__mmap(struct perf_mmap *map,
* evlist layer can't just drop it when filtering events in
* perf_evlist__filter_pollfd().
*/
- atomic_set(&map->refcnt, 2);
+ refcount_set(&map->refcnt, 2);
map->prev = 0;
map->mask = mp->mask;
map->base = mmap(NULL, perf_mmap__mmap_len(map), mp->prot,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 389b9ccdf8c7..39942995f537 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -1,7 +1,7 @@
#ifndef __PERF_EVLIST_H
#define __PERF_EVLIST_H 1

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/list.h>
#include <api/fd/array.h>
#include <stdio.h>
@@ -29,7 +29,7 @@ struct perf_mmap {
void *base;
int mask;
int fd;
- atomic_t refcnt;
+ refcount_t refcnt;
u64 prev;
struct auxtrace_mmap auxtrace_mmap;
char event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:10 PM3/6/17
to
From: Elena Reshetova <elena.r...@intel.com>

The refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <elena.r...@intel.com>
Signed-off-by: David Windsor <dwin...@gmail.com>
Signed-off-by: Hans Liljestrand <ishk...@gmail.com>
Signed-off-by: Kees Kook <kees...@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Windsor <dwin...@gmail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Hans Liljestrand <ishk...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kees Kook <kees...@chromium.org>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: alsa-...@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-5-git-s...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/dso.c | 6 +++---
tools/perf/util/dso.h | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index d38b62a700ca..42db00d78573 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1109,7 +1109,7 @@ struct dso *dso__new(const char *name)
INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
pthread_mutex_init(&dso->lock, NULL);
- atomic_set(&dso->refcnt, 1);
+ refcount_set(&dso->refcnt, 1);
}

return dso;
@@ -1147,13 +1147,13 @@ void dso__delete(struct dso *dso)
struct dso *dso__get(struct dso *dso)
{
if (dso)
- atomic_inc(&dso->refcnt);
+ refcount_inc(&dso->refcnt);
return dso;
}

void dso__put(struct dso *dso)
{
- if (dso && atomic_dec_and_test(&dso->refcnt))
+ if (dso && refcount_dec_and_test(&dso->refcnt))
dso__delete(dso);
}

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ecc4bbd3f82e..12350b171727 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -1,7 +1,7 @@
#ifndef __PERF_DSO
#define __PERF_DSO

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/types.h>
#include <linux/rbtree.h>
#include <sys/types.h>
@@ -187,7 +187,7 @@ struct dso {
void *priv;
u64 db_id;
};
- atomic_t refcnt;
+ refcount_t refcnt;
char name[0];
};

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:10 PM3/6/17
to
From: Elena Reshetova <elena.r...@intel.com>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <elena.r...@intel.com>
Signed-off-by: David Windsor <dwin...@gmail.com>
Signed-off-by: Hans Liljestrand <ishk...@gmail.com>
Signed-off-by: Kees Kook <kees...@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Windsor <dwin...@gmail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Hans Liljestrand <ishk...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kees Kook <kees...@chromium.org>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: alsa-...@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-10-git-s...@intel.com
[ Did missing tests/thread-map.c conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/thread-map.c | 6 +++---
tools/perf/util/thread_map.c | 20 ++++++++++----------
tools/perf/util/thread_map.h | 4 ++--
3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/thread-map.c b/tools/perf/tests/thread-map.c
index f2d2e542d0ee..a63d6945807b 100644
--- a/tools/perf/tests/thread-map.c
+++ b/tools/perf/tests/thread-map.c
@@ -29,7 +29,7 @@ int test__thread_map(int subtest __maybe_unused)
thread_map__comm(map, 0) &&
!strcmp(thread_map__comm(map, 0), NAME));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&map->refcnt) == 1);
+ refcount_read(&map->refcnt) == 1);
thread_map__put(map);

/* test dummy pid */
@@ -44,7 +44,7 @@ int test__thread_map(int subtest __maybe_unused)
thread_map__comm(map, 0) &&
!strcmp(thread_map__comm(map, 0), "dummy"));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&map->refcnt) == 1);
+ refcount_read(&map->refcnt) == 1);
thread_map__put(map);
return 0;
}
@@ -71,7 +71,7 @@ static int process_event(struct perf_tool *tool __maybe_unused,
thread_map__comm(threads, 0) &&
!strcmp(thread_map__comm(threads, 0), NAME));
TEST_ASSERT_VAL("wrong refcnt",
- atomic_read(&threads->refcnt) == 1);
+ refcount_read(&threads->refcnt) == 1);
thread_map__put(threads);
return 0;
}
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 7c3fcc538a70..9026408ea55b 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -66,7 +66,7 @@ struct thread_map *thread_map__new_by_pid(pid_t pid)
for (i = 0; i < items; i++)
thread_map__set_pid(threads, i, atoi(namelist[i]->d_name));
threads->nr = items;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

for (i=0; i<items; i++)
@@ -83,7 +83,7 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
if (threads != NULL) {
thread_map__set_pid(threads, 0, tid);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

return threads;
@@ -105,7 +105,7 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
goto out_free_threads;

threads->nr = 0;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);

while ((dirent = readdir(proc)) != NULL) {
char *end;
@@ -235,7 +235,7 @@ static struct thread_map *thread_map__new_by_pid_str(const char *pid_str)
out:
strlist__delete(slist);
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
return threads;

out_free_namelist:
@@ -255,7 +255,7 @@ struct thread_map *thread_map__new_dummy(void)
if (threads != NULL) {
thread_map__set_pid(threads, 0, -1);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}
return threads;
}
@@ -300,7 +300,7 @@ struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
}
out:
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
return threads;

out_free_threads:
@@ -326,7 +326,7 @@ static void thread_map__delete(struct thread_map *threads)
if (threads) {
int i;

- WARN_ONCE(atomic_read(&threads->refcnt) != 0,
+ WARN_ONCE(refcount_read(&threads->refcnt) != 0,
"thread map refcnt unbalanced\n");
for (i = 0; i < threads->nr; i++)
free(thread_map__comm(threads, i));
@@ -337,13 +337,13 @@ static void thread_map__delete(struct thread_map *threads)
struct thread_map *thread_map__get(struct thread_map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

void thread_map__put(struct thread_map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
thread_map__delete(map);
}

@@ -423,7 +423,7 @@ static void thread_map__copy_event(struct thread_map *threads,
threads->map[i].comm = strndup(event->entries[i].comm, 16);
}

- atomic_set(&threads->refcnt, 1);
+ refcount_set(&threads->refcnt, 1);
}

struct thread_map *thread_map__new_event(struct thread_map_event *event)
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index ea0ef08c6303..bd34d7a0b9fa 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -3,7 +3,7 @@

#include <sys/types.h>
#include <stdio.h>
-#include <linux/atomic.h>
+#include <linux/refcount.h>

struct thread_map_data {
pid_t pid;
@@ -11,7 +11,7 @@ struct thread_map_data {
};

struct thread_map {
- atomic_t refcnt;
+ refcount_t refcnt;
int nr;
struct thread_map_data map[];
};
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 6, 2017, 2:50:10 PM3/6/17
to
From: Elena Reshetova <elena.r...@intel.com>

The refcount_t type and corresponding API should be used instead of
atomic_t when the variable is used as a reference counter.

This allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.

Signed-off-by: Elena Reshetova <elena.r...@intel.com>
Signed-off-by: David Windsor <dwin...@gmail.com>
Signed-off-by: Hans Liljestrand <ishk...@gmail.com>
Signed-off-by: Kees Kook <kees...@chromium.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Windsor <dwin...@gmail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Hans Liljestrand <ishk...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kees Kook <kees...@chromium.org>
Cc: Mark Rutland <mark.r...@arm.com>
Cc: Matija Glavinic Pecotic <matija.glavin...@nokia.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: alsa-...@alsa-project.org
Link: http://lkml.kernel.org/r/1487691303-31858-6-git-s...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/map.c | 6 +++---
tools/perf/util/map.h | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 0a943e7b1ea7..f0e2428efd0b 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -141,7 +141,7 @@ void map__init(struct map *map, enum map_type type,
RB_CLEAR_NODE(&map->rb_node);
map->groups = NULL;
map->erange_warned = false;
- atomic_set(&map->refcnt, 1);
+ refcount_set(&map->refcnt, 1);
}

struct map *map__new(struct machine *machine, u64 start, u64 len,
@@ -255,7 +255,7 @@ void map__delete(struct map *map)

void map__put(struct map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcount_dec_and_test(&map->refcnt))
map__delete(map);
}

@@ -354,7 +354,7 @@ struct map *map__clone(struct map *from)
struct map *map = memdup(from, sizeof(*map));

if (map != NULL) {
- atomic_set(&map->refcnt, 1);
+ refcount_set(&map->refcnt, 1);
RB_CLEAR_NODE(&map->rb_node);
dso__get(map->dso);
map->groups = NULL;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index abdacf800c98..9545ff343ec5 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -1,7 +1,7 @@
#ifndef __PERF_MAP_H
#define __PERF_MAP_H

-#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/compiler.h>
#include <linux/list.h>
#include <linux/rbtree.h>
@@ -51,7 +51,7 @@ struct map {

struct dso *dso;
struct map_groups *groups;
- atomic_t refcnt;
+ refcount_t refcnt;
};

struct kmap {
@@ -150,7 +150,7 @@ struct map *map__clone(struct map *map);
static inline struct map *map__get(struct map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcount_inc(&map->refcnt);
return map;
}

--
2.9.3

Ingo Molnar

unread,
Mar 7, 2017, 2:20:05 AM3/7/17
to
Pulled, thanks a lot Arnaldo!

Ingo

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
From: Changbin Du <chang...@intel.com>

Commit 2f3f9bcf000b ("perf tools: Add +field argument support for
--field option") by Jiri Olsa <jo...@kernel.org> introduced +field style
argument support for --field option.

This is useful but not updated documentation. This add a little
description there.

Signed-off-by: Changbin Du <chang...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170313083252.23...@intel.com
[ Slightly improved the phrase structure ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 33f91906f5dc..672b149aa80a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -173,6 +173,9 @@ OPTIONS
By default, every sort keys not specified in -F will be appended
automatically.

+ If the keys starts with a prefix '+', then it will append the specified
+ field(s) to the default field order. For example: perf report -F +period,sample.
+
-p::
--parent=<regex>::
A regex filter to identify parent. The parent is a caller of this
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
From: Hari Bathini <hbat...@linux.vnet.ibm.com>

Introduce a new option to display events of type PERF_RECORD_NAMESPACES
and update perf-script documentation accordingly.

Shown below is output (trimmed) of perf script command with the newly
introduced option, on perf.data generated with perf record command using
--namespaces option.

$ perf script --show-namespace-events
swapper 0 [000] 0.000000: PERF_RECORD_NAMESPACES 1/1 - nr_namespaces: 7
[0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
swapper 0 [000] 0.000000: PERF_RECORD_NAMESPACES 2/2 - nr_namespaces: 7
[0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

Commiter notes:

Testing it:

Investigating that double PERF_RECORD_NAMESPACES for the 19155
pid/tid... Its more than that, there are two PERF_RECORD_COMM as well,
and with zeroed timestamps, so probably a synthesizing artifact...

# perf script --show-task --show-namespace
<SNIP>
perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19154/19154
perf 0 [000] 0.000000: PERF_RECORD_FORK(19155:19155):(19154:19154)
perf 0 [000] 0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19155/19155
perf 0 [000] 0.000000: PERF_RECORD_COMM: perf:19155/19155
perf 0 [000] 0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
swapper 0 [000] 3110.881834: 1 cycles: ffffffffa7060bf6 native_write_msr (/lib/modules/4.11.0-rc1+/build/vmlinux)

<SNIP>

Signed-off-by: Hari Bathini <hbat...@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Aravinda Prasad <arav...@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Eric Biederman <ebie...@xmission.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sargun Dhillon <sar...@sargun.me>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/r/148891932627.25309.194...@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 3 +++
tools/perf/builtin-script.c | 40 ++++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 4ed5f239ba7d..62c9b0c77a3a 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -248,6 +248,9 @@ OPTIONS
--show-mmap-events
Display mmap related events (e.g. MMAP, MMAP2).

+--show-namespace-events
+ Display namespace events i.e. events of type PERF_RECORD_NAMESPACES.
+
--show-switch-events
Display context switch events i.e. events of type PERF_RECORD_SWITCH or
PERF_RECORD_SWITCH_CPU_WIDE.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index f1ce806a1f31..66d62c98dff9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -830,6 +830,7 @@ struct perf_script {
bool show_task_events;
bool show_mmap_events;
bool show_switch_events;
+ bool show_namespace_events;
bool allocated;
struct cpu_map *cpus;
struct thread_map *threads;
@@ -1118,6 +1119,41 @@ static int process_comm_event(struct perf_tool *tool,
return ret;
}

+static int process_namespaces_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ struct thread *thread;
+ struct perf_script *script = container_of(tool, struct perf_script, tool);
+ struct perf_session *session = script->session;
+ struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ int ret = -1;
+
+ thread = machine__findnew_thread(machine, event->namespaces.pid,
+ event->namespaces.tid);
+ if (thread == NULL) {
+ pr_debug("problem processing NAMESPACES event, skipping it.\n");
+ return -1;
+ }
+
+ if (perf_event__process_namespaces(tool, event, sample, machine) < 0)
+ goto out;
+
+ if (!evsel->attr.sample_id_all) {
+ sample->cpu = 0;
+ sample->time = 0;
+ sample->tid = event->namespaces.tid;
+ sample->pid = event->namespaces.pid;
+ }
+ print_sample_start(sample, thread, evsel);
+ perf_event__fprintf(event, stdout);
+ ret = 0;
+out:
+ thread__put(thread);
+ return ret;
+}
+
static int process_fork_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -1293,6 +1329,8 @@ static int __cmd_script(struct perf_script *script)
}
if (script->show_switch_events)
script->tool.context_switch = process_switch_event;
+ if (script->show_namespace_events)
+ script->tool.namespaces = process_namespaces_event;

ret = perf_session__process_events(script->session);

@@ -2181,6 +2219,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"Show the mmap events"),
OPT_BOOLEAN('\0', "show-switch-events", &script.show_switch_events,
"Show context switch events (if recorded)"),
+ OPT_BOOLEAN('\0', "show-namespace-events", &script.show_namespace_events,
+ "Show namespace events (if recorded)"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_BOOLEAN(0, "ns", &nanosecs,
"Use 9 decimal places when displaying time"),
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
From: Changbin Du <chang...@intel.com>

Skip the sample which doesn't have branch_info to avoid segmentation
fault:

The fault can be reproduced by:

perf record -a
perf report -F cycles

Signed-off-by: Changbin Du <chang...@intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Fixes: 0e332f033a82 ("perf tools: Add support for cycles, weight branch_info field")
Link: http://lkml.kernel.org/r/20170313083148.23...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/sort.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f8f16c0e20b6..93f755ac60ca 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -846,6 +846,9 @@ static int hist_entry__mispredict_snprintf(struct hist_entry *he, char *bf,
static int64_t
sort__cycles_cmp(struct hist_entry *left, struct hist_entry *right)
{
+ if (!left->branch_info || !right->branch_info)
+ return cmp_null(left->branch_info, right->branch_info);
+
return left->branch_info->flags.cycles -
right->branch_info->flags.cycles;
}
@@ -853,6 +856,8 @@ sort__cycles_cmp(struct hist_entry *left, struct hist_entry *right)
static int hist_entry__cycles_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
+ if (!he->branch_info)
+ return scnprintf(bf, size, "%-.*s", width, "N/A");
if (he->branch_info->flags.cycles == 0)
return repsep_snprintf(bf, size, "%-*s", width, "-");
return repsep_snprintf(bf, size, "%-*hd", width,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 84e5b549214f2160c12318aac549de85f600c79a:

Merge tag 'perf-core-for-mingo-4.11-20170306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-07 08:14:14 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170314

for you to fetch changes up to 5f6bee34707973ea7879a7857fd63ddccc92fff3:

kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL (2017-03-14 15:17:40 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Add PERF_RECORD_NAMESPACES so that the kernel can record information
required to associate samples to namespaces, helping in container
problem characterization.

Now the 'perf record has a --namespace' option to ask for such info,
and when present, it can be used, initially, via a new sort order,
'cgroup_id', allowing histogram entry bucketization by a (device, inode)
based cgroup identifier (Hari Bathini)

- Add --next option to 'perf sched timehist', showing what is the next
thread to run (Brendan Gregg)

Fixes:

- Fix segfault with basic block 'cycles' sort dimension (Changbin Du)

- Add c2c to command-list.txt, making it appear in the 'perf help'
output (Changbin Du)

- Fix zeroing of 'abs_path' variable in the perf hists browser switch
file code (Changbin Du)

- Hide tips messages when -q/--quiet is given to 'perf report' (Namhyung Kim)

Infrastructure:

- Use ref_reloc_sym + offset to setup kretprobes (Naveen Rao)

- Ignore generated files pmu-events/{jevents,pmu-events.c} for git (Changbin Du)

Documentation:

- Document +field style argument support for --field option (Changbin Du)

- Clarify 'perf c2c --stats' help message (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Brendan Gregg (1):
perf sched timehist: Add --next option

Changbin Du (5):
perf tools: Missing c2c command in command-list
perf tools: Ignore generated files pmu-events/{jevents,pmu-events.c} for git
perf sort: Fix segfault with basic block 'cycles' sort dimension
perf report: Document +field style argument support for --field option
perf hists browser: Fix typo in function switch_data_file

Hari Bathini (5):
perf: Add PERF_RECORD_NAMESPACES to include namespaces related info
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info
perf record: Synthesize namespace events for current processes
perf script: Add script print support for namespace events
perf tools: Add 'cgroup_id' sort order keyword

Namhyung Kim (3):
perf report: Hide tip message when -q option is given
perf c2c: Clarify help message of --stats option
perf c2c: Fix display bug when using pipe

Naveen N. Rao (5):
perf probe: Factor out the ftrace README scanning
perf kretprobes: Offset from reloc_sym if kernel supports it
perf powerpc: Choose local entry point with kretprobes
doc: trace/kprobes: add information about NOKPROBE_SYMBOL
kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL

Documentation/trace/kprobetrace.txt | 5 +-
include/linux/perf_event.h | 2 +
include/uapi/linux/perf_event.h | 32 +++++-
kernel/events/core.c | 139 ++++++++++++++++++++++++++
kernel/fork.c | 2 +
kernel/kprobes.c | 5 +-
kernel/nsproxy.c | 3 +
tools/include/uapi/linux/perf_event.h | 32 +++++-
tools/perf/.gitignore | 2 +
tools/perf/Documentation/perf-record.txt | 3 +
tools/perf/Documentation/perf-report.txt | 7 +-
tools/perf/Documentation/perf-sched.txt | 4 +
tools/perf/Documentation/perf-script.txt | 3 +
tools/perf/arch/powerpc/util/sym-handling.c | 14 ++-
tools/perf/builtin-annotate.c | 1 +
tools/perf/builtin-c2c.c | 4 +-
tools/perf/builtin-diff.c | 1 +
tools/perf/builtin-inject.c | 13 +++
tools/perf/builtin-kmem.c | 1 +
tools/perf/builtin-kvm.c | 2 +
tools/perf/builtin-lock.c | 1 +
tools/perf/builtin-mem.c | 1 +
tools/perf/builtin-record.c | 35 ++++++-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-sched.c | 26 ++++-
tools/perf/builtin-script.c | 41 ++++++++
tools/perf/builtin-trace.c | 3 +-
tools/perf/command-list.txt | 1 +
tools/perf/perf.h | 1 +
tools/perf/ui/browsers/hists.c | 2 +-
tools/perf/util/Build | 1 +
tools/perf/util/data-convert-bt.c | 1 +
tools/perf/util/event.c | 150 ++++++++++++++++++++++++++--
tools/perf/util/event.h | 19 ++++
tools/perf/util/evsel.c | 3 +
tools/perf/util/hist.c | 7 ++
tools/perf/util/hist.h | 1 +
tools/perf/util/machine.c | 34 +++++++
tools/perf/util/machine.h | 3 +
tools/perf/util/namespaces.c | 36 +++++++
tools/perf/util/namespaces.h | 26 +++++
tools/perf/util/probe-event.c | 12 +--
tools/perf/util/probe-file.c | 77 ++++++++------
tools/perf/util/probe-file.h | 1 +
tools/perf/util/session.c | 7 ++
tools/perf/util/sort.c | 46 +++++++++
tools/perf/util/sort.h | 7 ++
tools/perf/util/thread.c | 44 +++++++-
tools/perf/util/thread.h | 6 ++
tools/perf/util/tool.h | 2 +
50 files changed, 799 insertions(+), 74 deletions(-)
create mode 100644 tools/perf/util/namespaces.c
create mode 100644 tools/perf/util/namespaces.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# dm
#

# uname -a
Linux zoo 4.9.13-100.fc24.x86_64 #1 SMP Mon Feb 27 16:57:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_debug_O: make DEBUG=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_pure_O: make
make_no_libbpf_O: make NO_LIBBPF=1
make_tags_O: make tags
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_perf_o_O: make perf.o
make_no_demangle_O: make NO_DEMANGLE=1
make_clean_all_O: make clean all
make_no_slang_O: make NO_SLANG=1
make_doc_O: make doc
make_no_newt_O: make NO_NEWT=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_install_bin_O: make install-bin
make_no_gtk2_O: make NO_GTK2=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_util_map_o_O: make util/map.o
make_no_libperl_O: make NO_LIBPERL=1
make_static_O: make LDFLAGS=-static
make_no_libunwind_O: make NO_LIBUNWIND=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_help_O: make help
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
OK
$

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
From: Changbin Du <chang...@intel.com>

Ignore two files: pmu-events/{jevents,pmu-events.c} which are generated
during the build.

Committer notes:

Testing it:

$ make -C tools/perf/
$ git status
On branch perf/core
Untracked files:
(use "git add <file>..." to include in what will be committed)

tools/perf/pmu-events/jevents
tools/perf/pmu-events/pmu-events.c

nothing added to commit but untracked files present (use "git add" to track)
$

After the patch:

$ git status
On branch perf/core
nothing to commit, working tree clean
$

Signed-off-by: Changbin Du <chang...@intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170313083026.23...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/.gitignore | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index 3db3db9278be..643cc4ba6872 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -31,3 +31,5 @@ config.mak.autogen
.config-detected
util/intel-pt-decoder/inat-tables.c
arch/*/include/generated/
+pmu-events/pmu-events.c
+pmu-events/jevents
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:05 PM3/14/17
to
From: Hari Bathini <hbat...@linux.vnet.ibm.com>

Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior
to invocation of perf record. The data for this is taken from /proc/$PID/ns.
These changes make way for analyzing events with regard to namespaces.

Committer notes:

Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the
test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread".

Testing it:

# ps axH > /tmp/allthreads
# perf record -a --namespaces usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ]
# perf report -D | grep PERF_RECORD_NAMESPACES | wc -l
602
# wc -l /tmp/allthreads
601 /tmp/allthreads
# tail /tmp/allthreads
16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
17176 pts/4 T 0:00 git commit --amend --no-post-rewrite
17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG
18939 ? S 0:00 [kworker/2:1]
18947 ? S 0:00 [kworker/3:0]
18974 ? S 0:00 [kworker/1:0]
19047 ? S 0:00 [kworker/0:1]
19152 pts/6 S+ 0:00 weechat
19153 pts/7 R+ 0:00 ps axH
# perf report -D | grep PERF_RECORD_NAMESPACES | tail
0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7
0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7
0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7
0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7
0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7
0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7
0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7
0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7
0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
#

Humm, investigate why we got two record for the 19155 pid/tid...

Signed-off-by: Hari Bathini <hbat...@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Aravinda Prasad <arav...@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Eric Biederman <ebie...@xmission.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sargun Dhillon <sar...@sargun.me>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/r/148891931111.25309.110...@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-record.c | 29 ++++++++++++--
tools/perf/util/event.c | 94 ++++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/event.h | 6 +++
3 files changed, 119 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 99562c7242b6..04faef79a548 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -986,6 +986,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
*/
if (forks) {
union perf_event *event;
+ pid_t tgid;

event = malloc(sizeof(event->comm) + machine->id_hdr_size);
if (event == NULL) {
@@ -999,10 +1000,30 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
* cannot see a correct process name for those events.
* Synthesize COMM event to prevent it.
*/
- perf_event__synthesize_comm(tool, event,
- rec->evlist->workload.pid,
- process_synthesized_event,
- machine);
+ tgid = perf_event__synthesize_comm(tool, event,
+ rec->evlist->workload.pid,
+ process_synthesized_event,
+ machine);
+ free(event);
+
+ if (tgid == -1)
+ goto out_child;
+
+ event = malloc(sizeof(event->namespaces) +
+ (NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
+ machine->id_hdr_size);
+ if (event == NULL) {
+ err = -ENOMEM;
+ goto out_child;
+ }
+
+ /*
+ * Synthesize NAMESPACES event for the command specified.
+ */
+ perf_event__synthesize_namespaces(tool, event,
+ rec->evlist->workload.pid,
+ tgid, process_synthesized_event,
+ machine);
free(event);

perf_evlist__start_workload(rec->evlist);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index fb52819023c7..d082cb70445d 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -221,6 +221,58 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
return tgid;
}

+static void perf_event__get_ns_link_info(pid_t pid, const char *ns,
+ struct perf_ns_link_info *ns_link_info)
+{
+ struct stat64 st;
+ char proc_ns[128];
+
+ sprintf(proc_ns, "/proc/%u/ns/%s", pid, ns);
+ if (stat64(proc_ns, &st) == 0) {
+ ns_link_info->dev = st.st_dev;
+ ns_link_info->ino = st.st_ino;
+ }
+}
+
+int perf_event__synthesize_namespaces(struct perf_tool *tool,
+ union perf_event *event,
+ pid_t pid, pid_t tgid,
+ perf_event__handler_t process,
+ struct machine *machine)
+{
+ u32 idx;
+ struct perf_ns_link_info *ns_link_info;
+
+ if (!tool || !tool->namespace_events)
+ return 0;
+
+ memset(&event->namespaces, 0, (sizeof(event->namespaces) +
+ (NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
+ machine->id_hdr_size));
+
+ event->namespaces.pid = tgid;
+ event->namespaces.tid = pid;
+
+ event->namespaces.nr_namespaces = NR_NAMESPACES;
+
+ ns_link_info = event->namespaces.link_info;
+
+ for (idx = 0; idx < event->namespaces.nr_namespaces; idx++)
+ perf_event__get_ns_link_info(pid, perf_ns__name(idx),
+ &ns_link_info[idx]);
+
+ event->namespaces.header.type = PERF_RECORD_NAMESPACES;
+
+ event->namespaces.header.size = (sizeof(event->namespaces) +
+ (NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
+ machine->id_hdr_size);
+
+ if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
+ return -1;
+
+ return 0;
+}
+
static int perf_event__synthesize_fork(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid, pid_t ppid,
@@ -452,8 +504,9 @@ int perf_event__synthesize_modules(struct perf_tool *tool,
static int __event__synthesize_thread(union perf_event *comm_event,
union perf_event *mmap_event,
union perf_event *fork_event,
+ union perf_event *namespaces_event,
pid_t pid, int full,
- perf_event__handler_t process,
+ perf_event__handler_t process,
struct perf_tool *tool,
struct machine *machine,
bool mmap_data,
@@ -473,6 +526,11 @@ static int __event__synthesize_thread(union perf_event *comm_event,
if (tgid == -1)
return -1;

+ if (perf_event__synthesize_namespaces(tool, namespaces_event, pid,
+ tgid, process, machine) < 0)
+ return -1;
+
+
return perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
process, machine, mmap_data,
proc_map_timeout);
@@ -506,6 +564,11 @@ static int __event__synthesize_thread(union perf_event *comm_event,
if (perf_event__synthesize_fork(tool, fork_event, _pid, tgid,
ppid, process, machine) < 0)
break;
+
+ if (perf_event__synthesize_namespaces(tool, namespaces_event, _pid,
+ tgid, process, machine) < 0)
+ break;
+
/*
* Send the prepared comm event
*/
@@ -534,6 +597,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
unsigned int proc_map_timeout)
{
union perf_event *comm_event, *mmap_event, *fork_event;
+ union perf_event *namespaces_event;
int err = -1, thread, j;

comm_event = malloc(sizeof(comm_event->comm) + machine->id_hdr_size);
@@ -548,10 +612,16 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
if (fork_event == NULL)
goto out_free_mmap;

+ namespaces_event = malloc(sizeof(namespaces_event->namespaces) +
+ (NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
+ machine->id_hdr_size);
+ if (namespaces_event == NULL)
+ goto out_free_fork;
+
err = 0;
for (thread = 0; thread < threads->nr; ++thread) {
if (__event__synthesize_thread(comm_event, mmap_event,
- fork_event,
+ fork_event, namespaces_event,
thread_map__pid(threads, thread), 0,
process, tool, machine,
mmap_data, proc_map_timeout)) {
@@ -577,7 +647,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
/* if not, generate events for it */
if (need_leader &&
__event__synthesize_thread(comm_event, mmap_event,
- fork_event,
+ fork_event, namespaces_event,
comm_event->comm.pid, 0,
process, tool, machine,
mmap_data, proc_map_timeout)) {
@@ -586,6 +656,8 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
}
}
}
+ free(namespaces_event);
+out_free_fork:
free(fork_event);
out_free_mmap:
free(mmap_event);
@@ -605,6 +677,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
char proc_path[PATH_MAX];
struct dirent *dirent;
union perf_event *comm_event, *mmap_event, *fork_event;
+ union perf_event *namespaces_event;
int err = -1;

if (machine__is_default_guest(machine))
@@ -622,11 +695,17 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
if (fork_event == NULL)
goto out_free_mmap;

+ namespaces_event = malloc(sizeof(namespaces_event->namespaces) +
+ (NR_NAMESPACES * sizeof(struct perf_ns_link_info)) +
+ machine->id_hdr_size);
+ if (namespaces_event == NULL)
+ goto out_free_fork;
+
snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
proc = opendir(proc_path);

if (proc == NULL)
- goto out_free_fork;
+ goto out_free_namespaces;

while ((dirent = readdir(proc)) != NULL) {
char *end;
@@ -638,13 +717,16 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
* We may race with exiting thread, so don't stop just because
* one thread couldn't be synthesized.
*/
- __event__synthesize_thread(comm_event, mmap_event, fork_event, pid,
- 1, process, tool, machine, mmap_data,
+ __event__synthesize_thread(comm_event, mmap_event, fork_event,
+ namespaces_event, pid, 1, process,
+ tool, machine, mmap_data,
proc_map_timeout);
}

err = 0;
closedir(proc);
+out_free_namespaces:
+ free(namespaces_event);
out_free_fork:
free(fork_event);
out_free_mmap:
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index b39ff795b9a9..e1d8166ebbd5 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -648,6 +648,12 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine);

+int perf_event__synthesize_namespaces(struct perf_tool *tool,
+ union perf_event *event,
+ pid_t pid, pid_t tgid,
+ perf_event__handler_t process,
+ struct machine *machine);
+
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:06 PM3/14/17
to
From: Brendan Gregg <bgr...@netflix.com>

The --next option shows the next task for each context switch, providing
more context for the sequence of scheduler events.

$ perf sched timehist --next | head
Samples do not have callchains.
time cpu task name waittime schdelay run time
[tid/pid] (msec) (msec) (msec)
---------- --- ---------- --------- ------ -----
374.793792 [0] <idle> 0.000 0.000 0.000 next: rngd[1524]
374.793801 [0] rngd[1524] 0.000 0.000 0.009 next: swapper/0[0]
374.794048 [7] <idle> 0.000 0.000 0.000 next: yes[30884]
374.794066 [7] yes[30884] 0.000 0.000 0.018 next: swapper/7[0]
374.794126 [2] <idle> 0.000 0.000 0.000 next: rngd[1524]
374.794140 [2] rngd[1524] 0.325 0.006 0.013 next: swapper/2[0]
374.794281 [3] <idle> 0.000 0.000 0.000 next: perf[31070]

Signed-off-by: Brendan Gregg <bgr...@netflix.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/1489456589-32555-1-g...@netflix.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-sched.txt | 4 ++++
tools/perf/builtin-sched.c | 25 ++++++++++++++++++++-----
2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index d33deddb0146..a092a2499e8f 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -132,6 +132,10 @@ OPTIONS for 'perf sched timehist'
--migrations::
Show migration events.

+-n::
+--next::
+ Show next task.
+
-I::
--idle-hist::
Show idle-related events only.
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 16170e9b47e6..b92c4d97192c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -221,6 +221,7 @@ struct perf_sched {
unsigned int max_stack;
bool show_cpu_visual;
bool show_wakeups;
+ bool show_next;
bool show_migrations;
bool show_state;
u64 skipped_samples;
@@ -1897,14 +1898,18 @@ static char task_state_char(struct thread *thread, int state)
}

static void timehist_print_sample(struct perf_sched *sched,
+ struct perf_evsel *evsel,
struct perf_sample *sample,
struct addr_location *al,
struct thread *thread,
u64 t, int state)
{
struct thread_runtime *tr = thread__priv(thread);
+ const char *next_comm = perf_evsel__strval(evsel, sample, "next_comm");
+ const u32 next_pid = perf_evsel__intval(evsel, sample, "next_pid");
u32 max_cpus = sched->max_cpu + 1;
char tstr[64];
+ char nstr[30];
u64 wait_time;

timestamp__scnprintf_usec(t, tstr, sizeof(tstr));
@@ -1937,7 +1942,12 @@ static void timehist_print_sample(struct perf_sched *sched,
if (sched->show_state)
printf(" %5c ", task_state_char(thread, state));

- if (sched->show_wakeups)
+ if (sched->show_next) {
+ snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
+ printf(" %-*s", comm_width, nstr);
+ }
+
+ if (sched->show_wakeups && !sched->show_next)
printf(" %-*s", comm_width, "");

if (thread->tid == 0)
@@ -2531,7 +2541,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
}

if (!sched->summary_only)
- timehist_print_sample(sched, sample, &al, thread, t, state);
+ timehist_print_sample(sched, evsel, sample, &al, thread, t, state);

out:
if (sched->hist_time.start == 0 && t >= ptime->start)
@@ -3341,6 +3351,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('S', "with-summary", &sched.summary,
"Show all syscalls and summary with statistics"),
OPT_BOOLEAN('w', "wakeups", &sched.show_wakeups, "Show wakeup events"),
+ OPT_BOOLEAN('n', "next", &sched.show_next, "Show next task"),
OPT_BOOLEAN('M', "migrations", &sched.show_migrations, "Show migration events"),
OPT_BOOLEAN('V', "cpu-visual", &sched.show_cpu_visual, "Add CPU visual"),
OPT_BOOLEAN('I', "idle-hist", &sched.idle_hist, "Show idle events only"),
@@ -3438,10 +3449,14 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
if (argc)
usage_with_options(timehist_usage, timehist_options);
}
- if (sched.show_wakeups && sched.summary_only) {
- pr_err(" Error: -s and -w are mutually exclusive.\n");
+ if ((sched.show_wakeups || sched.show_next) &&
+ sched.summary_only) {
+ pr_err(" Error: -s and -[n|w] are mutually exclusive.\n");
parse_options_usage(timehist_usage, timehist_options, "s", true);
- parse_options_usage(NULL, timehist_options, "w", true);
+ if (sched.show_wakeups)
+ parse_options_usage(NULL, timehist_options, "w", true);
+ if (sched.show_next)
+ parse_options_usage(NULL, timehist_options, "n", true);
return -EINVAL;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:06 PM3/14/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

perf now uses an offset from _text/_stext for kretprobes if the kernel
supports it, rather than the actual function name. As such, let's choose
the LEP for powerpc ABIv2 so as to ensure the probe gets hit. Do it only
if the kernel supports specifying offsets with kretprobes.

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/7445b5334673ef5404ac1d12609bad4d73...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/powerpc/util/sym-handling.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c b/tools/perf/arch/powerpc/util/sym-handling.c
index 1030a6e504bb..39dbe512b9fc 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -10,6 +10,7 @@
#include "symbol.h"
#include "map.h"
#include "probe-event.h"
+#include "probe-file.h"

#ifdef HAVE_LIBELF_SUPPORT
bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
@@ -79,13 +80,18 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
* However, if the user specifies an offset, we fall back to using the
* GEP since all userspace applications (objdump/readelf) show function
* disassembly with offsets from the GEP.
- *
- * In addition, we shouldn't specify an offset for kretprobes.
*/
- if (pev->point.offset || (!pev->uprobes && pev->point.retprobe) ||
- !map || !sym)
+ if (pev->point.offset || !map || !sym)
return;

+ /* For kretprobes, add an offset only if the kernel supports it */
+ if (!pev->uprobes && pev->point.retprobe) {
+#ifdef HAVE_LIBELF_SUPPORT
+ if (!kretprobe_offset_is_supported())
+#endif
+ return;
+ }
+
lep_offset = PPC64_LOCAL_ENTRY_OFFSET(sym->arch_sym);

if (map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:06 PM3/14/17
to
From: Namhyung Kim <namh...@kernel.org>

Currently 'perf c2c report' determines display mode using the --stdio
option, but it could be a problem if stdout is not a tty since
setup_browser falls back to stdio in this case.

But perf c2c didn't know this and tried to use TUI browser anyway. It
should check "use_browser" variable instead.

For example, the following command showed nothing and broke terminal
setting. Now it's fixed..

$ perf c2c report | head
=================================================
Trace Event Information
=================================================
Total records : 136
Locked Load/Store Operations : 6
Load Operations : 62
Loads - uncacheable : 0
Loads - IO : 1
Loads - Miss : 7
Loads - no mapping : 2

Committer notes:

When trying it without a proper perf.data file it results in a stuck
terminal, just as Namhyung reported above:

[acme@jouet ~]$ perf c2c report | head
WARNING: no sample cpu value[acme@jouet ~]$

One has to kill it from some other xterm. Confirm that this patch fixes
it:

After:

$ perf c2c report | head
WARNING: no sample cpu value=================================================
Trace Event Information
=================================================
Total records : 14
Locked Load/Store Operations : 0
Load Operations : 0
Loads - uncacheable : 0
Loads - IO : 0
Loads - Miss : 0
Loads - no mapping : 0
$

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: kerne...@lge.com
Link: http://lkml.kernel.org/r/20170307150851....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-c2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3fac30ed92f1..5cd6d7a047b9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2334,7 +2334,7 @@ static int perf_c2c__hists_browse(struct hists *hists)

static void perf_c2c_display(struct perf_session *session)
{
- if (c2c.use_stdio)
+ if (use_browser == 0)
perf_c2c__hists_fprintf(stdout, session);
else
perf_c2c__hists_browse(&c2c.hists.hists);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:06 PM3/14/17
to
From: Namhyung Kim <namh...@kernel.org>

The tip message at the end was printed regardless of the -q option.

Originally, the message suggested only '-s comm,dso' option for higher
level view when no sort option and parent option were given.

Now it shows random help message regardless of the options so the
condition can be simplified to honor the -q option.

Committer notes:

Before:

$ perf report --stdio -q
42.77% ls ls [.] _init
13.21% ls ld-2.24.so [.] match_symbol
12.55% ls libc-2.24.so [.] __strcoll_l
11.94% ls libc-2.24.so [.] _init

#
# (Tip: Show current config key-value pairs: perf config --list)
#
$

After:

$ perf report --stdio -q
42.77% ls ls [.] _init
13.21% ls ld-2.24.so [.] match_symbol
12.55% ls libc-2.24.so [.] __strcoll_l
11.94% ls libc-2.24.so [.] _init

$

We still have those two extra lines tho (that git commit insists in
turning into one, or git commit --amend doesn't make me add), food for
another patch...

Reported-and-Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Signed-off-by: Namhyung Kim <namh...@kernel.org>
Cc: Jiri Olsa <jo...@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-report.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0a88670e56f3..f03a5eac2a62 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -394,8 +394,7 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
fprintf(stdout, "\n\n");
}

- if (sort_order == NULL &&
- parent_pattern == default_parent_pattern)
+ if (!quiet)
fprintf(stdout, "#\n# (%s)\n#\n", help);

if (rep->show_threads) {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:06 PM3/14/17
to
From: Changbin Du <chang...@intel.com>

Add the c2c command to command-list.txt so perf help can list this
command.

Committer notes:

Before:

# perf help | grep c2c
#

After:

# perf help | grep c2c
c2c Shared Data C2C/HITM Analyzer.
#

Signed-off-by: Changbin Du <chang...@intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170313082845.23...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/command-list.txt | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index ac3efd396a72..2d0caf20ff3a 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -9,6 +9,7 @@ perf-buildid-cache mainporcelain common
perf-buildid-list mainporcelain common
perf-data mainporcelain common
perf-diff mainporcelain common
+perf-c2c mainporcelain common
perf-config mainporcelain common
perf-evlist mainporcelain common
perf-ftrace mainporcelain common
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:08 PM3/14/17
to
From: Hari Bathini <hbat...@linux.vnet.ibm.com>

This patch introduces a cgroup identifier entry field in perf report to
identify or distinguish data of different cgroups. It uses the device
number and inode number of cgroup namespace, included in perf data with
the new PERF_RECORD_NAMESPACES event, as cgroup identifier.

With the assumption that each container is created with it's own cgroup
namespace, this allows assessment/analysis of multiple containers at
once.

A simple test for this would be to clone a few processes passing
SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
different workloads on each of those contexts, while running perf
record command with --namespaces option.

Shown below is the output of perf report, sorted with cgroup identifier,
on perf.data generated with the above test scenario, clearly indicating
one context's considerable use of kernel memory in comparison with
others:

$ perf report -s cgroup_id,sample --stdio
#
# Total Lost Samples: 0
#
# Samples: 5K of event 'kmem:kmalloc'
# Event count (approx.): 5965
#
# Overhead cgroup id (dev/inode) Samples
# ........ ..................... ............
#
81.27% 3/0xeffffffb 4848
16.24% 3/0xf00000d0 969
1.16% 3/0xf00000ce 69
0.82% 3/0xf00000cf 49
0.50% 0/0x0 30

While this is a start, there is further scope of improving this. For
example, instead of cgroup namespace's device and inode numbers, dev
and inode numbers of some or all namespaces may be used to distinguish
which processes are running in a given container context.

Also, scripts to map device and inode info to containers sounds
plausible for better tracing of containers.

Signed-off-by: Hari Bathini <hbat...@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Aravinda Prasad <arav...@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Eric Biederman <ebie...@xmission.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sargun Dhillon <sar...@sargun.me>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/r/148891933338.25309.75...@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 4 +++-
tools/perf/util/hist.c | 7 ++++++
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 41 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 7 ++++++
5 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 672b149aa80a..e9a61f5485eb 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -72,7 +72,8 @@ OPTIONS
--sort=::
Sort histogram entries by given key(s) - multiple keys can be specified
in CSV format. Following sort keys are available:
- pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, local_weight.
+ pid, comm, dso, symbol, parent, cpu, socket, srcline, weight,
+ local_weight, cgroup_id.

Each key has following meaning:

@@ -92,6 +93,7 @@ OPTIONS
- weight: Event specific weight, e.g. memory latency or transaction
abort cost. This is the global weight.
- local_weight: Local weight version of the weight above.
+ - cgroup_id: ID derived from cgroup namespace device and inode numbers.
- transaction: Transaction abort flags.
- overhead: Overhead percentage of sample
- overhead_sys: Overhead percentage of sample running in system mode
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index eaf72a938fb4..e3b38f629504 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -3,6 +3,7 @@
#include "hist.h"
#include "map.h"
#include "session.h"
+#include "namespaces.h"
#include "sort.h"
#include "evlist.h"
#include "evsel.h"
@@ -169,6 +170,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
}

+ hists__new_col_len(hists, HISTC_CGROUP_ID, 20);
hists__new_col_len(hists, HISTC_CPU, 3);
hists__new_col_len(hists, HISTC_SOCKET, 6);
hists__new_col_len(hists, HISTC_MEM_LOCKED, 6);
@@ -574,9 +576,14 @@ __hists__add_entry(struct hists *hists,
bool sample_self,
struct hist_entry_ops *ops)
{
+ struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
.comm = thread__comm(al->thread),
+ .cgroup_id = {
+ .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
+ .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
+ },
.ms = {
.map = al->map,
.sym = al->sym,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2e839bf40bdd..ee3670a388df 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -30,6 +30,7 @@ enum hist_column {
HISTC_DSO,
HISTC_THREAD,
HISTC_COMM,
+ HISTC_CGROUP_ID,
HISTC_PARENT,
HISTC_CPU,
HISTC_SOCKET,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 93f755ac60ca..8b0d4e39f640 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -536,6 +536,46 @@ struct sort_entry sort_cpu = {
.se_width_idx = HISTC_CPU,
};

+/* --sort cgroup_id */
+
+static int64_t _sort__cgroup_dev_cmp(u64 left_dev, u64 right_dev)
+{
+ return (int64_t)(right_dev - left_dev);
+}
+
+static int64_t _sort__cgroup_inode_cmp(u64 left_ino, u64 right_ino)
+{
+ return (int64_t)(right_ino - left_ino);
+}
+
+static int64_t
+sort__cgroup_id_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ int64_t ret;
+
+ ret = _sort__cgroup_dev_cmp(right->cgroup_id.dev, left->cgroup_id.dev);
+ if (ret != 0)
+ return ret;
+
+ return _sort__cgroup_inode_cmp(right->cgroup_id.ino,
+ left->cgroup_id.ino);
+}
+
+static int hist_entry__cgroup_id_snprintf(struct hist_entry *he,
+ char *bf, size_t size,
+ unsigned int width __maybe_unused)
+{
+ return repsep_snprintf(bf, size, "%lu/0x%lx", he->cgroup_id.dev,
+ he->cgroup_id.ino);
+}
+
+struct sort_entry sort_cgroup_id = {
+ .se_header = "cgroup id (dev/inode)",
+ .se_cmp = sort__cgroup_id_cmp,
+ .se_snprintf = hist_entry__cgroup_id_snprintf,
+ .se_width_idx = HISTC_CGROUP_ID,
+};
+
/* --sort socket */

static int64_t
@@ -1464,6 +1504,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_TRANSACTION, "transaction", sort_transaction),
DIM(SORT_TRACE, "trace", sort_trace),
DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size),
+ DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
};

#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f583325a3743..baf20a399f34 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -54,6 +54,11 @@ struct he_stat {
u32 nr_events;
};

+struct namespace_id {
+ u64 dev;
+ u64 ino;
+};
+
struct hist_entry_diff {
bool computed;
union {
@@ -91,6 +96,7 @@ struct hist_entry {
struct map_symbol ms;
struct thread *thread;
struct comm *comm;
+ struct namespace_id cgroup_id;
u64 ip;
u64 transaction;
s32 socket;
@@ -212,6 +218,7 @@ enum sort_type {
SORT_TRANSACTION,
SORT_TRACE,
SORT_SYM_SIZE,
+ SORT_CGROUP_ID,

/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:08 PM3/14/17
to
From: Changbin Du <chang...@intel.com>

Should clear buf 'abs_path', not 'options'.

Signed-off-by: Changbin Du <chang...@intel.com>
Cc: Feng Tang <feng...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Fixes: 341487ab561f ("perf hists browser: Add option for runtime switching perf data file")
Link: http://lkml.kernel.org/r/20170313114652.9...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/hists.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index fc4fb669ceee..2dc82bec10c0 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2308,7 +2308,7 @@ static int switch_data_file(void)
return ret;

memset(options, 0, sizeof(options));
- memset(options, 0, sizeof(abs_path));
+ memset(abs_path, 0, sizeof(abs_path));

while ((dent = readdir(pwd_dir))) {
char path[PATH_MAX];
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:09 PM3/14/17
to
From: Namhyung Kim <namh...@kernel.org>

As it is not strictly asking for only stdio output, but will imply using
it.

Signed-off-by: Namhyung Kim <namh...@kernel.org>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: kerne...@lge.com
Link: http://lkml.kernel.org/r/20170307150851....@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-c2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e2b21723bbf8..3fac30ed92f1 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2536,7 +2536,7 @@ static int perf_c2c__report(int argc, const char **argv)
OPT_BOOLEAN(0, "stdio", &c2c.use_stdio, "Use the stdio interface"),
#endif
OPT_BOOLEAN(0, "stats", &c2c.stats_only,
- "Use the stdio interface"),
+ "Display only statistic tables (implies --stdio)"),
OPT_BOOLEAN(0, "full-symbols", &c2c.symbol_full,
"Display full length of symbols"),
OPT_BOOLEAN(0, "no-source", &no_source,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:09 PM3/14/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

We indicate support for accepting sym+offset with kretprobes through a
line in ftrace README. Parse the same to identify support and choose the
appropriate format for kprobe_events.

As an example, without this perf patch, but with the ftrace changes:

naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/tracing/README | grep kretprobe
place (kretprobe): [<module>:]<symbol>[+<offset>]|<memaddr>
naveen@ubuntu:~/linux/tools/perf$
naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7d8]
Probe point found: do_open+0
Matched function: do_open [35d76b5]
found inline addr: 0xc0000000004ba984
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open do_open+0
Writing event: r:probe/do_open_1 do_open+0
Added new events:
probe:do_open (on do_open%return)
probe:do_open_1 (on do_open%return)

You can now use it in all perf tools, such as:

perf record -e probe:do_open_1 -aR sleep 1

naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED]
c0000000004433d0 r do_open+0x0 [DISABLED]
c0000000004433d0 r do_open+0x0 [DISABLED]

And after this patch (and the subsequent powerpc patch):

naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7d8]
Probe point found: do_open+0
Matched function: do_open [35d76b5]
found inline addr: 0xc0000000004ba984
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//README write=0
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open _text+4469712
Writing event: r:probe/do_open_1 _text+4956248
Added new events:
probe:do_open (on do_open%return)
probe:do_open_1 (on do_open%return)

You can now use it in all perf tools, such as:

perf record -e probe:do_open_1 -aR sleep 1

naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED]
c0000000004433d0 r do_open+0x0 [DISABLED]
c0000000004ba058 r do_open+0x8 [DISABLED]

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/496ef9f33c1ab16286ece9dd62aa672807...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-event.c | 12 +++++-------
tools/perf/util/probe-file.c | 7 +++++++
tools/perf/util/probe-file.h | 1 +
3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 28fb62c32678..c9bdc9ded0c3 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -757,7 +757,9 @@ post_process_kernel_probe_trace_events(struct probe_trace_event *tevs,
}

for (i = 0; i < ntevs; i++) {
- if (!tevs[i].point.address || tevs[i].point.retprobe)
+ if (!tevs[i].point.address)
+ continue;
+ if (tevs[i].point.retprobe && !kretprobe_offset_is_supported())
continue;
/* If we found a wrong one, mark it by NULL symbol */
if (kprobe_warn_out_range(tevs[i].point.symbol,
@@ -1528,11 +1530,6 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
return -EINVAL;
}

- if (pp->retprobe && !pp->function) {
- semantic_error("Return probe requires an entry function.\n");
- return -EINVAL;
- }
-
if ((pp->offset || pp->line || pp->lazy_line) && pp->retprobe) {
semantic_error("Offset/Line/Lazy pattern can't be used with "
"return probe.\n");
@@ -2841,7 +2838,8 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
}

/* Note that the symbols in the kmodule are not relocated */
- if (!pev->uprobes && !pp->retprobe && !pev->target) {
+ if (!pev->uprobes && !pev->target &&
+ (!pp->retprobe || kretprobe_offset_is_supported())) {
reloc_sym = kernel_get_ref_reloc_sym();
if (!reloc_sym) {
pr_warning("Relocated base symbol is not found!\n");
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 8a219cd831b7..1542cd0d6799 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -879,6 +879,7 @@ int probe_cache__show_all_caches(struct strfilter *filter)

enum ftrace_readme {
FTRACE_README_PROBE_TYPE_X = 0,
+ FTRACE_README_KRETPROBE_OFFSET,
FTRACE_README_END,
};

@@ -889,6 +890,7 @@ static struct {
#define DEFINE_TYPE(idx, pat) \
[idx] = {.pattern = pat, .avail = false}
DEFINE_TYPE(FTRACE_README_PROBE_TYPE_X, "*type: * x8/16/32/64,*"),
+ DEFINE_TYPE(FTRACE_README_KRETPROBE_OFFSET, "*place (kretprobe): *"),
};

static bool scan_ftrace_readme(enum ftrace_readme type)
@@ -939,3 +941,8 @@ bool probe_type_is_available(enum probe_type type)

return true;
}
+
+bool kretprobe_offset_is_supported(void)
+{
+ return scan_ftrace_readme(FTRACE_README_KRETPROBE_OFFSET);
+}
diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
index a17a82eff8a0..dbf95a00864a 100644
--- a/tools/perf/util/probe-file.h
+++ b/tools/perf/util/probe-file.h
@@ -65,6 +65,7 @@ struct probe_cache_entry *probe_cache__find_by_name(struct probe_cache *pcache,
const char *group, const char *event);
int probe_cache__show_all_caches(struct strfilter *filter);
bool probe_type_is_available(enum probe_type type);
+bool kretprobe_offset_is_supported(void);
#else /* ! HAVE_LIBELF_SUPPORT */
static inline struct probe_cache *probe_cache__new(const char *tgt __maybe_unused)
{
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:09 PM3/14/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

commit fc62d0207ae0 ("kprobes: Introduce weak variant of
kprobe_exceptions_notify()") used the __kprobes annotation to exclude
kprobe_exceptions_notify from being probed. Since NOKPROBE_SYMBOL() is a
better way to do this enabling the symbol to be discovered as being
blacklisted, change over to using NOKPROBE_SYMBOL().

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/3f25bf400da5c222cd9b10eec6ded2d6b5...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
kernel/kprobes.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 448759d4a263..4780ec236035 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1740,11 +1740,12 @@ void unregister_kprobes(struct kprobe **kps, int num)
}
EXPORT_SYMBOL_GPL(unregister_kprobes);

-int __weak __kprobes kprobe_exceptions_notify(struct notifier_block *self,
- unsigned long val, void *data)
+int __weak kprobe_exceptions_notify(struct notifier_block *self,
+ unsigned long val, void *data)
{
return NOTIFY_DONE;
}
+NOKPROBE_SYMBOL(kprobe_exceptions_notify);

static struct notifier_block kprobe_exceptions_nb = {
.notifier_call = kprobe_exceptions_notify,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:09 PM3/14/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

Simplify and separate out the ftrace README scanning logic into a
separate helper. This is used subsequently to scan for all patterns of
interest and to cache the result.

Since we are only interested in availability of probe argument type x,
we will only scan for that.

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/6dc30edc747ba82a236593be6cf3a046fa...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-file.c | 70 +++++++++++++++++++++++---------------------
1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 1a62daceb028..8a219cd831b7 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -877,35 +877,31 @@ int probe_cache__show_all_caches(struct strfilter *filter)
return 0;
}

+enum ftrace_readme {
+ FTRACE_README_PROBE_TYPE_X = 0,
+ FTRACE_README_END,
+};
+
static struct {
const char *pattern;
- bool avail;
- bool checked;
-} probe_type_table[] = {
-#define DEFINE_TYPE(idx, pat, def_avail) \
- [idx] = {.pattern = pat, .avail = (def_avail)}
- DEFINE_TYPE(PROBE_TYPE_U, "* u8/16/32/64,*", true),
- DEFINE_TYPE(PROBE_TYPE_S, "* s8/16/32/64,*", true),
- DEFINE_TYPE(PROBE_TYPE_X, "* x8/16/32/64,*", false),
- DEFINE_TYPE(PROBE_TYPE_STRING, "* string,*", true),
- DEFINE_TYPE(PROBE_TYPE_BITFIELD,
- "* b<bit-width>@<bit-offset>/<container-size>", true),
+ bool avail;
+} ftrace_readme_table[] = {
+#define DEFINE_TYPE(idx, pat) \
+ [idx] = {.pattern = pat, .avail = false}
+ DEFINE_TYPE(FTRACE_README_PROBE_TYPE_X, "*type: * x8/16/32/64,*"),
};

-bool probe_type_is_available(enum probe_type type)
+static bool scan_ftrace_readme(enum ftrace_readme type)
{
+ int fd;
FILE *fp;
char *buf = NULL;
size_t len = 0;
- bool target_line = false;
- bool ret = probe_type_table[type].avail;
- int fd;
+ bool ret = false;
+ static bool scanned = false;

- if (type >= PROBE_TYPE_END)
- return false;
- /* We don't have to check the type which supported by default */
- if (ret || probe_type_table[type].checked)
- return ret;
+ if (scanned)
+ goto result;

fd = open_trace_file("README", false);
if (fd < 0)
@@ -917,21 +913,29 @@ bool probe_type_is_available(enum probe_type type)
return ret;
}

- while (getline(&buf, &len, fp) > 0 && !ret) {
- if (!target_line) {
- target_line = !!strstr(buf, " type: ");
- if (!target_line)
- continue;
- } else if (strstr(buf, "\t ") != buf)
- break;
- ret = strglobmatch(buf, probe_type_table[type].pattern);
- }
- /* Cache the result */
- probe_type_table[type].checked = true;
- probe_type_table[type].avail = ret;
+ while (getline(&buf, &len, fp) > 0)
+ for (enum ftrace_readme i = 0; i < FTRACE_README_END; i++)
+ if (!ftrace_readme_table[i].avail)
+ ftrace_readme_table[i].avail =
+ strglobmatch(buf, ftrace_readme_table[i].pattern);
+ scanned = true;

fclose(fp);
free(buf);

- return ret;
+result:
+ if (type >= FTRACE_README_END)
+ return false;
+
+ return ftrace_readme_table[type].avail;
+}
+
+bool probe_type_is_available(enum probe_type type)
+{
+ if (type >= PROBE_TYPE_END)
+ return false;
+ else if (type == PROBE_TYPE_X)
+ return scan_ftrace_readme(FTRACE_README_PROBE_TYPE_X);
+
+ return true;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:10 PM3/14/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

Update kprobe tracer documentation to also mention that
NOKPROBE_SYMBOL() and nokprobe_inline add symbols to the kprobes
blacklist.

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/d924e20de099579ace4286e610304f054c...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
Documentation/trace/kprobetrace.txt | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index 41ef9d8efe95..5ea85059db3b 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -8,8 +8,9 @@ Overview
--------
These events are similar to tracepoint based events. Instead of Tracepoint,
this is based on kprobes (kprobe and kretprobe). So it can probe wherever
-kprobes can probe (this means, all functions body except for __kprobes
-functions). Unlike the Tracepoint based event, this can be added and removed
+kprobes can probe (this means, all functions except those with
+__kprobes/nokprobe_inline annotation and those marked NOKPROBE_SYMBOL).
+Unlike the Tracepoint based event, this can be added and removed
dynamically, on the fly.

To enable this feature, build your kernel with CONFIG_KPROBE_EVENTS=y.
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 14, 2017, 3:00:11 PM3/14/17
to
From: Hari Bathini <hbat...@linux.vnet.ibm.com>

Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.

Committer notes:

Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.

Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:

util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:

# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#

Signed-off-by: Hari Bathini <hbat...@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexei Starovoitov <a...@fb.com>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Aravinda Prasad <arav...@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: Daniel Borkmann <dan...@iogearbox.net>
Cc: Eric Biederman <ebie...@xmission.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sargun Dhillon <sar...@sargun.me>
Cc: Steven Rostedt <ros...@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.184...@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/uapi/linux/perf_event.h | 32 +++++++++++++++++-
tools/perf/Documentation/perf-record.txt | 3 ++
tools/perf/builtin-annotate.c | 1 +
tools/perf/builtin-diff.c | 1 +
tools/perf/builtin-inject.c | 13 ++++++++
tools/perf/builtin-kmem.c | 1 +
tools/perf/builtin-kvm.c | 2 ++
tools/perf/builtin-lock.c | 1 +
tools/perf/builtin-mem.c | 1 +
tools/perf/builtin-record.c | 6 ++++
tools/perf/builtin-report.c | 1 +
tools/perf/builtin-sched.c | 1 +
tools/perf/builtin-script.c | 1 +
tools/perf/builtin-trace.c | 3 +-
tools/perf/perf.h | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/data-convert-bt.c | 1 +
tools/perf/util/event.c | 56 ++++++++++++++++++++++++++++++++
tools/perf/util/event.h | 13 ++++++++
tools/perf/util/evsel.c | 3 ++
tools/perf/util/machine.c | 34 +++++++++++++++++++
tools/perf/util/machine.h | 3 ++
tools/perf/util/namespaces.c | 36 ++++++++++++++++++++
tools/perf/util/namespaces.h | 26 +++++++++++++++
tools/perf/util/session.c | 7 ++++
tools/perf/util/thread.c | 44 +++++++++++++++++++++++--
tools/perf/util/thread.h | 6 ++++
tools/perf/util/tool.h | 2 ++
28 files changed, 296 insertions(+), 4 deletions(-)
create mode 100644 tools/perf/util/namespaces.c
create mode 100644 tools/perf/util/namespaces.h

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index c66a485a24ac..bec0aad0e15c 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -344,7 +344,8 @@ struct perf_event_attr {
use_clockid : 1, /* use @clockid for time fields */
context_switch : 1, /* context switch data */
write_backward : 1, /* Write ring buffer from end to beginning */
- __reserved_1 : 36;
+ namespaces : 1, /* include namespaces data */
+ __reserved_1 : 35;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -610,6 +611,23 @@ struct perf_event_header {
__u16 size;
};

+struct perf_ns_link_info {
+ __u64 dev;
+ __u64 ino;
+};
+
+enum {
+ NET_NS_INDEX = 0,
+ UTS_NS_INDEX = 1,
+ IPC_NS_INDEX = 2,
+ PID_NS_INDEX = 3,
+ USER_NS_INDEX = 4,
+ MNT_NS_INDEX = 5,
+ CGROUP_NS_INDEX = 6,
+
+ NR_NAMESPACES, /* number of available namespaces */
+};
+
enum perf_event_type {

/*
@@ -862,6 +880,18 @@ enum perf_event_type {
*/
PERF_RECORD_SWITCH_CPU_WIDE = 15,

+ /*
+ * struct {
+ * struct perf_event_header header;
+ * u32 pid;
+ * u32 tid;
+ * u64 nr_namespaces;
+ * { u64 dev, inode; } [nr_namespaces];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_NAMESPACES = 16,
+
PERF_RECORD_MAX, /* non-ABI */
};

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index b16003ec14a7..ea3789d05e5e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -347,6 +347,9 @@ Enable weightened sampling. An additional weight is recorded per sample and can
displayed with the weight and local_weight sort keys. This currently works for TSX
abort events and some memory events in precise mode on modern Intel CPUs.

+--namespaces::
+Record events of type PERF_RECORD_NAMESPACES.
+
--transaction::
Record transaction flags for transaction related events.

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 4f52d85f5ebc..e54b1f9fe1ee 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -393,6 +393,7 @@ int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
.comm = perf_event__process_comm,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 1b96a3122228..5e4803158672 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -364,6 +364,7 @@ static struct perf_tool tool = {
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.lost = perf_event__process_lost,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
.ordering_requires_timestamps = true,
};
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index b9bc7e39833a..8d1d13b9bab6 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -333,6 +333,18 @@ static int perf_event__repipe_comm(struct perf_tool *tool,
return err;
}

+static int perf_event__repipe_namespaces(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ int err = perf_event__process_namespaces(tool, event, sample, machine);
+
+ perf_event__repipe(tool, event, sample, machine);
+
+ return err;
+}
+
static int perf_event__repipe_exit(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -660,6 +672,7 @@ static int __cmd_inject(struct perf_inject *inject)
session->itrace_synth_opts = &inject->itrace_synth_opts;
inject->itrace_synth_opts.inject = true;
inject->tool.comm = perf_event__repipe_comm;
+ inject->tool.namespaces = perf_event__repipe_namespaces;
inject->tool.exit = perf_event__repipe_exit;
inject->tool.id_index = perf_event__repipe_id_index;
inject->tool.auxtrace_info = perf_event__process_auxtrace_info;
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 6da8d083e4e5..d509e74bc6e8 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -964,6 +964,7 @@ static struct perf_tool perf_kmem = {
.comm = perf_event__process_comm,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
};

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 08fa88f62a24..18e6c38864bc 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1044,6 +1044,7 @@ static int read_events(struct perf_kvm_stat *kvm)
struct perf_tool eops = {
.sample = process_sample_event,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
};
struct perf_data_file file = {
@@ -1348,6 +1349,7 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
kvm->tool.exit = perf_event__process_exit;
kvm->tool.fork = perf_event__process_fork;
kvm->tool.lost = process_lost_event;
+ kvm->tool.namespaces = perf_event__process_namespaces;
kvm->tool.ordered_events = true;
perf_tool__fill_defaults(&kvm->tool);

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index ce3bfb48b26f..d750ccaa978f 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -858,6 +858,7 @@ static int __cmd_report(bool display_info)
struct perf_tool eops = {
.sample = process_sample_event,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
};
struct perf_data_file file = {
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 6114e07ca613..030a6cfdda59 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -342,6 +342,7 @@ int cmd_mem(int argc, const char **argv, const char *prefix __maybe_unused)
.lost = perf_event__process_lost,
.fork = perf_event__process_fork,
.build_id = perf_event__process_build_id,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
},
.input_name = "perf.data",
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index bc84a375295d..99562c7242b6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -876,6 +876,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
signal(SIGTERM, sig_handler);
signal(SIGSEGV, sigsegv_handler);

+ if (rec->opts.record_namespaces)
+ tool->namespace_events = true;
+
if (rec->opts.auxtrace_snapshot_mode || rec->switch_output.enabled) {
signal(SIGUSR2, snapshot_sig_handler);
if (rec->opts.auxtrace_snapshot_mode)
@@ -1497,6 +1500,7 @@ static struct record record = {
.fork = perf_event__process_fork,
.exit = perf_event__process_exit,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.ordered_events = true,
@@ -1611,6 +1615,8 @@ static struct option __record_options[] = {
"opts", "AUX area tracing Snapshot Mode", ""),
OPT_UINTEGER(0, "proc-map-timeout", &record.opts.proc_map_timeout,
"per thread proc mmap processing timeout in ms"),
+ OPT_BOOLEAN(0, "namespaces", &record.opts.record_namespaces,
+ "Record namespaces events"),
OPT_BOOLEAN(0, "switch-events", &record.opts.record_switch_events,
"Record context switch events"),
OPT_BOOLEAN_FLAG(0, "all-kernel", &record.opts.all_kernel,
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index f03a5eac2a62..5ab8117c3bfd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -700,6 +700,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.lost = perf_event__process_lost,
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index b94cf0de715a..16170e9b47e6 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3272,6 +3272,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.tool = {
.sample = perf_sched__process_tracepoint_sample,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.lost = perf_event__process_lost,
.fork = perf_sched__process_fork_event,
.ordered_events = true,
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c0783b4f7b6c..f1ce806a1f31 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2097,6 +2097,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
+ .namespaces = perf_event__process_namespaces,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.attr = process_attr,
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 256f1fac6f7e..912fedc5b42d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2415,8 +2415,9 @@ static int trace__replay(struct trace *trace)
trace->tool.exit = perf_event__process_exit;
trace->tool.fork = perf_event__process_fork;
trace->tool.attr = perf_event__process_attr;
- trace->tool.tracing_data = perf_event__process_tracing_data;
+ trace->tool.tracing_data = perf_event__process_tracing_data;
trace->tool.build_id = perf_event__process_build_id;
+ trace->tool.namespaces = perf_event__process_namespaces;

trace->tool.ordered_events = true;
trace->tool.ordering_requires_timestamps = true;
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1c27d947c2fe..806c216a1078 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -50,6 +50,7 @@ struct record_opts {
bool running_time;
bool full_auxtrace;
bool auxtrace_snapshot_mode;
+ bool record_namespaces;
bool record_switch_events;
bool all_kernel;
bool all_user;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 5da376bc1afc..2ea5ee179a3b 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -42,6 +42,7 @@ libperf-y += pstack.o
libperf-y += session.o
libperf-$(CONFIG_AUDIT) += syscalltbl.o
libperf-y += ordered-events.o
+libperf-y += namespaces.o
libperf-y += comm.o
libperf-y += thread.o
libperf-y += thread_map.o
diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 4e6cbc99f08e..89ece2445713 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1468,6 +1468,7 @@ int bt_convert__perf2ctf(const char *input, const char *path,
.lost = perf_event__process_lost,
.tracing_data = perf_event__process_tracing_data,
.build_id = perf_event__process_build_id,
+ .namespaces = perf_event__process_namespaces,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 4ea7ce72ed9c..fb52819023c7 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -31,6 +31,7 @@ static const char *perf_event__names[] = {
[PERF_RECORD_LOST_SAMPLES] = "LOST_SAMPLES",
[PERF_RECORD_SWITCH] = "SWITCH",
[PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE",
+ [PERF_RECORD_NAMESPACES] = "NAMESPACES",
[PERF_RECORD_HEADER_ATTR] = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA",
@@ -49,6 +50,16 @@ static const char *perf_event__names[] = {
[PERF_RECORD_TIME_CONV] = "TIME_CONV",
};

+static const char *perf_ns__names[] = {
+ [NET_NS_INDEX] = "net",
+ [UTS_NS_INDEX] = "uts",
+ [IPC_NS_INDEX] = "ipc",
+ [PID_NS_INDEX] = "pid",
+ [USER_NS_INDEX] = "user",
+ [MNT_NS_INDEX] = "mnt",
+ [CGROUP_NS_INDEX] = "cgroup",
+};
+
const char *perf_event__name(unsigned int id)
{
if (id >= ARRAY_SIZE(perf_event__names))
@@ -58,6 +69,13 @@ const char *perf_event__name(unsigned int id)
return perf_event__names[id];
}

+static const char *perf_ns__name(unsigned int id)
+{
+ if (id >= ARRAY_SIZE(perf_ns__names))
+ return "UNKNOWN";
+ return perf_ns__names[id];
+}
+
static int perf_tool__process_synth_event(struct perf_tool *tool,
union perf_event *event,
struct machine *machine,
@@ -1008,6 +1026,33 @@ size_t perf_event__fprintf_comm(union perf_event *event, FILE *fp)
return fprintf(fp, "%s: %s:%d/%d\n", s, event->comm.comm, event->comm.pid, event->comm.tid);
}

+size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp)
+{
+ size_t ret = 0;
+ struct perf_ns_link_info *ns_link_info;
+ u32 nr_namespaces, idx;
+
+ ns_link_info = event->namespaces.link_info;
+ nr_namespaces = event->namespaces.nr_namespaces;
+
+ ret += fprintf(fp, " %d/%d - nr_namespaces: %u\n\t\t[",
+ event->namespaces.pid,
+ event->namespaces.tid,
+ nr_namespaces);
+
+ for (idx = 0; idx < nr_namespaces; idx++) {
+ if (idx && (idx % 4 == 0))
+ ret += fprintf(fp, "\n\t\t ");
+
+ ret += fprintf(fp, "%u/%s: %" PRIu64 "/%#" PRIx64 "%s", idx,
+ perf_ns__name(idx), (u64)ns_link_info[idx].dev,
+ (u64)ns_link_info[idx].ino,
+ ((idx + 1) != nr_namespaces) ? ", " : "]\n");
+ }
+
+ return ret;
+}
+
int perf_event__process_comm(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
@@ -1016,6 +1061,14 @@ int perf_event__process_comm(struct perf_tool *tool __maybe_unused,
return machine__process_comm_event(machine, event, sample);
}

+int perf_event__process_namespaces(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ return machine__process_namespaces_event(machine, event, sample);
+}
+
int perf_event__process_lost(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
@@ -1196,6 +1249,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
case PERF_RECORD_MMAP:
ret += perf_event__fprintf_mmap(event, fp);
break;
+ case PERF_RECORD_NAMESPACES:
+ ret += perf_event__fprintf_namespaces(event, fp);
+ break;
case PERF_RECORD_MMAP2:
ret += perf_event__fprintf_mmap2(event, fp);
break;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c735c53a26f8..b39ff795b9a9 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -39,6 +39,13 @@ struct comm_event {
char comm[16];
};

+struct namespaces_event {
+ struct perf_event_header header;
+ u32 pid, tid;
+ u64 nr_namespaces;
+ struct perf_ns_link_info link_info[];
+};
+
struct fork_event {
struct perf_event_header header;
u32 pid, ppid;
@@ -485,6 +492,7 @@ union perf_event {
struct mmap_event mmap;
struct mmap2_event mmap2;
struct comm_event comm;
+ struct namespaces_event namespaces;
struct fork_event fork;
struct lost_event lost;
struct lost_samples_event lost_samples;
@@ -587,6 +595,10 @@ int perf_event__process_switch(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine);
+int perf_event__process_namespaces(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine);
int perf_event__process_mmap(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -653,6 +665,7 @@ size_t perf_event__fprintf_itrace_start(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_cpu_map(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
size_t perf_event__fprintf(union perf_event *event, FILE *fp);

u64 kallsyms__get_function_start(const char *kallsyms_filename,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ac59710b79e0..175dc2305aa8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -932,6 +932,9 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm = track;

+ if (opts->record_namespaces)
+ attr->namespaces = track;
+
if (opts->record_switch_events)
attr->context_switch = track;

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b9974fe41bc1..dfc600446586 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -13,6 +13,7 @@
#include <symbol/kallsyms.h>
#include "unwind.h"
#include "linux/hash.h"
+#include "asm/bug.h"

static void __machine__remove_thread(struct machine *machine, struct thread *th, bool lock);

@@ -501,6 +502,37 @@ int machine__process_comm_event(struct machine *machine, union perf_event *event
return err;
}

+int machine__process_namespaces_event(struct machine *machine __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused)
+{
+ struct thread *thread = machine__findnew_thread(machine,
+ event->namespaces.pid,
+ event->namespaces.tid);
+ int err = 0;
+
+ WARN_ONCE(event->namespaces.nr_namespaces > NR_NAMESPACES,
+ "\nWARNING: kernel seems to support more namespaces than perf"
+ " tool.\nTry updating the perf tool..\n\n");
+
+ WARN_ONCE(event->namespaces.nr_namespaces < NR_NAMESPACES,
+ "\nWARNING: perf tool seems to support more namespaces than"
+ " the kernel.\nTry updating the kernel..\n\n");
+
+ if (dump_trace)
+ perf_event__fprintf_namespaces(event, stdout);
+
+ if (thread == NULL ||
+ thread__set_namespaces(thread, sample->time, &event->namespaces)) {
+ dump_printf("problem processing PERF_RECORD_NAMESPACES, skipping event.\n");
+ err = -1;
+ }
+
+ thread__put(thread);
+
+ return err;
+}
+
int machine__process_lost_event(struct machine *machine __maybe_unused,
union perf_event *event, struct perf_sample *sample __maybe_unused)
{
@@ -1538,6 +1570,8 @@ int machine__process_event(struct machine *machine, union perf_event *event,
ret = machine__process_comm_event(machine, event, sample); break;
case PERF_RECORD_MMAP:
ret = machine__process_mmap_event(machine, event, sample); break;
+ case PERF_RECORD_NAMESPACES:
+ ret = machine__process_namespaces_event(machine, event, sample); break;
case PERF_RECORD_MMAP2:
ret = machine__process_mmap2_event(machine, event, sample); break;
case PERF_RECORD_FORK:
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index a28305029711..3cdb1340f917 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -97,6 +97,9 @@ int machine__process_itrace_start_event(struct machine *machine,
union perf_event *event);
int machine__process_switch_event(struct machine *machine,
union perf_event *event);
+int machine__process_namespaces_event(struct machine *machine,
+ union perf_event *event,
+ struct perf_sample *sample);
int machine__process_mmap_event(struct machine *machine, union perf_event *event,
struct perf_sample *sample);
int machine__process_mmap2_event(struct machine *machine, union perf_event *event,
diff --git a/tools/perf/util/namespaces.c b/tools/perf/util/namespaces.c
new file mode 100644
index 000000000000..2de8da64d90c
--- /dev/null
+++ b/tools/perf/util/namespaces.c
@@ -0,0 +1,36 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2017 Hari Bathini, IBM Corporation
+ */
+
+#include "namespaces.h"
+#include "util.h"
+#include "event.h"
+#include <stdlib.h>
+#include <stdio.h>
+
+struct namespaces *namespaces__new(struct namespaces_event *event)
+{
+ struct namespaces *namespaces;
+ u64 link_info_size = ((event ? event->nr_namespaces : NR_NAMESPACES) *
+ sizeof(struct perf_ns_link_info));
+
+ namespaces = zalloc(sizeof(struct namespaces) + link_info_size);
+ if (!namespaces)
+ return NULL;
+
+ namespaces->end_time = -1;
+
+ if (event)
+ memcpy(namespaces->link_info, event->link_info, link_info_size);
+
+ return namespaces;
+}
+
+void namespaces__free(struct namespaces *namespaces)
+{
+ free(namespaces);
+}
diff --git a/tools/perf/util/namespaces.h b/tools/perf/util/namespaces.h
new file mode 100644
index 000000000000..468f1e9a1484
--- /dev/null
+++ b/tools/perf/util/namespaces.h
@@ -0,0 +1,26 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2017 Hari Bathini, IBM Corporation
+ */
+
+#ifndef __PERF_NAMESPACES_H
+#define __PERF_NAMESPACES_H
+
+#include "../perf.h"
+#include <linux/list.h>
+
+struct namespaces_event;
+
+struct namespaces {
+ struct list_head list;
+ u64 end_time;
+ struct perf_ns_link_info link_info[];
+};
+
+struct namespaces *namespaces__new(struct namespaces_event *event);
+void namespaces__free(struct namespaces *namespaces);
+
+#endif /* __PERF_NAMESPACES_H */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1dd617d116b5..ae42e742d461 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1239,6 +1239,8 @@ static int machines__deliver_event(struct machines *machines,
return tool->mmap2(tool, event, sample, machine);
case PERF_RECORD_COMM:
return tool->comm(tool, event, sample, machine);
+ case PERF_RECORD_NAMESPACES:
+ return tool->namespaces(tool, event, sample, machine);
case PERF_RECORD_FORK:
return tool->fork(tool, event, sample, machine);
case PERF_RECORD_EXIT:
@@ -1494,6 +1496,11 @@ int perf_session__register_idle_thread(struct perf_session *session)
err = -1;
}

+ if (thread == NULL || thread__set_namespaces(thread, 0, NULL)) {
+ pr_err("problem inserting idle task.\n");
+ err = -1;
+ }
+
/* machine__findnew_thread() got the thread, so put it */
thread__put(thread);
return err;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 74e79d26b421..dcdb87a5d0a1 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -7,6 +7,7 @@
#include "thread-stack.h"
#include "util.h"
#include "debug.h"
+#include "namespaces.h"
#include "comm.h"
#include "unwind.h"

@@ -40,6 +41,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
thread->tid = tid;
thread->ppid = -1;
thread->cpu = -1;
+ INIT_LIST_HEAD(&thread->namespaces_list);
INIT_LIST_HEAD(&thread->comm_list);

comm_str = malloc(32);
@@ -66,7 +68,8 @@ struct thread *thread__new(pid_t pid, pid_t tid)

void thread__delete(struct thread *thread)
{
- struct comm *comm, *tmp;
+ struct namespaces *namespaces, *tmp_namespaces;
+ struct comm *comm, *tmp_comm;

BUG_ON(!RB_EMPTY_NODE(&thread->rb_node));

@@ -76,7 +79,12 @@ void thread__delete(struct thread *thread)
map_groups__put(thread->mg);
thread->mg = NULL;
}
- list_for_each_entry_safe(comm, tmp, &thread->comm_list, list) {
+ list_for_each_entry_safe(namespaces, tmp_namespaces,
+ &thread->namespaces_list, list) {
+ list_del(&namespaces->list);
+ namespaces__free(namespaces);
+ }
+ list_for_each_entry_safe(comm, tmp_comm, &thread->comm_list, list) {
list_del(&comm->list);
comm__free(comm);
}
@@ -104,6 +112,38 @@ void thread__put(struct thread *thread)
}
}

+struct namespaces *thread__namespaces(const struct thread *thread)
+{
+ if (list_empty(&thread->namespaces_list))
+ return NULL;
+
+ return list_first_entry(&thread->namespaces_list, struct namespaces, list);
+}
+
+int thread__set_namespaces(struct thread *thread, u64 timestamp,
+ struct namespaces_event *event)
+{
+ struct namespaces *new, *curr = thread__namespaces(thread);
+
+ new = namespaces__new(event);
+ if (!new)
+ return -ENOMEM;
+
+ list_add(&new->list, &thread->namespaces_list);
+
+ if (timestamp && curr) {
+ /*
+ * setns syscall must have changed few or all the namespaces
+ * of this thread. Update end time for the namespaces
+ * previously used.
+ */
+ curr = list_next_entry(new, list);
+ curr->end_time = timestamp;
+ }
+
+ return 0;
+}
+
struct comm *thread__comm(const struct thread *thread)
{
if (list_empty(&thread->comm_list))
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index e57188546465..4eb849e9098f 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -28,6 +28,7 @@ struct thread {
bool comm_set;
int comm_len;
bool dead; /* if set thread has exited */
+ struct list_head namespaces_list;
struct list_head comm_list;
u64 db_id;

@@ -40,6 +41,7 @@ struct thread {
};

struct machine;
+struct namespaces;
struct comm;

struct thread *thread__new(pid_t pid, pid_t tid);
@@ -62,6 +64,10 @@ static inline void thread__exited(struct thread *thread)
thread->dead = true;
}

+struct namespaces *thread__namespaces(const struct thread *thread);
+int thread__set_namespaces(struct thread *thread, u64 timestamp,
+ struct namespaces_event *event);
+
int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
bool exec);
static inline int thread__set_comm(struct thread *thread, const char *comm,
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index ac2590a3de2d..829471a1c6d7 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -40,6 +40,7 @@ struct perf_tool {
event_op mmap,
mmap2,
comm,
+ namespaces,
fork,
exit,
lost,
@@ -66,6 +67,7 @@ struct perf_tool {
event_op3 auxtrace;
bool ordered_events;
bool ordering_requires_timestamps;
+ bool namespace_events;
};

#endif /* __PERF_TOOL_H */
--
2.9.3

Ingo Molnar

unread,
Mar 15, 2017, 2:40:08 PM3/15/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Mar 16, 2017, 12:20:05 PM3/16/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit ffa86c2f1a8862cf58c873f6f14d4b2c3250fb48:

Merge tag 'perf-core-for-mingo-4.12-20170314' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-15 19:27:27 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170316

for you to fetch changes up to 61f35d750683b21e9e3836e309195c79c1daed74:

uprobes: Default UPROBES_EVENTS to Y (2017-03-16 12:42:02 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Add 'brstackinsn' field in 'perf script' to reuse the x86 instruction
decoder used in the Intel PT code to study hot paths to samples (Andi Kleen)

Kernel:

- Default UPROBES_EVENTS to Y (Alexei Starovoitov)

- Fix check for kretprobe offset within function entry (Naveen N. Rao)

Infrastructure:

- Introduce util func is_sdt_event() (Ravi Bangoria)

- Make perf_event__synthesize_mmap_events() scale on older kernels where
reading /proc/pid/maps is way slower than reading /proc/pid/task/pid/maps (Stephane Eranian)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Andi Kleen (1):
perf script: Add 'brstackinsn' for branch stacks

Arnaldo Carvalho de Melo (2):
tools headers: Sync {tools/,}arch/x86/include/asm/cpufeatures.h
uprobes: Default UPROBES_EVENTS to Y

Naveen N. Rao (1):
trace/kprobes: Fix check for kretprobe offset within function entry

Ravi Bangoria (1):
perf probe: Introduce util func is_sdt_event()

Stephane Eranian (1):
perf tools: Make perf_event__synthesize_mmap_events() scale

include/linux/kprobes.h | 1 +
kernel/kprobes.c | 40 ++--
kernel/trace/Kconfig | 2 +-
kernel/trace/trace_kprobe.c | 2 +-
tools/arch/x86/include/asm/cpufeatures.h | 5 +-
tools/perf/Documentation/perf-script.txt | 13 +-
tools/perf/builtin-script.c | 264 ++++++++++++++++++++-
tools/perf/util/Build | 1 +
tools/perf/util/dump-insn.c | 14 ++
tools/perf/util/dump-insn.h | 22 ++
tools/perf/util/event.c | 4 +-
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 24 ++
tools/perf/util/parse-events.h | 20 ++
tools/perf/util/probe-event.c | 9 +-
14 files changed, 381 insertions(+), 40 deletions(-)
create mode 100644 tools/perf/util/dump-insn.c
create mode 100644 tools/perf/util/dump-insn.h
make_no_libelf_O: make NO_LIBELF=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_pure_O: make
make_perf_o_O: make perf.o
make_no_newt_O: make NO_NEWT=1
make_help_O: make help
make_no_libaudit_O: make NO_LIBAUDIT=1
make_static_O: make LDFLAGS=-static
make_doc_O: make doc
make_clean_all_O: make clean all
make_debug_O: make DEBUG=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_demangle_O: make NO_DEMANGLE=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_install_bin_O: make install-bin
make_util_map_o_O: make util/map.o
make_tags_O: make tags
make_no_libbpf_O: make NO_LIBBPF=1
make_install_O: make install
make_no_gtk2_O: make NO_GTK2=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'

Arnaldo Carvalho de Melo

unread,
Mar 16, 2017, 12:20:06 PM3/16/17
to
From: Stephane Eranian <era...@google.com>

This patch significantly improves the execution time of
perf_event__synthesize_mmap_events() when running perf record on systems
where processes have lots of threads.

It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
generate each map line in the maps file. If you have 1000 threads, then you
have necessarily 1000 stacks. For each vma, you need to check if it
corresponds to a thread's stack. With a large number of threads, this can take
a very long time. I have seen latencies >> 10mn.

As of today, perf does not use the fact that a mapping is a stack, therefore we
can work around the issue by using /proc/pid/tasks/pid/maps. This entry does
not try to map a vma to stack and is thus much faster with no loss of
functonality.

The proc-map-timeout logic is kept in case users still want some upper limit.

In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
/proc/pid/task/pid/maps, tasks -> task. Thanks Arnaldo for catching this.

Committer note:

This problem seems to have been elliminated in the kernel since commit :
b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").

Signed-off-by: Stephane Eranian <era...@google.com>
Acked-by: Jiri Olsa <jo...@redhat.com>
Cc: Andy Lutomirski <lu...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/2017031513...@redhat.com
Link: http://lkml.kernel.org/r/1489598233-25586-1-g...@google.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index d082cb70445d..33fc2e9c0b0c 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -325,8 +325,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
if (machine__is_default_guest(machine))
return 0;

- snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
- machine->root_dir, pid);
+ snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps",
+ machine->root_dir, pid, pid);

fp = fopen(filename, "r");
if (fp == NULL) {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 16, 2017, 12:20:06 PM3/16/17
to
From: "Naveen N. Rao" <naveen...@linux.vnet.ibm.com>

perf specifies an offset from _text and since this offset is fed
directly into the arch-specific helper, kprobes tracer rejects
installation of kretprobes through perf. Fix this by looking up the
actual offset from a function for the specified sym+offset.

Refactor and reuse existing routines to limit code duplication -- we
repurpose kprobe_addr() for determining final kprobe address and we
split out the function entry offset determination into a separate
generic helper.

Before patch:

naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7ff]
Probe point found: do_open+0
Matched function: do_open [35d76dc]
found inline addr: 0xc0000000004ba9c4
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//README write=0
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open _text+4469776
Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)
naveen@ubuntu:~/linux/tools/perf$ dmesg | tail
<snip>
[ 33.568656] Given offset is not valid for return probe.

After patch:

naveen@ubuntu:~/linux/tools/perf$ sudo ./perf probe -v do_open%return
probe-definition(0): do_open%return
symbol:do_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /boot/vmlinux for symbols
Open Debuginfo file: /boot/vmlinux
Try to find probe point from debuginfo.
Matched function: do_open [2d0c7d6]
Probe point found: do_open+0
Matched function: do_open [35d76b3]
found inline addr: 0xc0000000004ba9e4
Failed to find "do_open%return",
because do_open is an inlined function and has no return point.
An error occurred in debuginfo analysis (-22).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//README write=0
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: r:probe/do_open _text+4469808
Writing event: r:probe/do_open_1 _text+4956344
Added new events:
probe:do_open (on do_open%return)
probe:do_open_1 (on do_open%return)

You can now use it in all perf tools, such as:

perf record -e probe:do_open_1 -aR sleep 1

naveen@ubuntu:~/linux/tools/perf$ sudo cat /sys/kernel/debug/kprobes/list
c000000000041370 k kretprobe_trampoline+0x0 [OPTIMIZED]
c0000000004ba0b8 r do_open+0x8 [DISABLED]
c000000000443430 r do_open+0x0 [DISABLED]

Signed-off-by: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Ananth N Mavinakayanahalli <ana...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: linuxp...@lists.ozlabs.org
Link: http://lkml.kernel.org/r/d8cd1ef420ec22e3643ac332fdabcffc77...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 40 ++++++++++++++++++++++++++--------------
kernel/trace/trace_kprobe.c | 2 +-
3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 177bdf6c6aeb..47e4da5b4fa2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -268,6 +268,7 @@ extern void show_registers(struct pt_regs *regs);
extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern bool arch_function_offset_within_entry(unsigned long offset);
+extern bool function_offset_within_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);

extern bool within_kprobe_blacklist(unsigned long addr);

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4780ec236035..d733479a10ee 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1391,21 +1391,19 @@ bool within_kprobe_blacklist(unsigned long addr)
* This returns encoded errors if it fails to look up symbol or invalid
* combination of parameters.
*/
-static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
+static kprobe_opcode_t *_kprobe_addr(kprobe_opcode_t *addr,
+ const char *symbol_name, unsigned int offset)
{
- kprobe_opcode_t *addr = p->addr;
-
- if ((p->symbol_name && p->addr) ||
- (!p->symbol_name && !p->addr))
+ if ((symbol_name && addr) || (!symbol_name && !addr))
goto invalid;

- if (p->symbol_name) {
- kprobe_lookup_name(p->symbol_name, addr);
+ if (symbol_name) {
+ kprobe_lookup_name(symbol_name, addr);
if (!addr)
return ERR_PTR(-ENOENT);
}

- addr = (kprobe_opcode_t *)(((char *)addr) + p->offset);
+ addr = (kprobe_opcode_t *)(((char *)addr) + offset);
if (addr)
return addr;

@@ -1413,6 +1411,11 @@ static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
return ERR_PTR(-EINVAL);
}

+static kprobe_opcode_t *kprobe_addr(struct kprobe *p)
+{
+ return _kprobe_addr(p->addr, p->symbol_name, p->offset);
+}
+
/* Check passed kprobe is valid and return kprobe in kprobe_table. */
static struct kprobe *__get_valid_kprobe(struct kprobe *p)
{
@@ -1881,19 +1884,28 @@ bool __weak arch_function_offset_within_entry(unsigned long offset)
return !offset;
}

+bool function_offset_within_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+{
+ kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
+
+ if (IS_ERR(kp_addr))
+ return false;
+
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
+ !arch_function_offset_within_entry(offset))
+ return false;
+
+ return true;
+}
+
int register_kretprobe(struct kretprobe *rp)
{
int ret = 0;
struct kretprobe_instance *inst;
int i;
void *addr;
- unsigned long offset;
-
- addr = kprobe_addr(&rp->kp);
- if (!kallsyms_lookup_size_offset((unsigned long)addr, NULL, &offset))
- return -EINVAL;

- if (!arch_function_offset_within_entry(offset))
+ if (!function_offset_within_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
return -EINVAL;

if (kretprobe_blacklist_size) {
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 12fb540da0e5..013f4e7146d4 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -697,7 +697,7 @@ static int create_trace_kprobe(int argc, char **argv)
return ret;
}
if (offset && is_return &&
- !arch_function_offset_within_entry(offset)) {
+ !function_offset_within_entry(NULL, symbol, offset)) {
pr_info("Given offset is not valid for return probe.\n");
return -EINVAL;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 16, 2017, 12:20:07 PM3/16/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

Factor out the SDT event name checking routine as is_sdt_event().

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Ananth N Mavinakayanahalli <ana...@in.ibm.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Brendan Gregg <brendan...@gmail.com>
Cc: He Kuang <hek...@huawei.com>
Cc: Hemant Kumar <hem...@linux.vnet.ibm.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Mathieu Poirier <mathieu...@linaro.org>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sukadev Bhattiprolu <suk...@linux.vnet.ibm.com>
Cc: Taeung Song <treeze...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20170314150658.70...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/parse-events.h | 20 ++++++++++++++++++++
tools/perf/util/probe-event.c | 9 +--------
2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1af6a267c21b..8c72b0ff7fcb 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -8,6 +8,7 @@
#include <stdbool.h>
#include <linux/types.h>
#include <linux/perf_event.h>
+#include <string.h>

struct list_head;
struct perf_evsel;
@@ -196,4 +197,23 @@ int is_valid_tracepoint(const char *event_string);
int valid_event_mount(const char *eventfs);
char *parse_events_formats_error_string(char *additional_terms);

+#ifdef HAVE_LIBELF_SUPPORT
+/*
+ * If the probe point starts with '%',
+ * or starts with "sdt_" and has a ':' but no '=',
+ * then it should be a SDT/cached probe point.
+ */
+static inline bool is_sdt_event(char *str)
+{
+ return (str[0] == '%' ||
+ (!strncmp(str, "sdt_", 4) &&
+ !!strchr(str, ':') && !strchr(str, '=')));
+}
+#else
+static inline bool is_sdt_event(char *str __maybe_unused)
+{
+ return false;
+}
+#endif /* HAVE_LIBELF_SUPPORT */
+
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index c9bdc9ded0c3..b19d17801beb 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1341,14 +1341,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
if (!arg)
return -EINVAL;

- /*
- * If the probe point starts with '%',
- * or starts with "sdt_" and has a ':' but no '=',
- * then it should be a SDT/cached probe point.
- */
- if (arg[0] == '%' ||
- (!strncmp(arg, "sdt_", 4) &&
- !!strchr(arg, ':') && !strchr(arg, '='))) {
+ if (is_sdt_event(arg)) {
pev->sdt = true;
if (arg[0] == '%')
arg++;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 16, 2017, 12:20:08 PM3/16/17
to
From: Andi Kleen <a...@linux.intel.com>

Implement printing instruction sequences as hex dump for branch stacks.

This relies on the x86 instruction decoder used by the PT decoder to
find the lengths of instructions to dump them individually.

This is good enough for pattern matching.

This allows to study hot paths for individual samples, together with
branch misprediction and cycle count / IPC information if available (on
Skylake systems).

% perf record -b ...
% perf script -F brstackinsn
...
read_hpet+67:
ffffffff9905b843 insn: 74 ea # PRED
ffffffff9905b82f insn: 85 c9
ffffffff9905b831 insn: 74 12
ffffffff9905b833 insn: f3 90
ffffffff9905b835 insn: 48 8b 0f
ffffffff9905b838 insn: 48 89 ca
ffffffff9905b83b insn: 48 c1 ea 20
ffffffff9905b83f insn: 39 f2
ffffffff9905b841 insn: 89 d0
ffffffff9905b843 insn: 74 ea # PRED

Only works when no special branch filters are specified.

Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.

The instruction dumper piggy backs on the existing infrastructure from
the IP PT decoder.

An earlier iteration of this patch relied on a disassembler, but this
version only uses the existing instruction decoder.

Committer note:

Added hint about how to get suitable perf.data files for use with
'-F brstackinsm':

$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
$
$ perf script -F brstackinsn
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
$

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Michael Ellerman <m...@ellerman.id.au>
Link: http://lkml.kernel.org/r/201702232346...@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-script.txt | 13 +-
tools/perf/builtin-script.c | 264 ++++++++++++++++++++-
tools/perf/util/Build | 1 +
tools/perf/util/dump-insn.c | 14 ++
tools/perf/util/dump-insn.h | 22 ++
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 24 ++
6 files changed, 327 insertions(+), 11 deletions(-)
create mode 100644 tools/perf/util/dump-insn.c
create mode 100644 tools/perf/util/dump-insn.h

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 62c9b0c77a3a..cb0eda3925e6 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, brstack, brstacksym, flags, bpf-output,
+ srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@@ -189,15 +189,20 @@ OPTIONS
i.e., -F "" is not allowed.

The brstack output includes branch related information with raw addresses using the
- /v/v/v/v/ syntax in the following order:
+ /v/v/v/v/cycles syntax in the following order:
FROM: branch source instruction
TO : branch target instruction
M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported
X/- : X=branch inside a transactional region, -=not in transaction region or not supported
A/- : A=TSX abort entry, -=not aborted region or not supported
+ cycles

The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.

+ When brstackinsn is specified the full assembler sequences of branch sequences for each sample
+ is printed. This is the full execution path leading to the sample. This is only supported when the
+ sample was recorded with perf record -b or -j any.
+
-k::
--vmlinux=<file>::
vmlinux pathname
@@ -302,6 +307,10 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.

+--max-blocks::
+ Set the maximum number of program blocks to print with brstackasm for
+ each sample.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 66d62c98dff9..c98e16689b57 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -28,6 +28,7 @@
#include <linux/time64.h>
#include "asm/bug.h"
#include "util/mem-events.h"
+#include "util/dump-insn.h"

static char const *script_name;
static char const *generate_script_lang;
@@ -42,6 +43,7 @@ static bool nanosecs;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
+static int max_blocks;

unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;

@@ -69,6 +71,7 @@ enum perf_output_field {
PERF_OUTPUT_CALLINDENT = 1U << 20,
PERF_OUTPUT_INSN = 1U << 21,
PERF_OUTPUT_INSNLEN = 1U << 22,
+ PERF_OUTPUT_BRSTACKINSN = 1U << 23,
};

struct output_option {
@@ -98,6 +101,7 @@ struct output_option {
{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
{.str = "insn", .field = PERF_OUTPUT_INSN},
{.str = "insnlen", .field = PERF_OUTPUT_INSNLEN},
+ {.str = "brstackinsn", .field = PERF_OUTPUT_BRSTACKINSN},
};

/* default set to maintain compatibility with current format */
@@ -292,7 +296,13 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected. Hence, no address to lookup the source line number.\n");
return -EINVAL;
}
-
+ if (PRINT_FIELD(BRSTACKINSN) &&
+ !(perf_evlist__combined_branch_type(session->evlist) &
+ PERF_SAMPLE_BRANCH_ANY)) {
+ pr_err("Display of branch stack assembler requested, but non all-branch filter set\n"
+ "Hint: run 'perf record -b ...'\n");
+ return -EINVAL;
+ }
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID",
PERF_OUTPUT_TID|PERF_OUTPUT_PID))
@@ -546,6 +556,233 @@ static void print_sample_brstacksym(struct perf_sample *sample,
}
}

+#define MAXBB 16384UL
+
+static int grab_bb(u8 *buffer, u64 start, u64 end,
+ struct machine *machine, struct thread *thread,
+ bool *is64bit, u8 *cpumode, bool last)
+{
+ long offset, len;
+ struct addr_location al;
+ bool kernel;
+
+ if (!start || !end)
+ return 0;
+
+ kernel = machine__kernel_ip(machine, start);
+ if (kernel)
+ *cpumode = PERF_RECORD_MISC_KERNEL;
+ else
+ *cpumode = PERF_RECORD_MISC_USER;
+
+ /*
+ * Block overlaps between kernel and user.
+ * This can happen due to ring filtering
+ * On Intel CPUs the entry into the kernel is filtered,
+ * but the exit is not. Let the caller patch it up.
+ */
+ if (kernel != machine__kernel_ip(machine, end)) {
+ printf("\tblock %" PRIx64 "-%" PRIx64 " transfers between kernel and user\n",
+ start, end);
+ return -ENXIO;
+ }
+
+ memset(&al, 0, sizeof(al));
+ if (end - start > MAXBB - MAXINSN) {
+ if (last)
+ printf("\tbrstack does not reach to final jump (%" PRIx64 "-%" PRIx64 ")\n", start, end);
+ else
+ printf("\tblock %" PRIx64 "-%" PRIx64 " (%" PRIu64 ") too long to dump\n", start, end, end - start);
+ return 0;
+ }
+
+ thread__find_addr_map(thread, *cpumode, MAP__FUNCTION, start, &al);
+ if (!al.map || !al.map->dso) {
+ printf("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
+ return 0;
+ }
+ if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR) {
+ printf("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
+ return 0;
+ }
+
+ /* Load maps to ensure dso->is_64_bit has been updated */
+ map__load(al.map);
+
+ offset = al.map->map_ip(al.map, start);
+ len = dso__data_read_offset(al.map->dso, machine, offset, (u8 *)buffer,
+ end - start + MAXINSN);
+
+ *is64bit = al.map->dso->is_64_bit;
+ if (len <= 0)
+ printf("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n",
+ start, end);
+ return len;
+}
+
+static void print_jump(uint64_t ip, struct branch_entry *en,
+ struct perf_insn *x, u8 *inbuf, int len,
+ int insn)
+{
+ printf("\t%016" PRIx64 "\t%-30s\t#%s%s%s%s",
+ ip,
+ dump_insn(x, ip, inbuf, len, NULL),
+ en->flags.predicted ? " PRED" : "",
+ en->flags.mispred ? " MISPRED" : "",
+ en->flags.in_tx ? " INTX" : "",
+ en->flags.abort ? " ABORT" : "");
+ if (en->flags.cycles) {
+ printf(" %d cycles", en->flags.cycles);
+ if (insn)
+ printf(" %.2f IPC", (float)insn / en->flags.cycles);
+ }
+ putchar('\n');
+}
+
+static void print_ip_sym(struct thread *thread, u8 cpumode, int cpu,
+ uint64_t addr, struct symbol **lastsym,
+ struct perf_event_attr *attr)
+{
+ struct addr_location al;
+ int off;
+
+ memset(&al, 0, sizeof(al));
+
+ thread__find_addr_map(thread, cpumode, MAP__FUNCTION, addr, &al);
+ if (!al.map)
+ thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
+ addr, &al);
+ if ((*lastsym) && al.addr >= (*lastsym)->start && al.addr < (*lastsym)->end)
+ return;
+
+ al.cpu = cpu;
+ al.sym = NULL;
+ if (al.map)
+ al.sym = map__find_symbol(al.map, al.addr);
+
+ if (!al.sym)
+ return;
+
+ if (al.addr < al.sym->end)
+ off = al.addr - al.sym->start;
+ else
+ off = al.addr - al.map->start - al.sym->start;
+ printf("\t%s", al.sym->name);
+ if (off)
+ printf("%+d", off);
+ putchar(':');
+ if (PRINT_FIELD(SRCLINE))
+ map__fprintf_srcline(al.map, al.addr, "\t", stdout);
+ putchar('\n');
+ *lastsym = al.sym;
+}
+
+static void print_sample_brstackinsn(struct perf_sample *sample,
+ struct thread *thread,
+ struct perf_event_attr *attr,
+ struct machine *machine)
+{
+ struct branch_stack *br = sample->branch_stack;
+ u64 start, end;
+ int i, insn, len, nr, ilen;
+ struct perf_insn x;
+ u8 buffer[MAXBB];
+ unsigned off;
+ struct symbol *lastsym = NULL;
+
+ if (!(br && br->nr))
+ return;
+ nr = br->nr;
+ if (max_blocks && nr > max_blocks + 1)
+ nr = max_blocks + 1;
+
+ x.thread = thread;
+ x.cpu = sample->cpu;
+
+ putchar('\n');
+
+ /* Handle first from jump, of which we don't know the entry. */
+ len = grab_bb(buffer, br->entries[nr-1].from,
+ br->entries[nr-1].from,
+ machine, thread, &x.is64bit, &x.cpumode, false);
+ if (len > 0) {
+ print_ip_sym(thread, x.cpumode, x.cpu,
+ br->entries[nr - 1].from, &lastsym, attr);
+ print_jump(br->entries[nr - 1].from, &br->entries[nr - 1],
+ &x, buffer, len, 0);
+ }
+
+ /* Print all blocks */
+ for (i = nr - 2; i >= 0; i--) {
+ if (br->entries[i].from || br->entries[i].to)
+ pr_debug("%d: %" PRIx64 "-%" PRIx64 "\n", i,
+ br->entries[i].from,
+ br->entries[i].to);
+ start = br->entries[i + 1].to;
+ end = br->entries[i].from;
+
+ len = grab_bb(buffer, start, end, machine, thread, &x.is64bit, &x.cpumode, false);
+ /* Patch up missing kernel transfers due to ring filters */
+ if (len == -ENXIO && i > 0) {
+ end = br->entries[--i].from;
+ pr_debug("\tpatching up to %" PRIx64 "-%" PRIx64 "\n", start, end);
+ len = grab_bb(buffer, start, end, machine, thread, &x.is64bit, &x.cpumode, false);
+ }
+ if (len <= 0)
+ continue;
+
+ insn = 0;
+ for (off = 0;; off += ilen) {
+ uint64_t ip = start + off;
+
+ print_ip_sym(thread, x.cpumode, x.cpu, ip, &lastsym, attr);
+ if (ip == end) {
+ print_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn);
+ break;
+ } else {
+ printf("\t%016" PRIx64 "\t%s\n", ip,
+ dump_insn(&x, ip, buffer + off, len - off, &ilen));
+ if (ilen == 0)
+ break;
+ insn++;
+ }
+ }
+ }
+
+ /*
+ * Hit the branch? In this case we are already done, and the target
+ * has not been executed yet.
+ */
+ if (br->entries[0].from == sample->ip)
+ return;
+ if (br->entries[0].flags.abort)
+ return;
+
+ /*
+ * Print final block upto sample
+ */
+ start = br->entries[0].to;
+ end = sample->ip;
+ len = grab_bb(buffer, start, end, machine, thread, &x.is64bit, &x.cpumode, true);
+ print_ip_sym(thread, x.cpumode, x.cpu, start, &lastsym, attr);
+ if (len <= 0) {
+ /* Print at least last IP if basic block did not work */
+ len = grab_bb(buffer, sample->ip, sample->ip,
+ machine, thread, &x.is64bit, &x.cpumode, false);
+ if (len <= 0)
+ return;
+
+ printf("\t%016" PRIx64 "\t%s\n", sample->ip,
+ dump_insn(&x, sample->ip, buffer, len, NULL));
+ return;
+ }
+ for (off = 0; off <= end - start; off += ilen) {
+ printf("\t%016" PRIx64 "\t%s\n", start + off,
+ dump_insn(&x, start + off, buffer + off, len - off, &ilen));
+ if (ilen == 0)
+ break;
+ }
+}

static void print_sample_addr(struct perf_sample *sample,
struct thread *thread,
@@ -632,7 +869,9 @@ static void print_sample_callindent(struct perf_sample *sample,
}

static void print_insn(struct perf_sample *sample,
- struct perf_event_attr *attr)
+ struct perf_event_attr *attr,
+ struct thread *thread,
+ struct machine *machine)
{
if (PRINT_FIELD(INSNLEN))
printf(" ilen: %d", sample->insn_len);
@@ -643,12 +882,15 @@ static void print_insn(struct perf_sample *sample,
for (i = 0; i < sample->insn_len; i++)
printf(" %02x", (unsigned char)sample->insn[i]);
}
+ if (PRINT_FIELD(BRSTACKINSN))
+ print_sample_brstackinsn(sample, thread, attr, machine);
}

static void print_sample_bts(struct perf_sample *sample,
struct perf_evsel *evsel,
struct thread *thread,
- struct addr_location *al)
+ struct addr_location *al,
+ struct machine *machine)
{
struct perf_event_attr *attr = &evsel->attr;
bool print_srcline_last = false;
@@ -689,7 +931,7 @@ static void print_sample_bts(struct perf_sample *sample,
if (print_srcline_last)
map__fprintf_srcline(al->map, al->addr, "\n ", stdout);

- print_insn(sample, attr);
+ print_insn(sample, attr, thread, machine);

printf("\n");
}
@@ -872,7 +1114,8 @@ static size_t data_src__printf(u64 data_src)

static void process_event(struct perf_script *script,
struct perf_sample *sample, struct perf_evsel *evsel,
- struct addr_location *al)
+ struct addr_location *al,
+ struct machine *machine)
{
struct thread *thread = al->thread;
struct perf_event_attr *attr = &evsel->attr;
@@ -899,7 +1142,7 @@ static void process_event(struct perf_script *script,
print_sample_flags(sample->flags);

if (is_bts_event(attr)) {
- print_sample_bts(sample, evsel, thread, al);
+ print_sample_bts(sample, evsel, thread, al, machine);
return;
}

@@ -937,7 +1180,7 @@ static void process_event(struct perf_script *script,

if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
print_sample_bpf_output(sample);
- print_insn(sample, attr);
+ print_insn(sample, attr, thread, machine);
printf("\n");
}

@@ -1047,7 +1290,7 @@ static int process_sample_event(struct perf_tool *tool,
if (scripting_ops)
scripting_ops->process_event(event, sample, evsel, &al);
else
- process_event(scr, sample, evsel, &al);
+ process_event(scr, sample, evsel, &al, machine);

out_put:
addr_location__put(&al);
@@ -2191,7 +2434,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"Valid types: hw,sw,trace,raw. "
"Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
"addr,symoff,period,iregs,brstack,brstacksym,flags,"
- "bpf-output,callindent,insn,insnlen", parse_output_fields),
+ "bpf-output,callindent,insn,insnlen,brstackinsn",
+ parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
@@ -2222,6 +2466,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('\0', "show-namespace-events", &script.show_namespace_events,
"Show namespace events (if recorded)"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
+ OPT_INTEGER(0, "max-blocks", &max_blocks,
+ "Maximum number of code blocks to dump with brstackinsn"),
OPT_BOOLEAN(0, "ns", &nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 2ea5ee179a3b..fb4f42f1bb38 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -82,6 +82,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
libperf-$(CONFIG_AUXTRACE) += intel-pt.o
libperf-$(CONFIG_AUXTRACE) += intel-bts.o
libperf-y += parse-branch-options.o
+libperf-y += dump-insn.o
libperf-y += parse-regs-options.o
libperf-y += term.o
libperf-y += help-unknown-cmd.o
diff --git a/tools/perf/util/dump-insn.c b/tools/perf/util/dump-insn.c
new file mode 100644
index 000000000000..ffbdb19f05d0
--- /dev/null
+++ b/tools/perf/util/dump-insn.c
@@ -0,0 +1,14 @@
+#include <linux/compiler.h>
+#include "dump-insn.h"
+
+/* Fallback code */
+
+__weak
+const char *dump_insn(struct perf_insn *x __maybe_unused,
+ u64 ip __maybe_unused, u8 *inbuf __maybe_unused,
+ int inlen __maybe_unused, int *lenp)
+{
+ if (lenp)
+ *lenp = 0;
+ return "?";
+}
diff --git a/tools/perf/util/dump-insn.h b/tools/perf/util/dump-insn.h
new file mode 100644
index 000000000000..90fb115981cf
--- /dev/null
+++ b/tools/perf/util/dump-insn.h
@@ -0,0 +1,22 @@
+#ifndef __PERF_DUMP_INSN_H
+#define __PERF_DUMP_INSN_H 1
+
+#define MAXINSN 15
+
+#include <linux/types.h>
+
+struct thread;
+
+struct perf_insn {
+ /* Initialized by callers: */
+ struct thread *thread;
+ u8 cpumode;
+ bool is64bit;
+ int cpu;
+ /* Temporary */
+ char out[256];
+};
+
+const char *dump_insn(struct perf_insn *x, u64 ip,
+ u8 *inbuf, int inlen, int *lenp);
+#endif
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index 55b6250350d7..a5f35b21172f 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -26,6 +26,7 @@
#include "insn.c"

#include "intel-pt-insn-decoder.h"
+#include "dump-insn.h"

#if INTEL_PT_INSN_BUF_SZ < MAX_INSN_SIZE || INTEL_PT_INSN_BUF_SZ > MAX_INSN
#error Instruction buffer size too small
@@ -179,6 +180,29 @@ int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
return 0;
}

+const char *dump_insn(struct perf_insn *x, uint64_t ip __maybe_unused,
+ u8 *inbuf, int inlen, int *lenp)
+{
+ struct insn insn;
+ int n, i;
+ int left;
+
+ insn_init(&insn, inbuf, inlen, x->is64bit);
+ insn_get_length(&insn);
+ if (!insn_complete(&insn) || insn.length > inlen)
+ return "<bad>";
+ if (lenp)
+ *lenp = insn.length;
+ left = sizeof(x->out);
+ n = snprintf(x->out, left, "insn: ");
+ left -= n;
+ for (i = 0; i < insn.length; i++) {
+ n += snprintf(x->out + n, left, "%02x ", inbuf[i]);
+ left -= n;
+ }
+ return x->out;
+}
+
const char *branch_name[] = {
[INTEL_PT_OP_OTHER] = "Other",
[INTEL_PT_OP_CALL] = "Call",
--
2.9.3

Ingo Molnar

unread,
Mar 16, 2017, 12:40:09 PM3/16/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Mar 20, 2017, 9:20:05 PM3/20/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

All options need the -f/--force option, so move it to the array
referenced via OPT_PARENT.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Changbin Du <chang...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-unbeionpi5...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-lock.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 4ce815bb360d..e992e7206993 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -952,6 +952,7 @@ int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_STRING('i', "input", &input_name, "file", "input file name"),
OPT_INCR('v', "verbose", &verbose, "be more verbose (show symbol address, etc)"),
OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace, "dump raw trace in ASCII"),
+ OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
OPT_END()
};

@@ -960,14 +961,12 @@ int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
"dump thread list in perf.data"),
OPT_BOOLEAN('m', "map", &info_map,
"map of lock instances (address:name table)"),
- OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
OPT_PARENT(lock_options)
};

const struct option report_options[] = {
OPT_STRING('k', "key", &sort_key, "acquired",
"key for sorting (acquired / contended / avg_wait / wait_total / wait_max / wait_min)"),
- OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
/* TODO: type */
OPT_PARENT(lock_options)
};
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 20, 2017, 9:20:06 PM3/20/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 61f63e383784bd0ab6529cfc95ddc59c713afcc9:

Merge tag 'perf-core-for-mingo-4.12-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-16 17:29:23 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170320

for you to fetch changes up to affa6c169bae8dc9cb1a2d070c7cd2fe1939c5b8:

tools headers: Sync {tools/,}arch/powerpc/include/uapi/asm/kvm.h (2017-03-20 15:02:29 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

Fixes:

- Fix concat_probe_trace_events() in 'perf probe', it should dereference a
pointer, not test its value (Ravi Bangoria)

User visible:

- Handle partial AUX records, checking if 'kvm_intel.ko' is loaded and
if its 'vmm_exclusive' parameter is set to 0, suggesting tweaking
it to reduce gaps (Alexander Shishkin)

Infrastructure:

- Sync the kvm.h, cpufeatures.h and perf_event.h tools/ headers copies
with the kernel (Arnaldo Carvalho de Melo, Alexander Shishkin)

- 'perf lock' subcommands should include common options, using
OPT_PARENT() (Changbin Du)

- Ditto for 'perf timechart' (Arnaldo Carvalho de Melo)

Documentation:

Correct 'perf stat --no-aggr' description (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Alexander Shishkin (3):
tools lib api fs: Introduce sysfs__read_bool
tools include: Sync {,tools/}include/uapi/linux/perf_event.h
perf tools: Handle partial AUX records and print a warning

Arnaldo Carvalho de Melo (5):
perf lock: Make 'f' part of the common 'lock_options'
perf timechart: Use OPT_PARENT for common options
tools headers: Sync {tools/,}arch/x86/include/asm/cpufeatures.h
tools headers: Sync {tools/,}arch/arm{64}/include/uapi/asm/kvm.h
tools headers: Sync {tools/,}arch/powerpc/include/uapi/asm/kvm.h

Changbin Du (1):
perf lock: Subcommands should include common options

Ravi Bangoria (2):
perf stat: Correct --no-aggr description
perf probe: Fix concat_probe_trace_events

tools/arch/arm/include/uapi/asm/kvm.h | 13 +++++++++++++
tools/arch/arm64/include/uapi/asm/kvm.h | 13 +++++++++++++
tools/arch/powerpc/include/uapi/asm/kvm.h | 22 ++++++++++++++++++++++
tools/arch/x86/include/asm/cpufeatures.h | 3 ++-
tools/include/uapi/linux/perf_event.h | 1 +
tools/lib/api/fs/fs.c | 29 +++++++++++++++++++++++++++++
tools/lib/api/fs/fs.h | 1 +
tools/perf/Documentation/perf-stat.txt | 3 +--
tools/perf/builtin-lock.c | 22 ++++++++++++----------
tools/perf/builtin-timechart.c | 16 +++++++---------
tools/perf/util/event.c | 5 +++--
tools/perf/util/event.h | 1 +
tools/perf/util/probe-event.c | 2 +-
tools/perf/util/session.c | 27 ++++++++++++++++++++++++---
14 files changed, 130 insertions(+), 28 deletions(-)
Linux felicio.ghostprotocols.net 4.11.0-rc2+ #30 SMP Mon Mar 20 09:47:16 BRT 2017 x86_64 x86_64 x86_64 GNU/Linux
# Includes peterz's fix that makes "55: Convert perf time to TSC" pass,
# That fix should go via his tree.
37.1: Basic BPF filtering : FAILED!
37.2: BPF pinning : Skip
37.3: BPF prologue generation : Skip
37.4: BPF relocation checker : Skip
make_install_O: make install
make_doc_O: make doc
make_no_libelf_O: make NO_LIBELF=1
make_install_bin_O: make install-bin
make_util_map_o_O: make util/map.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_perf_o_O: make perf.o
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_pure_O: make
make_no_libpython_O: make NO_LIBPYTHON=1
make_debug_O: make DEBUG=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_static_O: make LDFLAGS=-static
make_no_slang_O: make NO_SLANG=1
make_no_newt_O: make NO_NEWT=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libperl_O: make NO_LIBPERL=1
make_clean_all_O: make clean all
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_gtk2_O: make NO_GTK2=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_tags_O: make tags
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_help_O: make help
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libbionic_O: make NO_LIBBIONIC=1
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$

Arnaldo Carvalho de Melo

unread,
Mar 20, 2017, 9:30:04 PM3/20/17
to
From: Alexander Shishkin <alexander...@linux.intel.com>

To get PERF_AUX_FLAG_PARTIAL, introduced in:

ae0c2d995d64 ("perf/core: Add a flag for partial AUX records")

and that will be used to warn the user about gaps in AUX records due
to VMX being used in KVM guests.

Silences the kernel/tools file copy detector:

Warning: include/uapi/linux/perf_event.h differs from kernel

Signed-off-by: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Cc: Stephane Eranian <era...@google.com>
Cc: Vince Weaver <vi...@deater.net>
Link: http://lkml.kernel.org/r/8760j94...@ashishki-desk.ger.corp.intel.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/include/uapi/linux/perf_event.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index bec0aad0e15c..d09a9cd021b1 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -915,6 +915,7 @@ enum perf_callchain_context {
*/
#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */
#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */
+#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */

#define PERF_FLAG_FD_NO_GROUP (1UL << 0)
#define PERF_FLAG_FD_OUTPUT (1UL << 1)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 20, 2017, 9:30:04 PM3/20/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

Description of --no-aggr in perf-stat man page is outdated. --no-aggr
can also be used while profiling specific set of cpus. For ex,

$ perf stat -e cycles,instructions -C 1-2 --no-aggr -- sleep 1

Performance counter stats for 'CPU(s) 1-2':

CPU1 5,94,92,795 cycles
CPU2 2,69,72,403 cycles
CPU1 2,02,08,327 instructions # 0.34 insn per cycle
CPU2 73,17,123 instructions # 0.12 insn per cycle

1.000989132 seconds time elapsed

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Stephane Eranian <era...@google.com>
Link: http://lkml.kernel.org/r/1490013438-5713-1-git-s...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-stat.txt | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index aecf2a87e7d6..978548138624 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -94,8 +94,7 @@ to activate system-wide monitoring. Default is to count on all CPUs.

-A::
--no-aggr::
-Do not aggregate counts across all monitored CPUs in system-wide mode (-a).
-This option is only valid in system-wide mode.
+Do not aggregate counts across all monitored CPUs.

-n::
--null::
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 20, 2017, 9:30:04 PM3/20/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

'*ntevs' contains number of elements present in 'tevs' array. If there
are no elements in array, 'tevs2' can be directly assigned to 'tevs'
without allocating more space. So the condition should be '*ntevs == 0'
not 'ntevs == 0'.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Fixes: 42bba263eb58 ("perf probe: Allow wildcard for cached events")
Link: http://lkml.kernel.org/r/20170308065908.41...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index b19d17801beb..6740d6812691 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3048,7 +3048,7 @@ concat_probe_trace_events(struct probe_trace_event **tevs, int *ntevs,
struct probe_trace_event *new_tevs;
int ret = 0;

- if (ntevs == 0) {
+ if (*ntevs == 0) {
*tevs = *tevs2;
*ntevs = ntevs2;
*tevs2 = NULL;
--
2.9.3

Ingo Molnar

unread,
Mar 21, 2017, 2:50:04 AM3/21/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Arnaldo Carvalho de Melo

unread,
Mar 21, 2017, 2:50:04 AM3/21/17
to
Commit-ID: b40e36121e23031f1e8916a70110ffc841230670
Gitweb: http://git.kernel.org/tip/b40e36121e23031f1e8916a70110ffc841230670
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Fri, 17 Mar 2017 11:16:02 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 17 Mar 2017 11:49:07 -0300

perf lock: Make 'f' part of the common 'lock_options'

All options need the -f/--force option, so move it to the array
referenced via OPT_PARENT.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Changbin Du <chang...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-unbeionpi5...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-lock.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 4ce815b..e992e72 100644

Arnaldo Carvalho de Melo

unread,
Mar 24, 2017, 12:10:05 PM3/24/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 267dd0a07eefbb37264fcfad984fffc8856898ad:

Merge tag 'perf-core-for-mingo-4.12-20170320' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-21 07:41:29 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170324

for you to fetch changes up to bf874fcf9f2fed58510dc83abcee388cee2b427e:

perf list: Move extra details printing to new option (2017-03-23 11:42:31 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Allow suppressing 'uncore_' when specifying PMU events (Andi Kleen)

- Collapse identically named PMU events in 'perf stat', allow
not merging it via --no-merge (Andi Kleen)

Fixes:

- Use more precise 'grep -v' to suppress unwanted 'objdump -dS'
disassembly output to not ditch line:number lines needed by
'perf annotate --print-lines' logic (Taeung Song)

Infrastructure:

- SDT (Statically Defined Tracing)/uprobes_events arguments improvements
(Alexis Berlemont, Ravi Bangoria)

- Improvements for the handling of JSON described vendor events,
including having an expression parser to calculate metrics
from multiple vendor events (Andi Kleen)

- Update Intel JSON vendor event files (Andi Kleen)

- Restore error reporting in 'perf probe -d' when none of the events
requested to be deleted exist. (Kefeng Wang)

- Bump MAX_CMDLEN in 'perf probe' to match what the kernel accepts
(Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Alexis Berlemont (2):
perf sdt: Add scanning of sdt probes arguments
perf probe: Add sdt probes arguments into the uprobe cmd string

Andi Kleen (13):
perf stat: Factor out callback for collecting event values
perf stat: Collapse identically named events
perf stat: Handle partially bad results with merging
perf tools: Factor out PMU matching in parser
perf pmu: Expand PMU events by prefix match
perf pmu: Special case uncore_ prefix
perf tools: Add a simple expression parser for JSON
perf vendor events intel: Update Intel uncore JSON event files
perf pmu: Support MetricExpr header in JSON event list
perf stat: Output JSON MetricExpr metric
perf list: Support printing MetricExpr with --debug
perf pmu: Add support for MetricName JSON attribute
perf list: Move extra details printing to new option

Arnaldo Carvalho de Melo (1):
perf annotate: Add comment clarifying how the source code line is parsed

Kefeng Wang (1):
perf probe: Return errno when not hitting any event

Ravi Bangoria (2):
perf probe: Change MAX_CMDLEN
perf sdt x86: Add renaming logic for rNN and other registers

Taeung Song (1):
perf annotate: More exactly grep -v of the objdump command

tools/perf/Documentation/perf-list.txt | 4 +
tools/perf/Documentation/perf-stat.txt | 3 +
tools/perf/arch/x86/util/perf_regs.c | 103 +++++++++++
tools/perf/builtin-list.c | 14 +-
tools/perf/builtin-probe.c | 6 +-
tools/perf/builtin-stat.c | 146 ++++++++++++---
.../arch/x86/broadwellde/uncore-cache.json | 28 +--
.../arch/x86/broadwellde/uncore-memory.json | 26 ++-
.../arch/x86/broadwellde/uncore-power.json | 26 ++-
.../arch/x86/broadwellx/uncore-cache.json | 28 +--
.../arch/x86/broadwellx/uncore-interconnect.json | 6 +-
.../arch/x86/broadwellx/uncore-memory.json | 21 ++-
.../arch/x86/broadwellx/uncore-power.json | 26 ++-
.../pmu-events/arch/x86/haswellx/uncore-cache.json | 28 +--
.../arch/x86/haswellx/uncore-interconnect.json | 6 +-
.../arch/x86/haswellx/uncore-memory.json | 21 ++-
.../pmu-events/arch/x86/haswellx/uncore-power.json | 26 ++-
.../pmu-events/arch/x86/ivytown/uncore-cache.json | 22 +--
.../arch/x86/ivytown/uncore-interconnect.json | 12 +-
.../pmu-events/arch/x86/ivytown/uncore-memory.json | 19 +-
.../pmu-events/arch/x86/ivytown/uncore-power.json | 53 ++++--
.../pmu-events/arch/x86/jaketown/uncore-cache.json | 13 +-
.../arch/x86/jaketown/uncore-interconnect.json | 12 +-
.../arch/x86/jaketown/uncore-memory.json | 21 ++-
.../pmu-events/arch/x86/jaketown/uncore-power.json | 53 ++++--
tools/perf/pmu-events/jevents.c | 26 ++-
tools/perf/pmu-events/jevents.h | 3 +-
tools/perf/pmu-events/pmu-events.h | 2 +
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/expr.c | 56 ++++++
tools/perf/tests/tests.h | 1 +
tools/perf/util/Build | 6 +
tools/perf/util/annotate.c | 8 +-
tools/perf/util/evsel.c | 4 +
tools/perf/util/evsel.h | 5 +
tools/perf/util/expr.h | 25 +++
tools/perf/util/expr.y | 173 ++++++++++++++++++
tools/perf/util/parse-events.c | 78 +++++++-
tools/perf/util/parse-events.h | 10 +-
tools/perf/util/parse-events.y | 73 ++++----
tools/perf/util/perf_regs.c | 6 +
tools/perf/util/perf_regs.h | 6 +
tools/perf/util/pmu.c | 32 +++-
tools/perf/util/pmu.h | 6 +-
tools/perf/util/probe-event.c | 1 -
tools/perf/util/probe-file.c | 173 +++++++++++++++++-
tools/perf/util/stat-shadow.c | 197 +++++++++++++++++++++
tools/perf/util/stat.h | 2 +
tools/perf/util/symbol-elf.c | 25 ++-
tools/perf/util/symbol.h | 1 +
51 files changed, 1370 insertions(+), 277 deletions(-)
create mode 100644 tools/perf/tests/expr.c
create mode 100644 tools/perf/util/expr.h
create mode 100644 tools/perf/util/expr.y
Linux felicio.ghostprotocols.net 4.11.0-rc3+ #1 SMP Thu Mar 23 14:32:00 BRT 2017 x86_64 x86_64 x86_64 GNU/Linux
# Has peterz's fix for 'perf test tsc'
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: Simple expression parser : Ok
7: PERF_RECORD_* events & perf_sample fields : Ok
8: Parse perf pmu format : Ok
9: DSO data read : Ok
10: DSO data cache : Ok
11: DSO data reopen : Ok
12: Roundtrip evsel->name : Ok
13: Parse sched tracepoints fields : Ok
14: syscalls:sys_enter_openat event fields : Ok
15: Setup struct perf_event_attr : Ok
16: Match and link multiple hists : Ok
17: 'import perf' in python : Ok
18: Breakpoint overflow signal handler : Ok
19: Breakpoint overflow sampling : Ok
20: Number of exit events of a simple workload : Ok
21: Software clock events period values : Ok
22: Object code reading : Ok
23: Sample parsing : Ok
24: Use a dummy software event to keep tracking: Ok
25: Parse with no sample_id_all bit set : Ok
26: Filter hist entries : Ok
27: Lookup mmap thread : Ok
28: Share thread mg : Ok
29: Sort output of hist entries : Ok
30: Cumulate child hist entries : Ok
31: Track with sched_switch : Ok
32: Filter fds with revents mask in a fdarray : Ok
33: Add fd to a fdarray, making it autogrow : Ok
34: kmod_path__parse : Ok
35: Thread map : Ok
36: LLVM search and compile :
36.1: Basic BPF llvm compile : Ok
36.2: kbuild searching : Ok
36.3: Compile source for BPF prologue generation: Ok
36.4: Compile source for BPF relocation : Ok
37: Session topology : Ok
38: BPF filter :
38.1: Basic BPF filtering : Ok
38.2: BPF pinning : Ok
38.3: BPF prologue generation : Ok
38.4: BPF relocation checker : Ok
39: Synthesize thread map : Ok
40: Remove thread map : Ok
41: Synthesize cpu map : Ok
42: Synthesize stat config : Ok
43: Synthesize stat : Ok
44: Synthesize stat round : Ok
45: Synthesize attr update : Ok
46: Event times : Ok
47: Read backward ring buffer : Ok
48: Print cpu map : Ok
49: Probe SDT events : Ok
50: is_printable_array : Ok
51: Print bitmap : Ok
52: perf hooks : Ok
53: builtin clang support : Skip (not compiled in)
54: unit_number__scnprintf : Ok
55: x86 rdpmc : Ok
56: Convert perf time to TSC : Ok
57: DWARF unwind : Ok
58: x86 instruction decoder - new instructions : Ok
59: Intel cqm nmi context read : Skip
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_install_O: make install
make_install_bin_O: make install-bin
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_pure_O: make
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_newt_O: make NO_NEWT=1
make_debug_O: make DEBUG=1
make_doc_O: make doc
make_no_slang_O: make NO_SLANG=1
make_static_O: make LDFLAGS=-static
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libbpf_O: make NO_LIBBPF=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_perf_o_O: make perf.o
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_help_O: make help
make_util_map_o_O: make util/map.o
make_no_libperl_O: make NO_LIBPERL=1
make_no_gtk2_O: make NO_GTK2=1
make_install_prefix_O: make install prefix=/tmp/krava
make_clean_all_O: make clean all
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_tags_O: make tags
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libelf_O: make NO_LIBELF=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/

Arnaldo Carvalho de Melo

unread,
Mar 24, 2017, 12:10:06 PM3/24/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

There are many SDT markers in powerpc whose uprobe definition goes
beyond current MAX_CMDLEN, especially when target filename is long and
sdt marker has long list of arguments. For example, definition of sdt
marker

method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28)

from file

/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so

is

p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\
1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\
arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\
arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\
arg9=+37(%gpr28):u8

'perf probe' fails with segfault for such markers. As the uprobe_events
file accepts definitions up to 4094 characters(4096 - 2 (\n\0)),
increase value of MAX_CMDLEN match that.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexis Berlemont <alexis.b...@gmail.com>
Cc: Madhavan Srinivasan <ma...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170207054547.36...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/probe-event.c | 1 -
tools/perf/util/probe-file.c | 3 ++-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 6740d6812691..e4b889444447 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -47,7 +47,6 @@
#include "probe-file.h"
#include "session.h"

-#define MAX_CMDLEN 256
#define PERFPROBE_GROUP "probe"

bool probe_event_dry_run; /* Dry run flag */
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 1542cd0d6799..c3c287125be5 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -28,7 +28,8 @@
#include "probe-file.h"
#include "session.h"

-#define MAX_CMDLEN 256
+/* 4096 - 2 ('\n' + '\0') */
+#define MAX_CMDLEN 4094

static void print_open_warning(int err, bool uprobe)
{
--
2.9.3

Ingo Molnar

unread,
Mar 24, 2017, 2:50:10 PM3/24/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:06 PM3/27/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

We got it from the git sources but never used it for anything, with the
place where this would be somehow used remaining:

static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
{
prefix = NULL;
if (p->option & RUN_SETUP)
prefix = NULL; /* setup_perf_directory(); */

Ditch it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-uw5swz05vo...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/bench/bench.h | 20 ++++++------
tools/perf/bench/futex-hash.c | 3 +-
tools/perf/bench/futex-lock-pi.c | 3 +-
tools/perf/bench/futex-requeue.c | 3 +-
tools/perf/bench/futex-wake-parallel.c | 3 +-
tools/perf/bench/futex-wake.c | 3 +-
tools/perf/bench/mem-functions.c | 4 +--
tools/perf/bench/numa.c | 2 +-
tools/perf/bench/sched-messaging.c | 3 +-
tools/perf/bench/sched-pipe.c | 2 +-
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-bench.c | 12 +++----
tools/perf/builtin-buildid-cache.c | 3 +-
tools/perf/builtin-buildid-list.c | 3 +-
tools/perf/builtin-c2c.c | 4 +--
tools/perf/builtin-config.c | 2 +-
tools/perf/builtin-data.c | 9 +++---
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-evlist.c | 2 +-
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kallsyms.c | 2 +-
tools/perf/builtin-kmem.c | 4 +--
tools/perf/builtin-kvm.c | 16 +++++-----
tools/perf/builtin-list.c | 2 +-
tools/perf/builtin-lock.c | 6 ++--
tools/perf/builtin-mem.c | 6 ++--
tools/perf/builtin-probe.c | 6 ++--
tools/perf/builtin-record.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 6 ++--
tools/perf/builtin-script.c | 4 +--
tools/perf/builtin-stat.c | 2 +-
tools/perf/builtin-timechart.c | 7 ++--
tools/perf/builtin-top.c | 2 +-
tools/perf/builtin-trace.c | 4 +--
tools/perf/builtin-version.c | 3 +-
tools/perf/builtin.h | 58 +++++++++++++++++-----------------
tools/perf/perf.c | 11 ++-----
tools/perf/tests/builtin-test.c | 2 +-
41 files changed, 110 insertions(+), 126 deletions(-)

diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 579a592990dd..842ab2781cdc 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -25,17 +25,17 @@
# endif
#endif

-int bench_numa(int argc, const char **argv, const char *prefix);
-int bench_sched_messaging(int argc, const char **argv, const char *prefix);
-int bench_sched_pipe(int argc, const char **argv, const char *prefix);
-int bench_mem_memcpy(int argc, const char **argv, const char *prefix);
-int bench_mem_memset(int argc, const char **argv, const char *prefix);
-int bench_futex_hash(int argc, const char **argv, const char *prefix);
-int bench_futex_wake(int argc, const char **argv, const char *prefix);
-int bench_futex_wake_parallel(int argc, const char **argv, const char *prefix);
-int bench_futex_requeue(int argc, const char **argv, const char *prefix);
+int bench_numa(int argc, const char **argv);
+int bench_sched_messaging(int argc, const char **argv);
+int bench_sched_pipe(int argc, const char **argv);
+int bench_mem_memcpy(int argc, const char **argv);
+int bench_mem_memset(int argc, const char **argv);
+int bench_futex_hash(int argc, const char **argv);
+int bench_futex_wake(int argc, const char **argv);
+int bench_futex_wake_parallel(int argc, const char **argv);
+int bench_futex_requeue(int argc, const char **argv);
/* pi futexes */
-int bench_futex_lock_pi(int argc, const char **argv, const char *prefix);
+int bench_futex_lock_pi(int argc, const char **argv);

#define BENCH_FORMAT_DEFAULT_STR "default"
#define BENCH_FORMAT_DEFAULT 0
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 2499e1b0c6fb..fe16b310097f 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -114,8 +114,7 @@ static void print_summary(void)
(int) runtime.tv_sec);
}

-int bench_futex_hash(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_hash(int argc, const char **argv)
{
int ret = 0;
cpu_set_t cpu;
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index a20814d94af1..73a1c44ea63c 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -140,8 +140,7 @@ static void create_threads(struct worker *w, pthread_attr_t thread_attr)
}
}

-int bench_futex_lock_pi(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_lock_pi(int argc, const char **argv)
{
int ret = 0;
unsigned int i;
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index 9fad1e4fcd3e..41786cbea24c 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -109,8 +109,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_requeue(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_requeue(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 40f5fcf1d120..4ab12c8e016a 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -197,8 +197,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_wake_parallel(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_wake_parallel(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 789490281ae3..2fa49222ef8d 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -115,8 +115,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_wake(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_wake(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
index 52504a83b5a1..d1dea33dcfcf 100644
--- a/tools/perf/bench/mem-functions.c
+++ b/tools/perf/bench/mem-functions.c
@@ -284,7 +284,7 @@ static const char * const bench_mem_memcpy_usage[] = {
NULL
};

-int bench_mem_memcpy(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_mem_memcpy(int argc, const char **argv)
{
struct bench_mem_info info = {
.functions = memcpy_functions,
@@ -358,7 +358,7 @@ static const struct function memset_functions[] = {
{ .name = NULL, }
};

-int bench_mem_memset(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_mem_memset(int argc, const char **argv)
{
struct bench_mem_info info = {
.functions = memset_functions,
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 6bd0581de298..1fe43bd5a012 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -1767,7 +1767,7 @@ static int bench_all(void)
return 0;
}

-int bench_numa(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_numa(int argc, const char **argv)
{
init_params(&p0, "main,", argc, argv);
argc = parse_options(argc, argv, options, bench_numa_usage, 0);
diff --git a/tools/perf/bench/sched-messaging.c b/tools/perf/bench/sched-messaging.c
index 6a111e775210..4f961e74535b 100644
--- a/tools/perf/bench/sched-messaging.c
+++ b/tools/perf/bench/sched-messaging.c
@@ -260,8 +260,7 @@ static const char * const bench_sched_message_usage[] = {
NULL
};

-int bench_sched_messaging(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_sched_messaging(int argc, const char **argv)
{
unsigned int i, total_children;
struct timeval start, stop, diff;
diff --git a/tools/perf/bench/sched-pipe.c b/tools/perf/bench/sched-pipe.c
index 2243f0150d76..a152737370c5 100644
--- a/tools/perf/bench/sched-pipe.c
+++ b/tools/perf/bench/sched-pipe.c
@@ -76,7 +76,7 @@ static void *worker_thread(void *__tdata)
return NULL;
}

-int bench_sched_pipe(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_sched_pipe(int argc, const char **argv)
{
struct thread_data threads[2], *td;
int pipe_1[2], pipe_2[2];
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index e54b1f9fe1ee..56a7c8d210b9 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -383,7 +383,7 @@ static const char * const annotate_usage[] = {
NULL
};

-int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_annotate(int argc, const char **argv)
{
struct perf_annotate annotate = {
.tool = {
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index a1cddc6bbf0f..445e62881254 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -25,7 +25,7 @@
#include <string.h>
#include <sys/prctl.h>

-typedef int (*bench_fn_t)(int argc, const char **argv, const char *prefix);
+typedef int (*bench_fn_t)(int argc, const char **argv);

struct bench {
const char *name;
@@ -155,7 +155,7 @@ static int bench_str2int(const char *str)
* to something meaningful:
*/
static int run_bench(const char *coll_name, const char *bench_name, bench_fn_t fn,
- int argc, const char **argv, const char *prefix)
+ int argc, const char **argv)
{
int size;
char *name;
@@ -171,7 +171,7 @@ static int run_bench(const char *coll_name, const char *bench_name, bench_fn_t f
prctl(PR_SET_NAME, name);
argv[0] = name;

- ret = fn(argc, argv, prefix);
+ ret = fn(argc, argv);

free(name);

@@ -198,7 +198,7 @@ static void run_collection(struct collection *coll)
fflush(stdout);

argv[1] = bench->name;
- run_bench(coll->name, bench->name, bench->fn, 1, argv, NULL);
+ run_bench(coll->name, bench->name, bench->fn, 1, argv);
printf("\n");
}
}
@@ -211,7 +211,7 @@ static void run_all_collections(void)
run_collection(coll);
}

-int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_bench(int argc, const char **argv)
{
struct collection *coll;
int ret = 0;
@@ -270,7 +270,7 @@ int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
if (bench_format == BENCH_FORMAT_DEFAULT)
printf("# Running '%s/%s' benchmark:\n", coll->name, bench->name);
fflush(stdout);
- ret = run_bench(coll->name, bench->name, bench->fn, argc-1, argv+1, prefix);
+ ret = run_bench(coll->name, bench->name, bench->fn, argc-1, argv+1);
goto end;
}

diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index 30e2b2cb2421..94b55eee0d9b 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -276,8 +276,7 @@ static int build_id_cache__update_file(const char *filename)
return err;
}

-int cmd_buildid_cache(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_buildid_cache(int argc, const char **argv)
{
struct strlist *list;
struct str_node *pos;
diff --git a/tools/perf/builtin-buildid-list.c b/tools/perf/builtin-buildid-list.c
index 5e914ee79eb3..26f4e608207f 100644
--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@@ -87,8 +87,7 @@ static int perf_session__list_build_ids(bool force, bool with_hits)
return 0;
}

-int cmd_buildid_list(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_buildid_list(int argc, const char **argv)
{
bool show_kernel = false;
bool with_hits = false;
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5cd6d7a047b9..70c2c773a2b8 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2755,12 +2755,12 @@ static int perf_c2c__record(int argc, const char **argv)
pr_debug("\n");
}

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}

-int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_c2c(int argc, const char **argv)
{
argc = parse_options(argc, argv, c2c_options, c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 8c0d93b7c2f0..55f04f85b049 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -154,7 +154,7 @@ static int parse_config_arg(char *arg, char **var, char **value)
return 0;
}

-int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_config(int argc, const char **argv)
{
int i, ret = 0;
struct perf_config_set *set;
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
index 7ad6e17ac6b3..0adb5f82335a 100644
--- a/tools/perf/builtin-data.c
+++ b/tools/perf/builtin-data.c
@@ -6,7 +6,7 @@
#include "data-convert.h"
#include "data-convert-bt.h"

-typedef int (*data_cmd_fn_t)(int argc, const char **argv, const char *prefix);
+typedef int (*data_cmd_fn_t)(int argc, const char **argv);

struct data_cmd {
const char *name;
@@ -50,8 +50,7 @@ static const char * const data_convert_usage[] = {
NULL
};

-static int cmd_data_convert(int argc, const char **argv,
- const char *prefix __maybe_unused)
+static int cmd_data_convert(int argc, const char **argv)
{
const char *to_ctf = NULL;
struct perf_data_convert_opts opts = {
@@ -98,7 +97,7 @@ static struct data_cmd data_cmds[] = {
{ .name = NULL, },
};

-int cmd_data(int argc, const char **argv, const char *prefix)
+int cmd_data(int argc, const char **argv)
{
struct data_cmd *cmd;
const char *cmdstr;
@@ -118,7 +117,7 @@ int cmd_data(int argc, const char **argv, const char *prefix)
if (strcmp(cmd->name, cmdstr))
continue;

- return cmd->fn(argc, argv, prefix);
+ return cmd->fn(argc, argv);
}

pr_err("Unknown command: %s\n", cmdstr);
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5e4803158672..cd2605d86984 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1321,7 +1321,7 @@ static int diff__config(const char *var, const char *value,
return 0;
}

-int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_diff(int argc, const char **argv)
{
int ret = hists__init();

diff --git a/tools/perf/builtin-evlist.c b/tools/perf/builtin-evlist.c
index e09c4287fe87..6d210e40d611 100644
--- a/tools/perf/builtin-evlist.c
+++ b/tools/perf/builtin-evlist.c
@@ -46,7 +46,7 @@ static int __cmd_evlist(const char *file_name, struct perf_attr_details *details
return 0;
}

-int cmd_evlist(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_evlist(int argc, const char **argv)
{
struct perf_attr_details details = { .verbose = false, };
const struct option options[] = {
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 6087295f8827..f80fb60b00b0 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -304,7 +304,7 @@ static int perf_ftrace_config(const char *var, const char *value, void *cb)
return -1;
}

-int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_ftrace(int argc, const char **argv)
{
int ret;
struct perf_ftrace ftrace = {
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index aed0d844e8c2..7ae238929e95 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -418,7 +418,7 @@ static int show_html_page(const char *perf_cmd)
return 0;
}

-int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_help(int argc, const char **argv)
{
bool show_all = false;
enum help_format help_format = HELP_FORMAT_MAN;
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 8d1d13b9bab6..42dff0b1375a 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -738,7 +738,7 @@ static int __cmd_inject(struct perf_inject *inject)
return ret;
}

-int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_inject(int argc, const char **argv)
{
struct perf_inject inject = {
.tool = {
diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
index 224bfc454b4a..8ff38c4eb2c0 100644
--- a/tools/perf/builtin-kallsyms.c
+++ b/tools/perf/builtin-kallsyms.c
@@ -43,7 +43,7 @@ static int __cmd_kallsyms(int argc, const char **argv)
return 0;
}

-int cmd_kallsyms(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_kallsyms(int argc, const char **argv)
{
const struct option options[] = {
OPT_INCR('v', "verbose", &verbose, "be more verbose (show counter open errors, etc)"),
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index d509e74bc6e8..515587825af4 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1866,7 +1866,7 @@ static int __cmd_record(int argc, const char **argv)
for (j = 1; j < (unsigned int)argc; j++, i++)
rec_argv[i] = argv[j];

- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

static int kmem_config(const char *var, const char *value, void *cb __maybe_unused)
@@ -1885,7 +1885,7 @@ static int kmem_config(const char *var, const char *value, void *cb __maybe_unus
return 0;
}

-int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_kmem(int argc, const char **argv)
{
const char * const default_slab_sort = "frag,hit,bytes";
const char * const default_page_sort = "bytes,hit";
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 18e6c38864bc..38b409173693 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1209,7 +1209,7 @@ kvm_events_record(struct perf_kvm_stat *kvm, int argc, const char **argv)
set_option_flag(record_options, 0, "transaction", PARSE_OPT_DISABLED);

record_usage = kvm_stat_record_usage;
- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

static int
@@ -1477,7 +1477,7 @@ static int kvm_cmd_stat(const char *file_name, int argc, const char **argv)
#endif

perf_stat:
- return cmd_stat(argc, argv, NULL);
+ return cmd_stat(argc, argv);
}
#endif /* HAVE_KVM_STAT_SUPPORT */

@@ -1496,7 +1496,7 @@ static int __cmd_record(const char *file_name, int argc, const char **argv)

BUG_ON(i != rec_argc);

- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

static int __cmd_report(const char *file_name, int argc, const char **argv)
@@ -1514,7 +1514,7 @@ static int __cmd_report(const char *file_name, int argc, const char **argv)

BUG_ON(i != rec_argc);

- return cmd_report(i, rec_argv, NULL);
+ return cmd_report(i, rec_argv);
}

static int
@@ -1533,10 +1533,10 @@ __cmd_buildid_list(const char *file_name, int argc, const char **argv)

BUG_ON(i != rec_argc);

- return cmd_buildid_list(i, rec_argv, NULL);
+ return cmd_buildid_list(i, rec_argv);
}

-int cmd_kvm(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_kvm(int argc, const char **argv)
{
const char *file_name = NULL;
const struct option kvm_options[] = {
@@ -1591,9 +1591,9 @@ int cmd_kvm(int argc, const char **argv, const char *prefix __maybe_unused)
else if (!strncmp(argv[0], "rep", 3))
return __cmd_report(file_name, argc, argv);
else if (!strncmp(argv[0], "diff", 4))
- return cmd_diff(argc, argv, NULL);
+ return cmd_diff(argc, argv);
else if (!strncmp(argv[0], "top", 3))
- return cmd_top(argc, argv, NULL);
+ return cmd_top(argc, argv);
else if (!strncmp(argv[0], "buildid-list", 12))
return __cmd_buildid_list(file_name, argc, argv);
#ifdef HAVE_KVM_STAT_SUPPORT
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index be9195e95c78..4bf2cb4d25aa 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -20,7 +20,7 @@
static bool desc_flag = true;
static bool details_flag;

-int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_list(int argc, const char **argv)
{
int i;
bool raw_dump = false;
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index e992e7206993..b686fb6759da 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -941,12 +941,12 @@ static int __cmd_record(int argc, const char **argv)

BUG_ON(i != rec_argc);

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}

-int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_lock(int argc, const char **argv)
{
const struct option lock_options[] = {
OPT_STRING('i', "input", &input_name, "file", "input file name"),
@@ -1009,7 +1009,7 @@ int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
rc = __cmd_report(false);
} else if (!strcmp(argv[0], "script")) {
/* Aliased to 'perf script' */
- return cmd_script(argc, argv, prefix);
+ return cmd_script(argc, argv);
} else if (!strcmp(argv[0], "info")) {
if (argc) {
argc = parse_options(argc, argv,
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 030a6cfdda59..643f4faac0d0 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -129,7 +129,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
pr_debug("\n");
}

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}
@@ -256,7 +256,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
for (j = 1; j < argc; j++, i++)
rep_argv[i] = argv[j];

- ret = cmd_report(i, rep_argv, NULL);
+ ret = cmd_report(i, rep_argv);
free(rep_argv);
return ret;
}
@@ -330,7 +330,7 @@ parse_mem_ops(const struct option *opt, const char *str, int unset)
return ret;
}

-int cmd_mem(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_mem(int argc, const char **argv)
{
struct stat st;
struct perf_mem mem = {
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 51cdc230f6ca..d7360c2bda13 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -468,7 +468,7 @@ static int perf_del_probe_events(struct strfilter *filter)


static int
-__cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
+__cmd_probe(int argc, const char **argv)
{
const char * const probe_usage[] = {
"perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]",
@@ -687,13 +687,13 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
return 0;
}

-int cmd_probe(int argc, const char **argv, const char *prefix)
+int cmd_probe(int argc, const char **argv)
{
int ret;

ret = init_params();
if (!ret) {
- ret = __cmd_probe(argc, argv, prefix);
+ ret = __cmd_probe(argc, argv);
cleanup_params();
}

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 04faef79a548..3191ab063852 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1667,7 +1667,7 @@ static struct option __record_options[] = {

struct option *record_options = __record_options;

-int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_record(int argc, const char **argv)
{
int err;
struct record *rec = &record;
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 5ab8117c3bfd..3c8885a1c452 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -681,7 +681,7 @@ const char report_callchain_help[] = "Display call graph (stack chain/backtrace)
CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;

-int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_report(int argc, const char **argv)
{
struct perf_session *session;
struct itrace_synth_opts itrace_synth_opts = { .set = 0, };
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index b92c4d97192c..79833e226789 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3272,10 +3272,10 @@ static int __cmd_record(int argc, const char **argv)

BUG_ON(i != rec_argc);

- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

-int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_sched(int argc, const char **argv)
{
const char default_sort_order[] = "avg, max, switch, runtime";
struct perf_sched sched = {
@@ -3412,7 +3412,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
* Aliased to 'perf script' for now:
*/
if (!strcmp(argv[0], "script"))
- return cmd_script(argc, argv, prefix);
+ return cmd_script(argc, argv);

if (!strncmp(argv[0], "rec", 3)) {
return __cmd_record(argc, argv);
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c98e16689b57..46acc8ece41f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2359,7 +2359,7 @@ int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
return set_maps(script);
}

-int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
bool header = false;
@@ -2504,7 +2504,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (argc > 1 && !strncmp(argv[0], "rec", strlen("rec"))) {
rec_script_path = get_script_path(argv[1], RECORD_SUFFIX);
if (!rec_script_path)
- return cmd_record(argc, argv, NULL);
+ return cmd_record(argc, argv);
}

if (argc > 1 && !strncmp(argv[0], "rep", strlen("rep"))) {
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 01b589e3c3a6..2158ea14da57 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2478,7 +2478,7 @@ static void setup_system_wide(int forks)
}
}

-int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_stat(int argc, const char **argv)
{
const char * const stat_usage[] = {
"perf stat [<options>] [<command>]",
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index fbd7c6c695b8..fafdb44b8bcb 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -1773,7 +1773,7 @@ static int timechart__io_record(int argc, const char **argv)
for (i = 0; i < (unsigned int)argc; i++)
*p++ = argv[i];

- return cmd_record(rec_argc, rec_argv, NULL);
+ return cmd_record(rec_argc, rec_argv);
}


@@ -1864,7 +1864,7 @@ static int timechart__record(struct timechart *tchart, int argc, const char **ar
for (j = 0; j < (unsigned int)argc; j++)
*p++ = argv[j];

- return cmd_record(rec_argc, rec_argv, NULL);
+ return cmd_record(rec_argc, rec_argv);
}

static int
@@ -1917,8 +1917,7 @@ parse_time(const struct option *opt, const char *arg, int __maybe_unused unset)
return 0;
}

-int cmd_timechart(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_timechart(int argc, const char **argv)
{
struct timechart tchart = {
.tool = {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ab9077915763..a0c97c70ec81 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1075,7 +1075,7 @@ parse_percent_limit(const struct option *opt, const char *arg,
const char top_callchain_help[] = CALLCHAIN_RECORD_HELP CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: fp,graph,0.5,caller,function";

-int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_top(int argc, const char **argv)
{
char errbuf[BUFSIZ];
struct perf_top top = {
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 60053d49539b..c88f9f215e6f 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1993,7 +1993,7 @@ static int trace__record(struct trace *trace, int argc, const char **argv)
for (i = 0; i < (unsigned int)argc; i++)
rec_argv[j++] = argv[i];

- return cmd_record(j, rec_argv, NULL);
+ return cmd_record(j, rec_argv);
}

static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp);
@@ -2791,7 +2791,7 @@ static int trace__parse_events_option(const struct option *opt, const char *str,
return err;
}

-int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_trace(int argc, const char **argv)
{
const char *trace_usage[] = {
"perf trace [<options>] [<command>]",
diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 9b10cda6b6dc..b9a095b1db99 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -2,8 +2,7 @@
#include "builtin.h"
#include "perf.h"

-int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused,
- const char *prefix __maybe_unused)
+int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused)
{
printf("perf version %s\n", perf_version_string);
return 0;
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 036e1e35b1a8..26669bf9129c 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -13,35 +13,35 @@ void prune_packed_objects(int);
int read_line_with_nul(char *buf, int size, FILE *file);
int check_pager_config(const char *cmd);

-int cmd_annotate(int argc, const char **argv, const char *prefix);
-int cmd_bench(int argc, const char **argv, const char *prefix);
-int cmd_buildid_cache(int argc, const char **argv, const char *prefix);
-int cmd_buildid_list(int argc, const char **argv, const char *prefix);
-int cmd_config(int argc, const char **argv, const char *prefix);
-int cmd_c2c(int argc, const char **argv, const char *prefix);
-int cmd_diff(int argc, const char **argv, const char *prefix);
-int cmd_evlist(int argc, const char **argv, const char *prefix);
-int cmd_help(int argc, const char **argv, const char *prefix);
-int cmd_sched(int argc, const char **argv, const char *prefix);
-int cmd_kallsyms(int argc, const char **argv, const char *prefix);
-int cmd_list(int argc, const char **argv, const char *prefix);
-int cmd_record(int argc, const char **argv, const char *prefix);
-int cmd_report(int argc, const char **argv, const char *prefix);
-int cmd_stat(int argc, const char **argv, const char *prefix);
-int cmd_timechart(int argc, const char **argv, const char *prefix);
-int cmd_top(int argc, const char **argv, const char *prefix);
-int cmd_script(int argc, const char **argv, const char *prefix);
-int cmd_version(int argc, const char **argv, const char *prefix);
-int cmd_probe(int argc, const char **argv, const char *prefix);
-int cmd_kmem(int argc, const char **argv, const char *prefix);
-int cmd_lock(int argc, const char **argv, const char *prefix);
-int cmd_kvm(int argc, const char **argv, const char *prefix);
-int cmd_test(int argc, const char **argv, const char *prefix);
-int cmd_trace(int argc, const char **argv, const char *prefix);
-int cmd_inject(int argc, const char **argv, const char *prefix);
-int cmd_mem(int argc, const char **argv, const char *prefix);
-int cmd_data(int argc, const char **argv, const char *prefix);
-int cmd_ftrace(int argc, const char **argv, const char *prefix);
+int cmd_annotate(int argc, const char **argv);
+int cmd_bench(int argc, const char **argv);
+int cmd_buildid_cache(int argc, const char **argv);
+int cmd_buildid_list(int argc, const char **argv);
+int cmd_config(int argc, const char **argv);
+int cmd_c2c(int argc, const char **argv);
+int cmd_diff(int argc, const char **argv);
+int cmd_evlist(int argc, const char **argv);
+int cmd_help(int argc, const char **argv);
+int cmd_sched(int argc, const char **argv);
+int cmd_kallsyms(int argc, const char **argv);
+int cmd_list(int argc, const char **argv);
+int cmd_record(int argc, const char **argv);
+int cmd_report(int argc, const char **argv);
+int cmd_stat(int argc, const char **argv);
+int cmd_timechart(int argc, const char **argv);
+int cmd_top(int argc, const char **argv);
+int cmd_script(int argc, const char **argv);
+int cmd_version(int argc, const char **argv);
+int cmd_probe(int argc, const char **argv);
+int cmd_kmem(int argc, const char **argv);
+int cmd_lock(int argc, const char **argv);
+int cmd_kvm(int argc, const char **argv);
+int cmd_test(int argc, const char **argv);
+int cmd_trace(int argc, const char **argv);
+int cmd_inject(int argc, const char **argv);
+int cmd_mem(int argc, const char **argv);
+int cmd_data(int argc, const char **argv);
+int cmd_ftrace(int argc, const char **argv);

int find_scripts(char **scripts_array, char **scripts_path_array);
#endif
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 6d5479e03e0d..4b283d18e158 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -34,7 +34,7 @@ const char *input_name;

struct cmd_struct {
const char *cmd;
- int (*fn)(int, const char **, const char *);
+ int (*fn)(int, const char **);
int option;
};

@@ -339,13 +339,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
{
int status;
struct stat st;
- const char *prefix;
char sbuf[STRERR_BUFSIZE];

- prefix = NULL;
- if (p->option & RUN_SETUP)
- prefix = NULL; /* setup_perf_directory(); */
-
if (use_browser == -1)
use_browser = check_browser_config(p->cmd);

@@ -356,7 +351,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
commit_pager_choice();

perf_env__set_cmdline(&perf_env, argc, argv);
- status = p->fn(argc, argv, prefix);
+ status = p->fn(argc, argv);
perf_config__exit();
exit_browser(status);
perf_env__exit(&perf_env);
@@ -566,7 +561,7 @@ int main(int argc, const char **argv)
#ifdef HAVE_LIBAUDIT_SUPPORT
setup_path();
argv[0] = "trace";
- return cmd_trace(argc, argv, NULL);
+ return cmd_trace(argc, argv);
#else
fprintf(stderr,
"trace command not available: missing audit-libs devel package at build time.\n");
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 86822969e8a8..e6d7876c94c2 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -464,7 +464,7 @@ static int perf_test__list(int argc, const char **argv)
return 0;
}

-int cmd_test(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_test(int argc, const char **argv)
{
const char *test_usage[] = {
"perf test [<options>] [{list <test-name-fragment>|[<test-name-fragments>|<test-numbers>]}]",
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:06 PM3/27/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit e3a6a62400520452fe39740dca90a1d0b94b8f92:

Merge tag 'perf-core-for-mingo-4.12-20170324' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-24 19:37:40 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170327

for you to fetch changes up to 55f77128e7652e537d6c226d5b56821cdb5c22de:

perf utils: Readlink /proc/self/exe to find the perf binary (2017-03-27 15:37:54 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Handle inline functions in callchains (Jin Yao)

- Enable sorting by srcline as key (Milian Wolff)

Fixes:

- Fix no_size logic in addr_filter__resolve_kernel_syms() in the
auxtrace code (Adrian Hunter)

- Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo)

- Fix divide by zero when calculating percent for an event in a group in
the annotate by source line code (Taeung Song)

- build-id files now aren't anymore symlinks, their parent directories
are, so readlink the later (Taeung Song)

- Assorted fixes for null termination problems, mostly related to
readlink, detected by valgrind (Tommi Rantala)

Infrastructure:

- Make vfs_getname probe point logic in 'perf trace' more robust
wrt length of pathname (Arnaldo Carvalho de Melo)

- Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo)

- Show 'perf list sdt' option in man page (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Adrian Hunter (1):
perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms()

Arnaldo Carvalho de Melo (4):
perf trace: Check for vfs_getname.pathname length
perf trace: Fix up error path indentation
perf trace: Fixup thread refcounting
perf tools: Remove unused 'prefix' from builtin functions

Jin Yao (5):
perf report: Refactor common code in srcline.c
perf report: Find the inline stack for a given address
perf report: Introduce --inline option
perf report: Show inline stack for stdio mode
perf report: Show inline stack for browser mode

Milian Wolff (1):
perf report: Enable sorting by srcline as key

Ravi Bangoria (1):
perf list sdt: Show option in man page

Taeung Song (2):
perf annotate: Fix a bug following symbolic link of a build-id file
perf annotate: Fix a bug of division by zero when calculating percent

Tommi Rantala (6):
perf buildid: Do not update SDT cache with null filename
perf buildid: Do not assume that readlink() returns a null terminated string
perf tests: Do not assume that readlink() returns a null terminated string
perf utils: use sizeof(buf) - 1 in readlink() call
perf utils: Null terminate buf in read_ftrace_printk()
perf utils: Readlink /proc/self/exe to find the perf binary

tools/perf/Documentation/perf-list.txt | 4 +-
tools/perf/Documentation/perf-report.txt | 5 +
tools/perf/bench/bench.h | 20 +--
tools/perf/bench/futex-hash.c | 3 +-
tools/perf/bench/futex-lock-pi.c | 3 +-
tools/perf/bench/futex-requeue.c | 3 +-
tools/perf/bench/futex-wake-parallel.c | 3 +-
tools/perf/bench/futex-wake.c | 3 +-
tools/perf/bench/mem-functions.c | 4 +-
tools/perf/bench/numa.c | 2 +-
tools/perf/bench/sched-messaging.c | 3 +-
tools/perf/bench/sched-pipe.c | 2 +-
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-bench.c | 12 +-
tools/perf/builtin-buildid-cache.c | 3 +-
tools/perf/builtin-buildid-list.c | 3 +-
tools/perf/builtin-c2c.c | 4 +-
tools/perf/builtin-config.c | 2 +-
tools/perf/builtin-data.c | 9 +-
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-evlist.c | 2 +-
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kallsyms.c | 2 +-
tools/perf/builtin-kmem.c | 4 +-
tools/perf/builtin-kvm.c | 16 +-
tools/perf/builtin-list.c | 2 +-
tools/perf/builtin-lock.c | 6 +-
tools/perf/builtin-mem.c | 6 +-
tools/perf/builtin-probe.c | 6 +-
tools/perf/builtin-record.c | 2 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-sched.c | 6 +-
tools/perf/builtin-script.c | 4 +-
tools/perf/builtin-stat.c | 2 +-
tools/perf/builtin-timechart.c | 7 +-
tools/perf/builtin-top.c | 2 +-
tools/perf/builtin-trace.c | 25 ++--
tools/perf/builtin-version.c | 3 +-
tools/perf/builtin.h | 58 ++++----
tools/perf/perf.c | 11 +-
tools/perf/tests/builtin-test.c | 2 +-
tools/perf/tests/sdt.c | 2 +-
tools/perf/ui/browsers/hists.c | 181 ++++++++++++++++++++++-
tools/perf/ui/stdio/hist.c | 86 ++++++++++-
tools/perf/util/annotate.c | 23 ++-
tools/perf/util/auxtrace.c | 4 +-
tools/perf/util/build-id.c | 8 +-
tools/perf/util/callchain.c | 52 ++++++-
tools/perf/util/callchain.h | 3 +-
tools/perf/util/header.c | 8 +-
tools/perf/util/hist.c | 5 +
tools/perf/util/map.c | 3 +-
tools/perf/util/sort.c | 16 +-
tools/perf/util/sort.h | 1 +
tools/perf/util/srcline.c | 246 +++++++++++++++++++++++++++----
tools/perf/util/symbol-elf.c | 5 +
tools/perf/util/symbol-minimal.c | 7 +
tools/perf/util/symbol.h | 5 +-
tools/perf/util/trace-event-read.c | 4 +-
tools/perf/util/util.h | 20 ++-
62 files changed, 739 insertions(+), 208 deletions(-)
11 debian:9: Ok
12 debian:experimental: Ok
13 debian:experimental-x-arm64: Ok
14 debian:experimental-x-mips: Ok
15 debian:experimental-x-mips64: Ok
16 debian:experimental-x-mipsel: Ok
17 fedora:20: Ok
18 fedora:21: Ok
19 fedora:22: Ok
20 fedora:23: Ok
21 fedora:24: Ok
22 fedora:24-x-ARC-uClibc: Ok
23 fedora:25: Ok
24 fedora:rawhide: Ok
25 mageia:5: Ok
26 opensuse:13.2: Ok
27 opensuse:42.1: Ok
28 opensuse:tumbleweed: Ok
29 ubuntu:12.04.5: Ok
30 ubuntu:14.04.4: Ok
31 ubuntu:14.04.4-x-linaro-arm64: Ok
32 ubuntu:15.10: Ok
33 ubuntu:16.04: Ok
34 ubuntu:16.04-x-arm: Ok
35 ubuntu:16.04-x-arm64: Ok
36 ubuntu:16.04-x-powerpc: Ok
37 ubuntu:16.04-x-powerpc64: Ok
38 ubuntu:16.04-x-s390: Ok
39 ubuntu:16.10: Ok
40 ubuntu:17.04: Ok
#
# uname -a
Linux jouet 4.11.0-rc2+ #5 SMP Mon Mar 20 18:12:29 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
# This kernel lacks the fix by peterz for 'perf test tsc'
56: Convert perf time to TSC : FAILED!
57: DWARF unwind : Ok
58: x86 instruction decoder - new instructions : Ok
59: Intel cqm nmi context read : Skip
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_clean_all_O: make clean all
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_O: make install
make_no_libaudit_O: make NO_LIBAUDIT=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_tags_O: make tags
make_no_slang_O: make NO_SLANG=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_pure_O: make
make_install_prefix_O: make install prefix=/tmp/krava
make_no_demangle_O: make NO_DEMANGLE=1
make_doc_O: make doc
make_no_gtk2_O: make NO_GTK2=1
make_help_O: make help
make_static_O: make LDFLAGS=-static
make_no_libunwind_O: make NO_LIBUNWIND=1
make_util_map_o_O: make util/map.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libelf_O: make NO_LIBELF=1
make_debug_O: make DEBUG=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_perf_o_O: make perf.o
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_newt_O: make NO_NEWT=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_bin_O: make install-bin

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:07 PM3/27/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

It shouldn't be zero, but if the 'perf probe' on getname_flags() (or
elsewhere in the future we need to probe to catch the pathname for
syscalls like 'open' being copied from userspace to the kernel) is
misplaced somehow, then we will end up not allocating space and trying
to copy the "" empty string to ttrace->filename.name, causing a
segfault, fix it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-c4f1t6sx1n...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-trace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 912fedc5b42d..33c657c15d5e 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1656,6 +1656,8 @@ static int trace__vfs_getname(struct trace *trace, struct perf_evsel *evsel,
goto out;

filename_len = strlen(filename);
+ if (filename_len == 0)
+ goto out;

if (ttrace->filename.namelen < filename_len) {
char *f = realloc(ttrace->filename.name, filename_len + 1);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:07 PM3/27/17
to
From: Taeung Song <treeze...@gmail.com>

Currently perf-annotate with --print-line can print
-nan(0x8000000000000) because of division by zero when calculating
percent. The division by zero happens when a sum of samples is zero in
symbol__get_source_line(), so fix it.

For example:

After running 'perf record' like below,

$ perf record -e "{cycles,page-faults,branch-misses}" ./a.out

Before:

$ perf annotate --stdio -l

Sorted summary for file /home/taeung/workspace/a.out
----------------------------------------------

32.89 -nan 7.04 a.c:38
25.14 -nan 0.00 a.c:34
16.26 -nan 56.34 a.c:31
15.88 -nan 1.41 a.c:37
5.67 -nan 0.00 a.c:39
1.13 -nan 35.21 a.c:26
0.95 -nan 0.00 a.c:44
0.57 -nan 0.00 a.c:32
Percent | Source code & Disassembly of a.out for cycles (529 samples)
-----------------------------------------------------------------------------------------
:
...

a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp)
a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp)

...

However, if a sum of samples is zero (e.g. 'page-faults'),
skip calculating percent.

After:

$ perf annotate --stdio -l

Sorted summary for file /home/taeung/workspace/a.out
----------------------------------------------

32.89 0.00 7.04 a.c:38
25.14 0.00 0.00 a.c:34
16.26 0.00 56.34 a.c:31
15.88 0.00 1.41 a.c:37
5.67 0.00 0.00 a.c:39
1.13 0.00 35.21 a.c:26
0.95 0.00 0.00 a.c:44
0.57 0.00 0.00 a.c:32
Percent | Source code & Disassembly of old for cycles (529 samples)
-----------------------------------------------------------------------------------------
:
...

a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp)
a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp)

...

Signed-off-by: Taeung Song <treeze...@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/1490598638-13947-3-git-...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6dc9148b9b84..11af5f0d56cc 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1671,11 +1671,15 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map,
src_line->nr_pcnt = nr_pcnt;

for (k = 0; k < nr_pcnt; k++) {
+ double percent = 0.0;
+
h = annotation__histogram(notes, evidx + k);
- src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum;
+ if (h->sum)
+ percent = 100.0 * h->addr[i] / h->sum;

- if (src_line->samples[k].percent > percent_max)
- percent_max = src_line->samples[k].percent;
+ if (percent > percent_max)
+ percent_max = percent;
+ src_line->samples[k].percent = percent;
}

if (percent_max <= 0.5)
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:07 PM3/27/17
to
From: Jin Yao <yao...@linux.intel.com>

Introduce dso__name() and filename_split() out of existing code because
these codes will be used in several places in next patch.

For filename_split(), it may also solve a potential memory leak in
existing code. In existing addr2line(),

sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}

out:
pclose(fp);
return ret;

If sep is NULL, filename is not freed or returned via file.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Tested-by: Milian Wolff <milian...@kdab.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-2-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/srcline.c | 68 +++++++++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b4db3f48e3b0..2953c9fecb30 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -12,6 +12,24 @@

bool srcline_full_filename;

+static const char *dso__name(struct dso *dso)
+{
+ const char *dso_name;
+
+ if (dso->symsrc_filename)
+ dso_name = dso->symsrc_filename;
+ else
+ dso_name = dso->long_name;
+
+ if (dso_name[0] == '[')
+ return NULL;
+
+ if (!strncmp(dso_name, "/tmp/perf-", 10))
+ return NULL;
+
+ return dso_name;
+}
+
#ifdef HAVE_LIBBFD_SUPPORT

/*
@@ -207,6 +225,27 @@ void dso__free_a2l(struct dso *dso)

#else /* HAVE_LIBBFD_SUPPORT */

+static int filename_split(char *filename, unsigned int *line_nr)
+{
+ char *sep;
+
+ sep = strchr(filename, '\n');
+ if (sep)
+ *sep = '\0';
+
+ if (!strcmp(filename, "??:0"))
+ return 0;
+
+ sep = strchr(filename, ':');
+ if (sep) {
+ *sep++ = '\0';
+ *line_nr = strtoul(sep, NULL, 0);
+ return 1;
+ }
+
+ return 0;
+}
+
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
@@ -216,7 +255,6 @@ static int addr2line(const char *dso_name, u64 addr,
char cmd[PATH_MAX];
char *filename = NULL;
size_t len;
- char *sep;
int ret = 0;

scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64,
@@ -233,23 +271,14 @@ static int addr2line(const char *dso_name, u64 addr,
goto out;
}

- sep = strchr(filename, '\n');
- if (sep)
- *sep = '\0';
-
- if (!strcmp(filename, "??:0")) {
- pr_debug("no debugging info in %s\n", dso_name);
+ ret = filename_split(filename, line_nr);
+ if (ret != 1) {
free(filename);
goto out;
}

- sep = strchr(filename, ':');
- if (sep) {
- *sep++ = '\0';
- *file = filename;
- *line_nr = strtoul(sep, NULL, 0);
- ret = 1;
- }
+ *file = filename;
+
out:
pclose(fp);
return ret;
@@ -278,15 +307,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (!dso->has_srcline)
goto out;

- if (dso->symsrc_filename)
- dso_name = dso->symsrc_filename;
- else
- dso_name = dso->long_name;
-
- if (dso_name[0] == '[')
- goto out;
-
- if (!strncmp(dso_name, "/tmp/perf-", 10))
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
goto out;

if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:08 PM3/27/17
to
From: Tommi Rantala <tommi.t...@nokia.com>

Ensure that the string that we read from the data file is null terminated.

Valgrind was complaining:

==31357== Invalid read of size 1
==31357== at 0x4EC8C1: __strtok_r_1c (string2.h:200)
==31357== by 0x4EC8C1: parse_ftrace_printk (trace-event-parse.c:161)
==31357== by 0x4F82A8: read_ftrace_printk (trace-event-read.c:204)
==31357== by 0x4F82A8: trace_report (trace-event-read.c:468)
==31357== by 0x4CD552: process_tracing_data (header.c:1576)
==31357== by 0x4D3397: perf_file_section__process (header.c:2705)
==31357== by 0x4D3397: perf_header__process_sections (header.c:2488)
==31357== by 0x4D3397: perf_session__read_header (header.c:2925)
==31357== by 0x4E71E2: perf_session__open (session.c:32)
==31357== by 0x4E71E2: perf_session__new (session.c:139)
==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
==31357== by 0x497150: run_builtin (perf.c:359)
==31357== by 0x428CE0: handle_internal_command (perf.c:421)
==31357== by 0x428CE0: run_argv (perf.c:467)
==31357== by 0x428CE0: main (perf.c:614)
==31357== Address 0x8ac0efb is 0 bytes after a block of size 1,963 alloc'd
==31357== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
==31357== by 0x4F827B: read_ftrace_printk (trace-event-read.c:195)
==31357== by 0x4F827B: trace_report (trace-event-read.c:468)
==31357== by 0x4CD552: process_tracing_data (header.c:1576)
==31357== by 0x4D3397: perf_file_section__process (header.c:2705)
==31357== by 0x4D3397: perf_header__process_sections (header.c:2488)
==31357== by 0x4D3397: perf_session__read_header (header.c:2925)
==31357== by 0x4E71E2: perf_session__open (session.c:32)
==31357== by 0x4E71E2: perf_session__new (session.c:139)
==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
==31357== by 0x497150: run_builtin (perf.c:359)
==31357== by 0x428CE0: handle_internal_command (perf.c:421)
==31357== by 0x428CE0: run_argv (perf.c:467)
==31357== by 0x428CE0: main (perf.c:614)

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/trace-event-read.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c
index 27420159bf69..8a9a677f7576 100644
--- a/tools/perf/util/trace-event-read.c
+++ b/tools/perf/util/trace-event-read.c
@@ -192,7 +192,7 @@ static int read_ftrace_printk(struct pevent *pevent)
if (!size)
return 0;

- buf = malloc(size);
+ buf = malloc(size + 1);
if (buf == NULL)
return -1;

@@ -201,6 +201,8 @@ static int read_ftrace_printk(struct pevent *pevent)
return -1;
}

+ buf[size] = '\0';
+
parse_ftrace_printk(pevent, buf, size);

free(buf);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:08 PM3/27/17
to
From: Jin Yao <yao...@linux.intel.com>

It takes some time to look for inline stack for callgraph addresses. So
it provides new option "--inline" to let user decide if enable this
feature.

--inline:

If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name or file/line.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Tested-by: Milian Wolff <milian...@kdab.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-4-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 4 ++++
tools/perf/builtin-report.c | 2 ++
tools/perf/util/symbol.h | 3 ++-
3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index e9a61f5485eb..248bba434b53 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -430,6 +430,10 @@ include::itrace.txt[]
--hierarchy::
Enable hierarchical output.

+--inline::
+ If a callgraph address belongs to an inlined function, the inline stack
+ will be printed. Each entry is function name or file/line.
+
include::callchain-overhead-calculation.txt[]

SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3c8885a1c452..c18158b83eb1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -845,6 +845,8 @@ int cmd_report(int argc, const char **argv)
stdio__config_color, "always"),
OPT_STRING(0, "time", &report.time_str, "str",
"Time span of interest (start,stop)"),
+ OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
+ "Show inline function"),
OPT_END()
};
struct perf_data_file file = {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index e36213ccfcf7..5245d2fb1a0a 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -118,7 +118,8 @@ struct symbol_conf {
show_ref_callgraph,
hide_unresolved,
raw_trace,
- report_hierarchy;
+ report_hierarchy,
+ inline_name;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:08 PM3/27/17
to
From: Jin Yao <yao...@linux.intel.com>

It would be useful for perf to support a mode to query the inline stack
for a given callgraph address. This would simplify finding the right
code in code that does a lot of inlining.

The srcline.c has contained the code which supports to translate the
address to filename:line_nr. This patch just extends the function to let
it support getting the inline stacks.

It introduces the inline_list which will store the inline function
result (filename:line_nr and funcname).

If BFD lib is not supported, the result is only filename:line_nr.

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Tested-by: Milian Wolff <milian...@kdab.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-3-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/srcline.c | 167 +++++++++++++++++++++++++++++++++++++--
tools/perf/util/symbol-elf.c | 5 ++
tools/perf/util/symbol-minimal.c | 7 ++
tools/perf/util/symbol.h | 2 +
tools/perf/util/util.h | 16 ++++
5 files changed, 192 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 2953c9fecb30..3ce28f702b36 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -7,6 +7,7 @@
#include "util/dso.h"
#include "util/util.h"
#include "util/debug.h"
+#include "util/callchain.h"

#include "symbol.h"

@@ -30,6 +31,34 @@ static const char *dso__name(struct dso *dso)
return dso_name;
}

+static int inline_list__append(char *filename, char *funcname, int line_nr,
+ struct inline_node *node, struct dso *dso)
+{
+ struct inline_list *ilist;
+ char *demangled;
+
+ ilist = zalloc(sizeof(*ilist));
+ if (ilist == NULL)
+ return -1;
+
+ ilist->filename = filename;
+ ilist->line_nr = line_nr;
+
+ if (dso != NULL) {
+ demangled = dso__demangle_sym(dso, 0, funcname);
+ if (demangled == NULL) {
+ ilist->funcname = funcname;
+ } else {
+ ilist->funcname = demangled;
+ free(funcname);
+ }
+ }
+
+ list_add_tail(&ilist->list, &node->val);
+
+ return 0;
+}
+
#ifdef HAVE_LIBBFD_SUPPORT

/*
@@ -169,9 +198,17 @@ static void addr2line_cleanup(struct a2l_data *a2l)

#define MAX_INLINE_NEST 1024

+static void inline_list__reverse(struct inline_node *node)
+{
+ struct inline_list *ilist, *n;
+
+ list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
+ list_move_tail(&ilist->list, &node->val);
+}
+
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line, struct dso *dso,
- bool unwind_inlines)
+ bool unwind_inlines, struct inline_node *node)
{
int ret = 0;
struct a2l_data *a2l = dso->a2l;
@@ -196,8 +233,21 @@ static int addr2line(const char *dso_name, u64 addr,

while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
&a2l->funcname, &a2l->line) &&
- cnt++ < MAX_INLINE_NEST)
- ;
+ cnt++ < MAX_INLINE_NEST) {
+
+ if (node != NULL) {
+ if (inline_list__append(strdup(a2l->filename),
+ strdup(a2l->funcname),
+ a2l->line, node,
+ dso) != 0)
+ return 0;
+ }
+ }
+
+ if ((node != NULL) &&
+ (callchain_param.order != ORDER_CALLEE)) {
+ inline_list__reverse(node);
+ }
}

if (a2l->found && a2l->filename) {
@@ -223,6 +273,35 @@ void dso__free_a2l(struct dso *dso)
dso->a2l = NULL;
}

+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso)
+{
+ char *file = NULL;
+ unsigned int line = 0;
+ struct inline_node *node;
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ return NULL;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
+ goto out_free_inline_node;
+
+ if (list_empty(&node->val))
+ goto out_free_inline_node;
+
+ return node;
+
+out_free_inline_node:
+ inline_node__delete(node);
+ return NULL;
+}
+
#else /* HAVE_LIBBFD_SUPPORT */

static int filename_split(char *filename, unsigned int *line_nr)
@@ -249,7 +328,8 @@ static int filename_split(char *filename, unsigned int *line_nr)
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
- bool unwind_inlines __maybe_unused)
+ bool unwind_inlines __maybe_unused,
+ struct inline_node *node __maybe_unused)
{
FILE *fp;
char cmd[PATH_MAX];
@@ -288,6 +368,58 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
{
}

+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso __maybe_unused)
+{
+ FILE *fp;
+ char cmd[PATH_MAX];
+ struct inline_node *node;
+ char *filename = NULL;
+ size_t len;
+ unsigned int line_nr = 0;
+
+ scnprintf(cmd, sizeof(cmd), "addr2line -e %s -i %016"PRIx64,
+ dso_name, addr);
+
+ fp = popen(cmd, "r");
+ if (fp == NULL) {
+ pr_err("popen failed for %s\n", dso_name);
+ return NULL;
+ }
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ while (getline(&filename, &len, fp) != -1) {
+ if (filename_split(filename, &line_nr) != 1) {
+ free(filename);
+ goto out;
+ }
+
+ if (inline_list__append(filename, NULL, line_nr, node,
+ NULL) != 0)
+ goto out;
+
+ filename = NULL;
+ }
+
+out:
+ pclose(fp);
+
+ if (list_empty(&node->val)) {
+ inline_node__delete(node);
+ return NULL;
+ }
+
+ return node;
+}
+
#endif /* HAVE_LIBBFD_SUPPORT */

/*
@@ -311,7 +443,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (dso_name == NULL)
goto out;

- if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
+ if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
goto out;

if (asprintf(&srcline, "%s:%u",
@@ -351,3 +483,28 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
{
return __get_srcline(dso, addr, sym, show_sym, false);
}
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
+{
+ const char *dso_name;
+
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
+ return NULL;
+
+ return addr2inlines(dso_name, addr, dso);
+}
+
+void inline_node__delete(struct inline_node *node)
+{
+ struct inline_list *ilist, *tmp;
+
+ list_for_each_entry_safe(ilist, tmp, &node->val, list) {
+ list_del_init(&ilist->list);
+ zfree(&ilist->filename);
+ zfree(&ilist->funcname);
+ free(ilist);
+ }
+
+ free(node);
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 0e660dba58ad..d1a40bb642ff 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -390,6 +390,11 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss, struct map *
return 0;
}

+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+{
+ return demangle_sym(dso, kmodule, elf_name);
+}
+
/*
* Align offset to 4 bytes as needed for note name and descriptor data.
*/
diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index 11cdde980545..870ef0f0659c 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -373,3 +373,10 @@ int kcore_copy(const char *from_dir __maybe_unused,
void symbol__elf_init(void)
{
}
+
+char *dso__demangle_sym(struct dso *dso __maybe_unused,
+ int kmodule __maybe_unused,
+ char *elf_name __maybe_unused)
+{
+ return NULL;
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 9222c7e702f3..e36213ccfcf7 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -305,6 +305,8 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
struct map *map);

+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+
void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
void symbols__insert(struct rb_root *symbols, struct symbol *sym);
void symbols__fixup_duplicate(struct rb_root *symbols);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index b2cfa47990dc..cc0700d6fef0 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -364,4 +364,20 @@ int is_printable_array(char *p, unsigned int len);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);

int unit_number__scnprintf(char *buf, size_t size, u64 n);
+
+struct inline_list {
+ char *filename;
+ char *funcname;
+ unsigned int line_nr;
+ struct list_head list;
+};
+
+struct inline_node {
+ u64 addr;
+ struct list_head val;
+};
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
+void inline_node__delete(struct inline_node *node);
+
#endif /* GIT_COMPAT_UTIL_H */
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:09 PM3/27/17
to
From: Jin Yao <yao...@linux.intel.com>

If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.

For example:

1. Show inlined function name
perf report -g function --inline

- 0.69% 0.00% inline ld-2.23.so [.] dl_main
- dl_main
0.56% _dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

2. Show the file/line information
perf report -g address --inline

- 0.69% 0.00% inline ld-2.23.so [.] _dl_start
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
+ _dl_sysdep_start dl-sysdep.c:250

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Tested-by: Milian Wolff <milian...@kdab.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-6-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/browsers/hists.c | 180 +++++++++++++++++++++++++++++++++++++++--
tools/perf/util/hist.c | 5 ++
tools/perf/util/sort.h | 1 +
3 files changed, 178 insertions(+), 8 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 2dc82bec10c0..62ecaebf2520 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -144,9 +144,60 @@ static void callchain_list__set_folding(struct callchain_list *cl, bool unfold)
cl->unfolded = unfold ? cl->has_children : false;
}

+static struct inline_node *inline_node__create(struct map *map, u64 ip)
+{
+ struct dso *dso;
+ struct inline_node *node;
+
+ if (map == NULL)
+ return NULL;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return NULL;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return NULL;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+
+ return node;
+}
+
+static int inline__count_rows(struct inline_node *node)
+{
+ struct inline_list *ilist;
+ int i = 0;
+
+ if (node == NULL)
+ return 0;
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL))
+ i++;
+ }
+
+ return i;
+}
+
+static int callchain_list__inline_rows(struct callchain_list *chain)
+{
+ struct inline_node *node;
+ int rows;
+
+ node = inline_node__create(chain->ms.map, chain->ip);
+ if (node == NULL)
+ return 0;
+
+ rows = inline__count_rows(node);
+ inline_node__delete(node);
+ return rows;
+}
+
static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
{
- int n = 0;
+ int n = 0, inline_rows;
struct rb_node *nd;

for (nd = rb_first(&node->rb_root); nd; nd = rb_next(nd)) {
@@ -156,6 +207,13 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)

list_for_each_entry(chain, &child->val, list) {
++n;
+
+ if (symbol_conf.inline_name) {
+ inline_rows =
+ callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
/* We need this because we may not have children */
folded_sign = callchain_list__folded(chain);
if (folded_sign == '+')
@@ -207,7 +265,7 @@ static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
bool unfolded = false;
- int n = 0;
+ int n = 0, inline_rows;

if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
@@ -216,6 +274,11 @@ static int callchain_node__count_rows(struct callchain_node *node)

list_for_each_entry(chain, &node->val, list) {
++n;
+ if (symbol_conf.inline_name) {
+ inline_rows = callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
unfolded = chain->unfolded;
}

@@ -362,6 +425,19 @@ static void hist_entry__init_have_children(struct hist_entry *he)
he->init_have_children = true;
}

+static void hist_entry_init_inline_node(struct hist_entry *he)
+{
+ if (he->inline_node)
+ return;
+
+ he->inline_node = inline_node__create(he->ms.map, he->ip);
+
+ if (he->inline_node == NULL)
+ return;
+
+ he->has_children = true;
+}
+
static bool hist_browser__toggle_fold(struct hist_browser *browser)
{
struct hist_entry *he = browser->he_selection;
@@ -393,7 +469,12 @@ static bool hist_browser__toggle_fold(struct hist_browser *browser)

if (he->unfolded) {
if (he->leaf)
- he->nr_rows = callchain__count_rows(&he->sorted_chain);
+ if (he->inline_node)
+ he->nr_rows = inline__count_rows(
+ he->inline_node);
+ else
+ he->nr_rows = callchain__count_rows(
+ &he->sorted_chain);
else
he->nr_rows = hierarchy_count_rows(browser, he, false);

@@ -753,6 +834,70 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u

#define LEVEL_OFFSET_STEP 3

+static int hist_browser__show_inline(struct hist_browser *browser,
+ struct inline_node *node,
+ unsigned short row,
+ int offset)
+{
+ struct inline_list *ilist;
+ char buf[1024];
+ int color, width, first_row;
+
+ first_row = row;
+ width = browser->b.width - (LEVEL_OFFSET_STEP + 2);
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
+ color = HE_COLORSET_NORMAL;
+ if (ui_browser__is_current_entry(&browser->b, row))
+ color = HE_COLORSET_SELECTED;
+
+ if (callchain_param.key == CCKEY_ADDRESS) {
+ if (ilist->filename != NULL)
+ scnprintf(buf, sizeof(buf),
+ "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ else
+ scnprintf(buf, sizeof(buf), "??");
+ } else if (ilist->funcname != NULL)
+ scnprintf(buf, sizeof(buf), "%s (inline)",
+ ilist->funcname);
+ else if (ilist->filename != NULL)
+ scnprintf(buf, sizeof(buf),
+ "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ else
+ scnprintf(buf, sizeof(buf), "??");
+
+ ui_browser__set_color(&browser->b, color);
+ hist_browser__gotorc(browser, row, 0);
+ ui_browser__write_nstring(&browser->b, " ",
+ LEVEL_OFFSET_STEP + offset);
+ ui_browser__write_nstring(&browser->b, buf, width);
+ row++;
+ }
+ }
+
+ return row - first_row;
+}
+
+static size_t show_inline_list(struct hist_browser *browser, struct map *map,
+ u64 ip, int row, int offset)
+{
+ struct inline_node *node;
+ int ret;
+
+ node = inline_node__create(map, ip);
+ if (node == NULL)
+ return 0;
+
+ ret = hist_browser__show_inline(browser, node, row, offset);
+
+ inline_node__delete(node);
+ return ret;
+}
+
static int hist_browser__show_callchain_list(struct hist_browser *browser,
struct callchain_node *node,
struct callchain_list *chain,
@@ -764,6 +909,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
char bf[1024], *alloc_str;
char buf[64], *alloc_str2;
const char *str;
+ int inline_rows = 0, ret = 1;

if (arg->row_offset != 0) {
arg->row_offset--;
@@ -801,10 +947,15 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
}

print(browser, chain, str, offset, row, arg);
-
free(alloc_str);
free(alloc_str2);
- return 1;
+
+ if (symbol_conf.inline_name) {
+ inline_rows = show_inline_list(browser, chain->ms.map,
+ chain->ip, row + 1, offset);
+ }
+
+ return ret + inline_rows;
}

static bool check_percent_display(struct rb_node *node, u64 parent_total)
@@ -1228,6 +1379,12 @@ static int hist_browser__show_entry(struct hist_browser *browser,
folded_sign = hist_entry__folded(entry);
}

+ if (symbol_conf.inline_name &&
+ (!entry->has_children)) {
+ hist_entry_init_inline_node(entry);
+ folded_sign = hist_entry__folded(entry);
+ }
+
if (row_offset == 0) {
struct hpp_arg arg = {
.b = &browser->b,
@@ -1259,7 +1416,8 @@ static int hist_browser__show_entry(struct hist_browser *browser,
}

if (first) {
- if (symbol_conf.use_callchain) {
+ if (symbol_conf.use_callchain ||
+ symbol_conf.inline_name) {
ui_browser__printf(&browser->b, "%c ", folded_sign);
width -= 2;
}
@@ -1301,8 +1459,14 @@ static int hist_browser__show_entry(struct hist_browser *browser,
.is_current_entry = current_entry,
};

- printed += hist_browser__show_callchain(browser, entry, 1, row,
- hist_browser__show_callchain_entry, &arg,
+ if (entry->inline_node)
+ printed += hist_browser__show_inline(browser,
+ entry->inline_node, row, 0);
+ else
+ printed += hist_browser__show_callchain(browser,
+ entry, 1, row,
+ hist_browser__show_callchain_entry,
+ &arg,
hist_browser__check_output_full);
}

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index e3b38f629504..3c4d4d00cb2c 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1136,6 +1136,11 @@ void hist_entry__delete(struct hist_entry *he)
zfree(&he->mem_info);
}

+ if (he->inline_node) {
+ inline_node__delete(he->inline_node);
+ he->inline_node = NULL;
+ }
+
zfree(&he->stat_acc);
free_srcline(he->srcline);
if (he->srcfile && he->srcfile[0])
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index baf20a399f34..e35fb186d048 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -128,6 +128,7 @@ struct hist_entry {
};
char *srcline;
char *srcfile;
+ struct inline_node *inline_node;
struct symbol *parent;
struct branch_info *branch_info;
struct hists *hists;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Tommi Rantala <tommi.t...@nokia.com>

Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index cf22962ce725..ef09f26e67da 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -370,15 +370,11 @@ static int write_cmdline(int fd, struct perf_header *h __maybe_unused,
struct perf_evlist *evlist __maybe_unused)
{
char buf[MAXPATHLEN];
- char proc[32];
u32 n;
int i, ret;

- /*
- * actual atual path to perf binary
- */
- sprintf(proc, "/proc/%d/exe", getpid());
- ret = readlink(proc, buf, sizeof(buf) - 1);
+ /* actual path to perf binary */
+ ret = readlink("/proc/self/exe", buf, sizeof(buf) - 1);
if (ret <= 0)
return -1;

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Tommi Rantala <tommi.t...@nokia.com>

Valgrind was complaining:

==2633== Syscall param open(filename) points to unaddressable byte(s)
==2633== at 0x5281CC0: __open_nocancel (syscall-template.S:84)
==2633== by 0x537D38: open (fcntl2.h:53)
==2633== by 0x537D38: get_sdt_note_list (symbol-elf.c:2017)
==2633== by 0x5396FD: probe_cache__scan_sdt (probe-file.c:700)
==2633== by 0x49EA2C: build_id_cache__add_sdt_cache (build-id.c:625)
==2633== by 0x49EA2C: build_id_cache__add_s (build-id.c:697)
==2633== by 0x49EE72: build_id_cache__add_b (build-id.c:717)
==2633== by 0x49EE72: dso__cache_build_id (build-id.c:782)
==2633== by 0x49F190: __dsos__cache_build_ids (build-id.c:793)
==2633== by 0x49F190: machine__cache_build_ids (build-id.c:801)
==2633== by 0x49F190: perf_session__cache_build_ids (build-id.c:815)
==2633== by 0x4CD4F2: write_build_id (header.c:165)
==2633== by 0x4D26F7: do_write_feat (header.c:2296)
==2633== by 0x4D26F7: perf_header__adds_write (header.c:2335)
==2633== by 0x4D26F7: perf_session__write_header (header.c:2414)
==2633== by 0x43B324: __cmd_record (builtin-record.c:1154)
==2633== by 0x43B324: cmd_record (builtin-record.c:1839)
==2633== by 0x455A07: __cmd_record (builtin-kmem.c:1868)
==2633== by 0x455A07: cmd_kmem (builtin-kmem.c:1944)
==2633== by 0x497150: run_builtin (perf.c:359)
==2633== by 0x428CE0: handle_internal_command (perf.c:421)
==2633== by 0x428CE0: run_argv (perf.c:467)
==2633== by 0x428CE0: main (perf.c:614)
==2633== Address 0x0 is not stack'd, malloc'd or (recently) free'd

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Tommi Rantala <tommi.t...@nokia.com>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/build-id.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index e528c40739cc..234859f756c4 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -690,7 +690,7 @@ int build_id_cache__add_s(const char *sbuild_id, const char *name,
err = 0;

/* Update SDT cache : error is just warned */
- if (build_id_cache__add_sdt_cache(sbuild_id, realname) < 0)
+ if (realname && build_id_cache__add_sdt_cache(sbuild_id, realname) < 0)
pr_debug4("Failed to update/scan SDT cache for %s\n", realname);

out_free:
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Milian Wolff <milian...@kdab.com>

Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.

Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.

The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:

~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g address
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--60.31%--hypot +20
| | |
| | |--8.52%--__hypot_finite +273
| | |
| | |--7.32%--__hypot_finite +411
...
--35.34%--_start +4194346
__libc_start_main +241
|
|--6.65%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--2.70%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--1.69%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~

With this patch and `-g srcline` we instead get the following output:

~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g srcline
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--64.02%--hypot
| | |
| | --59.81%--__hypot_finite
| |
| --0.53%--cabs
|
--35.34%--_start
__libc_start_main
|
|--12.48%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~

Signed-off-by: Milian Wolff <milian...@kdab.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Yao Jin <yao...@linux.intel.com>
Link: http://lkml.kernel.org/r/20170318214928.90...@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/ui/browsers/hists.c | 3 +-
tools/perf/ui/stdio/hist.c | 3 +-
tools/perf/util/annotate.c | 3 +-
tools/perf/util/callchain.c | 52 +++++++++++++++++++++++++++++---
tools/perf/util/callchain.h | 3 +-
tools/perf/util/map.c | 3 +-
tools/perf/util/sort.c | 16 ++++++----
tools/perf/util/srcline.c | 11 +++++--
tools/perf/util/util.h | 4 +--
10 files changed, 78 insertions(+), 21 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 248bba434b53..37a175914157 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -235,6 +235,7 @@ OPTIONS
sort_key can be:
- function: compare on functions (default)
- address: compare on individual code addresses
+ - srcline: compare on source filename and line number

branch can be:
- branch: include last branch information in callgraph when available.
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 62ecaebf2520..da24072bb76e 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -851,7 +851,8 @@ static int hist_browser__show_inline(struct hist_browser *browser,
if (ui_browser__is_current_entry(&browser->b, row))
color = HE_COLORSET_SELECTED;

- if (callchain_param.key == CCKEY_ADDRESS) {
+ if (callchain_param.key == CCKEY_ADDRESS ||
+ callchain_param.key == CCKEY_SRCLINE) {
if (ilist->filename != NULL)
scnprintf(buf, sizeof(buf),
"%s:%d (inline)",
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 6128f485a3c5..d52d5f64ea89 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -52,7 +52,8 @@ static size_t inline__fprintf(struct map *map, u64 ip, int left_margin,
ret += fprintf(fp, " ");
}

- if (callchain_param.key == CCKEY_ADDRESS) {
+ if (callchain_param.key == CCKEY_ADDRESS ||
+ callchain_param.key == CCKEY_SRCLINE) {
if (ilist->filename != NULL)
ret += fprintf(fp, "%s:%d (inline)",
ilist->filename,
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 22cd1dbe724b..3d0263e5d1db 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1674,7 +1674,8 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map,
goto next;

offset = start + i;
- src_line->path = get_srcline(map->dso, offset, NULL, false);
+ src_line->path = get_srcline(map->dso, offset, NULL,
+ false, true);
insert_source_line(&tmp_root, src_line);

next:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index aba953421a03..d78776a20e80 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -80,6 +80,10 @@ static int parse_callchain_sort_key(const char *value)
callchain_param.key = CCKEY_ADDRESS;
return 0;
}
+ if (!strncmp(value, "srcline", strlen(value))) {
+ callchain_param.key = CCKEY_SRCLINE;
+ return 0;
+ }
if (!strncmp(value, "branch", strlen(value))) {
callchain_param.branch_callstack = 1;
return 0;
@@ -510,14 +514,51 @@ enum match_result {
MATCH_GT,
};

+static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
+ struct callchain_list *cnode)
+{
+ char *left = get_srcline(cnode->ms.map->dso,
+ map__rip_2objdump(cnode->ms.map, cnode->ip),
+ cnode->ms.sym, true, false);
+ char *right = get_srcline(node->map->dso,
+ map__rip_2objdump(node->map, node->ip),
+ node->sym, true, false);
+ enum match_result ret = MATCH_EQ;
+ int cmp;
+
+ if (left && right)
+ cmp = strcmp(left, right);
+ else if (!left && right)
+ cmp = 1;
+ else if (left && !right)
+ cmp = -1;
+ else if (cnode->ip == node->ip)
+ cmp = 0;
+ else
+ cmp = (cnode->ip < node->ip) ? -1 : 1;
+
+ if (cmp != 0)
+ ret = cmp < 0 ? MATCH_LT : MATCH_GT;
+
+ free_srcline(left);
+ free_srcline(right);
+ return ret;
+}
+
static enum match_result match_chain(struct callchain_cursor_node *node,
struct callchain_list *cnode)
{
struct symbol *sym = node->sym;
u64 left, right;

- if (cnode->ms.sym && sym &&
- callchain_param.key == CCKEY_FUNCTION) {
+ if (callchain_param.key == CCKEY_SRCLINE) {
+ enum match_result match = match_chain_srcline(node, cnode);
+
+ if (match != MATCH_ERROR)
+ return match;
+ }
+
+ if (cnode->ms.sym && sym && callchain_param.key == CCKEY_FUNCTION) {
left = cnode->ms.sym->start;
right = sym->start;
} else {
@@ -911,15 +952,16 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso)
{
+ bool show_addr = callchain_param.key == CCKEY_ADDRESS;
+ bool show_srcline = show_addr || callchain_param.key == CCKEY_SRCLINE;
int printed;

if (cl->ms.sym) {
- if (callchain_param.key == CCKEY_ADDRESS &&
- cl->ms.map && !cl->srcline)
+ if (show_srcline && cl->ms.map && !cl->srcline)
cl->srcline = get_srcline(cl->ms.map->dso,
map__rip_2objdump(cl->ms.map,
cl->ip),
- cl->ms.sym, false);
+ cl->ms.sym, false, show_addr);
if (cl->srcline)
printed = scnprintf(bf, bfsize, "%s %s",
cl->ms.sym->name, cl->srcline);
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 4f4b60f1558a..c56c23dbbf72 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -77,7 +77,8 @@ typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_root *,

enum chain_key {
CCKEY_FUNCTION,
- CCKEY_ADDRESS
+ CCKEY_ADDRESS,
+ CCKEY_SRCLINE
};

enum chain_value {
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 1d9ebcf9e38e..c1870ac365a3 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -405,7 +405,8 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,

if (map && map->dso) {
srcline = get_srcline(map->dso,
- map__rip_2objdump(map, addr), NULL, true);
+ map__rip_2objdump(map, addr), NULL,
+ true, true);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 8b0d4e39f640..73f3ec1cf2a0 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -323,7 +323,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
return SRCLINE_UNKNOWN;

return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
- he->ms.sym, true);
+ he->ms.sym, true, true);
}

static int64_t
@@ -366,7 +366,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
left->branch_info->srcline_from = get_srcline(map->dso,
map__rip_2objdump(map,
left->branch_info->from.al_addr),
- left->branch_info->from.sym, true);
+ left->branch_info->from.sym,
+ true, true);
}
if (!right->branch_info->srcline_from) {
struct map *map = right->branch_info->from.map;
@@ -376,7 +377,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
right->branch_info->srcline_from = get_srcline(map->dso,
map__rip_2objdump(map,
right->branch_info->from.al_addr),
- right->branch_info->from.sym, true);
+ right->branch_info->from.sym,
+ true, true);
}
return strcmp(right->branch_info->srcline_from, left->branch_info->srcline_from);
}
@@ -407,7 +409,8 @@ sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
left->branch_info->srcline_to = get_srcline(map->dso,
map__rip_2objdump(map,
left->branch_info->to.al_addr),
- left->branch_info->from.sym, true);
+ left->branch_info->from.sym,
+ true, true);
}
if (!right->branch_info->srcline_to) {
struct map *map = right->branch_info->to.map;
@@ -417,7 +420,8 @@ sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
right->branch_info->srcline_to = get_srcline(map->dso,
map__rip_2objdump(map,
right->branch_info->to.al_addr),
- right->branch_info->to.sym, true);
+ right->branch_info->to.sym,
+ true, true);
}
return strcmp(right->branch_info->srcline_to, left->branch_info->srcline_to);
}
@@ -448,7 +452,7 @@ static char *hist_entry__get_srcfile(struct hist_entry *e)
return no_srcfile;

sf = __get_srcline(map->dso, map__rip_2objdump(map, e->ip),
- e->ms.sym, false, true);
+ e->ms.sym, false, true, true);
if (!strcmp(sf, SRCLINE_UNKNOWN))
return no_srcfile;
p = strchr(sf, ':');
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 3ce28f702b36..778ccb5d99d1 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -429,7 +429,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
#define A2L_FAIL_LIMIT 123

char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
- bool show_sym, bool unwind_inlines)
+ bool show_sym, bool show_addr, bool unwind_inlines)
{
char *file = NULL;
unsigned line = 0;
@@ -463,6 +463,11 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
dso->has_srcline = 0;
dso__free_a2l(dso);
}
+
+ if (!show_addr)
+ return (show_sym && sym) ?
+ strndup(sym->name, sym->namelen) : NULL;
+
if (sym) {
if (asprintf(&srcline, "%s+%" PRIu64, show_sym ? sym->name : "",
addr - sym->start) < 0)
@@ -479,9 +484,9 @@ void free_srcline(char *srcline)
}

char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
- bool show_sym)
+ bool show_sym, bool show_addr)
{
- return __get_srcline(dso, addr, sym, show_sym, false);
+ return __get_srcline(dso, addr, sym, show_sym, show_addr, false);
}

struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index cc0700d6fef0..7cf5752b38fd 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -287,9 +287,9 @@ struct symbol;

extern bool srcline_full_filename;
char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
- bool show_sym);
+ bool show_sym, bool show_addr);
char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
- bool show_sym, bool unwind_inlines);
+ bool show_sym, bool show_addr, bool unwind_inlines);
void free_srcline(char *srcline);

int perf_event_paranoid(void);
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Adrian Hunter <adrian...@intel.com>

Address filtering with kernel symbols incorrectly resulted in the error
"Cannot determine size of symbol" because the no_size logic was the wrong
way around.

Signed-off-by: Adrian Hunter <adrian...@intel.com>
Tested-by: Andi Kleen <a...@linux.intel.com>
Cc: sta...@vger.kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1490357752-27942-1-git-...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/auxtrace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c5a6e0b12452..78bd632f144d 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1826,7 +1826,7 @@ static int addr_filter__resolve_kernel_syms(struct addr_filter *filt)
filt->addr = start;
if (filt->range && !filt->size && !filt->sym_to) {
filt->size = size;
- no_size = !!size;
+ no_size = !size;
}
}

@@ -1840,7 +1840,7 @@ static int addr_filter__resolve_kernel_syms(struct addr_filter *filt)
if (err)
return err;
filt->size = start + size - filt->addr;
- no_size = !!size;
+ no_size = !size;
}

/* The very last symbol in kallsyms does not imply a particular size */
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Tommi Rantala <tommi.t...@nokia.com>

Valgrind was complaining:

$ valgrind ./perf list >/dev/null
==11643== Memcheck, a memory error detector
==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==11643== Command: ./perf list
==11643==
==11643== Conditional jump or move depends on uninitialised value(s)
==11643== at 0x4C30620: rindex (vg_replace_strmem.c:199)
==11643== by 0x49DAA9: build_id_cache__origname (build-id.c:198)
==11643== by 0x49E1C7: build_id_cache__valid_id (build-id.c:222)
==11643== by 0x49E1C7: build_id_cache__list_all (build-id.c:507)
==11643== by 0x4B9C8F: print_sdt_events (parse-events.c:2067)
==11643== by 0x4BB0B3: print_events (parse-events.c:2313)
==11643== by 0x439501: cmd_list (builtin-list.c:53)
==11643== by 0x497150: run_builtin (perf.c:359)
==11643== by 0x428CE0: handle_internal_command (perf.c:421)
==11643== by 0x428CE0: run_argv (perf.c:467)
==11643== by 0x428CE0: main (perf.c:614)
[...]

Additionally, a zero length result from readlink() is not very interesting.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/build-id.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 234859f756c4..33af67530d30 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -182,13 +182,17 @@ char *build_id_cache__origname(const char *sbuild_id)
char buf[PATH_MAX];
char *ret = NULL, *p;
size_t offs = 5; /* == strlen("../..") */
+ ssize_t len;

linkname = build_id_cache__linkname(sbuild_id, NULL, 0);
if (!linkname)
return NULL;

- if (readlink(linkname, buf, PATH_MAX) < 0)
+ len = readlink(linkname, buf, sizeof(buf) - 1);
+ if (len <= 0)
goto out;
+ buf[len] = '\0';
+
/* The link should be "../..<origpath>/<sbuild_id>" */
p = strrchr(buf, '/'); /* Cut off the "/<sbuild_id>" */
if (p && (p > buf + offs)) {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 9:50:10 PM3/27/17
to
From: Tommi Rantala <tommi.t...@nokia.com>

Ensure that the string in buf is null terminated.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/sdt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/sdt.c b/tools/perf/tests/sdt.c
index f59d210e1baf..26e5b7a0b839 100644
--- a/tools/perf/tests/sdt.c
+++ b/tools/perf/tests/sdt.c
@@ -43,7 +43,7 @@ static char *get_self_path(void)
{
char *buf = calloc(PATH_MAX, sizeof(char));

- if (buf && readlink("/proc/self/exe", buf, PATH_MAX) < 0) {
+ if (buf && readlink("/proc/self/exe", buf, PATH_MAX - 1) < 0) {
pr_debug("Failed to get correct path of perf\n");
free(buf);
return NULL;
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 27, 2017, 10:10:05 PM3/27/17
to
From: Jin Yao <yao...@linux.intel.com>

If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.

For example:

1. Show inlined function name
perf report --stdio -g function --inline

0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

2. Show the file/line information
perf report --stdio -g address --inline

0.69% 0.00% inline ld-2.23.so [.] _dl_start_user
|
---_dl_start_user .:0
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
_dl_sysdep_start dl-sysdep.c:250
|
--0.56%--dl_main rtld.c:2076

Committer tests:

# perf record --call-graph dwarf ~/bin/perf stat usleep 1

Performance counter stats for 'usleep 1':

0.443020 task-clock (msec) # 0.449 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.117 M/sec
1,049,423 cycles # 2.369 GHz
801,456 instructions # 0.76 insn per cycle
155,609 branches # 351.246 M/sec
7,026 branch-misses # 4.52% of all branches

0.000987570 seconds time elapsed

[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.553 MB perf.data (66 samples) ]
# perf report --stdio --inline fs__get_mountpoint
<SNIP>
1.73% 0.00% perf perf [.] fs__get_mountpoint
|
---fs__get_mountpoint
fs__get_mountpoint (inline)
fs__check_mounts (inline)
__statfs
entry_SYSCALL_64
sys_statfs
SYSC_statfs
user_statfs
user_path_at_empty
filename_lookup
path_lookupat
link_path_walk
inode_permission
__inode_permission
kernfs_iop_permission
kernfs_refresh_inode
security_inode_notifysecctx
selinux_inode_notifysecctx
selinux_inode_setsecurity
security_context_to_sid
security_context_to_sid_core
string_to_context_struct
symcmp

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Tested-by: Milian Wolff <milian...@kdab.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-5-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/ui/stdio/hist.c | 85 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 84 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 668f4aecf2e6..6128f485a3c5 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -17,6 +17,66 @@ static size_t callchain__fprintf_left_margin(FILE *fp, int left_margin)
return ret;
}

+static size_t inline__fprintf(struct map *map, u64 ip, int left_margin,
+ int depth, int depth_mask, FILE *fp)
+{
+ struct dso *dso;
+ struct inline_node *node;
+ struct inline_list *ilist;
+ int ret = 0, i;
+
+ if (map == NULL)
+ return 0;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return 0;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return 0;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+ if (node == NULL)
+ return 0;
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
+ ret += callchain__fprintf_left_margin(fp, left_margin);
+
+ for (i = 0; i < depth; i++) {
+ if (depth_mask & (1 << i))
+ ret += fprintf(fp, "|");
+ else
+ ret += fprintf(fp, " ");
+ ret += fprintf(fp, " ");
+ }
+
+ if (callchain_param.key == CCKEY_ADDRESS) {
+ if (ilist->filename != NULL)
+ ret += fprintf(fp, "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ else
+ ret += fprintf(fp, "??");
+ } else if (ilist->funcname != NULL)
+ ret += fprintf(fp, "%s (inline)",
+ ilist->funcname);
+ else if (ilist->filename != NULL)
+ ret += fprintf(fp, "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ else
+ ret += fprintf(fp, "??");
+
+ ret += fprintf(fp, "\n");
+ }
+ }
+
+ inline_node__delete(node);
+ return ret;
+}
+
static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
int left_margin)
{
@@ -78,6 +138,10 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
fputs(str, fp);
fputc('\n', fp);
free(alloc_str);
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map, chain->ip,
+ left_margin, depth, depth_mask, fp);
return ret;
}

@@ -229,6 +293,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (!i++ && field_order == NULL &&
sort_order && !prefixcmp(sort_order, "sym"))
continue;
+
if (!printed) {
ret += callchain__fprintf_left_margin(fp, left_margin);
ret += fprintf(fp, "|\n");
@@ -251,6 +316,13 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,

if (++entries_printed == callchain_param.print_limit)
break;
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map,
+ chain->ip,
+ left_margin,
+ 0, 0,
+ fp);
}
root = &cnode->rb_root;
}
@@ -529,6 +601,8 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
bool use_callchain)
{
int ret;
+ int callchain_ret = 0;
+ int inline_ret = 0;
struct perf_hpp hpp = {
.buf = bf,
.size = size,
@@ -547,7 +621,16 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
ret = fprintf(fp, "%s\n", bf);

if (use_callchain)
- ret += hist_entry_callchain__fprintf(he, total_period, 0, fp);
+ callchain_ret = hist_entry_callchain__fprintf(he, total_period,
+ 0, fp);
+
+ if (callchain_ret == 0 && symbol_conf.inline_name) {
+ inline_ret = inline__fprintf(he->ms.map, he->ip, 0, 0, 0, fp);
+ ret += inline_ret;
+ if (inline_ret > 0)
+ ret += fprintf(fp, "\n");
+ } else
+ ret += callchain_ret;

return ret;
}
--
2.9.3

Ingo Molnar

unread,
Mar 28, 2017, 1:50:05 AM3/28/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Arnaldo Carvalho de Melo

unread,
Mar 28, 2017, 2:00:07 AM3/28/17
to
Commit-ID: b0ad8ea66445d64a469df0c710947f4cdb8ef16b
Gitweb: http://git.kernel.org/tip/b0ad8ea66445d64a469df0c710947f4cdb8ef16b
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Mon, 27 Mar 2017 11:47:20 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 11:58:09 -0300

perf tools: Remove unused 'prefix' from builtin functions

We got it from the git sources but never used it for anything, with the
place where this would be somehow used remaining:

static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
{
prefix = NULL;
if (p->option & RUN_SETUP)
prefix = NULL; /* setup_perf_directory(); */

Ditch it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-uw5swz05vo...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/bench/bench.h | 20 ++++++------
tools/perf/bench/futex-hash.c | 3 +-
tools/perf/bench/futex-lock-pi.c | 3 +-
tools/perf/bench/futex-requeue.c | 3 +-
tools/perf/bench/futex-wake-parallel.c | 3 +-
tools/perf/bench/futex-wake.c | 3 +-
tools/perf/bench/mem-functions.c | 4 +--
tools/perf/bench/numa.c | 2 +-
tools/perf/bench/sched-messaging.c | 3 +-
tools/perf/bench/sched-pipe.c | 2 +-
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-bench.c | 12 +++----
tools/perf/builtin-buildid-cache.c | 3 +-
tools/perf/builtin-buildid-list.c | 3 +-
tools/perf/builtin-c2c.c | 4 +--
tools/perf/builtin-config.c | 2 +-
tools/perf/builtin-data.c | 9 +++---
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-evlist.c | 2 +-
tools/perf/builtin-ftrace.c | 2 +-
tools/perf/builtin-help.c | 2 +-
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kallsyms.c | 2 +-
tools/perf/builtin-kmem.c | 4 +--
tools/perf/builtin-kvm.c | 16 +++++-----
tools/perf/builtin-list.c | 2 +-
tools/perf/builtin-lock.c | 6 ++--
tools/perf/builtin-mem.c | 6 ++--
tools/perf/builtin-probe.c | 6 ++--
tools/perf/builtin-record.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 6 ++--
tools/perf/builtin-script.c | 4 +--
tools/perf/builtin-stat.c | 2 +-
tools/perf/builtin-timechart.c | 7 ++--
tools/perf/builtin-top.c | 2 +-
tools/perf/builtin-trace.c | 4 +--
tools/perf/builtin-version.c | 3 +-
tools/perf/builtin.h | 58 +++++++++++++++++-----------------
tools/perf/perf.c | 11 ++-----
tools/perf/tests/builtin-test.c | 2 +-
41 files changed, 110 insertions(+), 126 deletions(-)

diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 579a592..842ab27 100644
index 2499e1b..fe16b31 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -114,8 +114,7 @@ static void print_summary(void)
(int) runtime.tv_sec);
}

-int bench_futex_hash(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_hash(int argc, const char **argv)
{
int ret = 0;
cpu_set_t cpu;
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index a20814d..73a1c44 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -140,8 +140,7 @@ static void create_threads(struct worker *w, pthread_attr_t thread_attr)
}
}

-int bench_futex_lock_pi(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_lock_pi(int argc, const char **argv)
{
int ret = 0;
unsigned int i;
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index 9fad1e4..41786cb 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -109,8 +109,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_requeue(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_requeue(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 40f5fcf..4ab12c8 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -197,8 +197,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_wake_parallel(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_wake_parallel(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 7894902..2fa4922 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -115,8 +115,7 @@ static void toggle_done(int sig __maybe_unused,
done = true;
}

-int bench_futex_wake(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_futex_wake(int argc, const char **argv)
{
int ret = 0;
unsigned int i, j;
diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
index 52504a8..d1dea33 100644
--- a/tools/perf/bench/mem-functions.c
+++ b/tools/perf/bench/mem-functions.c
@@ -284,7 +284,7 @@ static const char * const bench_mem_memcpy_usage[] = {
NULL
};

-int bench_mem_memcpy(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_mem_memcpy(int argc, const char **argv)
{
struct bench_mem_info info = {
.functions = memcpy_functions,
@@ -358,7 +358,7 @@ static const struct function memset_functions[] = {
{ .name = NULL, }
};

-int bench_mem_memset(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_mem_memset(int argc, const char **argv)
{
struct bench_mem_info info = {
.functions = memset_functions,
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index 6bd0581..1fe43bd 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -1767,7 +1767,7 @@ static int bench_all(void)
return 0;
}

-int bench_numa(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_numa(int argc, const char **argv)
{
init_params(&p0, "main,", argc, argv);
argc = parse_options(argc, argv, options, bench_numa_usage, 0);
diff --git a/tools/perf/bench/sched-messaging.c b/tools/perf/bench/sched-messaging.c
index 6a111e7..4f961e7 100644
--- a/tools/perf/bench/sched-messaging.c
+++ b/tools/perf/bench/sched-messaging.c
@@ -260,8 +260,7 @@ static const char * const bench_sched_message_usage[] = {
NULL
};

-int bench_sched_messaging(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int bench_sched_messaging(int argc, const char **argv)
{
unsigned int i, total_children;
struct timeval start, stop, diff;
diff --git a/tools/perf/bench/sched-pipe.c b/tools/perf/bench/sched-pipe.c
index 2243f01..a152737 100644
--- a/tools/perf/bench/sched-pipe.c
+++ b/tools/perf/bench/sched-pipe.c
@@ -76,7 +76,7 @@ static void *worker_thread(void *__tdata)
return NULL;
}

-int bench_sched_pipe(int argc, const char **argv, const char *prefix __maybe_unused)
+int bench_sched_pipe(int argc, const char **argv)
{
struct thread_data threads[2], *td;
int pipe_1[2], pipe_2[2];
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index e54b1f9..56a7c8d 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -383,7 +383,7 @@ static const char * const annotate_usage[] = {
NULL
};

-int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_annotate(int argc, const char **argv)
{
struct perf_annotate annotate = {
.tool = {
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index a1cddc6..445e628 100644
index 30e2b2c..94b55ee 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -276,8 +276,7 @@ static int build_id_cache__update_file(const char *filename)
return err;
}

-int cmd_buildid_cache(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_buildid_cache(int argc, const char **argv)
{
struct strlist *list;
struct str_node *pos;
diff --git a/tools/perf/builtin-buildid-list.c b/tools/perf/builtin-buildid-list.c
index 5e914ee..26f4e60 100644
--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@@ -87,8 +87,7 @@ out:
return 0;
}

-int cmd_buildid_list(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_buildid_list(int argc, const char **argv)
{
bool show_kernel = false;
bool with_hits = false;
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5cd6d7a..70c2c77 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2755,12 +2755,12 @@ static int perf_c2c__record(int argc, const char **argv)
pr_debug("\n");
}

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}

-int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_c2c(int argc, const char **argv)
{
argc = parse_options(argc, argv, c2c_options, c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 8c0d93b..55f04f8 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -154,7 +154,7 @@ static int parse_config_arg(char *arg, char **var, char **value)
return 0;
}

-int cmd_config(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_config(int argc, const char **argv)
{
int i, ret = 0;
struct perf_config_set *set;
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
index 7ad6e17..0adb5f8 100644
index 5e480315..cd2605d 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1321,7 +1321,7 @@ static int diff__config(const char *var, const char *value,
return 0;
}

-int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_diff(int argc, const char **argv)
{
int ret = hists__init();

diff --git a/tools/perf/builtin-evlist.c b/tools/perf/builtin-evlist.c
index e09c428..6d210e4 100644
--- a/tools/perf/builtin-evlist.c
+++ b/tools/perf/builtin-evlist.c
@@ -46,7 +46,7 @@ static int __cmd_evlist(const char *file_name, struct perf_attr_details *details
return 0;
}

-int cmd_evlist(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_evlist(int argc, const char **argv)
{
struct perf_attr_details details = { .verbose = false, };
const struct option options[] = {
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 6087295..f80fb60 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -304,7 +304,7 @@ static int perf_ftrace_config(const char *var, const char *value, void *cb)
return -1;
}

-int cmd_ftrace(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_ftrace(int argc, const char **argv)
{
int ret;
struct perf_ftrace ftrace = {
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index aed0d84..7ae2389 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -418,7 +418,7 @@ static int show_html_page(const char *perf_cmd)
return 0;
}

-int cmd_help(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_help(int argc, const char **argv)
{
bool show_all = false;
enum help_format help_format = HELP_FORMAT_MAN;
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 8d1d13b..42dff0b 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -738,7 +738,7 @@ static int __cmd_inject(struct perf_inject *inject)
return ret;
}

-int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_inject(int argc, const char **argv)
{
struct perf_inject inject = {
.tool = {
diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
index 224bfc4..8ff38c4 100644
--- a/tools/perf/builtin-kallsyms.c
+++ b/tools/perf/builtin-kallsyms.c
@@ -43,7 +43,7 @@ static int __cmd_kallsyms(int argc, const char **argv)
return 0;
}

-int cmd_kallsyms(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_kallsyms(int argc, const char **argv)
{
const struct option options[] = {
OPT_INCR('v', "verbose", &verbose, "be more verbose (show counter open errors, etc)"),
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index d509e74..5155878 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1866,7 +1866,7 @@ static int __cmd_record(int argc, const char **argv)
for (j = 1; j < (unsigned int)argc; j++, i++)
rec_argv[i] = argv[j];

- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

static int kmem_config(const char *var, const char *value, void *cb __maybe_unused)
@@ -1885,7 +1885,7 @@ static int kmem_config(const char *var, const char *value, void *cb __maybe_unus
return 0;
}

-int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_kmem(int argc, const char **argv)
{
const char * const default_slab_sort = "frag,hit,bytes";
const char * const default_page_sort = "bytes,hit";
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 18e6c38..38b4091 100644
index be9195e..4bf2cb4 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -20,7 +20,7 @@
static bool desc_flag = true;
static bool details_flag;

-int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_list(int argc, const char **argv)
{
int i;
bool raw_dump = false;
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index e992e72..b686fb6 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -941,12 +941,12 @@ static int __cmd_record(int argc, const char **argv)

BUG_ON(i != rec_argc);

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}

-int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_lock(int argc, const char **argv)
{
const struct option lock_options[] = {
OPT_STRING('i', "input", &input_name, "file", "input file name"),
@@ -1009,7 +1009,7 @@ int cmd_lock(int argc, const char **argv, const char *prefix __maybe_unused)
rc = __cmd_report(false);
} else if (!strcmp(argv[0], "script")) {
/* Aliased to 'perf script' */
- return cmd_script(argc, argv, prefix);
+ return cmd_script(argc, argv);
} else if (!strcmp(argv[0], "info")) {
if (argc) {
argc = parse_options(argc, argv,
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 030a6cf..643f4fa 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -129,7 +129,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
pr_debug("\n");
}

- ret = cmd_record(i, rec_argv, NULL);
+ ret = cmd_record(i, rec_argv);
free(rec_argv);
return ret;
}
@@ -256,7 +256,7 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
for (j = 1; j < argc; j++, i++)
rep_argv[i] = argv[j];

- ret = cmd_report(i, rep_argv, NULL);
+ ret = cmd_report(i, rep_argv);
free(rep_argv);
return ret;
}
@@ -330,7 +330,7 @@ error:
return ret;
}

-int cmd_mem(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_mem(int argc, const char **argv)
{
struct stat st;
struct perf_mem mem = {
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 51cdc23..d7360c2 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -468,7 +468,7 @@ out:


static int
-__cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
+__cmd_probe(int argc, const char **argv)
{
const char * const probe_usage[] = {
"perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]",
@@ -687,13 +687,13 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
return 0;
}

-int cmd_probe(int argc, const char **argv, const char *prefix)
+int cmd_probe(int argc, const char **argv)
{
int ret;

ret = init_params();
if (!ret) {
- ret = __cmd_probe(argc, argv, prefix);
+ ret = __cmd_probe(argc, argv);
cleanup_params();
}

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 04faef7..3191ab0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1667,7 +1667,7 @@ static struct option __record_options[] = {

struct option *record_options = __record_options;

-int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_record(int argc, const char **argv)
{
int err;
struct record *rec = &record;
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 5ab8117..3c8885a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -681,7 +681,7 @@ const char report_callchain_help[] = "Display call graph (stack chain/backtrace)
CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;

-int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_report(int argc, const char **argv)
{
struct perf_session *session;
struct itrace_synth_opts itrace_synth_opts = { .set = 0, };
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index b92c4d9..79833e2 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3272,10 +3272,10 @@ static int __cmd_record(int argc, const char **argv)

BUG_ON(i != rec_argc);

- return cmd_record(i, rec_argv, NULL);
+ return cmd_record(i, rec_argv);
}

-int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_sched(int argc, const char **argv)
{
const char default_sort_order[] = "avg, max, switch, runtime";
struct perf_sched sched = {
@@ -3412,7 +3412,7 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
* Aliased to 'perf script' for now:
*/
if (!strcmp(argv[0], "script"))
- return cmd_script(argc, argv, prefix);
+ return cmd_script(argc, argv);

if (!strncmp(argv[0], "rec", 3)) {
return __cmd_record(argc, argv);
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c98e166..46acc8e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2359,7 +2359,7 @@ int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
return set_maps(script);
}

-int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
bool header = false;
@@ -2504,7 +2504,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (argc > 1 && !strncmp(argv[0], "rec", strlen("rec"))) {
rec_script_path = get_script_path(argv[1], RECORD_SUFFIX);
if (!rec_script_path)
- return cmd_record(argc, argv, NULL);
+ return cmd_record(argc, argv);
}

if (argc > 1 && !strncmp(argv[0], "rep", strlen("rep"))) {
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 01b589e..2158ea1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2478,7 +2478,7 @@ static void setup_system_wide(int forks)
}
}

-int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_stat(int argc, const char **argv)
{
const char * const stat_usage[] = {
"perf stat [<options>] [<command>]",
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index fbd7c6c..fafdb44 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -1773,7 +1773,7 @@ static int timechart__io_record(int argc, const char **argv)
for (i = 0; i < (unsigned int)argc; i++)
*p++ = argv[i];

- return cmd_record(rec_argc, rec_argv, NULL);
+ return cmd_record(rec_argc, rec_argv);
}


@@ -1864,7 +1864,7 @@ static int timechart__record(struct timechart *tchart, int argc, const char **ar
for (j = 0; j < (unsigned int)argc; j++)
*p++ = argv[j];

- return cmd_record(rec_argc, rec_argv, NULL);
+ return cmd_record(rec_argc, rec_argv);
}

static int
@@ -1917,8 +1917,7 @@ parse_time(const struct option *opt, const char *arg, int __maybe_unused unset)
return 0;
}

-int cmd_timechart(int argc, const char **argv,
- const char *prefix __maybe_unused)
+int cmd_timechart(int argc, const char **argv)
{
struct timechart tchart = {
.tool = {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ab90779..a0c97c7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1075,7 +1075,7 @@ parse_percent_limit(const struct option *opt, const char *arg,
const char top_callchain_help[] = CALLCHAIN_RECORD_HELP CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: fp,graph,0.5,caller,function";

-int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_top(int argc, const char **argv)
{
char errbuf[BUFSIZ];
struct perf_top top = {
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 60053d4..c88f9f2 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1993,7 +1993,7 @@ static int trace__record(struct trace *trace, int argc, const char **argv)
for (i = 0; i < (unsigned int)argc; i++)
rec_argv[j++] = argv[i];

- return cmd_record(j, rec_argv, NULL);
+ return cmd_record(j, rec_argv);
}

static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp);
@@ -2791,7 +2791,7 @@ out:
return err;
}

-int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
+int cmd_trace(int argc, const char **argv)
{
const char *trace_usage[] = {
"perf trace [<options>] [<command>]",
diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 9b10cda..b9a095b 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -2,8 +2,7 @@
#include "builtin.h"
#include "perf.h"

-int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused,
- const char *prefix __maybe_unused)
+int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused)
{
printf("perf version %s\n", perf_version_string);
return 0;
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 036e1e3..26669bf 100644
index 6d5479e..4b283d1 100644
index 8682296..e6d7876 100644

tip-bot for Arnaldo Carvalho de Melo

unread,
Mar 28, 2017, 2:00:08 AM3/28/17
to
Commit-ID: 39f0e7a825cfc971dc9ad40b0770c22f6f4f89b8
Gitweb: http://git.kernel.org/tip/39f0e7a825cfc971dc9ad40b0770c22f6f4f89b8
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Fri, 24 Mar 2017 14:51:28 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 24 Mar 2017 16:05:31 -0300

perf trace: Check for vfs_getname.pathname length

It shouldn't be zero, but if the 'perf probe' on getname_flags() (or
elsewhere in the future we need to probe to catch the pathname for
syscalls like 'open' being copied from userspace to the kernel) is
misplaced somehow, then we will end up not allocating space and trying
to copy the "" empty string to ttrace->filename.name, causing a
segfault, fix it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-c4f1t6sx1n...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-trace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 912fedc..33c657c 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c

tip-bot for Tommi Rantala

unread,
Mar 28, 2017, 2:10:05 AM3/28/17
to
Commit-ID: 0e6ba11511aef91ba8e2528ddc681d88922d7b0b
Gitweb: http://git.kernel.org/tip/0e6ba11511aef91ba8e2528ddc681d88922d7b0b
Author: Tommi Rantala <tommi.t...@nokia.com>
AuthorDate: Wed, 22 Mar 2017 15:06:21 +0200
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 15:35:56 -0300

perf tests: Do not assume that readlink() returns a null terminated string

Ensure that the string in buf is null terminated.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/tests/sdt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/sdt.c b/tools/perf/tests/sdt.c
index f59d210..26e5b7a 100644

tip-bot for Tommi Rantala

unread,
Mar 28, 2017, 2:10:05 AM3/28/17
to
Commit-ID: 55f77128e7652e537d6c226d5b56821cdb5c22de
Gitweb: http://git.kernel.org/tip/55f77128e7652e537d6c226d5b56821cdb5c22de
Author: Tommi Rantala <tommi.t...@nokia.com>
AuthorDate: Wed, 22 Mar 2017 15:06:24 +0200
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 15:37:54 -0300

perf utils: Readlink /proc/self/exe to find the perf binary

Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/header.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index cf22962..ef09f26 100644

tip-bot for Tommi Rantala

unread,
Mar 28, 2017, 2:10:06 AM3/28/17
to
Commit-ID: d4b364df5f6540e8d6a38008ce2693ba73a8508a
Gitweb: http://git.kernel.org/tip/d4b364df5f6540e8d6a38008ce2693ba73a8508a
Author: Tommi Rantala <tommi.t...@nokia.com>
AuthorDate: Wed, 22 Mar 2017 15:06:23 +0200
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 15:37:35 -0300

perf utils: Null terminate buf in read_ftrace_printk()

==31357== by 0x497150: run_builtin (perf.c:359)
==31357== by 0x428CE0: handle_internal_command (perf.c:421)
==31357== by 0x428CE0: run_argv (perf.c:467)
==31357== by 0x428CE0: main (perf.c:614)

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/trace-event-read.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c
index 2742015..8a9a677 100644

tip-bot for Tommi Rantala

unread,
Mar 28, 2017, 2:10:06 AM3/28/17
to
Commit-ID: 5a2342111c68e623e27ee7ea3d0492d8dad6bda0
Gitweb: http://git.kernel.org/tip/5a2342111c68e623e27ee7ea3d0492d8dad6bda0
Author: Tommi Rantala <tommi.t...@nokia.com>
AuthorDate: Wed, 22 Mar 2017 15:06:20 +0200
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 15:35:06 -0300

perf buildid: Do not assume that readlink() returns a null terminated string

Valgrind was complaining:

$ valgrind ./perf list >/dev/null
==11643== Memcheck, a memory error detector
==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==11643== Command: ./perf list
==11643==
==11643== Conditional jump or move depends on uninitialised value(s)
==11643== at 0x4C30620: rindex (vg_replace_strmem.c:199)
==11643== by 0x49DAA9: build_id_cache__origname (build-id.c:198)
==11643== by 0x49E1C7: build_id_cache__valid_id (build-id.c:222)
==11643== by 0x49E1C7: build_id_cache__list_all (build-id.c:507)
==11643== by 0x4B9C8F: print_sdt_events (parse-events.c:2067)
==11643== by 0x4BB0B3: print_events (parse-events.c:2313)
==11643== by 0x439501: cmd_list (builtin-list.c:53)
==11643== by 0x497150: run_builtin (perf.c:359)
==11643== by 0x428CE0: handle_internal_command (perf.c:421)
==11643== by 0x428CE0: run_argv (perf.c:467)
==11643== by 0x428CE0: main (perf.c:614)
[...]

Additionally, a zero length result from readlink() is not very interesting.

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/build-id.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 234859f..33af675 100644

tip-bot for Tommi Rantala

unread,
Mar 28, 2017, 2:10:09 AM3/28/17
to
Commit-ID: 2ccc220238680642be87a2d010ce07f1c40edafb
Gitweb: http://git.kernel.org/tip/2ccc220238680642be87a2d010ce07f1c40edafb
Author: Tommi Rantala <tommi.t...@nokia.com>
AuthorDate: Wed, 22 Mar 2017 15:06:19 +0200
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Mon, 27 Mar 2017 15:33:36 -0300

perf buildid: Do not update SDT cache with null filename

Signed-off-by: Tommi Rantala <tommi.t...@nokia.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Tommi Rantala <tommi.t...@nokia.com>
Link: http://lkml.kernel.org/r/20170322130624.2188...@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/build-id.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index e528c40..234859f 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:06 PM3/31/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 3906a13a6b4e78fbc0def03a808f091f0dff1b44:

Merge tag 'perf-core-for-mingo-4.12-20170327' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-28 07:44:43 +0200)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170331

for you to fetch changes up to fd5cead23f54697310bd565aa2a23ae5128080a0:

perf trace: Beautify statx syscall 'flag' and 'mask' arguments (2017-03-31 14:42:31 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo)

e.g.:

System wide strace like session:

# trace -e statx
16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0
^C#

User visible:

- Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we
shouldn't try to calculate duration or print the timestamp for a missing
matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo)

- Do not print "cycles: 0" in perf report LBR lines in platforms not
supporting 'cycles', such as Intel's Broadwell (Jin Yao)

- Handle missing $HOME env var (Jiri Olsa)

- Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to
the next best thing (ax, bx, etc) supported (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (4):
perf tools: Remove support for command aliases
perf trace: Handle unpaired raw_syscalls:sys_exit event
tools include uapi: Grab copies of stat.h and fcntl.h
perf trace: Beautify statx syscall 'flag' and 'mask' arguments

Colin Ian King (1):
perf utils: Fix spelling mistake: "Invalud" -> "Invalid"

Jin Yao (1):
perf report: Drop cycles 0 for LBR print

Jiri Olsa (1):
perf tools: Do not fail in case of empty HOME env variable

Ravi Bangoria (2):
perf/sdt/x86: Add renaming logic for (missing) 8 bit registers
perf/sdt/x86: Move OP parser to tools/perf/arch/x86/

tools/include/linux/types.h | 1 +
tools/include/uapi/linux/fcntl.h | 72 +++++++++
tools/include/uapi/linux/stat.h | 176 ++++++++++++++++++++
tools/perf/Build | 1 +
tools/perf/MANIFEST | 2 +
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 +
tools/perf/arch/x86/util/perf_regs.c | 187 ++++++++++++++++++----
tools/perf/builtin-help.c | 13 --
tools/perf/builtin-trace.c | 57 ++++---
tools/perf/check-headers.sh | 2 +
tools/perf/perf.c | 97 +----------
tools/perf/trace/beauty/Build | 1 +
tools/perf/trace/beauty/beauty.h | 24 +++
tools/perf/trace/beauty/statx.c | 72 +++++++++
tools/perf/util/Build | 1 -
tools/perf/util/alias.c | 78 ---------
tools/perf/util/cache.h | 1 -
tools/perf/util/callchain.c | 111 ++++++++-----
tools/perf/util/config.c | 54 ++++---
tools/perf/util/help-unknown-cmd.c | 8 +-
tools/perf/util/hist.c | 2 +-
tools/perf/util/perf_regs.c | 6 +-
tools/perf/util/perf_regs.h | 11 +-
tools/perf/util/probe-file.c | 132 +++++----------
24 files changed, 707 insertions(+), 403 deletions(-)
create mode 100644 tools/include/uapi/linux/fcntl.h
create mode 100644 tools/include/uapi/linux/stat.h
create mode 100644 tools/perf/trace/beauty/Build
create mode 100644 tools/perf/trace/beauty/beauty.h
create mode 100644 tools/perf/trace/beauty/statx.c
delete mode 100644 tools/perf/util/alias.c

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.

For this specific pull request the samples/bpf/ was disabled, as 'make headers_install'
is failing with the following error, in this case in fedora:rawhide:

INSTALL usr/include/uapi/ (0 file)
/git/linux/scripts/Makefile.headersinst:62: *** Missing generated UAPI file ./arch/x86/include/generated/uapi/asm/unistd_32.h. Stop.
make[1]: *** [/git/linux/Makefile:1151: headers_install] Error 2
make[1]: Leaving directory '/tmp/build/linux'
make: *** [Makefile:152: sub-make] Error 2
make: Leaving directory '/git/linux'

I'll investigate later, perf and objtool builds just fine, with clang and gcc.
# 'perf test tsc' already fixed by peterz in tip
$ make -C tools/perf build-test
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_gtk2_O: make NO_GTK2=1
make_no_newt_O: make NO_NEWT=1
make_debug_O: make DEBUG=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_tags_O: make tags
make_no_demangle_O: make NO_DEMANGLE=1
make_install_bin_O: make install-bin
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_slang_O: make NO_SLANG=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libbpf_O: make NO_LIBBPF=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_map_o_O: make util/map.o
make_static_O: make LDFLAGS=-static
make_help_O: make help
make_pure_O: make
make_perf_o_O: make perf.o
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_doc_O: make doc
make_no_libelf_O: make NO_LIBELF=1
make_clean_all_O: make clean all
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libaudit_O: make NO_LIBAUDIT=1
make_install_O: make install
make_install_prefix_slash_O: make install prefix=/tmp/krava/

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:06 PM3/31/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

This came from 'git', but isn't documented anywhere in
tools/perf/Documentation/, looks like baggage we can do without, ditch
it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-e7uwkn60t4...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-help.c | 13 -----
tools/perf/perf.c | 97 +++-----------------------------------
tools/perf/util/Build | 1 -
tools/perf/util/alias.c | 78 ------------------------------
tools/perf/util/cache.h | 1 -
tools/perf/util/help-unknown-cmd.c | 8 +---
6 files changed, 8 insertions(+), 190 deletions(-)
delete mode 100644 tools/perf/util/alias.c

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 7ae238929e95..1eec96a0fa67 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -301,12 +301,6 @@ void list_common_cmds_help(void)
}
}

-static int is_perf_command(const char *s)
-{
- return is_in_cmdlist(&main_cmds, s) ||
- is_in_cmdlist(&other_cmds, s);
-}
-
static const char *cmd_to_page(const char *perf_cmd)
{
char *s;
@@ -446,7 +440,6 @@ int cmd_help(int argc, const char **argv)
"perf help [--all] [--man|--web|--info] [command]",
NULL
};
- const char *alias;
int rc;

load_command_list("perf-", &main_cmds, &other_cmds);
@@ -472,12 +465,6 @@ int cmd_help(int argc, const char **argv)
return 0;
}

- alias = alias_lookup(argv[0]);
- if (alias && !is_perf_command(argv[0])) {
- printf("`perf %s' is aliased to `%s'\n", argv[0], alias);
- return 0;
- }
-
switch (help_format) {
case HELP_FORMAT_MAN:
rc = show_man_page(argv[0]);
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 4b283d18e158..9217f2227f3d 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -267,71 +267,6 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
return handled;
}

-static int handle_alias(int *argcp, const char ***argv)
-{
- int envchanged = 0, ret = 0, saved_errno = errno;
- int count, option_count;
- const char **new_argv;
- const char *alias_command;
- char *alias_string;
-
- alias_command = (*argv)[0];
- alias_string = alias_lookup(alias_command);
- if (alias_string) {
- if (alias_string[0] == '!') {
- if (*argcp > 1) {
- struct strbuf buf;
-
- if (strbuf_init(&buf, PATH_MAX) < 0 ||
- strbuf_addstr(&buf, alias_string) < 0 ||
- sq_quote_argv(&buf, (*argv) + 1,
- PATH_MAX) < 0)
- die("Failed to allocate memory.");
- free(alias_string);
- alias_string = buf.buf;
- }
- ret = system(alias_string + 1);
- if (ret >= 0 && WIFEXITED(ret) &&
- WEXITSTATUS(ret) != 127)
- exit(WEXITSTATUS(ret));
- die("Failed to run '%s' when expanding alias '%s'",
- alias_string + 1, alias_command);
- }
- count = split_cmdline(alias_string, &new_argv);
- if (count < 0)
- die("Bad alias.%s string", alias_command);
- option_count = handle_options(&new_argv, &count, &envchanged);
- if (envchanged)
- die("alias '%s' changes environment variables\n"
- "You can use '!perf' in the alias to do this.",
- alias_command);
- memmove(new_argv - option_count, new_argv,
- count * sizeof(char *));
- new_argv -= option_count;
-
- if (count < 1)
- die("empty alias for %s", alias_command);
-
- if (!strcmp(alias_command, new_argv[0]))
- die("recursive alias: %s", alias_command);
-
- new_argv = realloc(new_argv, sizeof(char *) *
- (count + *argcp + 1));
- /* insert after command name */
- memcpy(new_argv + count, *argv + 1, sizeof(char *) * *argcp);
- new_argv[count + *argcp] = NULL;
-
- *argv = new_argv;
- *argcp += count - 1;
-
- ret = 1;
- }
-
- errno = saved_errno;
-
- return ret;
-}
-
#define RUN_SETUP (1<<0)
#define USE_PAGER (1<<1)

@@ -455,25 +390,12 @@ static void execv_dashed_external(const char **argv)

static int run_argv(int *argcp, const char ***argv)
{
- int done_alias = 0;
-
- while (1) {
- /* See if it's an internal command */
- handle_internal_command(*argcp, *argv);
-
- /* .. then try the external ones */
- execv_dashed_external(*argv);
+ /* See if it's an internal command */
+ handle_internal_command(*argcp, *argv);

- /* It could be an alias -- this works around the insanity
- * of overriding "perf log" with "perf show" by having
- * alias.log = show
- */
- if (done_alias || !handle_alias(argcp, argv))
- break;
- done_alias = 1;
- }
-
- return done_alias;
+ /* .. then try the external ones */
+ execv_dashed_external(*argv);
+ return 0;
}

static void pthread__block_sigwinch(void)
@@ -606,17 +528,12 @@ int main(int argc, const char **argv)

while (1) {
static int done_help;
- int was_alias = run_argv(&argc, &argv);
+
+ run_argv(&argc, &argv);

if (errno != ENOENT)
break;

- if (was_alias) {
- fprintf(stderr, "Expansion of alias '%s' failed; "
- "'%s' is not a perf-command\n",
- cmd, argv[0]);
- goto out;
- }
if (!done_help) {
cmd = argv[0] = help_unknown_cmd(cmd);
done_help = 1;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 2ae92da613dd..5c0ea11a8f0a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,4 +1,3 @@
-libperf-y += alias.o
libperf-y += annotate.o
libperf-y += block-range.o
libperf-y += build-id.o
diff --git a/tools/perf/util/alias.c b/tools/perf/util/alias.c
deleted file mode 100644
index 6455471d9cd1..000000000000
--- a/tools/perf/util/alias.c
+++ /dev/null
@@ -1,78 +0,0 @@
-#include "cache.h"
-#include "util.h"
-#include "config.h"
-
-static const char *alias_key;
-static char *alias_val;
-
-static int alias_lookup_cb(const char *k, const char *v,
- void *cb __maybe_unused)
-{
- if (!prefixcmp(k, "alias.") && !strcmp(k+6, alias_key)) {
- if (!v)
- return config_error_nonbool(k);
- alias_val = strdup(v);
- return 0;
- }
- return 0;
-}
-
-char *alias_lookup(const char *alias)
-{
- alias_key = alias;
- alias_val = NULL;
- perf_config(alias_lookup_cb, NULL);
- return alias_val;
-}
-
-int split_cmdline(char *cmdline, const char ***argv)
-{
- int src, dst, count = 0, size = 16;
- char quoted = 0;
-
- *argv = malloc(sizeof(char*) * size);
-
- /* split alias_string */
- (*argv)[count++] = cmdline;
- for (src = dst = 0; cmdline[src];) {
- char c = cmdline[src];
- if (!quoted && isspace(c)) {
- cmdline[dst++] = 0;
- while (cmdline[++src]
- && isspace(cmdline[src]))
- ; /* skip */
- if (count >= size) {
- size += 16;
- *argv = realloc(*argv, sizeof(char*) * size);
- }
- (*argv)[count++] = cmdline + dst;
- } else if (!quoted && (c == '\'' || c == '"')) {
- quoted = c;
- src++;
- } else if (c == quoted) {
- quoted = 0;
- src++;
- } else {
- if (c == '\\' && quoted != '\'') {
- src++;
- c = cmdline[src];
- if (!c) {
- zfree(argv);
- return error("cmdline ends with \\");
- }
- }
- cmdline[dst++] = c;
- src++;
- }
- }
-
- cmdline[dst] = 0;
-
- if (quoted) {
- zfree(argv);
- return error("unclosed quote");
- }
-
- return count;
-}
-
diff --git a/tools/perf/util/cache.h b/tools/perf/util/cache.h
index 512c0c83fbc6..0328f297a748 100644
--- a/tools/perf/util/cache.h
+++ b/tools/perf/util/cache.h
@@ -15,7 +15,6 @@
#define PERF_TRACEFS_ENVIRONMENT "PERF_TRACEFS_DIR"
#define PERF_PAGER_ENVIRONMENT "PERF_PAGER"

-char *alias_lookup(const char *alias);
int split_cmdline(char *cmdline, const char ***argv);

#define alloc_nr(x) (((x)+16)*3/2)
diff --git a/tools/perf/util/help-unknown-cmd.c b/tools/perf/util/help-unknown-cmd.c
index 2821f8d77e52..34201440ac03 100644
--- a/tools/perf/util/help-unknown-cmd.c
+++ b/tools/perf/util/help-unknown-cmd.c
@@ -6,16 +6,12 @@
#include "levenshtein.h"

static int autocorrect;
-static struct cmdnames aliases;

static int perf_unknown_cmd_config(const char *var, const char *value,
void *cb __maybe_unused)
{
if (!strcmp(var, "help.autocorrect"))
autocorrect = perf_config_int(var,value);
- /* Also use aliases for command lookup */
- if (!prefixcmp(var, "alias."))
- add_cmdname(&aliases, var + 6, strlen(var + 6));

return 0;
}
@@ -59,14 +55,12 @@ const char *help_unknown_cmd(const char *cmd)

memset(&main_cmds, 0, sizeof(main_cmds));
memset(&other_cmds, 0, sizeof(main_cmds));
- memset(&aliases, 0, sizeof(aliases));

perf_config(perf_unknown_cmd_config, NULL);

load_command_list("perf-", &main_cmds, &other_cmds);

- if (add_cmd_list(&main_cmds, &aliases) < 0 ||
- add_cmd_list(&main_cmds, &other_cmds) < 0) {
+ if (add_cmd_list(&main_cmds, &other_cmds) < 0) {
fprintf(stderr, "ERROR: Failed to allocate command list for unknown command.\n");
goto end;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:06 PM3/31/17
to
From: Jiri Olsa <jo...@redhat.com>

Currently we fail in the following case:

$ unset HOME
$ ./perf record ls
$ echo $?
255

It's because the config code init fails due to a missing HOME variable
value. Fix this by skipping the user config init if there's no HOME
variable value.

Reported-by: Jan Stancek <jsta...@redhat.com>
Signed-off-by: Jiri Olsa <jo...@kernel.org>
Cc: David Ahern <dsa...@gmail.com>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <a.p.zi...@chello.nl>
Link: http://lkml.kernel.org/r/2017033014463...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/config.c | 54 +++++++++++++++++++++++++++---------------------
1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 0c7d5a4975cd..7b01d59076d3 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -627,6 +627,8 @@ static int perf_config_set__init(struct perf_config_set *set)
{
int ret = -1;
const char *home = NULL;
+ char *user_config;
+ struct stat st;

/* Setting $PERF_CONFIG makes perf read _only_ the given config file. */
if (config_exclusive_filename)
@@ -637,35 +639,41 @@ static int perf_config_set__init(struct perf_config_set *set)
}

home = getenv("HOME");
- if (perf_config_global() && home) {
- char *user_config = strdup(mkpath("%s/.perfconfig", home));
- struct stat st;

- if (user_config == NULL) {
- warning("Not enough memory to process %s/.perfconfig, "
- "ignoring it.", home);
- goto out;
- }
+ /*
+ * Skip reading user config if:
+ * - there is no place to read it from (HOME)
+ * - we are asked not to (PERF_CONFIG_NOGLOBAL=1)
+ */
+ if (!home || !*home || !perf_config_global())
+ return 0;

- if (stat(user_config, &st) < 0) {
- if (errno == ENOENT)
- ret = 0;
- goto out_free;
- }
+ user_config = strdup(mkpath("%s/.perfconfig", home));
+ if (user_config == NULL) {
+ warning("Not enough memory to process %s/.perfconfig, "
+ "ignoring it.", home);
+ goto out;
+ }
+
+ if (stat(user_config, &st) < 0) {
+ if (errno == ENOENT)
+ ret = 0;
+ goto out_free;
+ }

- ret = 0;
+ ret = 0;

- if (st.st_uid && (st.st_uid != geteuid())) {
- warning("File %s not owned by current user or root, "
- "ignoring it.", user_config);
- goto out_free;
- }
+ if (st.st_uid && (st.st_uid != geteuid())) {
+ warning("File %s not owned by current user or root, "
+ "ignoring it.", user_config);
+ goto out_free;
+ }
+
+ if (st.st_size)
+ ret = perf_config_from_file(collect_config, user_config, set);

- if (st.st_size)
- ret = perf_config_from_file(collect_config, user_config, set);
out_free:
- free(user_config);
- }
+ free(user_config);
out:
return ret;
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:07 PM3/31/17
to
From: Colin Ian King <colin...@canonical.com>

Trivial fix to spelling mistake in pr_debug message.

Signed-off-by: Colin King <colin...@canonical.com>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Krister Johansen <kj...@templeofstupid.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: kernel-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170330095440.1...@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/hist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 3c4d4d00cb2c..61bf304206fd 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2459,7 +2459,7 @@ int parse_filter_percentage(const struct option *opt __maybe_unused,
else if (!strcmp(arg, "absolute"))
symbol_conf.filter_relative = false;
else {
- pr_debug("Invalud percentage: %s\n", arg);
+ pr_debug("Invalid percentage: %s\n", arg);
return -1;
}

--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:07 PM3/31/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

I found couple of events using al, bl, cl and dl registers for argument.
These are not directly accepted by uprobe_events and thus needs to be
mapped to ax, bx, cx and dx respectively.

Few ex,

/usr/bin/qemu-system-s390x
css_adapter_interrupt: 1@%bl
css_chpid_add: 1@%cl 1@%sil 1@%dl
dma_bdrv_io: 8@%rbx 8@%rbp -8@%r14 1@%al

/usr/bin/postgres
buffer__read__done: ... -1@-bash -1@%al
buffer__read__start: ... -1@%al

I don't find any sdt events using ah, bh,... registers. But I also don't
see any reason to not use them, so there might be rare events using
these registers, and if so, perf should have a renaming logic for them
too.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexis Berlemont <alexis.b...@gmail.com>
Cc: Hemant Kumar <hem...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.31...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/x86/util/perf_regs.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index d8a8dcf761f7..fa1fd196837d 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -40,12 +40,20 @@ struct sdt_name_reg {
static const struct sdt_name_reg sdt_reg_renamings[] = {
SDT_NAME_REG(eax, ax),
SDT_NAME_REG(rax, ax),
+ SDT_NAME_REG(al, ax),
+ SDT_NAME_REG(ah, ax),
SDT_NAME_REG(ebx, bx),
SDT_NAME_REG(rbx, bx),
+ SDT_NAME_REG(bl, bx),
+ SDT_NAME_REG(bh, bx),
SDT_NAME_REG(ecx, cx),
SDT_NAME_REG(rcx, cx),
+ SDT_NAME_REG(cl, cx),
+ SDT_NAME_REG(ch, cx),
SDT_NAME_REG(edx, dx),
SDT_NAME_REG(rdx, dx),
+ SDT_NAME_REG(dl, dx),
+ SDT_NAME_REG(dh, dx),
SDT_NAME_REG(esi, si),
SDT_NAME_REG(rsi, si),
SDT_NAME_REG(sil, si),
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:07 PM3/31/17
to
From: Jin Yao <yao...@linux.intel.com>

For some platforms, for example Broadwell, it doesn't support cycles
for LBR. But the perf always prints cycles:0, it's not necessary.

The patch refactors the LBR info print code and drops the cycles:0.

For example: perf report --branch-history --no-children --stdio

On Broadwell:
--0.91%--__random_r random_r.c:394 (iterations:2)
__random_r random_r.c:360 (predicted:0.0%)
__random_r random_r.c:380 (predicted:0.0%)
__random_r random_r.c:357

On Skylake:
--1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17)
main div.c:44 (predicted:52.4% cycles:1)
main div.c:42 (cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)

Signed-off-by: Yao Jin <yao...@linux.intel.com>
Reviewed-by: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan....@intel.com>
Link: http://lkml.kernel.org/r/1489046786-10061-1-g...@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/callchain.c | 111 +++++++++++++++++++++++++++++---------------
1 file changed, 74 insertions(+), 37 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index d78776a20e80..3cea1fb5404b 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1105,63 +1105,100 @@ int callchain_branch_counts(struct callchain_root *root,
cycles_count);
}

-static int callchain_counts_printf(FILE *fp, char *bf, int bfsize,
- u64 branch_count, u64 predicted_count,
- u64 abort_count, u64 cycles_count,
- u64 iter_count, u64 samples_count)
+static int counts_str_build(char *bf, int bfsize,
+ u64 branch_count, u64 predicted_count,
+ u64 abort_count, u64 cycles_count,
+ u64 iter_count, u64 samples_count)
{
double predicted_percent = 0.0;
const char *null_str = "";
char iter_str[32];
- char *str;
- u64 cycles = 0;
-
- if (branch_count == 0) {
- if (fp)
- return fprintf(fp, " (calltrace)");
+ char cycle_str[32];
+ char *istr, *cstr;
+ u64 cycles;

+ if (branch_count == 0)
return scnprintf(bf, bfsize, " (calltrace)");
- }
+
+ cycles = cycles_count / branch_count;

if (iter_count && samples_count) {
- scnprintf(iter_str, sizeof(iter_str),
- ", iterations:%" PRId64 "",
- iter_count / samples_count);
- str = iter_str;
+ if (cycles > 0)
+ scnprintf(iter_str, sizeof(iter_str),
+ " iterations:%" PRId64 "",
+ iter_count / samples_count);
+ else
+ scnprintf(iter_str, sizeof(iter_str),
+ "iterations:%" PRId64 "",
+ iter_count / samples_count);
+ istr = iter_str;
+ } else
+ istr = (char *)null_str;
+
+ if (cycles > 0) {
+ scnprintf(cycle_str, sizeof(cycle_str),
+ "cycles:%" PRId64 "", cycles);
+ cstr = cycle_str;
} else
- str = (char *)null_str;
+ cstr = (char *)null_str;

predicted_percent = predicted_count * 100.0 / branch_count;
- cycles = cycles_count / branch_count;

- if ((predicted_percent >= 100.0) && (abort_count == 0)) {
- if (fp)
- return fprintf(fp, " (cycles:%" PRId64 "%s)",
- cycles, str);
+ if ((predicted_count == branch_count) && (abort_count == 0)) {
+ if ((cycles > 0) || (istr != (char *)null_str))
+ return scnprintf(bf, bfsize, " (%s%s)", cstr, istr);
+ else
+ return scnprintf(bf, bfsize, "%s", (char *)null_str);
+ }

- return scnprintf(bf, bfsize, " (cycles:%" PRId64 "%s)",
- cycles, str);
+ if ((predicted_count < branch_count) && (abort_count == 0)) {
+ if ((cycles > 0) || (istr != (char *)null_str))
+ return scnprintf(bf, bfsize,
+ " (predicted:%.1f%% %s%s)",
+ predicted_percent, cstr, istr);
+ else {
+ return scnprintf(bf, bfsize,
+ " (predicted:%.1f%%)",
+ predicted_percent);
+ }
}

- if ((predicted_percent < 100.0) && (abort_count == 0)) {
- if (fp)
- return fprintf(fp,
- " (predicted:%.1f%%, cycles:%" PRId64 "%s)",
- predicted_percent, cycles, str);
+ if ((predicted_count == branch_count) && (abort_count > 0)) {
+ if ((cycles > 0) || (istr != (char *)null_str))
+ return scnprintf(bf, bfsize,
+ " (abort:%" PRId64 " %s%s)",
+ abort_count, cstr, istr);
+ else
+ return scnprintf(bf, bfsize,
+ " (abort:%" PRId64 ")",
+ abort_count);
+ }

+ if ((cycles > 0) || (istr != (char *)null_str))
return scnprintf(bf, bfsize,
- " (predicted:%.1f%%, cycles:%" PRId64 "%s)",
- predicted_percent, cycles, str);
- }
+ " (predicted:%.1f%% abort:%" PRId64 " %s%s)",
+ predicted_percent, abort_count, cstr, istr);
+
+ return scnprintf(bf, bfsize,
+ " (predicted:%.1f%% abort:%" PRId64 ")",
+ predicted_percent, abort_count);
+}
+
+static int callchain_counts_printf(FILE *fp, char *bf, int bfsize,
+ u64 branch_count, u64 predicted_count,
+ u64 abort_count, u64 cycles_count,
+ u64 iter_count, u64 samples_count)
+{
+ char str[128];
+
+ counts_str_build(str, sizeof(str), branch_count,
+ predicted_count, abort_count, cycles_count,
+ iter_count, samples_count);

if (fp)
- return fprintf(fp,
- " (predicted:%.1f%%, abort:%" PRId64 ", cycles:%" PRId64 "%s)",
- predicted_percent, abort_count, cycles, str);
+ return fprintf(fp, "%s", str);

- return scnprintf(bf, bfsize,
- " (predicted:%.1f%%, abort:%" PRId64 ", cycles:%" PRId64 "%s)",
- predicted_percent, abort_count, cycles, str);
+ return scnprintf(bf, bfsize, "%s", str);
}

int callchain_list_counts__printf_value(struct callchain_node *node,
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Mar 31, 2017, 10:20:07 PM3/31/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

To test it, build samples/statx/test_statx, which I did as:

$ make headers_install
$ cc -I ~/git/linux/usr/include samples/statx/test-statx.c -o /tmp/statx

And then use perf trace on it:

# perf trace -e statx /tmp/statx /etc/passwd
statx(/etc/passwd) = 0
results=7ff
Size: 3496 Blocks: 8 IO Block: 4096 regular file
Device: fd:00 Inode: 280156 Links: 1
Access: (0644/-rw-r--r--) Uid: 0 Gid: 0
Access: 2017-03-29 16:01:01.650073438-0300
Modify: 2017-03-10 16:25:14.156479354-0300
Change: 2017-03-10 16:25:14.171479328-0300
0.000 ( 0.007 ms): statx/30648 statx(dfd: CWD, filename: 0x7ef503f4, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff7ef4eb10) = 0
#

Using the test-stat.c options to change the mask:

# perf trace -e statx /tmp/statx -O /etc/passwd > /dev/null
0.000 ( 0.008 ms): statx/30745 statx(dfd: CWD, filename: 0x3a0753f4, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffd3a0735c0) = 0
#
# perf trace -e statx /tmp/statx -A /etc/passwd > /dev/null
0.000 ( 0.010 ms): statx/30757 statx(dfd: CWD, filename: 0xa94e63f4, flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffea94e49d0) = 0
#
# trace --no-inherit -e statx /tmp/statx -F /etc/passwd > /dev/null
0.000 ( 0.011 ms): statx(dfd: CWD, filename: 0x3b02d3f3, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffd3b02c850) = 0
#
# trace --no-inherit -e statx /tmp/statx -F -L /etc/passwd > /dev/null
0.000 ( 0.008 ms): statx(dfd: CWD, filename: 0x15cff3f3, flags: STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff15cfdda0) = 0
#
# trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null
0.000 ( 0.009 ms): statx(dfd: CWD, filename: 0xfa37f3f3, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffffa37da20) = 0
#

Adding a probe to get the filename collected as well:

# perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=result->name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

# trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null
0.169 ( 0.007 ms): statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0
#

Same technique could be used to collect and beautify the result put in
the 'buffer' argument.

Finally do a system wide 'perf trace' session looking for any use of statx,
then run the test proggie with various flags:

# trace -e statx
16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
33064.447 ( 0.011 ms): statx/4569 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffc5484c790) = 0
36050.891 ( 0.023 ms): statx/4576 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffeb18b66e0) = 0
38039.889 ( 0.023 ms): statx/4584 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff1db0ea90) = 0
^C#

This one also starts moving the beautifiers from files directly included
in builtin-trace.c to separate objects + a beauty.h header with
prototypes, so that we can add test cases in tools/perf/tests/ to fire
syscalls with various arguments and then get them intercepted as
syscalls:sys_enter_foo or raw_syscalls:sys_enter + sys_exit to then
format and check that the formatted output is the one we expect.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Al Viro <vi...@zeniv.linux.org.uk>
Cc: David Ahern <dsa...@gmail.com>
Cc: David Howells <dhow...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-xvzw8eynff...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 +
tools/perf/builtin-trace.c | 14 ++---
tools/perf/trace/beauty/Build | 1 +
tools/perf/trace/beauty/beauty.h | 24 ++++++++
tools/perf/trace/beauty/statx.c | 72 +++++++++++++++++++++++
6 files changed, 104 insertions(+), 9 deletions(-)
create mode 100644 tools/perf/trace/beauty/Build
create mode 100644 tools/perf/trace/beauty/beauty.h
create mode 100644 tools/perf/trace/beauty/statx.c

diff --git a/tools/perf/Build b/tools/perf/Build
index 9b79f8d7db50..bd8eeb60533c 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -50,5 +50,6 @@ libperf-y += util/
libperf-y += arch/
libperf-y += ui/
libperf-y += scripts/
+libperf-y += trace/beauty/

gtk-y += ui/gtk/
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index e93ef0b38db8..5aef183e2f85 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -338,6 +338,7 @@
329 common pkey_mprotect sys_pkey_mprotect
330 common pkey_alloc sys_pkey_alloc
331 common pkey_free sys_pkey_free
+332 common statx sys_statx

#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 7379792a6504..fce278d5fada 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -31,6 +31,7 @@
#include "util/intlist.h"
#include "util/thread_map.h"
#include "util/stat.h"
+#include "trace/beauty/beauty.h"
#include "trace-event.h"
#include "util/parse-events.h"
#include "util/bpf-loader.h"
@@ -267,15 +268,6 @@ static struct perf_evsel *perf_evsel__syscall_newtp(const char *direction, void
({ struct syscall_tp *fields = evsel->priv; \
fields->name.pointer(&fields->name, sample); })

-struct syscall_arg {
- unsigned long val;
- struct thread *thread;
- struct trace *trace;
- void *parm;
- u8 idx;
- u8 mask;
-};
-
struct strarray {
int offset;
int nr_entries;
@@ -771,6 +763,10 @@ static struct syscall_fmt {
.arg_parm = { [0] = &strarray__socket_families, /* family */ }, },
{ .name = "stat", .errmsg = true, .alias = "newstat", },
{ .name = "statfs", .errmsg = true, },
+ { .name = "statx", .errmsg = true,
+ .arg_scnprintf = { [0] = SCA_FDAT, /* flags */
+ [2] = SCA_STATX_FLAGS, /* flags */
+ [3] = SCA_STATX_MASK, /* mask */ }, },
{ .name = "swapoff", .errmsg = true,
.arg_scnprintf = { [0] = SCA_FILENAME, /* specialfile */ }, },
{ .name = "swapon", .errmsg = true,
diff --git a/tools/perf/trace/beauty/Build b/tools/perf/trace/beauty/Build
new file mode 100644
index 000000000000..be95ac6ce845
--- /dev/null
+++ b/tools/perf/trace/beauty/Build
@@ -0,0 +1 @@
+libperf-y += statx.o
diff --git a/tools/perf/trace/beauty/beauty.h b/tools/perf/trace/beauty/beauty.h
new file mode 100644
index 000000000000..cf50be3f17a4
--- /dev/null
+++ b/tools/perf/trace/beauty/beauty.h
@@ -0,0 +1,24 @@
+#ifndef _PERF_TRACE_BEAUTY_H
+#define _PERF_TRACE_BEAUTY_H
+
+#include <linux/types.h>
+
+struct trace;
+struct thread;
+
+struct syscall_arg {
+ unsigned long val;
+ struct thread *thread;
+ struct trace *trace;
+ void *parm;
+ u8 idx;
+ u8 mask;
+};
+
+size_t syscall_arg__scnprintf_statx_flags(char *bf, size_t size, struct syscall_arg *arg);
+#define SCA_STATX_FLAGS syscall_arg__scnprintf_statx_flags
+
+size_t syscall_arg__scnprintf_statx_mask(char *bf, size_t size, struct syscall_arg *arg);
+#define SCA_STATX_MASK syscall_arg__scnprintf_statx_mask
+
+#endif /* _PERF_TRACE_BEAUTY_H */
diff --git a/tools/perf/trace/beauty/statx.c b/tools/perf/trace/beauty/statx.c
new file mode 100644
index 000000000000..5643b692af4c
--- /dev/null
+++ b/tools/perf/trace/beauty/statx.c
@@ -0,0 +1,72 @@
+/*
+ * trace/beauty/statx.c
+ *
+ * Copyright (C) 2017, Red Hat Inc, Arnaldo Carvalho de Melo <ac...@redhat.com>
+ *
+ * Released under the GPL v2. (and only v2, not any later version)
+ */
+
+#include "trace/beauty/beauty.h"
+#include <linux/kernel.h>
+#include <sys/types.h>
+#include <uapi/linux/fcntl.h>
+#include <uapi/linux/stat.h>
+
+size_t syscall_arg__scnprintf_statx_flags(char *bf, size_t size, struct syscall_arg *arg)
+{
+ int printed = 0, flags = arg->val;
+
+ if (flags == 0)
+ return scnprintf(bf, size, "SYNC_AS_STAT");
+#define P_FLAG(n) \
+ if (flags & AT_##n) { \
+ printed += scnprintf(bf + printed, size - printed, "%s%s", printed ? "|" : "", #n); \
+ flags &= ~AT_##n; \
+ }
+
+ P_FLAG(SYMLINK_NOFOLLOW);
+ P_FLAG(REMOVEDIR);
+ P_FLAG(SYMLINK_FOLLOW);
+ P_FLAG(NO_AUTOMOUNT);
+ P_FLAG(EMPTY_PATH);
+ P_FLAG(STATX_FORCE_SYNC);
+ P_FLAG(STATX_DONT_SYNC);
+
+#undef P_FLAG
+
+ if (flags)
+ printed += scnprintf(bf + printed, size - printed, "%s%#x", printed ? "|" : "", flags);
+
+ return printed;
+}
+
+size_t syscall_arg__scnprintf_statx_mask(char *bf, size_t size, struct syscall_arg *arg)
+{
+ int printed = 0, flags = arg->val;
+
+#define P_FLAG(n) \
+ if (flags & STATX_##n) { \
+ printed += scnprintf(bf + printed, size - printed, "%s%s", printed ? "|" : "", #n); \
+ flags &= ~STATX_##n; \
+ }
+
+ P_FLAG(TYPE);
+ P_FLAG(MODE);
+ P_FLAG(NLINK);
+ P_FLAG(UID);
+ P_FLAG(GID);
+ P_FLAG(ATIME);
+ P_FLAG(MTIME);
+ P_FLAG(CTIME);
+ P_FLAG(INO);
+ P_FLAG(SIZE);
+ P_FLAG(BLOCKS);
+ P_FLAG(BTIME);
+
+#undef P_FLAG
+
+ if (flags)
+ printed += scnprintf(bf + printed, size - printed, "%s%#x", printed ? "|" : "", flags);
+
+ return printed;
+}
--
2.9.3

Ingo Molnar

unread,
Apr 1, 2017, 6:50:08 AM4/1/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Arnaldo Carvalho de Melo

unread,
Apr 2, 2017, 3:20:06 PM4/2/17
to
Commit-ID: fd5cead23f54697310bd565aa2a23ae5128080a0
Gitweb: http://git.kernel.org/tip/fd5cead23f54697310bd565aa2a23ae5128080a0
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 14 Mar 2017 16:19:30 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Fri, 31 Mar 2017 14:42:31 -0300

perf trace: Beautify statx syscall 'flag' and 'mask' arguments

# trace -e statx
16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
33064.447 ( 0.011 ms): statx/4569 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffc5484c790) = 0
36050.891 ( 0.023 ms): statx/4576 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffeb18b66e0) = 0
38039.889 ( 0.023 ms): statx/4584 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff1db0ea90) = 0
^C#

This one also starts moving the beautifiers from files directly included
in builtin-trace.c to separate objects + a beauty.h header with
prototypes, so that we can add test cases in tools/perf/tests/ to fire
syscalls with various arguments and then get them intercepted as
syscalls:sys_enter_foo or raw_syscalls:sys_enter + sys_exit to then
format and check that the formatted output is the one we expect.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: Al Viro <vi...@zeniv.linux.org.uk>
Cc: David Ahern <dsa...@gmail.com>
Cc: David Howells <dhow...@redhat.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-xvzw8eynff...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/Build | 1 +
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 +
tools/perf/builtin-trace.c | 14 ++---
tools/perf/trace/beauty/Build | 1 +
tools/perf/trace/beauty/beauty.h | 24 ++++++++
tools/perf/trace/beauty/statx.c | 72 +++++++++++++++++++++++
6 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Build b/tools/perf/Build
index 9b79f8d..bd8eeb6 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -50,5 +50,6 @@ libperf-y += util/
libperf-y += arch/
libperf-y += ui/
libperf-y += scripts/
+libperf-y += trace/beauty/

gtk-y += ui/gtk/
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index e93ef0b..5aef183 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -338,6 +338,7 @@
329 common pkey_mprotect sys_pkey_mprotect
330 common pkey_alloc sys_pkey_alloc
331 common pkey_free sys_pkey_free
+332 common statx sys_statx

#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 7379792..fce278d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -31,6 +31,7 @@
#include "util/intlist.h"
#include "util/thread_map.h"
#include "util/stat.h"
+#include "trace/beauty/beauty.h"
#include "trace-event.h"
#include "util/parse-events.h"
#include "util/bpf-loader.h"
@@ -267,15 +268,6 @@ out_delete:
index 0000000..be95ac6
--- /dev/null
+++ b/tools/perf/trace/beauty/Build
@@ -0,0 +1 @@
+libperf-y += statx.o
diff --git a/tools/perf/trace/beauty/beauty.h b/tools/perf/trace/beauty/beauty.h
new file mode 100644
index 0000000..cf50be3
index 0000000..5643b69

tip-bot for Arnaldo Carvalho de Melo

unread,
Apr 2, 2017, 3:20:06 PM4/2/17
to
Commit-ID: c68677014bace6a4b6ad20f0818e1470d049618f
Gitweb: http://git.kernel.org/tip/c68677014bace6a4b6ad20f0818e1470d049618f
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 28 Mar 2017 11:19:59 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Tue, 28 Mar 2017 11:19:59 -0300

perf tools: Remove support for command aliases

This came from 'git', but isn't documented anywhere in
tools/perf/Documentation/, looks like baggage we can do without, ditch
it.

Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/n/tip-e7uwkn60t4...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/builtin-help.c | 13 -----
tools/perf/perf.c | 97 +++-----------------------------------
tools/perf/util/Build | 1 -
tools/perf/util/alias.c | 78 ------------------------------
tools/perf/util/cache.h | 1 -
tools/perf/util/help-unknown-cmd.c | 8 +---
6 files changed, 8 insertions(+), 190 deletions(-)

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 7ae2389..1eec96a 100644
index 4b283d1..9217f22 100644
@@ -455,25 +390,12 @@ do_die:
index 2ae92da..5c0ea11 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,4 +1,3 @@
-libperf-y += alias.o
libperf-y += annotate.o
libperf-y += block-range.o
libperf-y += build-id.o
diff --git a/tools/perf/util/alias.c b/tools/perf/util/alias.c
deleted file mode 100644
index 6455471..0000000
- cmdline[dst] = 0;
-
- if (quoted) {
- zfree(argv);
- return error("unclosed quote");
- }
-
- return count;
-}
-
diff --git a/tools/perf/util/cache.h b/tools/perf/util/cache.h
index 512c0c8..0328f29 100644
--- a/tools/perf/util/cache.h
+++ b/tools/perf/util/cache.h
@@ -15,7 +15,6 @@
#define PERF_TRACEFS_ENVIRONMENT "PERF_TRACEFS_DIR"
#define PERF_PAGER_ENVIRONMENT "PERF_PAGER"

-char *alias_lookup(const char *alias);
int split_cmdline(char *cmdline, const char ***argv);

#define alloc_nr(x) (((x)+16)*3/2)
diff --git a/tools/perf/util/help-unknown-cmd.c b/tools/perf/util/help-unknown-cmd.c
index 2821f8d..3420144 100644

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:05 PM4/4/17
to
From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

SDT marker argument is in N@OP format. Here OP is arch dependent
component. Add powerpc logic to parse OP and convert it to uprobe
compatible format.

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhir...@kernel.org>
Cc: Alexander Shishkin <alexander...@linux.intel.com>
Cc: Alexis Berlemont <alexis.b...@gmail.com>
Cc: Hemant Kumar <hem...@linux.vnet.ibm.com>
Cc: Michael Ellerman <m...@ellerman.id.au>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.31...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/arch/powerpc/util/perf_regs.c | 111 +++++++++++++++++++++++++++++++
1 file changed, 111 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index a3c3e1ce6807..4268f7762e25 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -1,5 +1,10 @@
+#include <string.h>
+#include <regex.h>
+
#include "../../perf.h"
+#include "../../util/util.h"
#include "../../util/perf_regs.h"
+#include "../../util/debug.h"

const struct sample_reg sample_reg_masks[] = {
SMPL_REG(r0, PERF_REG_POWERPC_R0),
@@ -47,3 +52,109 @@ const struct sample_reg sample_reg_masks[] = {
SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
SMPL_REG_END
};
+
+/* REG or %rREG */
+#define SDT_OP_REGEX1 "^(%r)?([1-2]?[0-9]|3[0-1])$"
+
+/* -NUM(REG) or NUM(REG) or -NUM(%rREG) or NUM(%rREG) */
+#define SDT_OP_REGEX2 "^(\\-)?([0-9]+)\\((%r)?([1-2]?[0-9]|3[0-1])\\)$"
+
+static regex_t sdt_op_regex1, sdt_op_regex2;
+
+static int sdt_init_op_regex(void)
+{
+ static int initialized;
+ int ret = 0;
+
+ if (initialized)
+ return 0;
+
+ ret = regcomp(&sdt_op_regex1, SDT_OP_REGEX1, REG_EXTENDED);
+ if (ret)
+ goto error;
+
+ ret = regcomp(&sdt_op_regex2, SDT_OP_REGEX2, REG_EXTENDED);
+ if (ret)
+ goto free_regex1;
+
+ initialized = 1;
+ return 0;
+
+free_regex1:
+ regfree(&sdt_op_regex1);
+error:
+ pr_debug4("Regex compilation error.\n");
+ return ret;
+}
+
+/*
+ * Parse OP and convert it into uprobe format, which is, +/-NUM(%gprREG).
+ * Possible variants of OP are:
+ * Format Example
+ * -------------------------
+ * NUM(REG) 48(18)
+ * -NUM(REG) -48(18)
+ * NUM(%rREG) 48(%r18)
+ * -NUM(%rREG) -48(%r18)
+ * REG 18
+ * %rREG %r18
+ * iNUM i0
+ * i-NUM i-1
+ *
+ * SDT marker arguments on Powerpc uses %rREG form with -mregnames flag
+ * and REG form with -mno-regnames. Here REG is general purpose register,
+ * which is in 0 to 31 range.
+ */
+int arch_sdt_arg_parse_op(char *old_op, char **new_op)
+{
+ int ret, new_len;
+ regmatch_t rm[5];
+ char prefix;
+
+ /* Constant argument. Uprobe does not support it */
+ if (old_op[0] == 'i') {
+ pr_debug4("Skipping unsupported SDT argument: %s\n", old_op);
+ return SDT_ARG_SKIP;
+ }
+
+ ret = sdt_init_op_regex();
+ if (ret < 0)
+ return ret;
+
+ if (!regexec(&sdt_op_regex1, old_op, 3, rm, 0)) {
+ /* REG or %rREG --> %gprREG */
+
+ new_len = 5; /* % g p r NULL */
+ new_len += (int)(rm[2].rm_eo - rm[2].rm_so);
+
+ *new_op = zalloc(new_len);
+ if (!*new_op)
+ return -ENOMEM;
+
+ scnprintf(*new_op, new_len, "%%gpr%.*s",
+ (int)(rm[2].rm_eo - rm[2].rm_so), old_op + rm[2].rm_so);
+ } else if (!regexec(&sdt_op_regex2, old_op, 5, rm, 0)) {
+ /*
+ * -NUM(REG) or NUM(REG) or -NUM(%rREG) or NUM(%rREG) -->
+ * +/-NUM(%gprREG)
+ */
+ prefix = (rm[1].rm_so == -1) ? '+' : '-';
+
+ new_len = 8; /* +/- ( % g p r ) NULL */
+ new_len += (int)(rm[2].rm_eo - rm[2].rm_so);
+ new_len += (int)(rm[4].rm_eo - rm[4].rm_so);
+
+ *new_op = zalloc(new_len);
+ if (!*new_op)
+ return -ENOMEM;
+
+ scnprintf(*new_op, new_len, "%c%.*s(%%gpr%.*s)", prefix,
+ (int)(rm[2].rm_eo - rm[2].rm_so), old_op + rm[2].rm_so,
+ (int)(rm[4].rm_eo - rm[4].rm_so), old_op + rm[4].rm_so);
+ } else {
+ pr_debug4("Skipping unsupported SDT argument: %s\n", old_op);
+ return SDT_ARG_SKIP;
+ }
+
+ return SDT_ARG_VALID;
+}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:05 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add a missing space in the JSON description after the uncore unit

Before:

perf list
...
unc_arb_coh_trk_requests.all
[Unit: uncore_arbNumber of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc]
...

After:

unc_arb_coh_trk_requests.all
[Unit: uncore_arb Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc]

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-p989c7x9ka...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/pmu-events/jevents.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 3a151c35852d..baa073f38334 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -469,6 +469,7 @@ int json_events(const char *fn,
}
addfield(map, &desc, ". ", "Unit: ", NULL);
addfield(map, &desc, "", pmu, NULL);
+ addfield(map, &desc, "", " ", NULL);
} else if (json_streq(map, field, "Filter")) {
addfield(map, &filter, "", "", val);
} else if (json_streq(map, field, "ScaleUnit")) {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:05 PM4/4/17
to
Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit fcc309e618c9e9ac4ede010d87522b0689549658:

Merge tag 'perf-core-for-mingo-4.12-20170331' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-04-01 12:43:40 +0200)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170404

for you to fetch changes up to 99094a5e941fe88d95cbd594e6a41bee24003ecb:

perf annotate: Fix missing number of samples for source_line_samples (2017-04-04 21:08:00 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Add missing number of samples in 'perf annotate --stdio -l --show-total-period'
(Taeung Song)

Vendor events updates:

- Add uncore_arb Intel vendor events in JSON format (Andi Kleen)

- Add uncore vendor events for Intel's Sandy Bridge, Ivy Bridge,
Haswell, Broadwell and Skylake architectures (Andi Kleen)

- Add missing UNC_M_DCLOCKTICKS Intel Broadwell DE uncore vendor event (Andi Kleen)

Infrastructure:

- Remove some more die() calls, avoiding sudden death in library code
(Arnaldo Carvalho de Melo)

- Add argument support for SDT events in powerpc (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>

----------------------------------------------------------------
Andi Kleen (8):
perf vendor events intel: Add missing UNC_M_DCLOCKTICKS for Broadwell DE uncore
perf vendor events intel: Add uncore events for Sandy Bridge client
perf vendor events intel: Add uncore events for Ivy Bridge client
perf vendor events intel: Add uncore events for Haswell client
perf vendor events intel: Add uncore events for Broadwell client
perf vendor events intel: Add uncore events for Skylake client
perf vendor events intel: Add uncore_arb JSON support
perf vendor events intel: Add missing space in json descriptions

Arnaldo Carvalho de Melo (4):
Merge branch 'perf/uncore-json-updates-1' of git://git.kernel.org/.../ak/linux-misc into perf/core
perf tools: Remove die() call
perf tools: Handle allocation failures gracefully
perf tools: Don't die on a print function

Ravi Bangoria (1):
perf sdt powerpc: Add argument support

Taeung Song (1):
perf annotate: Fix missing number of samples for source_line_samples

tools/perf/arch/powerpc/util/perf_regs.c | 111 ++++++
tools/perf/perf.c | 3 +-
.../perf/pmu-events/arch/x86/broadwell/uncore.json | 278 +++++++++++++++
.../arch/x86/broadwellde/uncore-memory.json | 13 +-
tools/perf/pmu-events/arch/x86/haswell/uncore.json | 374 +++++++++++++++++++++
.../perf/pmu-events/arch/x86/ivybridge/uncore.json | 314 +++++++++++++++++
.../pmu-events/arch/x86/sandybridge/uncore.json | 314 +++++++++++++++++
tools/perf/pmu-events/arch/x86/skylake/uncore.json | 254 ++++++++++++++
tools/perf/pmu-events/jevents.c | 2 +
tools/perf/util/annotate.c | 6 +-
tools/perf/util/annotate.h | 2 +-
tools/perf/util/values.c | 63 +++-
12 files changed, 1710 insertions(+), 24 deletions(-)
create mode 100644 tools/perf/pmu-events/arch/x86/broadwell/uncore.json
create mode 100644 tools/perf/pmu-events/arch/x86/haswell/uncore.json
create mode 100644 tools/perf/pmu-events/arch/x86/ivybridge/uncore.json
create mode 100644 tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
create mode 100644 tools/perf/pmu-events/arch/x86/skylake/uncore.json
# 'perf test tsc' already fixed by peterz in tip, need to update this kernel :-\
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_debug_O: make DEBUG=1
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_help_O: make help
make_tags_O: make tags
make_no_newt_O: make NO_NEWT=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_perf_o_O: make perf.o
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_clean_all_O: make clean all
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libperl_O: make NO_LIBPERL=1
make_pure_O: make
make_no_gtk2_O: make NO_GTK2=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_install_O: make install
make_no_libaudit_O: make NO_LIBAUDIT=1
make_install_bin_O: make install-bin
make_no_libpython_O: make NO_LIBPYTHON=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_util_map_o_O: make util/map.o
make_no_demangle_O: make NO_DEMANGLE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_doc_O: make doc
make_no_libbpf_O: make NO_LIBBPF=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_static_O: make LDFLAGS=-static

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:05 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

An earlier update removed the UNC_M_CLOCKTICKS event for Broadwell DE.
But Metric events were still referring to it.
This adds it back under a different name from the event list,
and also fixes up the Metric events to use the new name.

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-zxxzg4g5nr...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../perf/pmu-events/arch/x86/broadwellde/uncore-memory.json | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
index fa09e12018ce..f4b0745cdbbf 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
@@ -20,11 +20,18 @@
"Unit": "iMC"
},
{
+ "BriefDescription": "Memory controller clock ticks",
+ "Counter": "0,1,2,3",
+ "EventName": "UNC_M_DCLOCKTICKS",
+ "PerPkg": "1",
+ "Unit": "iMC"
+ },
+ {
"BriefDescription": "Cycles where DRAM ranks are in power down (CKE) mode",
"Counter": "0,1,2,3",
"EventCode": "0x85",
"EventName": "UNC_M_POWER_CHANNEL_PPD",
- "MetricExpr": "(UNC_M_POWER_CHANNEL_PPD / UNC_M_CLOCKTICKS) * 100.",
+ "MetricExpr": "(UNC_M_POWER_CHANNEL_PPD / UNC_M_DCLOCKTICKS) * 100.",
"MetricName": "power_channel_ppd %",
"PerPkg": "1",
"Unit": "iMC"
@@ -34,7 +41,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x86",
"EventName": "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES",
- "MetricExpr": "(UNC_M_POWER_CRITICAL_THROTTLE_CYCLES / UNC_M_CLOCKTICKS) * 100.",
+ "MetricExpr": "(UNC_M_POWER_CRITICAL_THROTTLE_CYCLES / UNC_M_DCLOCKTICKS) * 100.",
"MetricName": "power_critical_throttle_cycles %",
"PerPkg": "1",
"Unit": "iMC"
@@ -44,7 +51,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x43",
"EventName": "UNC_M_POWER_SELF_REFRESH",
- "MetricExpr": "(UNC_M_POWER_SELF_REFRESH / UNC_M_CLOCKTICKS) * 100.",
+ "MetricExpr": "(UNC_M_POWER_SELF_REFRESH / UNC_M_DCLOCKTICKS) * 100.",
"MetricName": "power_self_refresh %",
"PerPkg": "1",
"Unit": "iMC"
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:06 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add V18 of Broadwell uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-xlbguqdzho...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../perf/pmu-events/arch/x86/broadwell/uncore.json | 278 +++++++++++++++++++++
1 file changed, 278 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/broadwell/uncore.json

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/uncore.json b/tools/perf/pmu-events/arch/x86/broadwell/uncore.json
new file mode 100644
index 000000000000..28e1e159a3cb
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/broadwell/uncore.json
@@ -0,0 +1,278 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x41",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x44",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x11",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_M",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in M-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in M-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_M",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in M-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x18",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_I",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in I-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_I",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in I-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x1f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_MESI",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x2f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_MESI",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x8f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_MESI",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x86",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_ES",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x16",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_ES",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x26",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_ES",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0,",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x02",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.DRD_DIRECT",
+ "BriefDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
+ "PublicDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
+ "Counter": "0,",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x02",
+ "EventName": "UNC_ARB_TRK_REQUESTS.DRD_DIRECT",
+ "BriefDescription": "Number of Core coherent Data Read entries allocated in DirectData mode",
+ "PublicDescription": "Number of Core coherent Data Read entries allocated in DirectData mode.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "PublicDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.;",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "NCU",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "FIXED",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ }
+]
\ No newline at end of file
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:11 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add V18 of Ivy Bridge uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-299k76asec...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../perf/pmu-events/arch/x86/ivybridge/uncore.json | 314 +++++++++++++++++++++
1 file changed, 314 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/ivybridge/uncore.json

diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/uncore.json b/tools/perf/pmu-events/arch/x86/ivybridge/uncore.json
new file mode 100644
index 000000000000..42c70eed05a2
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/uncore.json
@@ -0,0 +1,314 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS",
+ "BriefDescription": "A snoop misses in some processor core.",
+ "PublicDescription": "A snoop misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL",
+ "BriefDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT",
+ "BriefDescription": "A snoop hits a non-modified line in some processor core.",
+ "PublicDescription": "A snoop hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM",
+ "BriefDescription": "A snoop hits a modified line in some processor core.",
+ "PublicDescription": "A snoop hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL_M",
+ "BriefDescription": "A snoop invalidates a modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EXTERNAL_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.XCORE_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EVICTION_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.M",
+ "BriefDescription": "LLC lookup request that access cache and found line in M-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.E",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.S",
+ "BriefDescription": "LLC lookup request that access cache and found line in S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.I",
+ "BriefDescription": "LLC lookup request that access cache and found line in I-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable read requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable read requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable write requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable write requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_FILTER",
+ "BriefDescription": "Filter on external snoop requests.",
+ "PublicDescription": "Filter on external snoop requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_REQUEST_FILTER",
+ "BriefDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "PublicDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "PublicDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x80",
+ "EventName": "UNC_ARB_TRK_REQUESTS.EVICTIONS",
+ "BriefDescription": "Counts the number of LLC evictions allocated.",
+ "PublicDescription": "Counts the number of LLC evictions allocated.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x83",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "PublicDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of requests allocated in Coherency Tracker.",
+ "PublicDescription": "Number of requests allocated in Coherency Tracker.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_OVER_HALF_FULL",
+ "BriefDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "10",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "Fixed",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x06",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ES",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ }

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:20:11 PM4/4/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

We can just use the exit() right after the branch calling die().

Link: http://lkml.kernel.org/n/tip-90athn06d7...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/perf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 9217f2227f3d..9dc346f2b255 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -378,7 +378,8 @@ static void execv_dashed_external(const char **argv)
if (status != -ERR_RUN_COMMAND_EXEC) {
if (IS_RUN_COMMAND_ERR(status)) {
do_die:
- die("unable to run '%s'", argv[0]);
+ pr_err("FATAL: unable to run '%s'", argv[0]);
+ status = -128;
}
exit(-status);
}
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:30:05 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add V25 of Skylake uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-00qmcrmq18...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/pmu-events/arch/x86/skylake/uncore.json | 254 +++++++++++++++++++++
1 file changed, 254 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/skylake/uncore.json

diff --git a/tools/perf/pmu-events/arch/x86/skylake/uncore.json b/tools/perf/pmu-events/arch/x86/skylake/uncore.json
new file mode 100644
index 000000000000..dbc193252fb3
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/skylake/uncore.json
@@ -0,0 +1,254 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x41",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x44",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in M-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_M",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in M-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x18",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_I",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in I-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_I",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in I-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x1f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_MESI",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x2f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_MESI",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x8f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_MESI",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x86",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_ES",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x16",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_ES",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x26",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_ES",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from its allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from its allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x02",
+ "EventName": "UNC_ARB_TRK_REQUESTS.DRD_DIRECT",
+ "BriefDescription": "Number of Core coherent Data Read entries allocated in DirectData mode",
+ "PublicDescription": "Number of Core coherent Data Read entries allocated in DirectData mode.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "PublicDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.;",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "NCU",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "FIXED",

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:30:05 PM4/4/17
to
From: Taeung Song <treeze...@gmail.com>

The option 'show-total-period' works fine without a option '-l'. But if
running 'perf annotate --stdio -l --show-total-period', you can see a
problem showing only zero '0' for number of samples.

Before:
$ perf annotate --stdio -l --show-total-period
...
0 : 400816: push %rbp
0 : 400817: mov %rsp,%rbp
0 : 40081a: mov %edi,-0x24(%rbp)
0 : 40081d: mov %rsi,-0x30(%rbp)
0 : 400821: mov -0x24(%rbp),%eax
0 : 400824: mov -0x30(%rbp),%rdx
0 : 400828: mov (%rdx),%esi
0 : 40082a: mov $0x0,%edx
...

The reason is it was missed to set number of samples of
source_line_samples, so set it ordinarily.

After:
$ perf annotate --stdio -l --show-total-period
...
3 : 400816: push %rbp
4 : 400817: mov %rsp,%rbp
0 : 40081a: mov %edi,-0x24(%rbp)
0 : 40081d: mov %rsi,-0x30(%rbp)
1 : 400821: mov -0x24(%rbp),%eax
2 : 400824: mov -0x30(%rbp),%rdx
0 : 400828: mov (%rdx),%esi
1 : 40082a: mov $0x0,%edx
...

Signed-off-by: Taeung Song <treeze...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Martin Liska <mli...@suse.cz>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Wang Nan <wang...@huawei.com>
Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period")
Link: http://lkml.kernel.org/r/1490703125-13643-1-git-...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/util/annotate.c | 6 ++++--
tools/perf/util/annotate.h | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 11af5f0d56cc..a37032bd137d 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1665,7 +1665,7 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map,
start = map__rip_2objdump(map, sym->start);

for (i = 0; i < len; i++) {
- u64 offset;
+ u64 offset, nr_samples;
double percent_max = 0.0;

src_line->nr_pcnt = nr_pcnt;
@@ -1674,12 +1674,14 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map,
double percent = 0.0;

h = annotation__histogram(notes, evidx + k);
+ nr_samples = h->addr[i];
if (h->sum)
- percent = 100.0 * h->addr[i] / h->sum;
+ percent = 100.0 * nr_samples / h->sum;

if (percent > percent_max)
percent_max = percent;
src_line->samples[k].percent = percent;
+ src_line->samples[k].nr = nr_samples;
}

if (percent_max <= 0.5)
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 09776b5af991..948aa8e6fd39 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -98,7 +98,7 @@ struct cyc_hist {
struct source_line_samples {
double percent;
double percent_sum;
- double nr;
+ u64 nr;
};

struct source_line {
--
2.9.3

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:30:05 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add V25 of Haswell uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-133r1do7vv...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/pmu-events/arch/x86/haswell/uncore.json | 374 +++++++++++++++++++++
1 file changed, 374 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/haswell/uncore.json

diff --git a/tools/perf/pmu-events/arch/x86/haswell/uncore.json b/tools/perf/pmu-events/arch/x86/haswell/uncore.json
new file mode 100644
index 000000000000..3ef5c21fef56
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswell/uncore.json
@@ -0,0 +1,374 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EXTERNAL",
+ "BriefDescription": "An external snoop misses in some processor core.",
+ "PublicDescription": "An external snoop misses in some processor core.",
+ "UMask": "0x24",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EXTERNAL",
+ "BriefDescription": "An external snoop hits a non-modified line in some processor core.",
+ "PublicDescription": "An external snoop hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x44",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x84",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x28",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EXTERNAL",
+ "BriefDescription": "An external snoop hits a modified line in some processor core.",
+ "PublicDescription": "An external snoop hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x11",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_M",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x41",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_M",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_M",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x18",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_I",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x28",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_I",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_I",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_I",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x1f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_MESI",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x2f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_MESI",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x4f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_MESI",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x8f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_MESI",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x86",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_ES",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x46",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_ES",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x16",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_ES",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x26",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_ES",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x83",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.All",
+ "BriefDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory)",
+ "PublicDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory).",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "PublicDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "NCU",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",

Arnaldo Carvalho de Melo

unread,
Apr 4, 2017, 8:30:05 PM4/4/17
to
From: Andi Kleen <a...@linux.intel.com>

Add V15 of Sandy Bridge uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-2qkwutpwlj...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
.../pmu-events/arch/x86/sandybridge/uncore.json | 314 +++++++++++++++++++++
1 file changed, 314 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/sandybridge/uncore.json

diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json b/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
new file mode 100644
index 000000000000..42c70eed05a2
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
@@ -0,0 +1,314 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS",
+ "BriefDescription": "A snoop misses in some processor core.",
+ "PublicDescription": "A snoop misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL",
+ "BriefDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT",
+ "BriefDescription": "A snoop hits a non-modified line in some processor core.",
+ "PublicDescription": "A snoop hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM",
+ "BriefDescription": "A snoop hits a modified line in some processor core.",
+ "PublicDescription": "A snoop hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL_M",
+ "BriefDescription": "A snoop invalidates a modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EXTERNAL_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.XCORE_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EVICTION_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.M",
+ "BriefDescription": "LLC lookup request that access cache and found line in M-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.E",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.S",
+ "BriefDescription": "LLC lookup request that access cache and found line in S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.I",
+ "BriefDescription": "LLC lookup request that access cache and found line in I-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable read requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable read requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable write requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable write requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_FILTER",
+ "BriefDescription": "Filter on external snoop requests.",
+ "PublicDescription": "Filter on external snoop requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_REQUEST_FILTER",
+ "BriefDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "PublicDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "PublicDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x80",
+ "EventName": "UNC_ARB_TRK_REQUESTS.EVICTIONS",
+ "BriefDescription": "Counts the number of LLC evictions allocated.",
+ "PublicDescription": "Counts the number of LLC evictions allocated.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x83",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "PublicDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of requests allocated in Coherency Tracker.",
+ "PublicDescription": "Number of requests allocated in Coherency Tracker.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_OVER_HALF_FULL",
+ "BriefDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "10",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "Fixed",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x06",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ES",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ }

Ingo Molnar

unread,
Apr 5, 2017, 1:50:05 AM4/5/17
to

* Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 1:50:05 AM4/5/17
to
Commit-ID: 0585c6265e66f952bcb6280cf078e5e120bd367a
Gitweb: http://git.kernel.org/tip/0585c6265e66f952bcb6280cf078e5e120bd367a
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:17:02 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:35:15 -0700

perf vendor events intel: Add uncore events for Haswell client

Add V25 of Haswell uncore events

Cc: jo...@kernel.org
---
tools/perf/pmu-events/arch/x86/haswell/uncore.json | 374 +++++++++++++++++++++
1 file changed, 374 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/haswell/uncore.json b/tools/perf/pmu-events/arch/x86/haswell/uncore.json
new file mode 100644
index 0000000..3ef5c21
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/haswell/uncore.json
@@ -0,0 +1,374 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EXTERNAL",
+ "BriefDescription": "An external snoop misses in some processor core.",
+ "PublicDescription": "An external snoop misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x41",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x24",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EXTERNAL",
+ "BriefDescription": "An external snoop hits a non-modified line in some processor core.",
+ "PublicDescription": "An external snoop hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x44",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x84",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x28",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EXTERNAL",
+ "BriefDescription": "An external snoop hits a modified line in some processor core.",
+ "PublicDescription": "An external snoop hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_XCORE",
+ "BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EVICTION",
+ "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
+ "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x11",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_M",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x21",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x41",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_M",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x81",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_M",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x18",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_I",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x28",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_I",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x48",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_I",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x88",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_I",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x1f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_MESI",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x2f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_MESI",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x4f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_MESI",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x8f",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_MESI",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x86",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_ES",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x46",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_ES",
+ "BriefDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x16",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_ES",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x26",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_ES",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "PublicDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Total number of Core outgoing entries allocated. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x83",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.All",
+ "BriefDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory)",
+ "PublicDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory).",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "PublicDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "NCU",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "FIXED",

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 1:50:06 AM4/5/17
to
Commit-ID: 80432c7311dbcf0c814d4923480b055a725b0be2
Gitweb: http://git.kernel.org/tip/80432c7311dbcf0c814d4923480b055a725b0be2
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:12:44 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:34:15 -0700

perf vendor events intel: Add uncore events for Sandy Bridge client

Add V15 of Sandy Bridge uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-2qkwutpwlj...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
---
.../pmu-events/arch/x86/sandybridge/uncore.json | 314 +++++++++++++++++++++
1 file changed, 314 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json b/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
new file mode 100644
index 0000000..42c70ee
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
@@ -0,0 +1,314 @@
+[
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.MISS",
+ "BriefDescription": "A snoop misses in some processor core.",
+ "PublicDescription": "A snoop misses in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL",
+ "BriefDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HIT",
+ "BriefDescription": "A snoop hits a non-modified line in some processor core.",
+ "PublicDescription": "A snoop hits a non-modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.HITM",
+ "BriefDescription": "A snoop hits a modified line in some processor core.",
+ "PublicDescription": "A snoop hits a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.INVAL_M",
+ "BriefDescription": "A snoop invalidates a modified line in some processor core.",
+ "PublicDescription": "A snoop invalidates a modified line in some processor core.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EXTERNAL_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to external snoop request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.XCORE_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to processor core memory request.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x22",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_XSNP_RESPONSE.EVICTION_FILTER",
+ "BriefDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "PublicDescription": "Filter on cross-core snoops initiated by this Cbox due to LLC eviction.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x01",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.M",
+ "BriefDescription": "LLC lookup request that access cache and found line in M-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in M-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x02",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.E",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x04",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.S",
+ "BriefDescription": "LLC lookup request that access cache and found line in S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x08",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.I",
+ "BriefDescription": "LLC lookup request that access cache and found line in I-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in I-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x10",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.READ_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable read requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable read requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x20",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_FILTER",
+ "BriefDescription": "Filter on processor core initiated cacheable write requests.",
+ "PublicDescription": "Filter on processor core initiated cacheable write requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x40",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_FILTER",
+ "BriefDescription": "Filter on external snoop requests.",
+ "PublicDescription": "Filter on external snoop requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x80",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ANY_REQUEST_FILTER",
+ "BriefDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "PublicDescription": "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts cycles weighted by the number of requests waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_REQUESTS.ALL",
+ "BriefDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "PublicDescription": "Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "PublicDescription": "Counts the number of allocated write entries, include full, partial, and LLC evictions.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x81",
+ "UMask": "0x80",
+ "EventName": "UNC_ARB_TRK_REQUESTS.EVICTIONS",
+ "BriefDescription": "Counts the number of LLC evictions allocated.",
+ "PublicDescription": "Counts the number of LLC evictions allocated.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x83",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.ALL",
+ "BriefDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "PublicDescription": "Cycles weighted by number of requests pending in Coherency Tracker.",
+ "Counter": "0",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x84",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_COH_TRK_REQUESTS.ALL",
+ "BriefDescription": "Number of requests allocated in Coherency Tracker.",
+ "PublicDescription": "Number of requests allocated in Coherency Tracker.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_OVER_HALF_FULL",
+ "BriefDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "PublicDescription": "Cycles with at least half of the requests outstanding are waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,1",
+ "CounterMask": "10",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "ARB",
+ "EventCode": "0x0",
+ "UMask": "0x01",
+ "EventName": "UNC_CLOCK.SOCKET",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "Counter": "Fixed",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "CBO",
+ "EventCode": "0x34",
+ "UMask": "0x06",
+ "EventName": "UNC_CBO_CACHE_LOOKUP.ES",
+ "BriefDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "PublicDescription": "LLC lookup request that access cache and found line in E-state or S-state.",
+ "Counter": "0,1",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ }

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 1:50:06 AM4/5/17
to
Commit-ID: 9c4e2e2589c99ed01db6245847b4bd44bc053330
Gitweb: http://git.kernel.org/tip/9c4e2e2589c99ed01db6245847b4bd44bc053330
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:07:53 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:32:25 -0700

perf vendor events intel: Add missing UNC_M_DCLOCKTICKS for Broadwell DE uncore

An earlier update removed the UNC_M_CLOCKTICKS event for Broadwell DE.
But Metric events were still referring to it.
This adds it back under a different name from the event list,
and also fixes up the Metric events to use the new name.

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-zxxzg4g5nr...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
---
.../perf/pmu-events/arch/x86/broadwellde/uncore-memory.json | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json b/tools/perf/pmu-events/arch/x86/broadwellde/uncore-memory.json
index fa09e12..f4b0745 100644

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 1:50:07 AM4/5/17
to
Commit-ID: bccdcb2a77ba0bef17baf152179e30ca35459a0c
Gitweb: http://git.kernel.org/tip/bccdcb2a77ba0bef17baf152179e30ca35459a0c
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:14:02 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:35:01 -0700

perf vendor events intel: Add uncore events for Ivy Bridge client

Add V18 of Ivy Bridge uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-299k76asec...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
---
tools/perf/pmu-events/arch/x86/{sandybridge => ivybridge}/uncore.json | 0
1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/uncore.json b/tools/perf/pmu-events/arch/x86/ivybridge/uncore.json
similarity index 100%
copy from tools/perf/pmu-events/arch/x86/sandybridge/uncore.json
copy to tools/perf/pmu-events/arch/x86/ivybridge/uncore.json

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 2:00:04 AM4/5/17
to
Commit-ID: 092a95d41655bdd31d7d28f1788818724505feb2
Gitweb: http://git.kernel.org/tip/092a95d41655bdd31d7d28f1788818724505feb2
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:17:42 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:35:23 -0700

perf vendor events intel: Add uncore events for Broadwell client

Add V18 of Broadwell uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-xlbguqdzho...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
---
.../arch/x86/{haswell => broadwell}/uncore.json | 190 +++++----------------
1 file changed, 47 insertions(+), 143 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/haswell/uncore.json b/tools/perf/pmu-events/arch/x86/broadwell/uncore.json
similarity index 63%
copy from tools/perf/pmu-events/arch/x86/haswell/uncore.json
copy to tools/perf/pmu-events/arch/x86/broadwell/uncore.json
index 3ef5c21..28e1e15 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/uncore.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/uncore.json
@@ -2,18 +2,6 @@
{
"Unit": "CBO",
"EventCode": "0x22",
- "UMask": "0x21",
- "EventName": "UNC_CBO_XSNP_RESPONSE.MISS_EXTERNAL",
- "BriefDescription": "An external snoop misses in some processor core.",
- "PublicDescription": "An external snoop misses in some processor core.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x22",
"UMask": "0x41",
"EventName": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE",
"BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which misses in some processor core.",
@@ -38,18 +26,6 @@
{
"Unit": "CBO",
"EventCode": "0x22",
- "UMask": "0x24",
- "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EXTERNAL",
- "BriefDescription": "An external snoop hits a non-modified line in some processor core.",
- "PublicDescription": "An external snoop hits a non-modified line in some processor core.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x22",
"UMask": "0x44",
"EventName": "UNC_CBO_XSNP_RESPONSE.HIT_XCORE",
"BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a non-modified line in some processor core.",
@@ -62,30 +38,6 @@
{
"Unit": "CBO",
"EventCode": "0x22",
- "UMask": "0x84",
- "EventName": "UNC_CBO_XSNP_RESPONSE.HIT_EVICTION",
- "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
- "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a non-modified line in some processor core.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x22",
- "UMask": "0x28",
- "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EXTERNAL",
- "BriefDescription": "An external snoop hits a modified line in some processor core.",
- "PublicDescription": "An external snoop hits a modified line in some processor core.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x22",
"UMask": "0x48",
"EventName": "UNC_CBO_XSNP_RESPONSE.HITM_XCORE",
"BriefDescription": "A cross-core snoop initiated by this Cbox due to processor core memory request which hits a modified line in some processor core.",
@@ -97,22 +49,10 @@
},
{
"Unit": "CBO",
- "EventCode": "0x22",
- "UMask": "0x88",
- "EventName": "UNC_CBO_XSNP_RESPONSE.HITM_EVICTION",
- "BriefDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
- "PublicDescription": "A cross-core snoop resulted from L3 Eviction which hits a modified line in some processor core.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
"EventCode": "0x34",
"UMask": "0x11",
"EventName": "UNC_CBO_CACHE_LOOKUP.READ_M",
- "BriefDescription": "L3 Lookup read request that access cache and found line in M-state.",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in M-state",
"PublicDescription": "L3 Lookup read request that access cache and found line in M-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -124,7 +64,7 @@
"EventCode": "0x34",
"UMask": "0x21",
"EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
- "BriefDescription": "L3 Lookup write request that access cache and found line in M-state.",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in M-state",
"PublicDescription": "L3 Lookup write request that access cache and found line in M-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -134,21 +74,9 @@
{
"Unit": "CBO",
"EventCode": "0x34",
- "UMask": "0x41",
- "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_M",
- "BriefDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
- "PublicDescription": "L3 Lookup external snoop request that access cache and found line in M-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
"UMask": "0x81",
"EventName": "UNC_CBO_CACHE_LOOKUP.ANY_M",
- "BriefDescription": "L3 Lookup any request that access cache and found line in M-state.",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in M-state",
"PublicDescription": "L3 Lookup any request that access cache and found line in M-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -160,7 +88,7 @@
"EventCode": "0x34",
"UMask": "0x18",
"EventName": "UNC_CBO_CACHE_LOOKUP.READ_I",
- "BriefDescription": "L3 Lookup read request that access cache and found line in I-state.",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in I-state",
"PublicDescription": "L3 Lookup read request that access cache and found line in I-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -170,33 +98,9 @@
{
"Unit": "CBO",
"EventCode": "0x34",
- "UMask": "0x28",
- "EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_I",
- "BriefDescription": "L3 Lookup write request that access cache and found line in I-state.",
- "PublicDescription": "L3 Lookup write request that access cache and found line in I-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
- "UMask": "0x48",
- "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_I",
- "BriefDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
- "PublicDescription": "L3 Lookup external snoop request that access cache and found line in I-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
"UMask": "0x88",
"EventName": "UNC_CBO_CACHE_LOOKUP.ANY_I",
- "BriefDescription": "L3 Lookup any request that access cache and found line in I-state.",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in I-state",
"PublicDescription": "L3 Lookup any request that access cache and found line in I-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -208,7 +112,7 @@
"EventCode": "0x34",
"UMask": "0x1f",
"EventName": "UNC_CBO_CACHE_LOOKUP.READ_MESI",
- "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in any MESI-state",
"PublicDescription": "L3 Lookup read request that access cache and found line in any MESI-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -220,7 +124,7 @@
"EventCode": "0x34",
"UMask": "0x2f",
"EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_MESI",
- "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in MESI-state",
"PublicDescription": "L3 Lookup write request that access cache and found line in MESI-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -230,21 +134,9 @@
{
"Unit": "CBO",
"EventCode": "0x34",
- "UMask": "0x4f",
- "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_MESI",
- "BriefDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
- "PublicDescription": "L3 Lookup external snoop request that access cache and found line in MESI-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
"UMask": "0x8f",
"EventName": "UNC_CBO_CACHE_LOOKUP.ANY_MESI",
- "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in MESI-state",
"PublicDescription": "L3 Lookup any request that access cache and found line in MESI-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -256,7 +148,7 @@
"EventCode": "0x34",
"UMask": "0x86",
"EventName": "UNC_CBO_CACHE_LOOKUP.ANY_ES",
- "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
+ "BriefDescription": "L3 Lookup any request that access cache and found line in E or S-state",
"PublicDescription": "L3 Lookup any request that access cache and found line in E or S-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -266,21 +158,9 @@
{
"Unit": "CBO",
"EventCode": "0x34",
- "UMask": "0x46",
- "EventName": "UNC_CBO_CACHE_LOOKUP.EXTSNP_ES",
- "BriefDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
- "PublicDescription": "L3 Lookup external snoop request that access cache and found line in E or S-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
"UMask": "0x16",
"EventName": "UNC_CBO_CACHE_LOOKUP.READ_ES",
- "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
+ "BriefDescription": "L3 Lookup read request that access cache and found line in E or S-state",
"PublicDescription": "L3 Lookup read request that access cache and found line in E or S-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -292,7 +172,7 @@
"EventCode": "0x34",
"UMask": "0x26",
"EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_ES",
- "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
+ "BriefDescription": "L3 Lookup write request that access cache and found line in E or S-state",
"PublicDescription": "L3 Lookup write request that access cache and found line in E or S-state.",
"Counter": "0,1",
"CounterMask": "0",
@@ -306,7 +186,19 @@
"EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
"BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
"PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
- "Counter": "0",
+ "Counter": "0,",
+ "CounterMask": "0",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x02",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.DRD_DIRECT",
+ "BriefDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
+ "PublicDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
+ "Counter": "0,",
"CounterMask": "0",
"Invert": "0",
"EdgeDetect": "0"
@@ -326,10 +218,10 @@
{
"Unit": "iMPH-U",
"EventCode": "0x81",
- "UMask": "0x20",
- "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
- "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
- "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "UMask": "0x02",
+ "EventName": "UNC_ARB_TRK_REQUESTS.DRD_DIRECT",
+ "BriefDescription": "Number of Core coherent Data Read entries allocated in DirectData mode",
+ "PublicDescription": "Number of Core coherent Data Read entries allocated in DirectData mode.",
"Counter": "0,1",
"CounterMask": "0",
"Invert": "0",
@@ -337,12 +229,12 @@
},
{
"Unit": "iMPH-U",
- "EventCode": "0x83",
- "UMask": "0x01",
- "EventName": "UNC_ARB_COH_TRK_OCCUPANCY.All",
- "BriefDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory)",
- "PublicDescription": "Each cycle count number of valid entries in Coherency Tracker queue from allocation till deallocation. Aperture requests (snoops) appear as NC decoded internally and become coherent (snoop L3, access memory).",
- "Counter": "0",
+ "EventCode": "0x81",
+ "UMask": "0x20",
+ "EventName": "UNC_ARB_TRK_REQUESTS.WRITES",
+ "BriefDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "PublicDescription": "Number of Writes allocated - any write transactions: full/partials writes and evictions.",
+ "Counter": "0,1",
"CounterMask": "0",
"Invert": "0",
"EdgeDetect": "0"
@@ -360,11 +252,23 @@
"EdgeDetect": "0"
},
{
+ "Unit": "iMPH-U",
+ "EventCode": "0x80",
+ "UMask": "0x01",
+ "EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
+ "BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.;",
+ "PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
+ "Counter": "0,",
+ "CounterMask": "1",
+ "Invert": "0",
+ "EdgeDetect": "0"
+ },
+ {
"Unit": "NCU",
"EventCode": "0x0",
"UMask": "0x01",
"EventName": "UNC_CLOCK.SOCKET",
- "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles.",
+ "BriefDescription": "This 48-bit fixed counter counts the UCLK cycles",
"PublicDescription": "This 48-bit fixed counter counts the UCLK cycles.",
"Counter": "FIXED",
"CounterMask": "0",

tip-bot for Andi Kleen

unread,
Apr 5, 2017, 2:00:05 AM4/5/17
to
Commit-ID: 92c6de0f10a80e4936fac04148bd3783a7c2b9f8
Gitweb: http://git.kernel.org/tip/92c6de0f10a80e4936fac04148bd3783a7c2b9f8
Author: Andi Kleen <a...@linux.intel.com>
AuthorDate: Wed, 29 Mar 2017 17:18:15 -0700
Committer: Andi Kleen <a...@linux.intel.com>
CommitDate: Thu, 30 Mar 2017 13:35:32 -0700

perf vendor events intel: Add uncore events for Skylake client

Add V25 of Skylake uncore events

Cc: jo...@kernel.org
Link: http://lkml.kernel.org/n/tip-00qmcrmq18...@git.kernel.org
Signed-off-by: Andi Kleen <a...@linux.intel.com>
---
.../arch/x86/{broadwell => skylake}/uncore.json | 32 +++-------------------
1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/uncore.json b/tools/perf/pmu-events/arch/x86/skylake/uncore.json
similarity index 86%
copy from tools/perf/pmu-events/arch/x86/broadwell/uncore.json
copy to tools/perf/pmu-events/arch/x86/skylake/uncore.json
index 28e1e15..dbc1932 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/uncore.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/uncore.json
@@ -50,18 +50,6 @@
{
"Unit": "CBO",
"EventCode": "0x34",
- "UMask": "0x11",
- "EventName": "UNC_CBO_CACHE_LOOKUP.READ_M",
- "BriefDescription": "L3 Lookup read request that access cache and found line in M-state",
- "PublicDescription": "L3 Lookup read request that access cache and found line in M-state.",
- "Counter": "0,1",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "CBO",
- "EventCode": "0x34",
"UMask": "0x21",
"EventName": "UNC_CBO_CACHE_LOOKUP.WRITE_M",
"BriefDescription": "L3 Lookup write request that access cache and found line in M-state",
@@ -184,21 +172,9 @@
"EventCode": "0x80",
"UMask": "0x01",
"EventName": "UNC_ARB_TRK_OCCUPANCY.ALL",
- "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
- "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from it's allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
- "Counter": "0,",
- "CounterMask": "0",
- "Invert": "0",
- "EdgeDetect": "0"
- },
- {
- "Unit": "iMPH-U",
- "EventCode": "0x80",
- "UMask": "0x02",
- "EventName": "UNC_ARB_TRK_OCCUPANCY.DRD_DIRECT",
- "BriefDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
- "PublicDescription": "Each cycle count number of 'valid' coherent Data Read entries that are in DirectData mode. Such entry is defined as valid when it is allocated till data sent to Core (first chunk, IDI0). Applicable for IA Cores' requests in normal case.",
- "Counter": "0,",
+ "BriefDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from its allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "PublicDescription": "Each cycle count number of all Core outgoing valid entries. Such entry is defined as valid from its allocation till first of IDI0 or DRS0 messages is sent out. Accounts for Coherent and non-coherent traffic.",
+ "Counter": "0",
"CounterMask": "0",
"Invert": "0",
"EdgeDetect": "0"
@@ -258,7 +234,7 @@
"EventName": "UNC_ARB_TRK_OCCUPANCY.CYCLES_WITH_ANY_REQUEST",
"BriefDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.;",
"PublicDescription": "Cycles with at least one request outstanding is waiting for data return from memory controller. Account for coherent and non-coherent requests initiated by IA Cores, Processor Graphics Unit, or LLC.",
- "Counter": "0,",
+ "Counter": "0",
"CounterMask": "1",

tip-bot for Arnaldo Carvalho de Melo

unread,
Apr 5, 2017, 2:00:05 AM4/5/17
to
Commit-ID: 427748068a973627b406bf7312342b6fe4742d07
Gitweb: http://git.kernel.org/tip/427748068a973627b406bf7312342b6fe4742d07
Author: Arnaldo Carvalho de Melo <ac...@redhat.com>
AuthorDate: Tue, 4 Apr 2017 11:36:22 -0300
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Tue, 4 Apr 2017 11:36:22 -0300

perf tools: Remove die() call

We can just use the exit() right after the branch calling die().

Link: http://lkml.kernel.org/n/tip-90athn06d7...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
---
tools/perf/perf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 9217f22..9dc346f 100644
It is loading more messages.
0 new messages