summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation/perf-report.txt
diff options
context:
space:
mode:
authorDon Zickus <dzickus@redhat.com>2014-06-01 15:38:29 +0200
committerJiri Olsa <jolsa@kernel.org>2014-06-09 13:34:49 +0200
commit9b32ba71ba905b90610fc2aad77cb98a373c5624 (patch)
tree452c7fd4bd5d494d8b170fd5e840a5ae004e65ba /tools/perf/Documentation/perf-report.txt
parent2b1b71003ea809e619bd73e74dfc2a73069de66f (diff)
perf tools: Add dcacheline sort
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a particular sample instruction. A bunch of those details relate to the data address. One interesting thing you can do with data addresses is to convert them into a unique cacheline they belong too. Organizing these data cachelines into similar groups and sorting them can reveal cache contention. This patch creates an alogorithm based on various sample details that can help group entries together into data cachelines and allows 'perf report' to sort on it. The algorithm relies on having proper mmap2 support in the kernel to help determine if the memory map the data address belongs to is private to a pid or globally shared. The alogortithm is as follows: o group cpumodes together o group entries with discovered maps together o sort on major, minor, inode and inode generation numbers o if userspace anon, then sort on pid o sort on cachelines based on data addresses The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'. Sample output: # # Samples: 206 of event 'cpu/mem-loads/pp' # Total weight : 2534 # Sort order : dcacheline,pid # # Overhead Samples Data Cacheline Command: Pid # ........ ............ ...................................................................... .................. # 13.22% 1 [k] 0xffff88042f08ebc0 swapper: 0 9.27% 1 [k] 0xffff88082e8cea80 swapper: 0 3.59% 2 [k] 0xffffffff819ba180 swapper: 0 0.32% 1 [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0 swapper: 0 0.32% 1 [k] timekeeper_seq+0xfffffffffffffff8 swapper: 0 Note: Added a '+1' to symlen size in hists__calc_col_len to prevent the next column from prematurely tabbing over and mis-aligning. Not sure what the problem is. Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Diffstat (limited to 'tools/perf/Documentation/perf-report.txt')
-rw-r--r--tools/perf/Documentation/perf-report.txt3
1 files changed, 2 insertions, 1 deletions
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 00fbfb6d7f14..d2b59af62bc0 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -119,7 +119,7 @@ OPTIONS
If --mem-mode option is used, following sort keys are also available
(incompatible with --branch-stack):
- symbol_daddr, dso_daddr, locked, tlb, mem, snoop.
+ symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
- symbol_daddr: name of data symbol being executed on at the time of sample
- dso_daddr: name of library or module containing the data being executed
@@ -128,6 +128,7 @@ OPTIONS
- tlb: type of tlb access for the data at the time of sample
- mem: type of memory access for the data at the time of sample
- snoop: type of snoop (if any) for the data at the time of sample
+ - dcacheline: the cacheline the data address is on at the time of sample
And default sort keys are changed to local_weight, mem, sym, dso,
symbol_daddr, dso_daddr, snoop, tlb, locked, see '--mem-mode'.