linux-toradex.git/tools/perf/bench, branch v3.10.78

perf tools: Fix LIBNUMA build with glibc 2.12 and older.

2013-03-14T11:06:21+00:00

The tokens MADV_HUGEPAGE and MADV_NOHUGEPAGE are not available with
glibc 2.12 and older. Define these tokens if they are not already
defined.

This patch fixes these build errors with older versions of glibc.

    CC bench/numa.o
bench/numa.c: In function ‘alloc_data’:
bench/numa.c:334: error: ‘MADV_HUGEPAGE’ undeclared (first use in this function)
bench/numa.c:334: error: (Each undeclared identifier is reported only once
bench/numa.c:334: error: for each function it appears in.)
bench/numa.c:341: error: ‘MADV_NOHUGEPAGE’ undeclared (first use in this function)
make: *** [bench/numa.o] Error 1

Signed-off-by: Vinson Lee 
Acked-by: Ingo Molnar 
Cc: Ingo Molnar 
Cc: Irina Tirdea 
Cc: Paul Mackerras 
Cc: Pekka Enberg 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1363214064-4671-2-git-send-email-vlee@twitter.com
Signed-off-by: Arnaldo Carvalho de Melo

perf: Add 'perf bench numa mem' NUMA performance measurement suite

2013-01-30T13:35:36+00:00

Add a suite of NUMA performance benchmarks.

The goal was simulate the behavior and access patterns of real NUMA
workloads, via a wide range of parameters, so this tool goes well
beyond simple bzero() measurements that most NUMA micro-benchmarks use:

 - It processes the data and creates a chain of data dependencies,
   like a real workload would. Neither the compiler, nor the
   kernel (via KSM and other optimizations) nor the CPU can
   eliminate parts of the workload.

 - It randomizes the initial state and also randomizes the target
   addresses of the processing - it's not a simple forward scan
   of addresses.

 - It provides flexible options to set process, thread and memory
   relationship information: -G sets "global" memory shared between
   all test processes, -P sets "process" memory shared by all
   threads of a process and -T sets "thread" private memory.

 - There's a NUMA convergence monitoring and convergence latency
   measurement option via -c and -m.

 - Micro-sleeps and synchronization can be injected to provoke lock
   contention and scheduling, via the -u and -S options. This simulates
   IO and contention.

 - The -x option instructs the workload to 'perturb' itself artificially
   every N seconds, by moving to the first and last CPU of the system
   periodically. This way the stability of convergence equilibrium and
   the number of steps taken for the scheduler to reach equilibrium again
   can be measured.

 - The amount of work can be specified via the -l loop count, and/or
   via a -s seconds-timeout value.

 - CPU and node memory binding options, to test hard binding scenarios.
   THP can be turned on and off via madvise() calls.

 - Live reporting of convergence progress in an 'at glance' output format.
   Printing of convergence and deconvergence events.

The 'perf bench numa mem -a' option will start an array of about 30
individual tests that will each output such measurements:

 # Running  5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp  1"
  5x5-bw-thread,                         20.276, secs,           runtime-max/thread
  5x5-bw-thread,                         20.004, secs,           runtime-min/thread
  5x5-bw-thread,                         20.155, secs,           runtime-avg/thread
  5x5-bw-thread,                          0.671, %,              spread-runtime/thread
  5x5-bw-thread,                         21.153, GB,             data/thread
  5x5-bw-thread,                        528.818, GB,             data-total
  5x5-bw-thread,                          0.959, nsecs,          runtime/byte/thread
  5x5-bw-thread,                          1.043, GB/sec,         thread-speed
  5x5-bw-thread,                         26.081, GB/sec,         total-speed

See the help text and the code for more details.

Cc: Peter Zijlstra 
Cc: Arnaldo Carvalho de Melo 
Cc: Frederic Weisbecker 
Cc: Mike Galbraith 
Cc: Steven Rostedt 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Peter Zijlstra 
Cc: Andrea Arcangeli 
Cc: Rik van Riel 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Signed-off-by: Ingo Molnar

perf tools: Use __maybe_used for unused variables

2012-09-11T15:19:15+00:00

perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.

Signed-off-by: Irina Tirdea 
Acked-by: Pekka Enberg 
Cc: David Ahern 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Steven Rostedt 
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo

perf bench: fix assert when NDEBUG is defined

2012-09-08T16:18:54+00:00

When NDEBUG is defined, the assert macro will be expanded to nothing.
Some assert calls used in perf are also including some functionality
(e.g. system calls), not only validity checks. Therefore, if NDEBUG is
defined, this functionality will be removed along with the assert.  Perf
also defines BUG_ON based on assert, so it has the same problem.

Define BUG_ON so that the condition will be executed when NDEBUG is
defined.  Replace the assert statements that have these side effects
with BUG_ON.

For defining BUG_ON, use "if (cond) {}" insted of "if (cond) ;" because
in the latter case build fails with "error: suggest braces around empty
body in an ‘if’ statement [-Werror=empty-body]"

Suggested-by: Peter Zijlstra 
Signed-off-by: Irina Tirdea 
Reviewed-by: Namhyung Kim 
Reviewed-by: Pekka Enberg 
Cc: David Ahern 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Steven Rostedt 
Link: http://lkml.kernel.org/r/1347082551-2394-1-git-send-email-irina.tirdea@intel.com
Signed-off-by: Arnaldo Carvalho de Melo

perf bench: Fix confused variable namings and descriptions in mem subsystem

2012-07-02T17:35:45+00:00

As Namhyung Kim pointed, there are confused namings and descriptions of words
"cycle" and "clock" in mem-memset.c and mem-memcpy.c.

With the option "-c" (or "--clock", now renamed as "--cycle"), mem subsystem
measures cost of memset() and memcpy() with cpu-cycles event.

But current mem subsystem source code contains lots of confused variable
namings and descriptions with "clock" (e.g. the variable use_clock). This is a
very bad style because there is another software event named "cpu-clock". This
patch replaces wrong usage of "clock" to "cycle".

v2: modified Documentation/perf-bench.txt for the descriptions of
--cycle option

Signed-off-by: Hitoshi Mitake 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/1341236777-18457-1-git-send-email-h.mitake@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

perf bench: Documentation update

2012-06-27T16:17:48+00:00

The current perf-bench documentation has a couple of typos and even
lacks entire description of mem subsystem. Fix it.

Reported-by: Ingo Molnar 
Signed-off-by: Namhyung Kim 
Acked-by: Hitoshi Mitake 
Cc: Hitoshi Mitake 
Cc: Ingo Molnar 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1340172486-17805-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf tool: Fix perf stack to non executable on x86_64

2012-02-06T21:14:17+00:00

By adding following objects:
  bench/mem-memset-x86-64-asm.o
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above objects are assembler sourced and are missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned objects, with all
flags disabled, thus omiting those objects from linker stack flags decision.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams 
Acked-by: Eric Dumazet 
Cc: Corey Ashford 
Cc: Ingo Molnar 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa 
[ committer note: Remaining bits after what was already added to perf/urgent ]
Signed-off-by: Arnaldo Carvalho de Melo

Merge branch 'perf/urgent' into perf/core

2012-02-06T21:11:02+00:00

So that we can get the perf bench exec stack fixes and then apply the
remaining fix for the files added after what is in perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo

perf tools: Fix perf stack to non executable on x86_64

2012-02-06T20:54:06+00:00

By adding following objects:
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above object are assembler sourced and is missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned object, with all
flags disabled, thus omiting this object from linker stack flags decision.

Problem introduced in:

  $ git describe ea7872b
  v2.6.37-rc2-19-gea7872b

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams 
Acked-by: Eric Dumazet 
Cc: Corey Ashford 
Cc: Ingo Molnar 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa 
[ committer note: Backported fix to perf/urgent (3.3-rc2+) ]
Signed-off-by: Arnaldo Carvalho de Melo

perf tools: Remove unnecessary ctype.h inclusion

2012-01-30T20:37:35+00:00

There are unnecessary #include  out there, and they might cause
a nasty build failure in some environment. As we already have most of
ctype macros in util.h, just get rid of them.

A few of exceptions are util/symbol.c which needs isupper() macro util.h
doesn't provide and perl scripting support code which includes ctype.h
internally.

Suggested-by: Ingo Molnar 
Cc: Ingo Molnar 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1327827356-8786-4-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim 
Signed-off-by: Arnaldo Carvalho de Melo