<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/events/ring_buffer.c, branch v3.10.78</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>perf: Fix perf ring buffer memory ordering</title>
<updated>2013-11-20T20:27:47+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-28T12:55:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0fb024c4377701319ae29af1b23b647885d7c808'/>
<id>0fb024c4377701319ae29af1b23b647885d7c808</id>
<content type='text'>
commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix vmalloc ring buffer pages handling</title>
<updated>2013-05-01T10:34:46+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2013-03-19T14:35:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5919b30933d7c4fac1f214c59f26c5e990044f09'/>
<id>5919b30933d7c4fac1f214c59f26c5e990044f09</id>
<content type='text'>
If we allocate perf ring buffer with the size of single (user)
page, we will get memory corruption when releasing itin
rb_free_work function (for CONFIG_PERF_USE_VMALLOC option).

For single page sized ring buffer the page_order is -1 (because
nr_pages is 0). This needs to be recognized in the rb_free_work
function to release proper amount of pages.

Adding data_page_nr function that returns number of allocated
data pages. Customizing the rest of the code to use it.

Reported-by: Jan Stancek &lt;jstancek@redhat.com&gt;
Original-patch-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lkml.kernel.org/r/20130319143509.GA1128@krava.brq.redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we allocate perf ring buffer with the size of single (user)
page, we will get memory corruption when releasing itin
rb_free_work function (for CONFIG_PERF_USE_VMALLOC option).

For single page sized ring buffer the page_order is -1 (because
nr_pages is 0). This needs to be recognized in the rb_free_work
function to release proper amount of pages.

Adding data_page_nr function that returns number of allocated
data pages. Customizing the rest of the code to use it.

Reported-by: Jan Stancek &lt;jstancek@redhat.com&gt;
Original-patch-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lkml.kernel.org/r/20130319143509.GA1128@krava.brq.redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix ring_buffer perf_output_space() boundary calculation</title>
<updated>2013-03-21T11:04:35+00:00</updated>
<author>
<name>Stephane Eranian</name>
<email>eranian@google.com</email>
</author>
<published>2013-03-18T13:33:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd9c086d9f507d02d5ba4d7c5eef4bb9518088b8'/>
<id>dd9c086d9f507d02d5ba4d7c5eef4bb9518088b8</id>
<content type='text'>
This patch fixes a flaw in perf_output_space(). In case the size
of the space needed is bigger than the actual buffer size, there
may be situations where the function would return true (i.e.,
there is space) when it should not. head &gt; offset due to
rounding of the masking logic.

The problem can be tested by activating BTS on Intel processors.
A BTS record can be as big as 16 pages. The following command
fails:

  $ perf record -m 4 -c 1 -e branches:u my_test_program

You will get a buffer corruption with this. Perf report won't be
able to parse the perf.data.

The fix is to first check that the requested space is smaller
than the buffer size. If so, then the masking logic will work
fine. If not, then there is no chance the record can be saved
and it will be gracefully handled by upper code layers.

[ In v2, we also make the logic for the writable more explicit by
  renaming it to rb-&gt;overwrite because it tells whether or not the
  buffer can overwrite its tail (suggested by PeterZ). ]

Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20130318133327.GA3056@quad
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fixes a flaw in perf_output_space(). In case the size
of the space needed is bigger than the actual buffer size, there
may be situations where the function would return true (i.e.,
there is space) when it should not. head &gt; offset due to
rounding of the masking logic.

The problem can be tested by activating BTS on Intel processors.
A BTS record can be as big as 16 pages. The following command
fails:

  $ perf record -m 4 -c 1 -e branches:u my_test_program

You will get a buffer corruption with this. Perf report won't be
able to parse the perf.data.

The fix is to first check that the requested space is smaller
than the buffer size. If so, then the masking logic will work
fine. If not, then there is no chance the record can be saved
and it will be gracefully handled by upper code layers.

[ In v2, we also make the logic for the writable more explicit by
  renaming it to rb-&gt;overwrite because it tells whether or not the
  buffer can overwrite its tail (suggested by PeterZ). ]

Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20130318133327.GA3056@quad
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Add perf_output_skip function to skip bytes in sample</title>
<updated>2012-08-10T15:16:22+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2012-08-07T13:20:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5685e0ff45f5df67e79e9b052b6ffd501ff38c11'/>
<id>5685e0ff45f5df67e79e9b052b6ffd501ff38c11</id>
<content type='text'>
Introducing perf_output_skip function to be able to skip data within the
perf ring buffer.

When writing data into perf ring buffer we first reserve needed place in
ring buffer and then copy the actual data.

There's a possibility we won't be able to fill all the reserved size
with data, so we need a way to skip the remaining bytes.

This is going to be useful when storing the user stack dump, where we
might end up with less data than we originally requested.

Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Benjamin Redelings &lt;benjamin.redelings@nescent.org&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Tom Zanussi &lt;tzanussi@gmail.com&gt;
Cc: Ulrich Drepper &lt;drepper@gmail.com&gt;
Link: http://lkml.kernel.org/r/1344345647-11536-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introducing perf_output_skip function to be able to skip data within the
perf ring buffer.

When writing data into perf ring buffer we first reserve needed place in
ring buffer and then copy the actual data.

There's a possibility we won't be able to fill all the reserved size
with data, so we need a way to skip the remaining bytes.

This is going to be useful when storing the user stack dump, where we
might end up with less data than we originally requested.

Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Benjamin Redelings &lt;benjamin.redelings@nescent.org&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Tom Zanussi &lt;tzanussi@gmail.com&gt;
Cc: Ulrich Drepper &lt;drepper@gmail.com&gt;
Link: http://lkml.kernel.org/r/1344345647-11536-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Factor __output_copy to be usable with specific copy function</title>
<updated>2012-08-10T14:44:06+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-08-07T13:20:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=91d7753a45f8525dc75b6be01e427dc1c378dc16'/>
<id>91d7753a45f8525dc75b6be01e427dc1c378dc16</id>
<content type='text'>
Adding a generic way to use __output_copy function with specific copy
function via DEFINE_PERF_OUTPUT_COPY macro.

Using this to add new __output_copy_user function, that provides output
copy from user pointers. For x86 the copy_from_user_nmi function is used
and __copy_from_user_inatomic for the rest of the architectures.

This new function will be used in user stack dump on sample, coming in
next patches.

Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Benjamin Redelings &lt;benjamin.redelings@nescent.org&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Tom Zanussi &lt;tzanussi@gmail.com&gt;
Cc: Ulrich Drepper &lt;drepper@gmail.com&gt;
Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.com
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adding a generic way to use __output_copy function with specific copy
function via DEFINE_PERF_OUTPUT_COPY macro.

Using this to add new __output_copy_user function, that provides output
copy from user pointers. For x86 the copy_from_user_nmi function is used
and __copy_from_user_inatomic for the rest of the architectures.

This new function will be used in user stack dump on sample, coming in
next patches.

Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Benjamin Redelings &lt;benjamin.redelings@nescent.org&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Tom Zanussi &lt;tzanussi@gmail.com&gt;
Cc: Ulrich Drepper &lt;drepper@gmail.com&gt;
Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.com
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2012-01-08T21:21:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-01-08T21:21:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=98793265b429a3f0b3f1750e74d67cd4d740d162'/>
<id>98793265b429a3f0b3f1750e74d67cd4d740d162</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
  Kconfig: acpi: Fix typo in comment.
  misc latin1 to utf8 conversions
  devres: Fix a typo in devm_kfree comment
  btrfs: free-space-cache.c: remove extra semicolon.
  fat: Spelling s/obsolate/obsolete/g
  SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
  tools/power turbostat: update fields in manpage
  mac80211: drop spelling fix
  types.h: fix comment spelling for 'architectures'
  typo fixes: aera -&gt; area, exntension -&gt; extension
  devices.txt: Fix typo of 'VMware'.
  sis900: Fix enum typo 'sis900_rx_bufer_status'
  decompress_bunzip2: remove invalid vi modeline
  treewide: Fix comment and string typo 'bufer'
  hyper-v: Update MAINTAINERS
  treewide: Fix typos in various parts of the kernel, and fix some comments.
  clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
  gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
  leds: Kconfig: Fix typo 'D2NET_V2'
  sound: Kconfig: drop unknown symbol ARCH_CLPS7500
  ...

Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
kconfig additions, close to removed commented-out old ones)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
  Kconfig: acpi: Fix typo in comment.
  misc latin1 to utf8 conversions
  devres: Fix a typo in devm_kfree comment
  btrfs: free-space-cache.c: remove extra semicolon.
  fat: Spelling s/obsolate/obsolete/g
  SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
  tools/power turbostat: update fields in manpage
  mac80211: drop spelling fix
  types.h: fix comment spelling for 'architectures'
  typo fixes: aera -&gt; area, exntension -&gt; extension
  devices.txt: Fix typo of 'VMware'.
  sis900: Fix enum typo 'sis900_rx_bufer_status'
  decompress_bunzip2: remove invalid vi modeline
  treewide: Fix comment and string typo 'bufer'
  hyper-v: Update MAINTAINERS
  treewide: Fix typos in various parts of the kernel, and fix some comments.
  clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
  gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
  leds: Kconfig: Fix typo 'D2NET_V2'
  sound: Kconfig: drop unknown symbol ARCH_CLPS7500
  ...

Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
kconfig additions, close to removed commented-out old ones)
</pre>
</div>
</content>
</entry>
<entry>
<title>misc latin1 to utf8 conversions</title>
<updated>2012-01-02T12:04:55+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-12-29T22:09:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d36b691077dc59c74efec0d54ed21b86f7a2a21a'/>
<id>d36b691077dc59c74efec0d54ed21b86f7a2a21a</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix loss of notification with multi-event</title>
<updated>2011-12-05T08:33:03+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-11-26T01:47:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10c6db110d0eb4466b59812c49088ab56218fc2e'/>
<id>10c6db110d0eb4466b59812c49088ab56218fc2e</id>
<content type='text'>
When you do:
        $ perf record -e cycles,cycles,cycles noploop 10

You expect about 10,000 samples for each event, i.e., 10s at
1000samples/sec. However, this is not what's happening. You
get much fewer samples, maybe 3700 samples/event:

$ perf report -D | tail -15
Aggregated stats:
           TOTAL events:      10998
            MMAP events:         66
            COMM events:          2
          SAMPLE events:      10930
cycles stats:
           TOTAL events:       3644
          SAMPLE events:       3644
cycles stats:
           TOTAL events:       3642
          SAMPLE events:       3642
cycles stats:
           TOTAL events:       3644
          SAMPLE events:       3644

On a Intel Nehalem or even AMD64, there are 4 counters capable
of measuring cycles, so there is plenty of space to measure those
events without multiplexing (even with the NMI watchdog active).
And even with multiplexing, we'd expect roughly the same number
of samples per event.

The root of the problem was that when the event that caused the buffer
to become full was not the first event passed on the cmdline, the user
notification would get lost. The notification was sent to the file
descriptor of the overflowed event but the perf tool was not polling
on it.  The perf tool aggregates all samples into a single buffer,
i.e., the buffer of the first event. Consequently, it assumes
notifications for any event will come via that descriptor.

The seemingly straight forward solution of moving the waitq into the
ringbuffer object doesn't work because of life-time issues. One could
perf_event_set_output() on a fd that you're also blocking on and cause
the old rb object to be freed while its waitq would still be
referenced by the blocked thread -&gt; FAIL.

Therefore link all events to the ringbuffer and broadcast the wakeup
from the ringbuffer object to all possible events that could be waited
upon. This is rather ugly, and we're open to better solutions but it
works for now.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Finished-by: Stephane Eranian &lt;eranian@google.com&gt;
Reviewed-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20111126014731.GA7030@quad
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When you do:
        $ perf record -e cycles,cycles,cycles noploop 10

You expect about 10,000 samples for each event, i.e., 10s at
1000samples/sec. However, this is not what's happening. You
get much fewer samples, maybe 3700 samples/event:

$ perf report -D | tail -15
Aggregated stats:
           TOTAL events:      10998
            MMAP events:         66
            COMM events:          2
          SAMPLE events:      10930
cycles stats:
           TOTAL events:       3644
          SAMPLE events:       3644
cycles stats:
           TOTAL events:       3642
          SAMPLE events:       3642
cycles stats:
           TOTAL events:       3644
          SAMPLE events:       3644

On a Intel Nehalem or even AMD64, there are 4 counters capable
of measuring cycles, so there is plenty of space to measure those
events without multiplexing (even with the NMI watchdog active).
And even with multiplexing, we'd expect roughly the same number
of samples per event.

The root of the problem was that when the event that caused the buffer
to become full was not the first event passed on the cmdline, the user
notification would get lost. The notification was sent to the file
descriptor of the overflowed event but the perf tool was not polling
on it.  The perf tool aggregates all samples into a single buffer,
i.e., the buffer of the first event. Consequently, it assumes
notifications for any event will come via that descriptor.

The seemingly straight forward solution of moving the waitq into the
ringbuffer object doesn't work because of life-time issues. One could
perf_event_set_output() on a fd that you're also blocking on and cause
the old rb object to be freed while its waitq would still be
referenced by the blocked thread -&gt; FAIL.

Therefore link all events to the ringbuffer and broadcast the wakeup
from the ringbuffer object to all possible events that could be waited
upon. This is rather ugly, and we're open to better solutions but it
works for now.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Finished-by: Stephane Eranian &lt;eranian@google.com&gt;
Reviewed-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20111126014731.GA7030@quad
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Remove the perf_output_begin(.sample) argument</title>
<updated>2011-07-01T09:06:35+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-06-27T14:47:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a7ac67ea021b4603095d2aa458bc41641238f22c'/>
<id>a7ac67ea021b4603095d2aa458bc41641238f22c</id>
<content type='text'>
Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Remove the nmi parameter from the swevent and overflow interface</title>
<updated>2011-07-01T09:06:35+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-06-27T12:41:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a8b0ca17b80e92faab46ee7179ba9e99ccb61233'/>
<id>a8b0ca17b80e92faab46ee7179ba9e99ccb61233</id>
<content type='text'>
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Michael Cree &lt;mcree@orcon.net.nz&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Deng-Cheng Zhu &lt;dengcheng.zhu@gmail.com&gt;
Cc: Anton Blanchard &lt;anton@samba.org&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Jason Wessel &lt;jason.wessel@windriver.com&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Michael Cree &lt;mcree@orcon.net.nz&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Deng-Cheng Zhu &lt;dengcheng.zhu@gmail.com&gt;
Cc: Anton Blanchard &lt;anton@samba.org&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Jason Wessel &lt;jason.wessel@windriver.com&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
