<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/events, branch v3.12.29</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>perf: Fix race in removing an event</title>
<updated>2014-06-20T15:33:52+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-02T14:56:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c8db9ba0ae861889aaed1cad87483a966fdda818'/>
<id>c8db9ba0ae861889aaed1cad87483a966fdda818</id>
<content type='text'>
commit 46ce0fe97a6be7532ce6126bb26ce89fed81528c upstream.

When removing a (sibling) event we do:

	raw_spin_lock_irq(&amp;ctx-&gt;lock);
	perf_group_detach(event);
	raw_spin_unlock_irq(&amp;ctx-&gt;lock);

	&lt;hole&gt;

	perf_remove_from_context(event);
		raw_spin_lock_irq(&amp;ctx-&gt;lock);
		...
		raw_spin_unlock_irq(&amp;ctx-&gt;lock);

Now, assuming the event is a sibling, it will be 'unreachable' for
things like ctx_sched_out() because that iterates the
groups-&gt;siblings, and we just unhooked the sibling.

So, if during &lt;hole&gt; we get ctx_sched_out(), it will miss the event
and not call event_sched_out() on it, leaving it programmed on the
PMU.

The subsequent perf_remove_from_context() call will find the ctx is
inactive and only call list_del_event() to remove the event from all
other lists.

Hereafter we can proceed to free the event; while still programmed!

Close this hole by moving perf_group_detach() inside the same
ctx-&gt;lock region(s) perf_remove_from_context() has.

The condition on inherited events only in __perf_event_exit_task() is
likely complete crap because non-inherited events are part of groups
too and we're tearing down just the same. But leave that for another
patch.

Most-likely-Fixes: e03a9a55b4e ("perf: Change close() semantics for group events")
Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Tested-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20140505093124.GN17778@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 46ce0fe97a6be7532ce6126bb26ce89fed81528c upstream.

When removing a (sibling) event we do:

	raw_spin_lock_irq(&amp;ctx-&gt;lock);
	perf_group_detach(event);
	raw_spin_unlock_irq(&amp;ctx-&gt;lock);

	&lt;hole&gt;

	perf_remove_from_context(event);
		raw_spin_lock_irq(&amp;ctx-&gt;lock);
		...
		raw_spin_unlock_irq(&amp;ctx-&gt;lock);

Now, assuming the event is a sibling, it will be 'unreachable' for
things like ctx_sched_out() because that iterates the
groups-&gt;siblings, and we just unhooked the sibling.

So, if during &lt;hole&gt; we get ctx_sched_out(), it will miss the event
and not call event_sched_out() on it, leaving it programmed on the
PMU.

The subsequent perf_remove_from_context() call will find the ctx is
inactive and only call list_del_event() to remove the event from all
other lists.

Hereafter we can proceed to free the event; while still programmed!

Close this hole by moving perf_group_detach() inside the same
ctx-&gt;lock region(s) perf_remove_from_context() has.

The condition on inherited events only in __perf_event_exit_task() is
likely complete crap because non-inherited events are part of groups
too and we're tearing down just the same. But leave that for another
patch.

Most-likely-Fixes: e03a9a55b4e ("perf: Change close() semantics for group events")
Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Tested-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20140505093124.GN17778@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Limit perf_event_attr::sample_period to 63 bits</title>
<updated>2014-06-20T15:33:52+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-15T18:23:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ed8acba3f62f4cb479628215582b487872c398d7'/>
<id>ed8acba3f62f4cb479628215582b487872c398d7</id>
<content type='text'>
commit 0819b2e30ccb93edf04876237b6205eef84ec8d2 upstream.

Vince reported that using a large sample_period (one with bit 63 set)
results in wreckage since while the sample_period is fundamentally
unsigned (negative periods don't make sense) the way we implement
things very much rely on signed logic.

So limit sample_period to 63 bits to avoid tripping over this.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/n/tip-p25fhunibl4y3qi0zuqmyf4b@git.kernel.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0819b2e30ccb93edf04876237b6205eef84ec8d2 upstream.

Vince reported that using a large sample_period (one with bit 63 set)
results in wreckage since while the sample_period is fundamentally
unsigned (negative periods don't make sense) the way we implement
things very much rely on signed logic.

So limit sample_period to 63 bits to avoid tripping over this.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/n/tip-p25fhunibl4y3qi0zuqmyf4b@git.kernel.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Prevent false warning in perf_swevent_add</title>
<updated>2014-06-20T15:33:51+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2014-04-07T09:04:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e36e8c8f72bc4f8309bf0dea92d58aa661c74c84'/>
<id>e36e8c8f72bc4f8309bf0dea92d58aa661c74c84</id>
<content type='text'>
commit 39af6b1678afa5880dda7e375cf3f9d395087f6d upstream.

The perf cpu offline callback takes down all cpu context
events and releases swhash-&gt;swevent_hlist.

This could race with task context software event being just
scheduled on this cpu via perf_swevent_add while cpu hotplug
code already cleaned up event's data.

The race happens in the gap between the cpu notifier code
and the cpu being actually taken down. Note that only cpu
ctx events are terminated in the perf cpu hotplug code.

It's easily reproduced with:
  $ perf record -e faults perf bench sched pipe

while putting one of the cpus offline:
  # echo 0 &gt; /sys/devices/system/cpu/cpu1/online

Console emits following warning:
  WARNING: CPU: 1 PID: 2845 at kernel/events/core.c:5672 perf_swevent_add+0x18d/0x1a0()
  Modules linked in:
  CPU: 1 PID: 2845 Comm: sched-pipe Tainted: G        W    3.14.0+ #256
  Hardware name: Intel Corporation Montevina platform/To be filled by O.E.M., BIOS AMVACRB1.86C.0066.B00.0805070703 05/07/2008
   0000000000000009 ffff880077233ab8 ffffffff81665a23 0000000000200005
   0000000000000000 ffff880077233af8 ffffffff8104732c 0000000000000046
   ffff88007467c800 0000000000000002 ffff88007a9cf2a0 0000000000000001
  Call Trace:
   [&lt;ffffffff81665a23&gt;] dump_stack+0x4f/0x7c
   [&lt;ffffffff8104732c&gt;] warn_slowpath_common+0x8c/0xc0
   [&lt;ffffffff8104737a&gt;] warn_slowpath_null+0x1a/0x20
   [&lt;ffffffff8110fb3d&gt;] perf_swevent_add+0x18d/0x1a0
   [&lt;ffffffff811162ae&gt;] event_sched_in.isra.75+0x9e/0x1f0
   [&lt;ffffffff8111646a&gt;] group_sched_in+0x6a/0x1f0
   [&lt;ffffffff81083dd5&gt;] ? sched_clock_local+0x25/0xa0
   [&lt;ffffffff811167e6&gt;] ctx_sched_in+0x1f6/0x450
   [&lt;ffffffff8111757b&gt;] perf_event_sched_in+0x6b/0xa0
   [&lt;ffffffff81117a4b&gt;] perf_event_context_sched_in+0x7b/0xc0
   [&lt;ffffffff81117ece&gt;] __perf_event_task_sched_in+0x43e/0x460
   [&lt;ffffffff81096f1e&gt;] ? put_lock_stats.isra.18+0xe/0x30
   [&lt;ffffffff8107b3c8&gt;] finish_task_switch+0xb8/0x100
   [&lt;ffffffff8166a7de&gt;] __schedule+0x30e/0xad0
   [&lt;ffffffff81172dd2&gt;] ? pipe_read+0x3e2/0x560
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b464&gt;] preempt_schedule_irq+0x44/0x70
   [&lt;ffffffff816707f0&gt;] retint_kernel+0x20/0x30
   [&lt;ffffffff8109e60a&gt;] ? lockdep_sys_exit+0x1a/0x90
   [&lt;ffffffff812a4234&gt;] lockdep_sys_exit_thunk+0x35/0x67
   [&lt;ffffffff81679321&gt;] ? sysret_check+0x5/0x56

Fixing this by tracking the cpu hotplug state and displaying
the WARN only if current cpu is initialized properly.

Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1396861448-10097-1-git-send-email-jolsa@redhat.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 39af6b1678afa5880dda7e375cf3f9d395087f6d upstream.

The perf cpu offline callback takes down all cpu context
events and releases swhash-&gt;swevent_hlist.

This could race with task context software event being just
scheduled on this cpu via perf_swevent_add while cpu hotplug
code already cleaned up event's data.

The race happens in the gap between the cpu notifier code
and the cpu being actually taken down. Note that only cpu
ctx events are terminated in the perf cpu hotplug code.

It's easily reproduced with:
  $ perf record -e faults perf bench sched pipe

while putting one of the cpus offline:
  # echo 0 &gt; /sys/devices/system/cpu/cpu1/online

Console emits following warning:
  WARNING: CPU: 1 PID: 2845 at kernel/events/core.c:5672 perf_swevent_add+0x18d/0x1a0()
  Modules linked in:
  CPU: 1 PID: 2845 Comm: sched-pipe Tainted: G        W    3.14.0+ #256
  Hardware name: Intel Corporation Montevina platform/To be filled by O.E.M., BIOS AMVACRB1.86C.0066.B00.0805070703 05/07/2008
   0000000000000009 ffff880077233ab8 ffffffff81665a23 0000000000200005
   0000000000000000 ffff880077233af8 ffffffff8104732c 0000000000000046
   ffff88007467c800 0000000000000002 ffff88007a9cf2a0 0000000000000001
  Call Trace:
   [&lt;ffffffff81665a23&gt;] dump_stack+0x4f/0x7c
   [&lt;ffffffff8104732c&gt;] warn_slowpath_common+0x8c/0xc0
   [&lt;ffffffff8104737a&gt;] warn_slowpath_null+0x1a/0x20
   [&lt;ffffffff8110fb3d&gt;] perf_swevent_add+0x18d/0x1a0
   [&lt;ffffffff811162ae&gt;] event_sched_in.isra.75+0x9e/0x1f0
   [&lt;ffffffff8111646a&gt;] group_sched_in+0x6a/0x1f0
   [&lt;ffffffff81083dd5&gt;] ? sched_clock_local+0x25/0xa0
   [&lt;ffffffff811167e6&gt;] ctx_sched_in+0x1f6/0x450
   [&lt;ffffffff8111757b&gt;] perf_event_sched_in+0x6b/0xa0
   [&lt;ffffffff81117a4b&gt;] perf_event_context_sched_in+0x7b/0xc0
   [&lt;ffffffff81117ece&gt;] __perf_event_task_sched_in+0x43e/0x460
   [&lt;ffffffff81096f1e&gt;] ? put_lock_stats.isra.18+0xe/0x30
   [&lt;ffffffff8107b3c8&gt;] finish_task_switch+0xb8/0x100
   [&lt;ffffffff8166a7de&gt;] __schedule+0x30e/0xad0
   [&lt;ffffffff81172dd2&gt;] ? pipe_read+0x3e2/0x560
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b464&gt;] preempt_schedule_irq+0x44/0x70
   [&lt;ffffffff816707f0&gt;] retint_kernel+0x20/0x30
   [&lt;ffffffff8109e60a&gt;] ? lockdep_sys_exit+0x1a/0x90
   [&lt;ffffffff812a4234&gt;] lockdep_sys_exit_thunk+0x35/0x67
   [&lt;ffffffff81679321&gt;] ? sysret_check+0x5/0x56

Fixing this by tracking the cpu hotplug state and displaying
the WARN only if current cpu is initialized properly.

Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1396861448-10097-1-git-send-email-jolsa@redhat.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>list: introduce list_next_entry() and list_prev_entry()</title>
<updated>2014-05-29T09:37:44+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2013-11-12T23:10:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=99565b6c9eb1ae28ab633d7104e654e6dc6a6679'/>
<id>99565b6c9eb1ae28ab633d7104e654e6dc6a6679</id>
<content type='text'>
commit 008208c6b26f21c2648c250a09c55e737c02c5f8 upstream.

Add two trivial helpers list_next_entry() and list_prev_entry(), they
can have a lot of users including list.h itself.  In fact the 1st one is
already defined in events/core.c and bnx2x_sp.c, so the patch simply
moves the definition to list.h.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eilon Greenstein &lt;eilong@broadcom.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 008208c6b26f21c2648c250a09c55e737c02c5f8 upstream.

Add two trivial helpers list_next_entry() and list_prev_entry(), they
can have a lot of users including list.h itself.  In fact the 1st one is
already defined in events/core.c and bnx2x_sp.c, so the patch simply
moves the definition to list.h.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eilon Greenstein &lt;eilong@broadcom.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix hotplug splat</title>
<updated>2014-03-05T16:13:51+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-02-24T11:06:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=290de678159b95d6f45f40214938c5a444eacad2'/>
<id>290de678159b95d6f45f40214938c5a444eacad2</id>
<content type='text'>
commit e3703f8cdfcf39c25c4338c3ad8e68891cca3731 upstream.

Drew Richardson reported that he could make the kernel go *boom* when hotplugging
while having perf events active.

It turned out that when you have a group event, the code in
__perf_event_exit_context() fails to remove the group siblings from
the context.

We then proceed with destroying and freeing the event, and when you
re-plug the CPU and try and add another event to that CPU, things go
*boom* because you've still got dead entries there.

Reported-by: Drew Richardson &lt;drew.richardson@arm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/n/tip-k6v5wundvusvcseqj1si0oz0@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e3703f8cdfcf39c25c4338c3ad8e68891cca3731 upstream.

Drew Richardson reported that he could make the kernel go *boom* when hotplugging
while having perf events active.

It turned out that when you have a group event, the code in
__perf_event_exit_context() fails to remove the group siblings from
the context.

We then proceed with destroying and freeing the event, and when you
re-plug the CPU and try and add another event to that CPU, things go
*boom* because you've still got dead entries there.

Reported-by: Drew Richardson &lt;drew.richardson@arm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/n/tip-k6v5wundvusvcseqj1si0oz0@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Enforce 1 as lower limit for perf_event_max_sample_rate</title>
<updated>2014-02-26T09:24:22+00:00</updated>
<author>
<name>Knut Petersen</name>
<email>Knut_Petersen@t-online.de</email>
</author>
<published>2013-09-25T12:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2b183ff24e726b97e2eeef5a54f88d4a8ac4f7d8'/>
<id>2b183ff24e726b97e2eeef5a54f88d4a8ac4f7d8</id>
<content type='text'>
commit 723478c8a471403c53cf144999701f6e0c4bbd11 upstream.

/proc/sys/kernel/perf_event_max_sample_rate will accept
negative values as well as 0.

Negative values are unreasonable, and 0 causes a
divide by zero exception in perf_proc_update_handler.

This patch enforces a lower limit of 1.

Signed-off-by: Knut Petersen &lt;Knut_Petersen@t-online.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 723478c8a471403c53cf144999701f6e0c4bbd11 upstream.

/proc/sys/kernel/perf_event_max_sample_rate will accept
negative values as well as 0.

Negative values are unreasonable, and 0 causes a
divide by zero exception in perf_proc_update_handler.

This patch enforces a lower limit of 1.

Signed-off-by: Knut Petersen &lt;Knut_Petersen@t-online.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix perf ring buffer memory ordering</title>
<updated>2013-10-29T11:01:19+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-28T12:55:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf378d341e4873ed928dc3c636252e6895a21f50'/>
<id>bf378d341e4873ed928dc3c636252e6895a21f50</id>
<content type='text'>
The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Tested-by: Victor Kaplansky &lt;victork@il.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: michael@ellerman.id.au
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Disable PERF_RECORD_MMAP2 support</title>
<updated>2013-10-17T19:27:14+00:00</updated>
<author>
<name>Stephane Eranian</name>
<email>eranian@google.com</email>
</author>
<published>2013-10-17T17:32:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3090ffb5a2515990182f3f55b0688a7817325488'/>
<id>3090ffb5a2515990182f3f55b0688a7817325488</id>
<content type='text'>
For now, we disable the extended MMAP record support (MMAP2).

We have identified cases where it would not report the correct mapping
information, clone(VM_CLONE) but with separate pids.  We will revisit
the support once we find a solution for this case.

The patch changes the kernel to return EINVAL if attr-&gt;mmap2 is set. The
patch also modifies the perf tool to use regular PERF_RECORD_MMAP for
synthetic events and it also prevents the tool from requesting
attr-&gt;mmap2 mode because the kernel would reject it.

The support will be revisited once the kenrel interface is updated.

In V2, we reduce the patch to the strict minimum.

In V3, we avoid calling perf_event_open() with mmap2 set because we know
it will fail and require fallback retry.

Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20131017173215.GA8820@quad
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For now, we disable the extended MMAP record support (MMAP2).

We have identified cases where it would not report the correct mapping
information, clone(VM_CLONE) but with separate pids.  We will revisit
the support once we find a solution for this case.

The patch changes the kernel to return EINVAL if attr-&gt;mmap2 is set. The
patch also modifies the perf tool to use regular PERF_RECORD_MMAP for
synthetic events and it also prevents the tool from requesting
attr-&gt;mmap2 mode because the kernel would reject it.

The support will be revisited once the kenrel interface is updated.

In V2, we reduce the patch to the strict minimum.

In V3, we avoid calling perf_event_open() with mmap2 set because we know
it will fail and require fallback retry.

Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20131017173215.GA8820@quad
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix perf_pmu_migrate_context</title>
<updated>2013-10-04T07:58:53+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-03T14:02:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9886167d20c0720dcfb01e62cdff4d906b226f43'/>
<id>9886167d20c0720dcfb01e62cdff4d906b226f43</id>
<content type='text'>
While auditing the list_entry usage due to a trinity bug I found that
perf_pmu_migrate_context violates the rules for
perf_event::event_entry.

The problem is that perf_event::event_entry is a RCU list element, and
hence we must wait for a full RCU grace period before re-using the
element after deletion.

Therefore the usage in perf_pmu_migrate_context() which re-uses the
entry immediately is broken. For now introduce another list_head into
perf_event for this specific usage.

This doesn't actually fix the trinity report because that never goes
through this code.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While auditing the list_entry usage due to a trinity bug I found that
perf_pmu_migrate_context violates the rules for
perf_event::event_entry.

The problem is that perf_event::event_entry is a RCU list element, and
hence we must wait for a full RCU grace period before re-using the
element after deletion.

Therefore the usage in perf_pmu_migrate_context() which re-uses the
entry immediately is broken. For now introduce another list_head into
perf_event for this specific usage.

This doesn't actually fix the trinity report because that never goes
through this code.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'</title>
<updated>2013-09-20T07:45:11+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-09-19T08:16:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fa7315871046b9a4c48627905691dbde57e51033'/>
<id>fa7315871046b9a4c48627905691dbde57e51033</id>
<content type='text'>
Solve the problems around the broken definition of perf_event_mmap_page::
cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
fixed by:

  860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'")

The problem with the fix (merged in v3.12-rc1 and not yet released
officially), noticed by Vince Weaver is that the new behavior is
not detectable by new user-space, and that due to the reuse of the
field names it's easy to mis-compile a binary if old headers are used
on a new kernel or new headers are used on an old kernel.

To solve all that make this change explicit, detectable and self-contained,
by iterating the ABI the following way:

 - Always clear bit 0, and rename it to usrpage-&gt;cap_bit0, to at least not
   confuse old user-space binaries. RDPMC will be marked as unavailable
   to old binaries but that's within the ABI, this is a capability bit.

 - Rename bit 1 to -&gt;cap_bit0_is_deprecated and always set it to 1, so new
   libraries can reliably detect that bit 0 is deprecated and perma-zero
   without having to check the kernel version.

 - Use bits 2, 3, 4 for the newly defined, correct functionality:

	cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
	cap_user_time		: 1, /* The time_* fields are used */
	cap_user_time_zero	: 1, /* The time_zero field is used */

 - Rename all the bitfield names in perf_event.h to be different from the
   old names, to make sure it's not possible to mis-compile it
   accidentally with old assumptions.

The 'size' field can then be used in the future to add new fields and it
will act as a natural ABI version indicator as well.

Also adjust tools/perf/ userspace for the new definitions, noticed by
Adrian Hunter.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Also-Fixed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Solve the problems around the broken definition of perf_event_mmap_page::
cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
fixed by:

  860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'")

The problem with the fix (merged in v3.12-rc1 and not yet released
officially), noticed by Vince Weaver is that the new behavior is
not detectable by new user-space, and that due to the reuse of the
field names it's easy to mis-compile a binary if old headers are used
on a new kernel or new headers are used on an old kernel.

To solve all that make this change explicit, detectable and self-contained,
by iterating the ABI the following way:

 - Always clear bit 0, and rename it to usrpage-&gt;cap_bit0, to at least not
   confuse old user-space binaries. RDPMC will be marked as unavailable
   to old binaries but that's within the ABI, this is a capability bit.

 - Rename bit 1 to -&gt;cap_bit0_is_deprecated and always set it to 1, so new
   libraries can reliably detect that bit 0 is deprecated and perma-zero
   without having to check the kernel version.

 - Use bits 2, 3, 4 for the newly defined, correct functionality:

	cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
	cap_user_time		: 1, /* The time_* fields are used */
	cap_user_time_zero	: 1, /* The time_zero field is used */

 - Rename all the bitfield names in perf_event.h to be different from the
   old names, to make sure it's not possible to mis-compile it
   accidentally with old assumptions.

The 'size' field can then be used in the future to add new fields and it
will act as a natural ABI version indicator as well.

Also adjust tools/perf/ userspace for the new definitions, noticed by
Adrian Hunter.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Also-Fixed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
