<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/events/core.c, branch v4.6-rc6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>perf/core: Fix perf_event_open() vs. execve() race</title>
<updated>2016-04-28T08:32:41+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-04-26T09:36:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=79c9ce57eb2d5f1497546a3946b4ae21b6fdc438'/>
<id>79c9ce57eb2d5f1497546a3946b4ae21b6fdc438</id>
<content type='text'>
Jann reported that the ptrace_may_access() check in
find_lively_task_by_vpid() is racy against exec().

Specifically:

  perf_event_open()		execve()

  ptrace_may_access()
				commit_creds()
  ...				if (get_dumpable() != SUID_DUMP_USER)
				  perf_event_exit_task();
  perf_install_in_context()

would result in installing a counter across the creds boundary.

Fix this by wrapping lots of perf_event_open() in cred_guard_mutex.
This should be fine as perf_event_exit_task() is already called with
cred_guard_mutex held, so all perf locks already nest inside it.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Jann reported that the ptrace_may_access() check in
find_lively_task_by_vpid() is racy against exec().

Specifically:

  perf_event_open()		execve()

  ptrace_may_access()
				commit_creds()
  ...				if (get_dumpable() != SUID_DUMP_USER)
				  perf_event_exit_task();
  perf_install_in_context()

would result in installing a counter across the creds boundary.

Fix this by wrapping lots of perf_event_open() in cred_guard_mutex.
This should be fine as perf_event_exit_task() is already called with
cred_guard_mutex held, so all perf locks already nest inside it.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Make sysctl_perf_cpu_time_max_percent conform to documentation</title>
<updated>2016-04-23T11:47:50+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-04-04T07:57:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b303e7c15d53cd8ef6b349b702e07eee3f102792'/>
<id>b303e7c15d53cd8ef6b349b702e07eee3f102792</id>
<content type='text'>
Markus reported that 0 should also disable the throttling we per
Documentation/sysctl/kernel.txt.

Reported-by: Markus Trippelsdorf &lt;markus@trippelsdorf.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: 91a612eea9a3 ("perf/core: Fix dynamic interrupt throttle")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Markus reported that 0 should also disable the throttling we per
Documentation/sysctl/kernel.txt.

Reported-by: Markus Trippelsdorf &lt;markus@trippelsdorf.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: 91a612eea9a3 ("perf/core: Fix dynamic interrupt throttle")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Don't leak event in the syscall error path</title>
<updated>2016-03-31T07:54:07+00:00</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2016-03-21T08:02:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=201c2f85bd0bc13b712d9c0b3d11251b182e06ae'/>
<id>201c2f85bd0bc13b712d9c0b3d11251b182e06ae</id>
<content type='text'>
In the error path, event_file not being NULL is used to determine
whether the event itself still needs to be free'd, so fix it up to
avoid leaking.

Reported-by: Leon Yu &lt;chianglungyu@gmail.com&gt;
Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: 130056275ade ("perf: Do not double free")
Link: http://lkml.kernel.org/r/87twk06yxp.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the error path, event_file not being NULL is used to determine
whether the event itself still needs to be free'd, so fix it up to
avoid leaking.

Reported-by: Leon Yu &lt;chianglungyu@gmail.com&gt;
Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: 130056275ade ("perf: Do not double free")
Link: http://lkml.kernel.org/r/87twk06yxp.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Fix time tracking bug with multiplexing</title>
<updated>2016-03-31T07:54:06+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-03-29T07:26:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8fdc65391c6ad16461526a685f03262b3b01bfde'/>
<id>8fdc65391c6ad16461526a685f03262b3b01bfde</id>
<content type='text'>
Stephane reported that commit:

  3cbaa5906967 ("perf: Fix ctx time tracking by introducing EVENT_TIME")

introduced a regression wrt. time tracking, as easily observed by:

&gt; This patch introduce a bug in the time tracking of events when
&gt; multiplexing is used.
&gt;
&gt; The issue is easily reproducible with the following perf run:
&gt;
&gt;  $ perf stat -a -C 0 -e branches,branches,branches,branches,branches,branches -I 1000
&gt;      1.000730239            652,394      branches   (66.41%)
&gt;      1.000730239            597,809      branches   (66.41%)
&gt;      1.000730239            593,870      branches   (66.63%)
&gt;      1.000730239            651,440      branches   (67.03%)
&gt;      1.000730239            656,725      branches   (66.96%)
&gt;      1.000730239      &lt;not counted&gt;      branches
&gt;
&gt; One branches event is shown as not having run. Yet, with
&gt; multiplexing, all events should run especially with a 1s (-I 1000)
&gt; interval. The delta for time_running comes out to 0. Yet, the event
&gt; has run because the kernel is actually multiplexing the events. The
&gt; problem is that the time tracking is the kernel and especially in
&gt; ctx_sched_out() is wrong now.
&gt;
&gt; The problem is that in case that the kernel enters ctx_sched_out() with the
&gt; following state:
&gt;    ctx-&gt;is_active=0x7 event_type=0x1
&gt;    Call Trace:
&gt;     [&lt;ffffffff813ddd41&gt;] dump_stack+0x63/0x82
&gt;     [&lt;ffffffff81182bdc&gt;] ctx_sched_out+0x2bc/0x2d0
&gt;     [&lt;ffffffff81183896&gt;] perf_mux_hrtimer_handler+0xf6/0x2c0
&gt;     [&lt;ffffffff811837a0&gt;] ? __perf_install_in_context+0x130/0x130
&gt;     [&lt;ffffffff810f5818&gt;] __hrtimer_run_queues+0xf8/0x2f0
&gt;     [&lt;ffffffff810f6097&gt;] hrtimer_interrupt+0xb7/0x1d0
&gt;     [&lt;ffffffff810509a8&gt;] local_apic_timer_interrupt+0x38/0x60
&gt;     [&lt;ffffffff8175ca9d&gt;] smp_apic_timer_interrupt+0x3d/0x50
&gt;     [&lt;ffffffff8175ac7c&gt;] apic_timer_interrupt+0x8c/0xa0
&gt;
&gt; In that case, the test:
&gt;       if (is_active &amp; EVENT_TIME)
&gt;
&gt; will be false and the time will not be updated. Time must always be updated on
&gt; sched out.

Fix this by always updating time if EVENT_TIME was set, as opposed to
only updating time when EVENT_TIME changed.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Tested-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Fixes: 3cbaa5906967 ("perf: Fix ctx time tracking by introducing EVENT_TIME")
Link: http://lkml.kernel.org/r/20160329072644.GB3408@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Stephane reported that commit:

  3cbaa5906967 ("perf: Fix ctx time tracking by introducing EVENT_TIME")

introduced a regression wrt. time tracking, as easily observed by:

&gt; This patch introduce a bug in the time tracking of events when
&gt; multiplexing is used.
&gt;
&gt; The issue is easily reproducible with the following perf run:
&gt;
&gt;  $ perf stat -a -C 0 -e branches,branches,branches,branches,branches,branches -I 1000
&gt;      1.000730239            652,394      branches   (66.41%)
&gt;      1.000730239            597,809      branches   (66.41%)
&gt;      1.000730239            593,870      branches   (66.63%)
&gt;      1.000730239            651,440      branches   (67.03%)
&gt;      1.000730239            656,725      branches   (66.96%)
&gt;      1.000730239      &lt;not counted&gt;      branches
&gt;
&gt; One branches event is shown as not having run. Yet, with
&gt; multiplexing, all events should run especially with a 1s (-I 1000)
&gt; interval. The delta for time_running comes out to 0. Yet, the event
&gt; has run because the kernel is actually multiplexing the events. The
&gt; problem is that the time tracking is the kernel and especially in
&gt; ctx_sched_out() is wrong now.
&gt;
&gt; The problem is that in case that the kernel enters ctx_sched_out() with the
&gt; following state:
&gt;    ctx-&gt;is_active=0x7 event_type=0x1
&gt;    Call Trace:
&gt;     [&lt;ffffffff813ddd41&gt;] dump_stack+0x63/0x82
&gt;     [&lt;ffffffff81182bdc&gt;] ctx_sched_out+0x2bc/0x2d0
&gt;     [&lt;ffffffff81183896&gt;] perf_mux_hrtimer_handler+0xf6/0x2c0
&gt;     [&lt;ffffffff811837a0&gt;] ? __perf_install_in_context+0x130/0x130
&gt;     [&lt;ffffffff810f5818&gt;] __hrtimer_run_queues+0xf8/0x2f0
&gt;     [&lt;ffffffff810f6097&gt;] hrtimer_interrupt+0xb7/0x1d0
&gt;     [&lt;ffffffff810509a8&gt;] local_apic_timer_interrupt+0x38/0x60
&gt;     [&lt;ffffffff8175ca9d&gt;] smp_apic_timer_interrupt+0x3d/0x50
&gt;     [&lt;ffffffff8175ac7c&gt;] apic_timer_interrupt+0x8c/0xa0
&gt;
&gt; In that case, the test:
&gt;       if (is_active &amp; EVENT_TIME)
&gt;
&gt; will be false and the time will not be updated. Time must always be updated on
&gt; sched out.

Fix this by always updating time if EVENT_TIME was set, as opposed to
only updating time when EVENT_TIME changed.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Tested-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Fixes: 3cbaa5906967 ("perf: Fix ctx time tracking by introducing EVENT_TIME")
Link: http://lkml.kernel.org/r/20160329072644.GB3408@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Document some hotplug bits</title>
<updated>2016-03-21T08:35:29+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-03-08T16:56:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1dcaac1ce07db672e2cb5b4ef9642990689bb2a1'/>
<id>1dcaac1ce07db672e2cb5b4ef9642990689bb2a1</id>
<content type='text'>
Document some of the hotplug notifier usage.

Requested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Document some of the hotplug notifier usage.

Requested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Fix dynamic interrupt throttle</title>
<updated>2016-03-21T08:08:17+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-03-17T14:17:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=91a612eea9a316c464cc170ff8492ec09e7d1c69'/>
<id>91a612eea9a316c464cc170ff8492ec09e7d1c69</id>
<content type='text'>
There were two problems with the dynamic interrupt throttle mechanism,
both triggered by the same action.

When you (or perf_fuzzer) write a huge value into
/proc/sys/kernel/perf_event_max_sample_rate the computed
perf_sample_allowed_ns becomes 0. This effectively disables the whole
dynamic throttle.

This is fixed by ensuring update_perf_cpu_limits() never sets the
value to 0. However, we allow disabling of the dynamic throttle by
writing 100 to /proc/sys/kernel/perf_cpu_time_max_percent. This will
generate a warning in dmesg.

The second problem is that by setting the max_sample_rate to a huge
number, the adaptive process can take a few tries, since it halfs the
limit each time. Change that to directly compute a new value based on
the observed duration.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There were two problems with the dynamic interrupt throttle mechanism,
both triggered by the same action.

When you (or perf_fuzzer) write a huge value into
/proc/sys/kernel/perf_event_max_sample_rate the computed
perf_sample_allowed_ns becomes 0. This effectively disables the whole
dynamic throttle.

This is fixed by ensuring update_perf_cpu_limits() never sets the
value to 0. However, we allow disabling of the dynamic throttle by
writing 100 to /proc/sys/kernel/perf_cpu_time_max_percent. This will
generate a warning in dmesg.

The second problem is that by setting the max_sample_rate to a huge
number, the adaptive process can take a few tries, since it halfs the
limit each time. Change that to directly compute a new value based on
the observed duration.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Fix the unthrottle logic</title>
<updated>2016-03-21T08:08:14+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-03-10T14:39:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1e02cd40f15190b78fcc6b3f50c952fb4028e9a5'/>
<id>1e02cd40f15190b78fcc6b3f50c952fb4028e9a5</id>
<content type='text'>
Its possible to IOC_PERIOD while the event is throttled, this would
re-start the event and the next tick would then try to unthrottle it,
and find the event wasn't actually stopped anymore.

This would tickle a WARN in the x86-pmu code which isn't expecting to
start a !stopped event.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: dvyukov@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160310143924.GR6356@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Its possible to IOC_PERIOD while the event is throttled, this would
re-start the event and the next tick would then try to unthrottle it,
and find the event wasn't actually stopped anymore.

This would tickle a WARN in the x86-pmu code which isn't expecting to
start a !stopped event.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: dvyukov@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160310143924.GR6356@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2016-03-15T02:44:38+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-03-15T02:44:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e23604edac2a7be6a8808a5d13fac6b9df4eb9a8'/>
<id>e23604edac2a7be6a8808a5d13fac6b9df4eb9a8</id>
<content type='text'>
Pull NOHZ updates from Ingo Molnar:
 "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
  the NOHZ 'can the tick be stopped?' infrastructure and related code to
  be data driven, and harmonizes the naming and handling of all the
  various properties"

[ This makes the ugly "fetch_or()" macro that the scheduler used
  internally a new generic helper, and does a bad job at it.

  I'm pulling it, but I've asked Ingo and Frederic to get this
  fixed up ]

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched-clock: Migrate to use new tick dependency mask model
  posix-cpu-timers: Migrate to use new tick dependency mask model
  sched: Migrate sched to use new tick dependency mask model
  sched: Account rr tasks
  perf: Migrate perf to use new tick dependency mask model
  nohz: Use enum code for tick stop failure tracing message
  nohz: New tick dependency mask
  nohz: Implement wide kick on top of irq work
  atomic: Export fetch_or()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull NOHZ updates from Ingo Molnar:
 "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
  the NOHZ 'can the tick be stopped?' infrastructure and related code to
  be data driven, and harmonizes the naming and handling of all the
  various properties"

[ This makes the ugly "fetch_or()" macro that the scheduler used
  internally a new generic helper, and does a bad job at it.

  I'm pulling it, but I've asked Ingo and Frederic to get this
  fixed up ]

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched-clock: Migrate to use new tick dependency mask model
  posix-cpu-timers: Migrate to use new tick dependency mask model
  sched: Migrate sched to use new tick dependency mask model
  sched: Account rr tasks
  perf: Migrate perf to use new tick dependency mask model
  nohz: Use enum code for tick stop failure tracing message
  nohz: New tick dependency mask
  nohz: Implement wide kick on top of irq work
  atomic: Export fetch_or()
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'timers/core-v9' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz</title>
<updated>2016-03-08T12:17:54+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2016-03-08T12:17:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1f25184656a00a59e3a953189070d42a749f6aee'/>
<id>1f25184656a00a59e3a953189070d42a749f6aee</id>
<content type='text'>
Pull nohz enhancements from Frederic Weisbecker:

"Currently in nohz full configs, the tick dependency is checked
 asynchronously by nohz code from interrupt and context switch for each
 concerned subsystem with a set of function provided by these. Such
 functions are made of many conditions and details that can be heavyweight
 as they are called on fastpath: sched_can_stop_tick(),
 posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...

 Thomas suggested a few months ago to make that tick dependency check
 synchronous. Instead of checking subsystems details from each interrupt
 to guess if the tick can be stopped, every subsystem that may have a tick
 dependency should set itself a flag specifying the state of that
 dependency. This way we can verify if we can stop the tick with a single
 lightweight mask check on fast path.

 This conversion from a pull to a push model to implement tick dependency
 is the core feature of this patchset that is split into:

  * Nohz wide kick simplification
  * Improve nohz tracing
  * Introduce tick dependency mask
  * Migrate scheduler, posix timers, perf events and sched clock tick
    dependencies to the tick dependency mask."

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull nohz enhancements from Frederic Weisbecker:

"Currently in nohz full configs, the tick dependency is checked
 asynchronously by nohz code from interrupt and context switch for each
 concerned subsystem with a set of function provided by these. Such
 functions are made of many conditions and details that can be heavyweight
 as they are called on fastpath: sched_can_stop_tick(),
 posix_cpu_timer_can_stop_tick(), perf_event_can_stop_tick()...

 Thomas suggested a few months ago to make that tick dependency check
 synchronous. Instead of checking subsystems details from each interrupt
 to guess if the tick can be stopped, every subsystem that may have a tick
 dependency should set itself a flag specifying the state of that
 dependency. This way we can verify if we can stop the tick with a single
 lightweight mask check on fast path.

 This conversion from a pull to a push model to implement tick dependency
 is the core feature of this patchset that is split into:

  * Nohz wide kick simplification
  * Improve nohz tracing
  * Introduce tick dependency mask
  * Migrate scheduler, posix timers, perf events and sched clock tick
    dependencies to the tick dependency mask."

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/core: Fix perf_sched_count derailment</title>
<updated>2016-03-08T11:18:31+00:00</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2016-03-02T11:24:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=927a5570855836e5d5859a80ce7e91e963545e8f'/>
<id>927a5570855836e5d5859a80ce7e91e963545e8f</id>
<content type='text'>
The error path in perf_event_open() is such that asking for a sampling
event on a PMU that doesn't generate interrupts will end up in dropping
the perf_sched_count even though it hasn't been incremented for this
event yet.

Given a sufficient amount of these calls, we'll end up disabling
scheduler's jump label even though we'd still have active events in the
system, thereby facilitating the arrival of the infernal regions upon us.

I'm fixing this by moving account_event() inside perf_event_alloc().

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The error path in perf_event_open() is such that asking for a sampling
event on a PMU that doesn't generate interrupts will end up in dropping
the perf_sched_count even though it hasn't been incremented for this
event yet.

Given a sufficient amount of these calls, we'll end up disabling
scheduler's jump label even though we'd still have active events in the
system, thereby facilitating the arrival of the infernal regions upon us.

I'm fixing this by moving account_event() inside perf_event_alloc().

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
