<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/time/timer_list.c, branch v4.16-rc4</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>timer/debug: Change /proc/timer_list from 0444 to 0400</title>
<updated>2017-11-13T15:04:06+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-11-13T06:15:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e7df2b5b7f245c9bd11064712db5cb69044a362'/>
<id>8e7df2b5b7f245c9bd11064712db5cb69044a362</id>
<content type='text'>
While it uses %pK, there's still few reasons to read this file
as non-root.

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While it uses %pK, there's still few reasons to read this file
as non-root.

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sysrq: Reset the watchdog timers while displaying high-resolution timers</title>
<updated>2017-03-23T19:46:53+00:00</updated>
<author>
<name>Tom Hromatka</name>
<email>tom.hromatka@oracle.com</email>
</author>
<published>2017-01-04T22:28:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0107042768658fea9f5f5a9c00b1c90f5dab6a06'/>
<id>0107042768658fea9f5f5a9c00b1c90f5dab6a06</id>
<content type='text'>
On systems with a large number of CPUs, running sysrq-&lt;q&gt; can cause
watchdog timeouts.  There are two slow sections of code in the sysrq-&lt;q&gt;
path in timer_list.c.

1. print_active_timers() - This function is called by print_cpu() and
   contains a slow goto loop.  On a machine with hundreds of CPUs, this
   loop took approximately 100ms for the first CPU in a NUMA node.
   (Subsequent CPUs in the same node ran much quicker.)  The total time
   to print all of the CPUs is ultimately long enough to trigger the
   soft lockup watchdog.

2. print_tickdevice() - This function outputs a large amount of textual
   information.  This function also took approximately 100ms per CPU.

Since sysrq-&lt;q&gt; is not a performance critical path, there should be no
harm in touching the nmi watchdog in both slow sections above.  Touching
it in just one location was insufficient on systems with hundreds of
CPUs as occasional timeouts were still observed during testing.

This issue was observed on an Oracle T7 machine with 128 CPUs, but I
anticipate it may affect other systems with similarly large numbers of
CPUs.

Signed-off-by: Tom Hromatka &lt;tom.hromatka@oracle.com&gt;
Reviewed-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On systems with a large number of CPUs, running sysrq-&lt;q&gt; can cause
watchdog timeouts.  There are two slow sections of code in the sysrq-&lt;q&gt;
path in timer_list.c.

1. print_active_timers() - This function is called by print_cpu() and
   contains a slow goto loop.  On a machine with hundreds of CPUs, this
   loop took approximately 100ms for the first CPU in a NUMA node.
   (Subsequent CPUs in the same node ran much quicker.)  The total time
   to print all of the CPUs is ultimately long enough to trigger the
   soft lockup watchdog.

2. print_tickdevice() - This function outputs a large amount of textual
   information.  This function also took approximately 100ms per CPU.

Since sysrq-&lt;q&gt; is not a performance critical path, there should be no
harm in touching the nmi watchdog in both slow sections above.  Touching
it in just one location was insufficient on systems with hundreds of
CPUs as occasional timeouts were still observed during testing.

This issue was observed on an Oracle T7 machine with 128 CPUs, but I
anticipate it may affect other systems with similarly large numbers of
CPUs.

Signed-off-by: Tom Hromatka &lt;tom.hromatka@oracle.com&gt;
Reviewed-by: Rob Gardner &lt;rob.gardner@oracle.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timer_list: Remove useless cast when printing</title>
<updated>2017-02-10T10:15:08+00:00</updated>
<author>
<name>Mars Cheng</name>
<email>mars.cheng@mediatek.com</email>
</author>
<published>2017-02-09T07:50:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7551b02b94ad1daee3a79d667dc3c46d08328f87'/>
<id>7551b02b94ad1daee3a79d667dc3c46d08328f87</id>
<content type='text'>
hrtimer_resolution is already unsigned int, not necessary to cast
it when printing.

Signed-off-by: Mars Cheng &lt;mars.cheng@mediatek.com&gt;
Cc: CC Hwang &lt;cc.hwang@mediatek.com&gt;
Cc: wsd_upstream@mediatek.com
Cc: Loda Chou &lt;loda.chou@mediatek.com&gt;
Cc: Jades Shih &lt;jades.shih@mediatek.com&gt;
Cc: Miles Chen &lt;miles.chen@mediatek.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: My Chuang &lt;my.chuang@mediatek.com&gt;
Cc: Matthias Brugger &lt;matthias.bgg@gmail.com&gt;
Cc: Yingjoe Chen &lt;yingjoe.chen@mediatek.com&gt;
Link: http://lkml.kernel.org/r/1486626615-5879-1-git-send-email-mars.cheng@mediatek.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
hrtimer_resolution is already unsigned int, not necessary to cast
it when printing.

Signed-off-by: Mars Cheng &lt;mars.cheng@mediatek.com&gt;
Cc: CC Hwang &lt;cc.hwang@mediatek.com&gt;
Cc: wsd_upstream@mediatek.com
Cc: Loda Chou &lt;loda.chou@mediatek.com&gt;
Cc: Jades Shih &lt;jades.shih@mediatek.com&gt;
Cc: Miles Chen &lt;miles.chen@mediatek.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: My Chuang &lt;my.chuang@mediatek.com&gt;
Cc: Matthias Brugger &lt;matthias.bgg@gmail.com&gt;
Cc: Yingjoe Chen &lt;yingjoe.chen@mediatek.com&gt;
Link: http://lkml.kernel.org/r/1486626615-5879-1-git-send-email-mars.cheng@mediatek.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>time: Remove CONFIG_TIMER_STATS</title>
<updated>2017-02-10T10:15:08+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2017-02-08T19:26:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dfb4357da6ddbdf57d583ba64361c9d792b0e0b1'/>
<id>dfb4357da6ddbdf57d583ba64361c9d792b0e0b1</id>
<content type='text'>
Currently CONFIG_TIMER_STATS exposes process information across namespaces:

kernel/time/timer_list.c print_timer():

        SEQ_printf(m, ", %s/%d", tmp, timer-&gt;start_pid);

/proc/timer_list:

 #11: &lt;0000000000000000&gt;, hrtimer_wakeup, S:01, do_nanosleep, cron/2570

Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.

Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Xing Gao &lt;xgao01@email.wm.edu&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Jessica Frazelle &lt;me@jessfraz.com&gt;
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss &lt;nicolas.iooss_linux@m4x.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Michal Marek &lt;mmarek@suse.com&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Olof Johansson &lt;olof@lixom.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently CONFIG_TIMER_STATS exposes process information across namespaces:

kernel/time/timer_list.c print_timer():

        SEQ_printf(m, ", %s/%d", tmp, timer-&gt;start_pid);

/proc/timer_list:

 #11: &lt;0000000000000000&gt;, hrtimer_wakeup, S:01, do_nanosleep, cron/2570

Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.

Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Xing Gao &lt;xgao01@email.wm.edu&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Jessica Frazelle &lt;me@jessfraz.com&gt;
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss &lt;nicolas.iooss_linux@m4x.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Michal Marek &lt;mmarek@suse.com&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Olof Johansson &lt;olof@lixom.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Replace &lt;asm/uaccess.h&gt; with &lt;linux/uaccess.h&gt; globally</title>
<updated>2016-12-24T19:46:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-24T19:46:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba'/>
<id>7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba</id>
<content type='text'>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hrtimer: Handle remaining time proper for TIME_LOW_RES</title>
<updated>2016-01-17T10:13:55+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-01-14T16:54:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=203cbf77de59fc8f13502dcfd11350c6d4a5c95f'/>
<id>203cbf77de59fc8f13502dcfd11350c6d4a5c95f</id>
<content type='text'>
If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
prevent short sleeps, but we do not account for that in interfaces which
retrieve the remaining time.

Helge observed that timerfd can return a remaining time larger than the
relative timeout. That's not expected and breaks userland test programs.

Store the information that the timer was armed relative and provide functions
to adjust the remaining time. To avoid bloating the hrtimer struct make state
a u8, which as a bonus results in better code on x86 at least.

Reported-and-tested-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
prevent short sleeps, but we do not account for that in interfaces which
retrieve the remaining time.

Helge observed that timerfd can return a remaining time larger than the
relative timeout. That's not expected and breaks userland test programs.

Store the information that the timer was armed relative and provide functions
to adjust the remaining time. To avoid bloating the hrtimer struct make state
a u8, which as a bonus results in better code on x86 at least.

Reported-and-tested-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clockevents: Remove unused set_mode() callback</title>
<updated>2015-09-14T09:00:55+00:00</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2015-09-11T04:04:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eef7635a22f6b144206b5ca2f1398f637acffc4d'/>
<id>eef7635a22f6b144206b5ca2f1398f637acffc4d</id>
<content type='text'>
All users are migrated to the per-state callbacks, get rid of the
unused interface and the core support code.

Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linaro-kernel@lists.linaro.org
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All users are migrated to the per-state callbacks, get rid of the
unused interface and the core support code.

Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linaro-kernel@lists.linaro.org
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers</title>
<updated>2015-08-17T18:23:31+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2015-05-27T23:44:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=38bf985b05625df3fbbc1dbf543bdd2da447c2af'/>
<id>38bf985b05625df3fbbc1dbf543bdd2da447c2af</id>
<content type='text'>
I noticed for non-monotonic timers in timer_list, some of the
output looked a little confusing.

For example:
 #1: &lt;0000000000000000&gt;, posix_timer_fn, S:01, hrtimer_start_range_ns, leap-a-day/2360
 # expires at 1434412800000000000-1434412800000000000 nsecs [in 1434410725062375469 to 1434410725062375469 nsecs]

You'll note the relative time till the expiration "[in xxx to
yyy nsecs]" is incorrect. This is because its printing the delta
between CLOCK_MONOTONIC time to the CLOCK_REALTIME expiration.

This patch fixes this issue by adding the clock offset to the
"now" time which we use to calculate the delta.

Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jiri Bohac &lt;jbohac@suse.cz&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Shuah Khan &lt;shuahkh@osg.samsung.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I noticed for non-monotonic timers in timer_list, some of the
output looked a little confusing.

For example:
 #1: &lt;0000000000000000&gt;, posix_timer_fn, S:01, hrtimer_start_range_ns, leap-a-day/2360
 # expires at 1434412800000000000-1434412800000000000 nsecs [in 1434410725062375469 to 1434410725062375469 nsecs]

You'll note the relative time till the expiration "[in xxx to
yyy nsecs]" is incorrect. This is because its printing the delta
between CLOCK_MONOTONIC time to the CLOCK_REALTIME expiration.

This patch fixes this issue by adding the clock offset to the
"now" time which we use to calculate the delta.

Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jiri Bohac &lt;jbohac@suse.cz&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Shuah Khan &lt;shuahkh@osg.samsung.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timer: Reduce timer migration overhead if disabled</title>
<updated>2015-06-19T13:18:28+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50'/>
<id>bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50</id>
<content type='text'>
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu


Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu


Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state</title>
<updated>2015-05-19T14:18:02+00:00</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2015-04-03T03:34:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8fff52fd50934580c5108afed12043a774edf728'/>
<id>8fff52fd50934580c5108afed12043a774edf728</id>
<content type='text'>
When no timers/hrtimers are pending, the expiry time is set to a
special value: 'KTIME_MAX'. This normally happens with
NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.

When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer
(NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device
(NOHZ_MODE_LOWRES).  But, the clockevent device is already
reprogrammed from the tick-handler for next tick.

As the clock event device is programmed in ONESHOT mode it will at
least fire one more time (unnecessarily). Timers on few
implementations (like arm_arch_timer, etc.) only support PERIODIC mode
and their drivers emulate ONESHOT over that. Which means that on these
platforms we will get spurious interrupts periodically (at last
programmed interval rate, normally tick rate).

In order to avoid spurious interrupts, the clockevent device should be
stopped or its interrupts should be masked.

A simple (yet hacky) solution to get this fixed could be: update
hrtimer_force_reprogram() to always reprogram clockevent device and
update clockevent drivers to STOP generating events (or delay it to
max time) when 'expires' is set to KTIME_MAX. But the drawback here is
that every clockevent driver has to be hacked for this particular case
and its very easy for new ones to miss this.

However, Thomas suggested to add an optional state ONESHOT_STOPPED to
solve this problem: lkml.org/lkml/2014/5/9/508.

This patch adds support for ONESHOT_STOPPED state in clockevents
core. It will only be available to drivers that implement the
state-specific callbacks instead of the legacy -&gt;set_mode() callback.

Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Reviewed-by: Preeti U. Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Cc: linaro-kernel@lists.linaro.org
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Kevin Hilman &lt;khilman@linaro.org&gt;
Cc: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/b8b383a03ac07b13312c16850b5106b82e4245b5.1428031396.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When no timers/hrtimers are pending, the expiry time is set to a
special value: 'KTIME_MAX'. This normally happens with
NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.

When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer
(NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device
(NOHZ_MODE_LOWRES).  But, the clockevent device is already
reprogrammed from the tick-handler for next tick.

As the clock event device is programmed in ONESHOT mode it will at
least fire one more time (unnecessarily). Timers on few
implementations (like arm_arch_timer, etc.) only support PERIODIC mode
and their drivers emulate ONESHOT over that. Which means that on these
platforms we will get spurious interrupts periodically (at last
programmed interval rate, normally tick rate).

In order to avoid spurious interrupts, the clockevent device should be
stopped or its interrupts should be masked.

A simple (yet hacky) solution to get this fixed could be: update
hrtimer_force_reprogram() to always reprogram clockevent device and
update clockevent drivers to STOP generating events (or delay it to
max time) when 'expires' is set to KTIME_MAX. But the drawback here is
that every clockevent driver has to be hacked for this particular case
and its very easy for new ones to miss this.

However, Thomas suggested to add an optional state ONESHOT_STOPPED to
solve this problem: lkml.org/lkml/2014/5/9/508.

This patch adds support for ONESHOT_STOPPED state in clockevents
core. It will only be available to drivers that implement the
state-specific callbacks instead of the legacy -&gt;set_mode() callback.

Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Reviewed-by: Preeti U. Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Cc: linaro-kernel@lists.linaro.org
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Kevin Hilman &lt;khilman@linaro.org&gt;
Cc: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/b8b383a03ac07b13312c16850b5106b82e4245b5.1428031396.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
