<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/time/timekeeping.c, branch v3.10.54</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>timekeeping: Avoid possible deadlock from clock_was_set_delayed</title>
<updated>2014-02-13T21:48:04+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-12-11T01:18:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d9e8fada0c0161f6fe2499a1b7dc9ce18e20fec2'/>
<id>d9e8fada0c0161f6fe2499a1b7dc9ce18e20fec2</id>
<content type='text'>
commit 6fdda9a9c5db367130cf32df5d6618d08b89f46a upstream.

As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -&gt; timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [&lt;ffffffff81160e96&gt;] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [&lt;ffffffff81160e7c&gt;] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-&gt; #5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-&gt; #4 (&amp;rt_b-&gt;rt_runtime_lock){-.-...}:
[snipped]
-&gt; #3 (&amp;rq-&gt;lock){-.-.-.}:
[snipped]
-&gt; #2 (&amp;p-&gt;pi_lock){-.-.-.}:
[snipped]
-&gt; #1 (&amp;(&amp;pool-&gt;lock)-&gt;rlock){-.-...}:
[  251.101967]        [&lt;ffffffff81194803&gt;] validate_chain+0x6c3/0x7b0
[  251.101967]        [&lt;ffffffff81194d9d&gt;] __lock_acquire+0x4ad/0x580
[  251.101967]        [&lt;ffffffff81194ff2&gt;] lock_acquire+0x182/0x1d0
[  251.101967]        [&lt;ffffffff84398500&gt;] _raw_spin_lock+0x40/0x80
[  251.101967]        [&lt;ffffffff81153e69&gt;] __queue_work+0x1a9/0x3f0
[  251.101967]        [&lt;ffffffff81154168&gt;] queue_work_on+0x98/0x120
[  251.101967]        [&lt;ffffffff81161351&gt;] clock_was_set_delayed+0x21/0x30
[  251.101967]        [&lt;ffffffff811c4bd1&gt;] do_adjtimex+0x111/0x160
[  251.101967]        [&lt;ffffffff811e2711&gt;] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [&lt;ffffffff843a4b49&gt;] ia32_sysret+0x0/0x5
[  251.101967]
-&gt; #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --&gt; &amp;rt_b-&gt;rt_runtime_lock --&gt; hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&amp;rt_b-&gt;rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [&lt;ffffffff81154960&gt;] process_one_work+0x200/0x530
[  251.101967]  #1:  (hrtimer_work){+.+...}, at: [&lt;ffffffff81154960&gt;] process_one_work+0x200/0x530
[  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [&lt;ffffffff81160e7c&gt;] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6fdda9a9c5db367130cf32df5d6618d08b89f46a upstream.

As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -&gt; timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [&lt;ffffffff81160e96&gt;] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [&lt;ffffffff81160e7c&gt;] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-&gt; #5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-&gt; #4 (&amp;rt_b-&gt;rt_runtime_lock){-.-...}:
[snipped]
-&gt; #3 (&amp;rq-&gt;lock){-.-.-.}:
[snipped]
-&gt; #2 (&amp;p-&gt;pi_lock){-.-.-.}:
[snipped]
-&gt; #1 (&amp;(&amp;pool-&gt;lock)-&gt;rlock){-.-...}:
[  251.101967]        [&lt;ffffffff81194803&gt;] validate_chain+0x6c3/0x7b0
[  251.101967]        [&lt;ffffffff81194d9d&gt;] __lock_acquire+0x4ad/0x580
[  251.101967]        [&lt;ffffffff81194ff2&gt;] lock_acquire+0x182/0x1d0
[  251.101967]        [&lt;ffffffff84398500&gt;] _raw_spin_lock+0x40/0x80
[  251.101967]        [&lt;ffffffff81153e69&gt;] __queue_work+0x1a9/0x3f0
[  251.101967]        [&lt;ffffffff81154168&gt;] queue_work_on+0x98/0x120
[  251.101967]        [&lt;ffffffff81161351&gt;] clock_was_set_delayed+0x21/0x30
[  251.101967]        [&lt;ffffffff811c4bd1&gt;] do_adjtimex+0x111/0x160
[  251.101967]        [&lt;ffffffff811e2711&gt;] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [&lt;ffffffff843a4b49&gt;] ia32_sysret+0x0/0x5
[  251.101967]
-&gt; #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --&gt; &amp;rt_b-&gt;rt_runtime_lock --&gt; hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&amp;rt_b-&gt;rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [&lt;ffffffff81154960&gt;] process_one_work+0x200/0x530
[  251.101967]  #1:  (hrtimer_work){+.+...}, at: [&lt;ffffffff81154960&gt;] process_one_work+0x200/0x530
[  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [&lt;ffffffff81160e7c&gt;] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Fix missing timekeeping_update in suspend path</title>
<updated>2014-02-13T21:48:03+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-12-12T03:10:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=226e0f713f585c549f4200bb8a69b6753dff28d0'/>
<id>226e0f713f585c549f4200bb8a69b6753dff28d0</id>
<content type='text'>
commit 330a1617b0a6268d427aa5922c94d082b1d3e96d upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

In the timekeeping suspend path, we udpate the timekeeper
structure, so we should be sure to update the shadow-timekeeper
before releasing the timekeeping locks. Currently this isn't done.

In most cases, the next time related code to run would be
timekeeping_resume, which does update the shadow-timekeeper, but
in an abundence of caution, this patch adds the call to
timekeeping_update() in the suspend path.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 330a1617b0a6268d427aa5922c94d082b1d3e96d upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

In the timekeeping suspend path, we udpate the timekeeper
structure, so we should be sure to update the shadow-timekeeper
before releasing the timekeeping locks. Currently this isn't done.

In most cases, the next time related code to run would be
timekeeping_resume, which does update the shadow-timekeeper, but
in an abundence of caution, this patch adds the call to
timekeeping_update() in the suspend path.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Fix CLOCK_TAI timer/nanosleep delays</title>
<updated>2014-02-13T21:48:03+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-12-11T01:13:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a8ad6b67721e81c2e181ae3e0f3aea79da779cd7'/>
<id>a8ad6b67721e81c2e181ae3e0f3aea79da779cd7</id>
<content type='text'>
commit 04005f6011e3b504cd4d791d9769f7cb9a3b2eae upstream.

A think-o in the calculation of the monotonic -&gt; tai time offset
results in CLOCK_TAI timers and nanosleeps to expire late (the
latency is ~2x the tai offset).

Fix this by adding the tai offset from the realtime offset instead
of subtracting.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 04005f6011e3b504cd4d791d9769f7cb9a3b2eae upstream.

A think-o in the calculation of the monotonic -&gt; tai time offset
results in CLOCK_TAI timers and nanosleeps to expire late (the
latency is ~2x the tai offset).

Fix this by adding the tai offset from the realtime offset instead
of subtracting.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Fix lost updates to tai adjustment</title>
<updated>2014-02-13T21:48:03+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-12-12T02:50:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=77535a0a168b2bf7c4aa35a20792a6e895844424'/>
<id>77535a0a168b2bf7c4aa35a20792a6e895844424</id>
<content type='text'>
commit f55c07607a38f84b5c7e6066ee1cfe433fa5643c upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

Unfortunately, the updates to the tai offset via adjtimex do not
trigger this update, causing adjustments to the tai offset to be
made and then over-written by the previous value at the next
update_wall_time() call.

This patch resovles the issue by calling timekeeping_update()
right after setting the tai offset.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f55c07607a38f84b5c7e6066ee1cfe433fa5643c upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

Unfortunately, the updates to the tai offset via adjtimex do not
trigger this update, causing adjustments to the tai offset to be
made and then over-written by the previous value at the next
update_wall_time() call.

This patch resovles the issue by calling timekeeping_update()
right after setting the tai offset.

Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>time: Fix 1ns/tick drift w/ GENERIC_TIME_VSYSCALL_OLD</title>
<updated>2013-12-12T06:36:27+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2013-11-22T19:44:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=78f8d9b5647283bdea224d9bb7fb99f8f37a7614'/>
<id>78f8d9b5647283bdea224d9bb7fb99f8f37a7614</id>
<content type='text'>
commit 4be77398ac9d948773116b6be4a3c91b3d6ea18c upstream.

Since commit 1e75fa8be9f (time: Condense timekeeper.xtime
into xtime_sec - merged in v3.6), there has been an problem
with the error accounting in the timekeeping code, such that
when truncating to nanoseconds, we round up to the next nsec,
but the balancing adjustment to the ntp_error value was dropped.

This causes 1ns per tick drift forward of the clock.

In 3.7, this logic was isolated to only GENERIC_TIME_VSYSCALL_OLD
architectures (s390, ia64, powerpc).

The fix is simply to balance the accounting and to subtract the
added nanosecond from ntp_error. This allows the internal long-term
clock steering to keep the clock accurate.

While this fix removes the regression added in 1e75fa8be9f, the
ideal solution is to move away from GENERIC_TIME_VSYSCALL_OLD
and use the new VSYSCALL method, which avoids entirely the
nanosecond granular rounding, and the resulting short-term clock
adjustment oscillation needed to keep long term accurate time.

[ jstultz: Many thanks to Martin for his efforts identifying this
  	   subtle bug, and providing the fix. ]

Originally-from: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1385149491-20307-1-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4be77398ac9d948773116b6be4a3c91b3d6ea18c upstream.

Since commit 1e75fa8be9f (time: Condense timekeeper.xtime
into xtime_sec - merged in v3.6), there has been an problem
with the error accounting in the timekeeping code, such that
when truncating to nanoseconds, we round up to the next nsec,
but the balancing adjustment to the ntp_error value was dropped.

This causes 1ns per tick drift forward of the clock.

In 3.7, this logic was isolated to only GENERIC_TIME_VSYSCALL_OLD
architectures (s390, ia64, powerpc).

The fix is simply to balance the accounting and to subtract the
added nanosecond from ntp_error. This allows the internal long-term
clock steering to keep the clock accurate.

While this fix removes the regression added in 1e75fa8be9f, the
ideal solution is to move away from GENERIC_TIME_VSYSCALL_OLD
and use the new VSYSCALL method, which avoids entirely the
nanosecond granular rounding, and the resulting short-term clock
adjustment oscillation needed to keep long term accurate time.

[ jstultz: Many thanks to Martin for his efforts identifying this
  	   subtle bug, and providing the fix. ]

Originally-from: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1385149491-20307-1-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Fix HRTICK related deadlock from ntp lock changes</title>
<updated>2013-10-01T16:17:45+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-09-11T23:50:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=63946e8616205dafe39b4d88f9fc3dc7c4fd79aa'/>
<id>63946e8616205dafe39b4d88f9fc3dc7c4fd79aa</id>
<content type='text'>
commit 7bd36014460f793c19e7d6c94dab67b0afcfcb7f upstream.

Gerlando Falauto reported that when HRTICK is enabled, it is
possible to trigger system deadlocks. These were hard to
reproduce, as HRTICK has been broken in the past, but seemed
to be connected to the timekeeping_seq lock.

Since seqlock/seqcount's aren't supported w/ lockdep, I added
some extra spinlock based locking and triggered the following
lockdep output:

[   15.849182] ntpd/4062 is trying to acquire lock:
[   15.849765]  (&amp;(&amp;pool-&gt;lock)-&gt;rlock){..-...}, at: [&lt;ffffffff810aa9b5&gt;] __queue_work+0x145/0x480
[   15.850051]
[   15.850051] but task is already holding lock:
[   15.850051]  (timekeeper_lock){-.-.-.}, at: [&lt;ffffffff810df6df&gt;] do_adjtimex+0x7f/0x100

&lt;snip&gt;

[   15.850051] Chain exists of: &amp;(&amp;pool-&gt;lock)-&gt;rlock --&gt; &amp;p-&gt;pi_lock --&gt; timekeeper_lock
[   15.850051]  Possible unsafe locking scenario:
[   15.850051]
[   15.850051]        CPU0                    CPU1
[   15.850051]        ----                    ----
[   15.850051]   lock(timekeeper_lock);
[   15.850051]                                lock(&amp;p-&gt;pi_lock);
[   15.850051] lock(timekeeper_lock);
[   15.850051] lock(&amp;(&amp;pool-&gt;lock)-&gt;rlock);
[   15.850051]
[   15.850051]  *** DEADLOCK ***

The deadlock was introduced by 06c017fdd4dc48451a ("timekeeping:
Hold timekeepering locks in do_adjtimex and hardpps") in 3.10

This patch avoids this deadlock, by moving the call to
schedule_delayed_work() outside of the timekeeper lock
critical section.

Reported-by: Gerlando Falauto &lt;gerlando.falauto@keymile.com&gt;
Tested-by: Lin Ming &lt;minggr@gmail.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7bd36014460f793c19e7d6c94dab67b0afcfcb7f upstream.

Gerlando Falauto reported that when HRTICK is enabled, it is
possible to trigger system deadlocks. These were hard to
reproduce, as HRTICK has been broken in the past, but seemed
to be connected to the timekeeping_seq lock.

Since seqlock/seqcount's aren't supported w/ lockdep, I added
some extra spinlock based locking and triggered the following
lockdep output:

[   15.849182] ntpd/4062 is trying to acquire lock:
[   15.849765]  (&amp;(&amp;pool-&gt;lock)-&gt;rlock){..-...}, at: [&lt;ffffffff810aa9b5&gt;] __queue_work+0x145/0x480
[   15.850051]
[   15.850051] but task is already holding lock:
[   15.850051]  (timekeeper_lock){-.-.-.}, at: [&lt;ffffffff810df6df&gt;] do_adjtimex+0x7f/0x100

&lt;snip&gt;

[   15.850051] Chain exists of: &amp;(&amp;pool-&gt;lock)-&gt;rlock --&gt; &amp;p-&gt;pi_lock --&gt; timekeeper_lock
[   15.850051]  Possible unsafe locking scenario:
[   15.850051]
[   15.850051]        CPU0                    CPU1
[   15.850051]        ----                    ----
[   15.850051]   lock(timekeeper_lock);
[   15.850051]                                lock(&amp;p-&gt;pi_lock);
[   15.850051] lock(timekeeper_lock);
[   15.850051] lock(&amp;(&amp;pool-&gt;lock)-&gt;rlock);
[   15.850051]
[   15.850051]  *** DEADLOCK ***

The deadlock was introduced by 06c017fdd4dc48451a ("timekeeping:
Hold timekeepering locks in do_adjtimex and hardpps") in 3.10

This patch avoids this deadlock, by moving the call to
schedule_delayed_work() outside of the timekeeper lock
critical section.

Reported-by: Gerlando Falauto &lt;gerlando.falauto@keymile.com&gt;
Tested-by: Lin Ming &lt;minggr@gmail.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Correct run-time detection of persistent_clock.</title>
<updated>2013-05-28T20:45:19+00:00</updated>
<author>
<name>Zoran Markovic</name>
<email>zoran.markovic@linaro.org</email>
</author>
<published>2013-05-17T18:24:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0d6bd9953f739dad96d9a0de65383e479ab4e10d'/>
<id>0d6bd9953f739dad96d9a0de65383e479ab4e10d</id>
<content type='text'>
Since commit 31ade30692dc9680bfc95700d794818fa3f754ac, timekeeping_init()
checks for presence of persistent clock by attempting to read a non-zero
time value. This is an issue on platforms where persistent_clock (instead
is implemented as a free-running counter (instead of an RTC) starting
from zero on each boot and running during suspend. Examples are some ARM
platforms (e.g. PandaBoard).

An attempt to read such a clock during timekeeping_init() may return zero
value and falsely declare persistent clock as missing. Additionally, in
the above case suspend times may be accounted twice (once from
timekeeping_resume() and once from rtc_resume()), resulting in a gradual
drift of system time.

This patch does a run-time correction of the issue by doing the same check
during timekeeping_suspend().

A better long-term solution would have to return error when trying to read
non-existing clock and zero when trying to read an uninitialized clock, but
that would require changing all persistent_clock implementations.

This patch addresses the immediate breakage, for now.

Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Zoran Markovic &lt;zoran.markovic@linaro.org&gt;
[jstultz: Tweaked commit message and subject]
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit 31ade30692dc9680bfc95700d794818fa3f754ac, timekeeping_init()
checks for presence of persistent clock by attempting to read a non-zero
time value. This is an issue on platforms where persistent_clock (instead
is implemented as a free-running counter (instead of an RTC) starting
from zero on each boot and running during suspend. Examples are some ARM
platforms (e.g. PandaBoard).

An attempt to read such a clock during timekeeping_init() may return zero
value and falsely declare persistent clock as missing. Additionally, in
the above case suspend times may be accounted twice (once from
timekeeping_resume() and once from rtc_resume()), resulting in a gradual
drift of system time.

This patch does a run-time correction of the issue by doing the same check
during timekeeping_suspend().

A better long-term solution would have to return error when trying to read
non-existing clock and zero when trying to read an uninitialized clock, but
that would require changing all persistent_clock implementations.

This patch addresses the immediate breakage, for now.

Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Zoran Markovic &lt;zoran.markovic@linaro.org&gt;
[jstultz: Tweaked commit message and subject]
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Update tk-&gt;cycle_last in resume</title>
<updated>2013-04-22T18:17:51+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2013-04-22T07:37:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=77c675ba18836802f6b73d2d773481d06ebc0f04'/>
<id>77c675ba18836802f6b73d2d773481d06ebc0f04</id>
<content type='text'>
commit 7ec98e15aa (timekeeping: Delay update of clock-&gt;cycle_last)
forgot to update tk-&gt;cycle_last in the resume path. This results in a
stale value versus clock-&gt;cycle_last and prevents resume in the worst
case.

Reported-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Reported-and-tested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Linux-pm mailing list &lt;linux-pm@lists.linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304211648150.21884@ionos
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7ec98e15aa (timekeeping: Delay update of clock-&gt;cycle_last)
forgot to update tk-&gt;cycle_last in the resume path. This results in a
stale value versus clock-&gt;cycle_last and prevents resume in the worst
case.

Reported-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Reported-and-tested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Linux-pm mailing list &lt;linux-pm@lists.linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304211648150.21884@ionos
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Make sure to notify hrtimers when TAI offset changes</title>
<updated>2013-04-11T08:19:44+00:00</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-04-10T19:41:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4e8f8b34b92b6514cc070aeb94d317cadd5071d7'/>
<id>4e8f8b34b92b6514cc070aeb94d317cadd5071d7</id>
<content type='text'>
Now that we have CLOCK_TAI timers, make sure we notify hrtimer
code when TAI offset is changed.

Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Link: http://lkml.kernel.org/r/1365622909-953-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that we have CLOCK_TAI timers, make sure we notify hrtimer
code when TAI offset is changed.

Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Link: http://lkml.kernel.org/r/1365622909-953-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Shorten seq_count region</title>
<updated>2013-04-04T20:18:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2013-02-21T22:51:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca4523cda429712fc135c5db50920d90eb776a6c'/>
<id>ca4523cda429712fc135c5db50920d90eb776a6c</id>
<content type='text'>
Shorten the seqcount write hold region to the actual update of the
timekeeper and the related data (e.g vsyscall).

On a contemporary x86 system this reduces the maximum latencies on
Preempt-RT from 8us to 4us on the non-timekeeping cores.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Shorten the seqcount write hold region to the actual update of the
timekeeper and the related data (e.g vsyscall).

On a contemporary x86 system this reduces the maximum latencies on
Preempt-RT from 8us to 4us on the non-timekeeping cores.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
