diff options
author | John Stultz <john.stultz@linaro.org> | 2012-07-17 03:05:14 -0400 |
---|---|---|
committer | Ben Hutchings <ben@decadent.org.uk> | 2012-07-25 04:11:32 +0100 |
commit | a57ccabee60519dd90051266c00d038055b93878 (patch) | |
tree | b773b7d58cab58a53a6edd143b4aee976e018268 /include/linux | |
parent | 9f1e3e0f9fae973747e113848f9a9d0a2e1867f9 (diff) |
ntp: Fix leap-second hrtimer livelock
This is a backport of 6b43ae8a619d17c4935c3320d2ef9e92bdeed05d
This should have been backported when it was commited, but I
mistook the problem as requiring the ntp_lock changes
that landed in 3.4 in order for it to occur.
Unfortunately the same issue can happen (with only one cpu)
as follows:
do_adjtimex()
write_seqlock_irq(&xtime_lock);
process_adjtimex_modes()
process_adj_status()
ntp_start_leap_timer()
hrtimer_start()
hrtimer_reprogram()
tick_program_event()
clockevents_program_event()
ktime_get()
seq = req_seqbegin(xtime_lock); [DEADLOCK]
This deadlock will no always occur, as it requires the
leap_timer to force a hrtimer_reprogram which only happens
if its set and there's no sooner timer to expire.
NOTE: This patch, being faithful to the original commit,
introduces a bug (we don't update wall_to_monotonic),
which will be resovled by backporting a following fix.
Original commit message below:
Since commit 7dffa3c673fbcf835cd7be80bb4aec8ad3f51168 the ntp
subsystem has used an hrtimer for triggering the leapsecond
adjustment. However, this can cause a potential livelock.
Thomas diagnosed this as the following pattern:
CPU 0 CPU 1
do_adjtimex()
spin_lock_irq(&ntp_lock);
process_adjtimex_modes(); timer_interrupt()
process_adj_status(); do_timer()
ntp_start_leap_timer(); write_lock(&xtime_lock);
hrtimer_start(); update_wall_time();
hrtimer_reprogram(); ntp_tick_length()
tick_program_event() spin_lock(&ntp_lock);
clockevents_program_event()
ktime_get()
seq = req_seqbegin(xtime_lock);
This patch tries to avoid the problem by reverting back to not using
an hrtimer to inject leapseconds, and instead we handle the leapsecond
processing in the second_overflow() function.
The downside to this change is that on systems that support highres
timers, the leap second processing will occur on a HZ tick boundary,
(ie: ~1-10ms, depending on HZ) after the leap second instead of
possibly sooner (~34us in my tests w/ x86_64 lapic).
This patch applies on top of tip/timers/core.
CC: Sasha Levin <levinsasha928@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Diagnoised-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Diffstat (limited to 'include/linux')
-rw-r--r-- | include/linux/timex.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/include/linux/timex.h b/include/linux/timex.h index aa60fe7b6ed6..08e90fb81acc 100644 --- a/include/linux/timex.h +++ b/include/linux/timex.h @@ -266,7 +266,7 @@ static inline int ntp_synced(void) /* Returns how long ticks are at present, in ns / 2^NTP_SCALE_SHIFT. */ extern u64 tick_length; -extern void second_overflow(void); +extern int second_overflow(unsigned long secs); extern void update_ntp_one_tick(void); extern int do_adjtimex(struct timex *); extern void hardpps(const struct timespec *, const struct timespec *); |