sched: Cleanup bandwidth timers

Roman reported a 3 cpu lockup scenario involving __start_cfs_bandwidth(). The more I look at that code the more I'm convinced its crack, that entire __start_cfs_bandwidth() thing is brain melting, we don't need to cancel a timer before starting it, *hrtimer_start*() will happily remove the timer for you if its still enqueued. Removing that, removes a big part of the problem, no more ugly cancel loop to get stuck in. So now, if I understand things right, the entire reason you have this cfs_b->lock guarded ->timer_active nonsense is to make sure we don't accidentally lose the timer. It appears to me that it should be possible to guarantee that same by unconditionally (re)starting the timer when !queued. Because regardless what hrtimer::function will return, if we beat it to (re)enqueue the timer, it doesn't matter. Now, because hrtimers don't come with any serialization guarantees we must ensure both handler and (re)start loop serialize their access to the hrtimer to avoid both trying to forward the timer at the same time. Update the rt bandwidth timer to match. This effectively reverts: 09dc4ab03936 ("sched/fair: Fix tg_set_cfs_bandwidth() deadlock on rq->lock"). Reported-by: Roman Gushchin <klamm@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20150415095011.804589208@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
author: Peter Zijlstra <peterz@infradead.org> 2015-04-15 11:41:57 +0200
committer: Thomas Gleixner <tglx@linutronix.de> 2015-04-22 17:06:53 +0200
commit: 77a4d1a1b9a122ca1fa3507bd30aec1520d7a8a4 (patch)
tree: d4912f0a6379207bf160b5666d5ecc20d70412ab /kernel/sched/rt.c
parent: 5de2755c8c8b3a6b8414870e2c284914a2b42e4d (diff)
1 files changed, 6 insertions, 8 deletions
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 575da76a3874..b0febf25d8f1 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -18,19 +18,20 @@ static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
 {
 	struct rt_bandwidth *rt_b =
 		container_of(timer, struct rt_bandwidth, rt_period_timer);
-	ktime_t now;
-	int overrun;
 	int idle = 0;
+	int overrun;
 
+	raw_spin_lock(&rt_b->rt_runtime_lock);
 	for (;;) {
-		now = hrtimer_cb_get_time(timer);
-		overrun = hrtimer_forward(timer, now, rt_b->rt_period);
-
+		overrun = hrtimer_forward_now(timer, rt_b->rt_period);
 		if (!overrun)
 			break;
 
+		raw_spin_unlock(&rt_b->rt_runtime_lock);
 		idle = do_sched_rt_period_timer(rt_b, overrun);
+		raw_spin_lock(&rt_b->rt_runtime_lock);
 	}
+	raw_spin_unlock(&rt_b->rt_runtime_lock);
 
 	return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
 }
@@ -52,9 +53,6 @@ static void start_rt_bandwidth(struct rt_bandwidth *rt_b)
 	if (!rt_bandwidth_enabled() || rt_b->rt_runtime == RUNTIME_INF)
 		return;
 
-	if (hrtimer_active(&rt_b->rt_period_timer))
-		return;
-
 	raw_spin_lock(&rt_b->rt_runtime_lock);
 	start_bandwidth_timer(&rt_b->rt_period_timer, rt_b->rt_period);
 	raw_spin_unlock(&rt_b->rt_runtime_lock);
author	Peter Zijlstra <peterz@infradead.org>	2015-04-15 11:41:57 +0200
committer	Thomas Gleixner <tglx@linutronix.de>	2015-04-22 17:06:53 +0200
commit	77a4d1a1b9a122ca1fa3507bd30aec1520d7a8a4 (patch)
tree	d4912f0a6379207bf160b5666d5ecc20d70412ab /kernel/sched/rt.c
parent	5de2755c8c8b3a6b8414870e2c284914a2b42e4d (diff)