From 1717f2096b543cede7a380c858c765c41936bc35 Mon Sep 17 00:00:00 2001 From: Hidehiro Kawai Date: Mon, 14 Dec 2015 11:19:09 +0100 Subject: panic, x86: Fix re-entrance problem due to panic on NMI If panic on NMI happens just after panic() on the same CPU, panic() is recursively called. Kernel stalls, as a result, after failing to acquire panic_lock. To avoid this problem, don't call panic() in NMI context if we've already entered panic(). For that, introduce nmi_panic() macro to reduce code duplication. In the case of panic on NMI, don't return from NMI handlers if another CPU already panicked. Signed-off-by: Hidehiro Kawai Acked-by: Michal Hocko Cc: Aaron Tomlin Cc: Andrew Morton Cc: Andy Lutomirski Cc: Baoquan He Cc: Chris Metcalf Cc: David Hildenbrand Cc: Don Zickus Cc: "Eric W. Biederman" Cc: Frederic Weisbecker Cc: Gobinda Charan Maji Cc: HATAYAMA Daisuke Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Javi Merino Cc: Jonathan Corbet Cc: kexec@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: lkml Cc: Masami Hiramatsu Cc: Michal Nazarewicz Cc: Nicolas Iooss Cc: Peter Zijlstra Cc: Prarit Bhargava Cc: Rasmus Villemoes Cc: Rusty Russell Cc: Seth Jennings Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Ulrich Obergfell Cc: Vitaly Kuznetsov Cc: Vivek Goyal Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs [ Cleanup comments, fixup formatting. ] Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner --- kernel/panic.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 4b150bc0c6c1..3344524cf6ff 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -61,6 +61,8 @@ void __weak panic_smp_self_stop(void) cpu_relax(); } +atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID); + /** * panic - halt the system * @fmt: The text string to print @@ -71,17 +73,17 @@ void __weak panic_smp_self_stop(void) */ void panic(const char *fmt, ...) { - static DEFINE_SPINLOCK(panic_lock); static char buf[1024]; va_list args; long i, i_next = 0; int state = 0; + int old_cpu, this_cpu; /* * Disable local interrupts. This will prevent panic_smp_self_stop * from deadlocking the first cpu that invokes the panic, since * there is nothing to prevent an interrupt handler (that runs - * after the panic_lock is acquired) from invoking panic again. + * after setting panic_cpu) from invoking panic() again. */ local_irq_disable(); @@ -94,8 +96,16 @@ void panic(const char *fmt, ...) * multiple parallel invocations of panic, all other CPUs either * stop themself or will wait until they are stopped by the 1st CPU * with smp_send_stop(). + * + * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which + * comes here, so go ahead. + * `old_cpu == this_cpu' means we came from nmi_panic() which sets + * panic_cpu to this CPU. In this case, this is also the 1st CPU. */ - if (!spin_trylock(&panic_lock)) + this_cpu = raw_smp_processor_id(); + old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu); + + if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) panic_smp_self_stop(); console_verbose(); -- cgit v1.2.3 From 58c5661f2144c089bbc2e5d87c9ec1dc1d2964fe Mon Sep 17 00:00:00 2001 From: Hidehiro Kawai Date: Mon, 14 Dec 2015 11:19:10 +0100 Subject: panic, x86: Allow CPUs to save registers even if looping in NMI context Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(), sends an NMI IPI to CPUs which haven't called panic() to stop them, save their register information and do some cleanups for crash dumping. However, if such a CPU is infinitely looping in NMI context, we fail to save its register information into the crash dump. For example, this can happen when unknown NMIs are broadcast to all CPUs as follows: CPU 0 CPU 1 =========================== ========================== receive an unknown NMI unknown_nmi_error() panic() receive an unknown NMI spin_trylock(&panic_lock) unknown_nmi_error() crash_kexec() panic() spin_trylock(&panic_lock) panic_smp_self_stop() infinite loop kdump_nmi_shootdown_cpus() issue NMI IPI -----------> blocked until IRET infinite loop... Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET, so the NMI is not handled and the callback function to save registers is never called. In practice, this can happen on some servers which broadcast NMIs to all CPUs when the NMI button is pushed. To save registers in this case, we need to: a) Return from NMI handler instead of looping infinitely or b) Call the callback function directly from the infinite loop Inherently, a) is risky because NMI is also used to prevent corrupted data from being propagated to devices. So, we chose b). This patch does the following: 1. Move the infinite looping of CPUs which haven't called panic() in NMI context (actually done by panic_smp_self_stop()) outside of panic() to enable us to refer pt_regs. Please note that panic_smp_self_stop() is still used for normal context. 2. Call a callback of kdump_nmi_shootdown_cpus() directly to save registers and do some cleanups after setting waiting_for_crash_ipi which is used for counting down the number of CPUs which handled the callback Signed-off-by: Hidehiro Kawai Acked-by: Michal Hocko Cc: Aaron Tomlin Cc: Andrew Morton Cc: Andy Lutomirski Cc: Baoquan He Cc: Chris Metcalf Cc: Dave Young Cc: David Hildenbrand Cc: Don Zickus Cc: Eric Biederman Cc: Frederic Weisbecker Cc: Gobinda Charan Maji Cc: HATAYAMA Daisuke Cc: Hidehiro Kawai Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Javi Merino Cc: Jiang Liu Cc: Jonathan Corbet Cc: kexec@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: lkml Cc: Masami Hiramatsu Cc: Michal Nazarewicz Cc: Nicolas Iooss Cc: Oleg Nesterov Cc: Peter Zijlstra Cc: Prarit Bhargava Cc: Rasmus Villemoes Cc: Seth Jennings Cc: Stefan Lippers-Hollmann Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Ulrich Obergfell Cc: Vitaly Kuznetsov Cc: Vivek Goyal Cc: Yasuaki Ishimatsu Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs [ Cleanup comments, fixup formatting. ] Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner --- kernel/panic.c | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 3344524cf6ff..06f31b49b3b4 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -61,6 +61,15 @@ void __weak panic_smp_self_stop(void) cpu_relax(); } +/* + * Stop ourselves in NMI context if another CPU has already panicked. Arch code + * may override this to prepare for crash dumping, e.g. save regs info. + */ +void __weak nmi_panic_self_stop(struct pt_regs *regs) +{ + panic_smp_self_stop(); +} + atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID); /** -- cgit v1.2.3 From 7bbee5ca3896f69f09c68be549cb8997abe6bca6 Mon Sep 17 00:00:00 2001 From: Hidehiro Kawai Date: Mon, 14 Dec 2015 11:19:11 +0100 Subject: kexec: Fix race between panic() and crash_kexec() Currently, panic() and crash_kexec() can be called at the same time. For example (x86 case): CPU 0: oops_end() crash_kexec() mutex_trylock() // acquired nmi_shootdown_cpus() // stop other CPUs CPU 1: panic() crash_kexec() mutex_trylock() // failed to acquire smp_send_stop() // stop other CPUs infinite loop If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump fails. In another case: CPU 0: oops_end() crash_kexec() mutex_trylock() // acquired io_check_error() panic() crash_kexec() mutex_trylock() // failed to acquire infinite loop Clearly, this is an undesirable result. To fix this problem, this patch changes crash_kexec() to exclude others by using the panic_cpu atomic. Signed-off-by: Hidehiro Kawai Acked-by: Michal Hocko Cc: Andrew Morton Cc: Baoquan He Cc: Dave Young Cc: "Eric W. Biederman" Cc: HATAYAMA Daisuke Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jonathan Corbet Cc: kexec@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: Martin Schwidefsky Cc: Masami Hiramatsu Cc: Minfei Huang Cc: Peter Zijlstra Cc: Seth Jennings Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Vitaly Kuznetsov Cc: Vivek Goyal Cc: x86-ml Link: http://lkml.kernel.org/r/20151210014630.25437.94161.stgit@softrs Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner --- kernel/panic.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 06f31b49b3b4..b333380c6bb2 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -136,9 +136,11 @@ void panic(const char *fmt, ...) * everything else. * If we want to run this after calling panic_notifiers, pass * the "crash_kexec_post_notifiers" option to the kernel. + * + * Bypass the panic_cpu check and call __crash_kexec directly. */ if (!crash_kexec_post_notifiers) - crash_kexec(NULL); + __crash_kexec(NULL); /* * Note smp_send_stop is the usual smp shutdown function, which @@ -161,9 +163,11 @@ void panic(const char *fmt, ...) * panic_notifiers and dumping kmsg before kdump. * Note: since some panic_notifiers can make crashed kernel * more unstable, it can increase risks of the kdump failure too. + * + * Bypass the panic_cpu check and call __crash_kexec directly. */ if (crash_kexec_post_notifiers) - crash_kexec(NULL); + __crash_kexec(NULL); bust_spinlocks(0); -- cgit v1.2.3