summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2012-12-10x86, fpu: Avoid FPU lazy restore after suspendVincent Palatin
commit 644c154186386bb1fa6446bc5e037b9ed098db46 upstream. When a cpu enters S3 state, the FPU state is lost. After resuming for S3, if we try to lazy restore the FPU for a process running on the same CPU, this will result in a corrupted FPU context. Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU, so nobody will try to lazy restore a state which doesn't exist in the hardware. Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off, by doing thousands of suspend/resume cycles with 4 processes doing FPU operations running. Without the patch, a process is killed after a few hundreds cycles by a SIGFPE. Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Cc: Duncan Laurie <dlaurie@chromium.org> Cc: Olof Johansson <olofj@chromium.org> Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-05x86-32: Export kernel_stack_pointer() for modulesH. Peter Anvin
commit cb57a2b4cff7edf2a4e32c0163200e9434807e0a upstream. Modules, in particular oprofile (and possibly other similar tools) need kernel_stack_pointer(), so export it using EXPORT_SYMBOL_GPL(). Cc: Yang Wei <wei.yang@windriver.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Jun Zhang <jun.zhang@intel.com> Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Robert Richter <rric@kernel.org> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Philip Müller <philm@manjaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-03x86, microcode, AMD: Add support for family 16h processorsBoris Ostrovsky
commit 36c46ca4f322a7bf89aad5462a3a1f61713edce7 upstream. Add valid patch size for family 16h processors. [ hpa: promoting to urgent/stable since it is hw enabling and trivial ] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Acked-by: Andreas Herrmann <herrmann.der.user@googlemail.com> Link: http://lkml.kernel.org/r/1353004910-2204-1-git-send-email-boris.ostrovsky@amd.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-03x86-32: Fix invalid stack address while in softirqRobert Richter
commit 1022623842cb72ee4d0dbf02f6937f38c92c3f41 upstream. In 32 bit the stack address provided by kernel_stack_pointer() may point to an invalid range causing NULL pointer access or page faults while in NMI (see trace below). This happens if called in softirq context and if the stack is empty. The address at &regs->sp is then out of range. Fixing this by checking if regs and &regs->sp are in the same stack context. Otherwise return the previous stack pointer stored in struct thread_info. If that address is invalid too, return address of regs. BUG: unable to handle kernel NULL pointer dereference at 0000000a IP: [<c1004237>] print_context_stack+0x6e/0x8d *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0 EIP is at print_context_stack+0x6e/0x8d EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000) Stack: 000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000 f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c Call Trace: [<c1003723>] dump_trace+0x7b/0xa1 [<c12dce1c>] x86_backtrace+0x40/0x88 [<c12db712>] ? oprofile_add_sample+0x56/0x84 [<c12db731>] oprofile_add_sample+0x75/0x84 [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260 [<c12dd40d>] profile_exceptions_notify+0x23/0x4c [<c1395034>] nmi_handle+0x31/0x4a [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [<c13950ed>] do_nmi+0xa0/0x2ff [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [<c13949e5>] nmi_stack_correct+0x28/0x2d [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45 [<c1003603>] ? do_softirq+0x4b/0x7f <IRQ> [<c102a06f>] irq_exit+0x35/0x5b [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a [<c1394746>] apic_timer_interrupt+0x2a/0x30 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0 CR2: 000000000000000a ---[ end trace 62afee3481b00012 ]--- Kernel panic - not syncing: Fatal exception in interrupt V2: * add comments to kernel_stack_pointer() * always return a valid stack address by falling back to the address of regs Reported-by: Yang Wei <wei.yang@windriver.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jun Zhang <jun.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-31x86, mm: Use memblock memory loop instead of e820_RAMYinghai Lu
commit 1f2ff682ac951ed82cc043cf140d2851084512df upstream. We need to handle E820_RAM and E820_RESERVED_KERNEL at the same time. Also memblock has page aligned range for ram, so we could avoid mapping partial pages. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com Acked-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-31x86: efi: Turn off efi_enabled after setup on mixed fw/kernelOlof Johansson
commit 5189c2a7c7769ee9d037d76c1a7b8550ccf3481c upstream. When 32-bit EFI is used with 64-bit kernel (or vice versa), turn off efi_enabled once setup is done. Beyond setup, it is normally used to determine if runtime services are available and we will have none. This will resolve issues stemming from efivars modprobe panicking on a 32/64-bit setup, as well as some reboot issues on similar setups. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=45991 Reported-by: Marko Kohtala <marko.kohtala@gmail.com> Reported-by: Maxim Kammerer <mk@dee.su> Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-31x86, mm: Trim memory in memblock to be page alignedYinghai Lu
commit 6ede1fd3cb404c0016de6ac529df46d561bd558b upstream. We will not map partial pages, so need to make sure memblock allocation will not allocate those bytes out. Also we will use for_each_mem_pfn_range() to loop to map memory range to keep them consistent. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com Acked-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-28xen/x86: don't corrupt %eip when returning from a signal handlerDavid Vrabel
commit a349e23d1cf746f8bdc603dcc61fae9ee4a695f6 upstream. In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS (-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event /and/ the process has a pending signal then %eip (and %eax) are corrupted when returning to the main process after handling the signal. The application may then crash with SIGSEGV or a SIGILL or it may have subtly incorrect behaviour (depending on what instruction it returned to). The occurs because handle_signal() is incorrectly thinking that there is a system call that needs to restarted so it adjusts %eip and %eax to re-execute the system call instruction (even though user space had not done a system call). If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK (-516) then handle_signal() only corrupted %eax (by setting it to -EINTR). This may cause the application to crash or have incorrect behaviour. handle_signal() assumes that regs->orig_ax >= 0 means a system call so any kernel entry point that is not for a system call must push a negative value for orig_ax. For example, for physical interrupts on bare metal the inverse of the vector is pushed and page_fault() sets regs->orig_ax to -1, overwriting the hardware provided error code. xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax instead of -1. Classic Xen kernels pushed %eax which works as %eax cannot be both non-negative and -RESTARTSYS (etc.), but using -1 is consistent with other non-system call entry points and avoids some of the tests in handle_signal(). There were similar bugs in xen_failsafe_callback() of both 32 and 64-bit guests. If the fault was corrected and the normal return path was used then 0 was incorrectly pushed as the value for orig_ax. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-28x86: Exclude E820_RESERVED regions and memory holes above 4 GB from direct ↵Jacob Shin
mapping. commit 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a upstream. On systems with very large memory (1 TB in our case), BIOS may report a reserved region or a hole in the E820 map, even above the 4 GB range. Exclude these from the direct mapping. [ hpa: this should be done not just for > 4 GB but for everything above the legacy region (1 MB), at the very least. That, however, turns out to require significant restructuring. That work is well underway, but is not suitable for rc/stable. ] Signed-off-by: Jacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/1319145326-13902-1-git-send-email-jacob.shin@amd.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-07x86/alternatives: Fix p6 nops on non-modular kernelsAvi Kivity
commit cb09cad44f07044d9810f18f6f9a6a6f3771f979 upstream. Probably a leftover from the early days of self-patching, p6nops are marked __initconst_or_module, which causes them to be discarded in a non-modular kernel. If something later triggers patching, it will overwrite kernel code with garbage. Reported-by: Tomas Racek <tracek@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Cc: Michael Tokarev <mjt@tls.msk.ru> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: qemu-devel@nongnu.org Cc: Anthony Liguori <anthony@codemonkey.ws> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/5034AE84.90708@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ben Jencks <ben@bjencks.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14x86, microcode, AMD: Fix broken ucode patch size checkAndreas Herrmann
commit 36bf50d7697be18c6bfd0401e037df10bff1e573 upstream. This issue was recently observed on an AMD C-50 CPU where a patch of maximum size was applied. Commit be62adb49294 ("x86, microcode, AMD: Simplify ucode verification") added current_size in get_matching_microcode(). This is calculated as size of the ucode patch + 8 (ie. size of the header). Later this is compared against the maximum possible ucode patch size for a CPU family. And of course this fails if the patch has already maximum size. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344361461-10076-1-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, microcode: Sanitize per-cpu microcode reloading interfaceBorislav Petkov
commit c9fc3f778a6a215ace14ee556067c73982b6d40f upstream. Microcode reloading in a per-core manner is a very bad idea for both major x86 vendors. And the thing is, we have such interface with which we can end up with different microcode versions applied on different cores of an otherwise homogeneous wrt (family,model,stepping) system. So turn off the possibility of doing that per core and allow it only system-wide. This is a minimal fix which we'd like to see in stable too thus the more-or-less arbitrary decision to allow system-wide reloading only on the BSP: $ echo 1 > /sys/devices/system/cpu/cpu0/microcode/reload ... and disable the interface on the other cores: $ echo 1 > /sys/devices/system/cpu/cpu23/microcode/reload -bash: echo: write error: Invalid argument Also, allowing the reload only from one CPU (the BSP in that case) doesn't allow the reload procedure to degenerate into an O(n^2) deal when triggering reloads from all /sys/devices/system/cpu/cpuX/microcode/reload sysfs nodes simultaneously. A more generic fix will follow. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1340280437-7718-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, microcode: microcode_core.c simple_strtoul cleanupShuah Khan
commit e826abd523913f63eb03b59746ffb16153c53dc4 upstream. Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, nops: Missing break resulting in incorrect selection on IntelAlan Cox
commit d6250a3f12edb3a86db9598ffeca3de8b4a219e9 upstream. The Intel case falls through into the generic case which then changes the values. For cases like the P6 it doesn't do the right thing so this seems to be a screwup. Signed-off-by: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/n/tip-lww2uirad4skzjlmrm0vru8o@git.kernel.org Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faultsTony Luck
commit 6751ed65dc6642af64f7b8a440a75563c8aab7ae upstream. In commit dad1743e5993f1 ("x86/mce: Only restart instruction after machine check recovery if it is safe") we fixed mce_notify_process() to force a signal to the current process if it was not restartable (RIPV bit not set in MCG_STATUS). But doing it here means that the process doesn't get told the virtual address of the fault via siginfo_t->si_addr. This would prevent application level recovery from the fault. Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so that we will provide the right information with the signal. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERMH. Peter Anvin
commit 4ad33411308596f2f918603509729922a1ec4411 upstream. It makes sense to label "Digital Thermal Sensor" as "DTS", but unfortunately the string "dts" was already used for "Debug Store", and /proc/cpuinfo is a user space ABI. Therefore, rename this to "dtherm". This conflict went into mainline via the hwmon tree without any x86 maintainer ack, and without any kind of hint in the subject. a4659053 x86/hwmon: fix initialization of coretemp Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16ACPI, x86: fix Dell M6600 ACPI reboot regression via DMIZhang Rui
commit 76eb9a30db4bc8fd172f9155247264b5f2686d7b upstream. Dell Precision M6600 is known to require PCI reboot, so add it to the reboot blacklist in pci_reboot_dmi_table[]. https://bugzilla.kernel.org/show_bug.cgi?id=42749 cc: x86@kernel.org Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overridingFeng Tang
commit f6b54f083cc66cf9b11d2120d8df3c2ad4e0836d upstream. This is the 2nd part of fix for kernel bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 The root cause is the buggy FW, whose ACPI tables assign the GSI 16 to 2 irqs 0 and 16(VGA), and the VGA is the right owner of GSI 16. So add a quirk to ignore the irq0 overriding GSI 16 for the FUJITSU SIEMENS AMILO PRO V2030 platform will solve this issue. Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16ACPI: Remove one board specific WARN when ignoring timer overridingFeng Tang
commit 7f68b4c2e158019c2ec494b5cfbd9c83b4e5b253 upstream. Current WARN msg is only for the ati_ixp4x0 board, while this function is used by mulitple platforms. So this one board specific warning is not appropriate any more. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16ACPI: Make acpi_skip_timer_override cover all source_irq==0 casesFeng Tang
commit ae10ccdc3093486f8c2369d227583f9d79f628e5 upstream. Currently when acpi_skip_timer_override is set, it only cover the (source_irq == 0 && global_irq == 2) cases. While there is also platform which need use this option and its global_irq is not 2. This patch will extend acpi_skip_timer_override to cover all timer overriding cases as long as the source irq is 0. This is the first part of a fix to kernel bug bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-17x86, MCE, AMD: Make APIC LVT thresholding interrupt optionalBorislav Petkov
commit f227d4306cf30e1d5b6f231e8ef9006c34f3d186 upstream. Currently, the APIC LVT interrupt for error thresholding is implicitly enabled. However, there are models in the F15h range which do not enable it. Make the code machinery which sets up the APIC interrupt support an optional setting and add an ->interrupt_capable member to the bank representation mirroring that capability and enable the interrupt offset programming only if it is true. Simplify code and fixup comment style while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-10x86: Reset the debug_stack update counterSteven Rostedt
commit c0525a6972d3f1fb83058ef503e183475d6e4e26 upstream. When an NMI goes off and it sees that it preempted the debug stack, to keep the debug stack safe, it changes the IDT to point to one that does not modify the stack on breakpoint (to allow breakpoints in NMIs). But the variable that gets set to know to undo it on exit never gets cleared on exit. Thus every NMI will reset it on exit the first time it is done even if it does not need to be reset. [ Added H. Peter Anvin's suggestion to use this_cpu_read/write ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-10x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32H.J. Lu
commit bad1a753d4d4deb09d4bc0bac1dd4fc3298502e9 upstream. When I added x32 ptrace to 3.4 kernel, I also include PTRACE_ARCH_PRCTL support for x32 GDB For ARCH_GET_FS/GS, it takes a pointer to int64. But at user level, ARCH_GET_FS/GS takes a pointer to int32. So I have to add x32 ptrace to glibc to handle it with a temporary int64 passed to kernel and copy it back to GDB as int32. Roland suggested that PTRACE_ARCH_PRCTL is obsolete and x32 GDB should use fs_base and gs_base fields of user_regs_struct instead. Accordingly, remove PTRACE_ARCH_PRCTL completely from the x32 code to avoid possible memory overrun when pointer to int32 is passed to kernel. Link: http://lkml.kernel.org/r/CAMe9rOpDzHfS7NH7m1vmD9QRw8SSj4Sc%2BaNOgcWm_WJME2eRsQ@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-01MCE: Fix vm86 handling for 32bit mce handlerAndi Kleen
commit a129a7c84582629741e5fa6f40026efcd7a65bd4 upstream. When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-01x86/mce: Fix check for processor context when machine check was taken.Tony Luck
commit 875e26648cf9b6db9d8dc07b7959d7c61fb3f49c upstream. Linus pointed out that there was no value is checking whether m->ip was zero - because zero is a legimate value. If we have a reliable (or faked in the VM86 case) "m->cs" we can use it to tell whether we were in user mode or kernelwhen the machine check hit. Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-01perf/x86: Update event scheduling constraints for AMD family 15h modelsRobert Richter
commit 5bcdf5e4fee3c45e1281c25e4941f2163cb28c65 upstream. This update is for newer family 15h cpu models from 0x02 to 0x1f. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1337337642-1621-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-18Merge tag 'linus-mce-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull a machine check recovery fix from Tony Luck. I really don't like how the MCE code does some of the things it does, but this does seem to be an improvement. * tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Only restart instruction after machine check recovery if it is safe
2012-05-17Merge branches 'perf-urgent-for-linus', 'x86-urgent-for-linus' and ↵Linus Torvalds
'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf, x86 and scheduler updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing: Do not enable function event with enable perf stat: handle ENXIO error for perf_event_open perf: Turn off compiler warnings for flex and bison generated files perf stat: Fix case where guest/host monitoring is not supported by kernel perf build-id: Fix filename size calculation * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable x86: Fix section annotation of acpi_map_cpu2node() x86/microcode: Ensure that module is only loaded on supported Intel CPUs * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption
2012-05-14x86/mce: Only restart instruction after machine check recovery if it is safeTony Luck
Section 15.3.1.2 of the software developer manual has this to say about the RIPV bit in the IA32_MCG_STATUS register: RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program execution can be restarted reliably at the instruction pointed to by the instruction pointer pushed on the stack when the machine-check exception is generated. When clear, the program cannot be reliably restarted at the pushed instruction pointer. We need to save the state of this bit in do_machine_check() and use it in mce_notify_process() to force a signal; even if memory_failure() says it made a complete recovery ... e.g. replaced a clean LRU page. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-09Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Avi Kivity: "Two asynchronous page fault fixes (one guest, one host), a powerpc page refcount fix, and an ia64 build fix." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: ia64: fix build due to typo KVM: PPC: Book3S HV: Fix refcounting of hugepages KVM: Do not take reference to mm during async #PF KVM: ensure async PF event wakes up vcpu from halt
2012-05-08Merge branch 'for-3.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull two percpu fixes from Tejun Heo: "One adds missing KERN_CONT on split printk()s and the other makes the percpu allocator avoid using PMD_SIZE as atom_size on x86_32. Using PMD_SIZE led to vmalloc area exhaustion on certain configurations (x86_32 android) and the only cost of using PAGE_SIZE instead is static percpu area not being aligned to large page mapping." * 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit percpu: use KERN_CONT in pcpu_dump_alloc_info()
2012-05-08percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bitTejun Heo
With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE or PMD_SIZE for atom_size. PMD_SIZE is used when CPU supports PSE so that percpu areas are aligned to PMD mappings and possibly allow using PMD mappings in vmalloc areas in the future. Using larger atom_size doesn't waste actual memory; however, it does require larger vmalloc space allocation later on for !first chunks. With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem but x86_32 at this point is anything but reasonable in terms of address space and using larger atom_size reportedly leads to frequent percpu allocation failures on certain setups. As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space is aplenty and most x86_64 configurations support PSE, fix the issue by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32. v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and x86_32 PAGE_SIZE as suggested by hpa. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Reported-by: ShuoX Liu <shuox.liu@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4F97BA98.6010001@intel.com> Cc: stable@vger.kernel.org
2012-05-08x86: Fix section annotation of acpi_map_cpu2node()Jan Beulich
Commit 943bc7e110f2 ("x86: Fix section warnings") added __cpuinitdata here, while for functions __cpuinit should be used. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <sp@numascale.com> Link: http://lkml.kernel.org/r/4FA947910200007800082470@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Steffen Persvold <sp@numascale.com>
2012-05-07x86/microcode: Ensure that module is only loaded on supported Intel CPUsSrivatsa S. Bhat
Exit early when there's no support for a particular CPU family. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Jones <davej@redhat.com> Cc: tigran@aivazian.fsnet.co.uk Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/4F8BDB58.6070007@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-06IA32 emulation: Fix build problem for modular ia32 a.out supportLarry Finger
Commit ce7e5d2d19bc ("x86: fix broken TASK_SIZE for ia32_aout") breaks kernel builds when "CONFIG_IA32_AOUT=m" with ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! make[1]: *** [__modpost] Error 1 The entry point needs to be exported. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Al Viro <viro@zeniv.linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-06KVM: Do not take reference to mm during async #PFGleb Natapov
It turned to be totally unneeded. The reason the code was introduced is so that KVM can prefault swapped in page, but prefault can fail even if mm is pinned since page table can change anyway. KVM handles this situation correctly though and does not inject spurious page faults. Fixes: "INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected" warning while running LTP inside a KVM guest using the recent -next kernel. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-27x86/amd: Re-enable CPU topology extensions in case BIOS has disabled itAndreas Herrmann
BIOS will switch off the corresponding feature flag on family 15h models 10h-1fh non-desktop CPUs. The topology extension CPUID leafs are required to detect which cores belong to the same compute unit. (thread siblings mask is set accordingly and also correct information about L1i and L2 cache sharing depends on this). W/o this patch we wouldn't see which cores belong to the same compute unit and also cache sharing information for L1i and L2 would be incorrect on such systems. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25x86/apic: Use x2apic physical mode based on FADT settingGreg Pearson
Provide systems that do not support x2apic cluster mode a mechanism to select x2apic physical mode using the FADT FORCE_APIC_PHYSICAL_DESTINATION_MODE bit. Changes from v1: (based on Suresh's comments) - removed #ifdef CONFIG_ACPI - removed #include <linux/acpi.h> Signed-off-by: Greg Pearson <greg.pearson@hp.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1335313436-32020-1-git-send-email-greg.pearson@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25Merge tag 'l3-fix-for-3.5' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. ( Pulling it into v3.4 despite the v3.5 tag - the fix is small and we better keep the same code across kernel versions for such user facing interfaces. ) Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-23x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from ↵Konrad Rzeszutek Wilk
assembler With commit a2ef5c4fd44ce3922435139393b89f2cce47f576 "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state. The assembler code in wakeup_*.S did not do that. One solution is to call it from assembler and stick the wake_sleep_flags on the stack (for 32-bit) or in %esi (for 64-bit). hpa and rafael both suggested however to create a wrapper function to call acpi_enter_sleep_state and call said wrapper function ("acpi_enter_s3") from assembler. For 32-bit, the acpi_enter_s3 ends up looking as so: push %ebp mov %esp,%ebp sub $0x8,%esp movzbl 0xc1809314,%eax [wake_sleep_flags] movl $0x3,(%esp) mov %eax,0x4(%esp) call 0xc12d1fa0 <acpi_enter_sleep_state> leave ret And 64-bit: movzbl 0x9afde1(%rip),%esi [wake_sleep_flags] push %rbp mov $0x3,%edi mov %rsp,%rbp callq 0xffffffff812e9800 <acpi_enter_sleep_state> leaveq retq Reviewed-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: H. Peter Anvin <hpa@zytor.com> [v2: Remove extra assembler operations, per hpa review] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-3-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-23ACPI: Convert wake_sleep_flags to a value instead of functionKonrad Rzeszutek Wilk
With commit a2ef5c4fd44ce3922435139393b89f2cce47f576 "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state, which means that if there are functions outside the sleep.c code they can't get the wake_sleep_flags values. This converts the function in to a exported value and converts the module config operands to a function. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Lin Ming <ming.m.lin@intel.com> [v2: Parameters can be turned on/off dynamically] [v3: unsigned char -> u8] [v4: val -> kp->arg] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-2-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-19x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot()Srivatsa S. Bhat
If the L3 disable slot is already in use, return -EEXIST instead of -EINVAL. The caller, store_cache_disable(), checks this return value to print an appropriate warning. Also, we want to signal with -EEXIST that the current index we're disabling has actually been already disabled on the node: $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_1 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu5/cache/index3/cache_disable_1 -bash: echo: write error: File exists The old code would say -bash: echo: write error: Invalid argument for disable slot 1 when playing the example above with no output in dmesg, which is clearly misleading. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120419070053.GB16645@elgon.mountain [Boris: add testing for the other index too] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-18x86, apic: APIC code touches invalid MSR on P5 class machinesBryan O'Donoghue
Current APIC code assumes MSR_IA32_APICBASE is present for all systems. Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE was introduced as an architectural MSR by Intel @ P6. Code paths that can touch this MSR invalidly are when vendor == Intel && cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass lapic on the kernel command line, on a P5. The below patch stops Linux incorrectly interfering with the MSR_IA32_APICBASE for P5 class machines. Other code paths exist that touch the MSR - however those paths are not currently reachable for a conformant P5. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linux.intel.com> Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>
2012-04-16i387: ptrace breaks the lazy-fpu-restore logicOleg Nesterov
Starting from 7e16838d "i387: support lazy restore of FPU state" we assume that fpu_owner_task doesn't need restore_fpu_checking() on the context switch, its FPU state should match what we already have in the FPU on this CPU. However, debugger can change the tracee's FPU state, in this case we should reset fpu.last_cpu to ensure fpu_lazy_restore() can't return true. Change init_fpu() to do this, it is called by user_regset->set() methods. Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20120416204815.GB24884@redhat.com Cc: <stable@vger.kernel.org> v3.3 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-16x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()Andreas Herrmann
It's only called from amd.c:srat_detect_node(). The introduced condition for calling the fixup code is true for all AMD multi-node processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA nodes on one socket. Thus there are cores having different numa-node-id but with equal phys_proc_id. There is no point to print error messages in such a situation. The confusing/misleading error message was introduced with commit 64be4c1c2428e148de6081af235e2418e6a66dda ("x86: Add x86_init platform override to fix up NUMA core numbering"). Remove the default fixup function (especially the error message) and replace it by a NULL pointer check, move the Numascale-specific condition for calling the fixup into the fixup-function itself and slightly adapt the comment. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: <stable@kernel.org> Cc: <sp@numascale.com> Cc: <bp@amd64.org> Cc: <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/20120402160648.GR27684@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-16x86/amd: Remove broken links from comment and kernel messageAndreas Herrmann
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20120411151238.GA4794@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-14Merge tag 'microcode-fix-for-3.4' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent Pull from Borislav Petkov a two-patch fix from Andreas taking care of a sysfs warning when the microcode driver is loaded on unsupported platforms. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-13x86, microcode: Ensure that module is only loaded on supported AMD CPUsAndreas Herrmann
Exit early when there's no support for a particular CPU family. Also, fixup the "no support for this CPU vendor" to be issued only when the driver is attempted to be loaded on an unsupported vendor. Cc: stable@vger.kernel.org Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com [Boris: add a commit msg because Andreas is lazy] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-13x86, microcode: Fix sysfs warning during module unload on unsupported CPUsAndreas Herrmann
Loading the microcode driver on an unsupported CPU and subsequently unloading the driver causes WARNING: at fs/sysfs/group.c:138 mc_device_remove+0x5f/0x70 [microcode]() Hardware name: 01972NG sysfs group ffffffffa00013d0 not found for kobject 'cpu0' Modules linked in: snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel btusb snd_hda_codec bluetooth thinkpad_acpi rfkill microcode(-) [last unloaded: cfg80211] Pid: 4560, comm: modprobe Not tainted 3.4.0-rc2-00002-g258f742 #5 Call Trace: [<ffffffff8103113b>] ? warn_slowpath_common+0x7b/0xc0 [<ffffffff81031235>] ? warn_slowpath_fmt+0x45/0x50 [<ffffffff81120e74>] ? sysfs_remove_group+0x34/0x120 [<ffffffffa00000ef>] ? mc_device_remove+0x5f/0x70 [microcode] [<ffffffff81331eb9>] ? subsys_interface_unregister+0x69/0xa0 [<ffffffff81563526>] ? mutex_lock+0x16/0x40 [<ffffffffa0000c3e>] ? microcode_exit+0x50/0x92 [microcode] [<ffffffff8107051d>] ? sys_delete_module+0x16d/0x260 [<ffffffff810a0065>] ? wait_iff_congested+0x45/0x110 [<ffffffff815656af>] ? page_fault+0x1f/0x30 [<ffffffff81565ba2>] ? system_call_fastpath+0x16/0x1b on recent kernels. This is due to commit 8a25a2fd126c ("cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem") which renders commit 6c53cbfced04 ("x86, microcode: Correct sysdev_add error path") useless. See http://marc.info/?l=linux-kernel&m=133416246406478 Avoid above warning by restoring the old driver behaviour before 6c53cbfced04 ("x86, microcode: Correct sysdev_add error path"). Cc: stable@vger.kernel.org Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Use correct byte-sized register constraint in __add() x86: Use correct byte-sized register constraint in __xchg_op() x86: vsyscall: Use NULL instead 0 for a pointer argument