summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/apic
AgeCommit message (Collapse)Author
2014-03-11x86/apic: Plug racy xAPIC access of CPU hotplug codeJan Kiszka
apic_icr_write() and its users in smpboot.c were apparently written under the assumption that this code would only run during early boot. But nowadays we also execute it when onlining a CPU later on while the system is fully running. That will make wakeup_cpu_via_init_nmi and, thus, also native_apic_icr_write run in plain process context. If we migrate the caller to a different CPU at the wrong time or interrupt it and write to ICR/ICR2 to send unrelated IPIs, we can end up sending INIT, SIPI or NMIs to wrong CPUs. Fix this by disabling interrupts during the write to the ICR halves and disable preemption around waiting for ICR availability and using it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Tested-By: Igor Mammedov <imammedo@redhat.com> Link: http://lkml.kernel.org/r/52E6AFFE.3030004@siemens.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09x86/apic: Always define nox2apic and define it as initdataDavid Rientjes
The "nox2apic" variable can be defined as __initdata since it is only used for bootstrap. It can now unconditionally be defined since it will later be freed. At the same time, it is also better off as a bool. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042354380.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09x86/apic: Switch wait_for_init_deassert() to a bool flagDavid Rientjes
Now that there is only a single wait_for_init_deassert() function, just convert the member of struct apic to a bool to determine whether we need to wait for init_deassert to become non-zero. There are no more callers of default_wait_for_init_deassert(), so fold it into the caller. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042354010.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09x86/apic: Only use default_wait_for_init_deassert()David Rientjes
es7000_wait_for_init_deassert() is functionally equivalent to default_wait_for_init_deassert(), so remove the duplicate code and use only a single function. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402042353030.7839@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-31Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core debug changes from Ingo Molnar: "This contains mostly kernel debugging related updates: - make hung_task detection more configurable to distros - add final bits for x86 UV NMI debugging, with related KGDB changes - update the mailing-list of MAINTAINERS entries I'm involved with" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hung_task: Display every hung task warning sysctl: Add neg_one as a standard constraint x86/uv/nmi, kgdb/kdb: Fix UV NMI handler when KDB not configured x86/uv/nmi: Fix Sparse warnings kgdb/kdb: Fix no KDB config problem MAINTAINERS: Restore "L: linux-kernel@vger.kernel.org" entries
2014-01-25x86/uv/nmi: Fix Sparse warningsMike Travis
Make uv_register_nmi_notifier() and uv_handle_nmi_ping() static to address sparse warnings. Fix problem where uv_nmi_kexec_failed is unused when CONFIG_KEXEC is not defined. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20140114162551.480872353@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-24Merge tag 'pm+acpi-3.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...
2014-01-20Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpu, amd: Fix a shadowed variable situation um, x86: Fix vDSO build x86: Delete non-required instances of include <linux/init.h> x86, realmode: Pointer walk cleanups, pull out invariant use of __pa() x86/traps: Clean up error exception handler definitions
2014-01-20Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic changes from Ingo Molnar: "Two main changes: - improve local APIC Error Status Register reporting robustness - add the 'disable_cpu_apicid=x' boot parameter for kexec booting" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos x86, apic, kexec: Add disable_cpu_apicid kernel parameter x86/apic: Read Error Status Register correctly
2014-01-15x86, apic: Make disabled_cpu_apicid static read_mostly, fix typosH. Peter Anvin
Make disabled_cpu_apicid static and read_mostly, and fix a couple of typos. Reported-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/20140115182511.GA22737@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
2014-01-15x86, apic, kexec: Add disable_cpu_apicid kernel parameterHATAYAMA Daisuke
Add disable_cpu_apicid kernel parameter. To use this kernel parameter, specify an initial APIC ID of the corresponding CPU you want to disable. This is mostly used for the kdump 2nd kernel to disable BSP to wake up multiple CPUs without causing system reset or hang due to sending INIT from AP to BSP. Kdump users first figure out initial APIC ID of the BSP, CPU0 in the 1st kernel, for example from /proc/cpuinfo and then set up this kernel parameter for the 2nd kernel using the obtained APIC ID. However, doing this procedure at each boot time manually is awkward, which should be automatically done by user-land service scripts, for example, kexec-tools on fedora/RHEL distributions. This design is more flexible than disabling BSP in kernel boot time automatically in that in kernel boot time we have no choice but referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning that the method is not applicable to the systems without such BIOS tables. One assumption behind this design is that users get initial APIC ID of the BSP in still healthy state and so BSP is uniquely kept in CPU0. Thus, through the kernel parameter, only one initial APIC ID can be specified. In a comparison with disabled_cpu_apicid, we use read_apic_id(), not boot_cpu_physical_apicid, because on some platforms, the variable is modified to the apicid reported as BSP through MP table and this function is executed with the temporarily modified boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel parameter doesn't work well for apicids of APs. Fixing the wrong handling of boot_cpu_physical_apicid requires some reviews and tests beyond some platforms and it could take some time. The fix here is a kind of workaround to focus on the main topic of this patch. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6 Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-14x86/apic: Read Error Status Register correctlyRichard Weinberger
Currently we do a read, a dummy write and a final read to fetch the error code. The value from the final read is taken. This is not the recommended way and leads to corrupted/lost ESR values. Intel(c) 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes 1, 2ABC, 3ABC, Section 10.5.3 states: Before attempt to read from the ESR, software should first write to it. (The value written does not affect the values read subsequently; only zero may be written in x2APIC mode.) This write clears any previously logged errors and updates the ESR with any errors detected since the last write to the ESR. This write also rearms the APIC error interrupt triggering mechanism. This patch removes the first read such that we are conform with the manual. On my (very old) Pentium MMX SMP system this patch fixes the issue that APIC errors: a) are not always reported and b) are reported with false error numbers. Signed-off-by: Richard Weinberger <richard@nod.at> Cc: seiji.aguchi@hds.com Cc: rientjes@google.com Cc: konrad.wilk@oracle.com Cc: bp@alien8.de Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1389685487-20872-1-git-send-email-richard@nod.at Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()Prarit Bhargava
Fengguang Wu's 0day kernel build service reported the following build warning: arch/x86/kernel/apic/io_apic.c:2211 smp_irq_move_cleanup_interrupt() warn: always true condition '(irq <= -1) => (0-u32max <= (-1))' because irq is defined as an unsigned int instead of an int. Fix this trivial error by redefining irq as a signed int. The remaining consumers of the int are okay. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Joerg Roedel <joro@8bytes.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-prarit@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqsPrarit Bhargava
During heavy CPU-hotplug operations the following spurious kernel warnings can trigger: do_IRQ: No ... irq handler for vector (irq -1) [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ] When downing a cpu it is possible that there are unhandled irqs left in the APIC IRR register. The following code path shows how the problem can occur: 1. CPU 5 is to go down. 2. cpu_disable() on CPU 5 executes with interrupt flag cleared by local_irq_save() via stop_machine(). 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because interrupt flag is cleared (CPU unabled to handle the irq) 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set to -1. 5. stop_machine() finishes cpu_disable() 6. cpu_die() for CPU 5 executes in normal context. 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for IRQ 12. The code attempts to find the vector's IRQ and cannot because it has been set to -1. 8. do_IRQ() warning displays warning about CPU 5 IRQ 12. I added a debug printk to output which CPU & vector was retriggered and discovered that that we are getting bogus events. I see a 100% correlation between this debug printk in fixup_irqs() and the do_IRQ() warning. This patchset resolves this by adding definitions for VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying the code to use them. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831 Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Rui Wang <rui.y.wang@intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: janet.morgan@Intel.com Cc: tony.luck@Intel.com Cc: ruiv.wang@gmail.com Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-prarit@redhat.com [ Cleaned up the code a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-06x86: Delete non-required instances of include <linux/init.h>Paul Gortmaker
None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. [ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-12-07ACPI: Clean up inclusions of ACPI header filesLv Zheng
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h> inclusions and remove some inclusions of those files that aren't necessary. First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h> should not be included directly from any files that are built for CONFIG_ACPI unset, because that generally leads to build warnings about undefined symbols in !CONFIG_ACPI builds. For CONFIG_ACPI set, <linux/acpi.h> includes those files and for CONFIG_ACPI unset it provides stub ACPI symbols to be used in that case. Second, there are ordering dependencies between those files that always have to be met. Namely, it is required that <acpi/acpi_bus.h> be included prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the latter depends on are always there. And <acpi/acpi.h> which provides basic ACPICA type declarations should always be included prior to any other ACPI headers in CONFIG_ACPI builds. That also is taken care of including <linux/acpi.h> as appropriate. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Tony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff) Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-19Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "A modular build fix for certain .config's" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Export 'boot_cpu_physical_apicid' to modules
2013-11-15x86: Export 'boot_cpu_physical_apicid' to modulesDavid Rientjes
Commit 9ebddac7ea2a "ACPI, x86: Fix extended error log driver to depend on CONFIG_X86_LOCAL_APIC" fixed a build error when CONFIG_X86_LOCAL_APIC was not selected and !CONFIG_SMP. However, since CONFIG_ACPI_EXTLOG is tristate, there is a second build error: ERROR: "boot_cpu_physical_apicid" [drivers/acpi/acpi_extlog.ko] undefined! The symbol needs to be exported for it to be available. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311141504080.30112@chino.kir.corp.google.com [ Changed it to a _GPL() export. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-14Merge tag 'pm+acpi-3.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael J Wysocki: - New power capping framework and the the Intel Running Average Power Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan. - Addition of the in-kernel switching feature to the arm_big_little cpufreq driver from Viresh Kumar and Nicolas Pitre. - cpufreq support for iMac G5 from Aaro Koskinen. - Baytrail processors support for intel_pstate from Dirk Brandewie. - cpufreq support for Midway/ECX-2000 from Mark Langsdorf. - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha. - ACPI power management support for the I2C and SPI bus types from Mika Westerberg and Lv Zheng. - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat, Stratos Karafotis, Xiaoguang Chen, Lan Tianyu. - cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev. - intel_pstate updates from Dirk Brandewie and Adrian Huang. - ACPICA update to version 20130927 includig fixes and cleanups and some reduction of divergences between the ACPICA code in the kernel and ACPICA upstream in order to improve the automatic ACPICA patch generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh Bhat, Bjorn Helgaas, David E Box. - ACPI IPMI driver fixes and cleanups from Lv Zheng. - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang Yanfei, Rafael J Wysocki. - Conversion of the ACPI AC driver to the platform bus type and multiple driver fixes and cleanups related to ACPI from Zhang Rui. - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu, Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki. - Fixes and cleanups and new blacklist entries related to the ACPI video support from Aaron Lu, Felipe Contreras, Lennart Poettering, Kirill Tkhai. - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi. - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han, Bartlomiej Zolnierkiewicz, Prarit Bhargava. - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe. - Operation Performance Points (OPP) core updates from Nishanth Menon. - Runtime power management core fix from Rafael J Wysocki and update from Ulf Hansson. - Hibernation fixes from Aaron Lu and Rafael J Wysocki. - Device suspend/resume lockup detection mechanism from Benoit Goby. - Removal of unused proc directories created for various ACPI drivers from Lan Tianyu. - ACPI LPSS driver fix and new device IDs for the ACPI platform scan handler from Heikki Krogerus and Jarkko Nikula. - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa. - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter, Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause, Liu Chuansheng. - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding, Jean-Christophe Plagniol-Villard. * tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits) cpufreq: conservative: fix requested_freq reduction issue ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines PM / runtime: Use pm_runtime_put_sync() in __device_release_driver() ACPI / event: remove unneeded NULL pointer check Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1" ACPI / video: Quirk initial backlight level 0 ACPI / video: Fix initial level validity test intel_pstate: skip the driver if ACPI has power mgmt option PM / hibernate: Avoid overflow in hibernate_preallocate_memory() ACPI / hotplug: Do not execute "insert in progress" _OST ACPI / hotplug: Carry out PCI root eject directly ACPI / hotplug: Merge device hot-removal routines ACPI / hotplug: Make acpi_bus_hot_remove_device() internal ACPI / hotplug: Simplify device ejection routines ACPI / hotplug: Fix handle_root_bridge_removal() ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug ACPI / scan: Start matching drivers after trying scan handlers ACPI: Remove acpi_pci_slot_init() headers from internal.h ACPI / blacklist: fix name of ThinkPad Edge E530 PowerCap: Fix build error with option -Werror=format-security ... Conflicts: arch/arm/mach-omap2/opp.c drivers/Kconfig drivers/spi/spi.c
2013-11-12Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV debug changes from Ingo Molnar: "Various SGI UV debuggability improvements, amongst them KDB support, with related core KDB enabling patches changing kernel/debug/kdb/" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86/UV: Add uvtrace support" x86/UV: Add call to KGDB/KDB from NMI handler kdb: Add support for external NMI handler to call KGDB/KDB x86/UV: Check for alloc_cpumask_var() failures properly in uv_nmi_setup() x86/UV: Add uvtrace support x86/UV: Add kdump to UV NMI handler x86/UV: Add summary of cpu activity to UV NMI handler x86/UV: Update UV support for external NMI signals x86/UV: Move NMI support
2013-10-28Merge branch 'acpi-processor'Rafael J. Wysocki
* acpi-processor: ACPI / processor: fixed a brace coding style issue ACPI / processor: Remove outdated comments ACPI / processor: remove unnecessary if (!pr) check ACPI / processor: remove some dead code in acpi_processor_get_info() x86 / ACPI: simplify _acpi_map_lsapic() ACPI / processor: use apic_id and remove duplicated _MAT evaluation ACPI / processor: Introduce apic_id in struct processor to save parsed APIC id
2013-10-15x86: Update UV3 hub revision IDRuss Anderson
The UV3 hub revision ID is different than expected. The first revision was supposed to start at 1 but instead will start at 0. Signed-off-by: Russ Anderson <rja@sgi.com> Cc: <stable@kernel.org> # v3.9, v3.10, v3.11 Link: http://lkml.kernel.org/r/20131014161733.GA6274@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Update UV support for external NMI signalsMike Travis
The current UV NMI handler has not been updated for the changes in the system NMI handler and the perf operations. The UV NMI handler reads an MMR in the UV Hub to check to see if the NMI event was caused by the external 'system NMI' that the operator can initiate on the System Mgmt Controller. The problem arises when the perf tools are running, causing millions of perf events per second on very large CPU count systems. Previously this was okay because the perf NMI handler ran at a higher priority on the NMI call chain and if the NMI was a perf event, it would stop calling other NMI handlers remaining on the NMI call chain. Now the system NMI handler calls all the handlers on the NMI call chain including the UV NMI handler. This causes the UV NMI handler to read the MMRs at the same millions per second rate. This can lead to significant performance loss and possible system failures. It also can cause thousands of 'Dazed and Confused' messages being sent to the system console. This effectively makes perf tools unusable on UV systems. To avoid this excessive overhead when perf tools are running, this code has been optimized to minimize reading of the MMRs as much as possible, by moving to the NMI_UNKNOWN notifier chain. This chain is called only when all the users on the standard NMI_LOCAL call chain have been called and none of them have claimed this NMI. There is an exception where the NMI_LOCAL notifier chain is used. When the perf tools are in use, it's possible that the UV NMI was captured by some other NMI handler and then either ignored or mistakenly processed as a perf event. We set a per_cpu ('ping') flag for those CPUs that ignored the initial NMI, and then send them an IPI NMI signal. The NMI_LOCAL handler on each cpu does not need to read the MMR, but instead checks the in memory flag indicating it was pinged. There are two module variables, 'ping_count' indicating how many requested NMI events occurred, and 'ping_misses' indicating how many stray NMI events. These most likely are perf events so it shows the overhead of the perf NMI interrupts and how many MMR reads were avoided. This patch also minimizes the reads of the MMRs by having the first cpu entering the NMI handler on each node set a per HUB in-memory atomic value. (Having a per HUB value avoids sending lock traffic over NumaLink.) Both types of UV NMIs from the SMI layer are supported. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.353547733@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86/UV: Move NMI supportMike Travis
This patch moves the UV NMI support from the x2apic file to a new separate uv_nmi.c file in preparation for the next sequence of patches. It prevents upcoming bloat of the x2apic file, and has the added benefit of putting the upcoming /sys/module parameters under the name 'uv_nmi' instead of 'x2apic_uv_x', which was obscure. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20130923212500.183295611@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-24x86 / ACPI: simplify _acpi_map_lsapic()Jiang Liu
In acpi_register_lapic(), it will generates a new logical cpu number and maps to the local APIC id, this logical cpu number can be returned to simplify _acpi_map_lsapic() implementation. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-04Merge branch 'x86-asmlinkage-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/asmlinkage changes from Ingo Molnar: "As a preparation for Andi Kleen's LTO patchset (link time optimizations using GCC's -flto which build time optimization has steadily increased in quality over the past few years and might eventually be usable for the kernel too) this tree includes a handful of preparatory patches that make function calling convention annotations consistent again: - Mark every function without arguments (or 64bit only) that is used by assembly code with asmlinkage() - Mark every function with parameters or variables that is used by assembly code as __visible. For the vanilla kernel this has documentation, consistency and debuggability advantages, for the time being" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asmlinkage: Fix warning in xen asmlinkage change x86, asmlinkage, vdso: Mark vdso variables __visible x86, asmlinkage, power: Make various symbols used by the suspend asm code visible x86, asmlinkage: Make dump_stack visible x86, asmlinkage: Make 64bit checksum functions visible x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops x86, asmlinkage, apm: Make APM data structure used from assembler visible x86, asmlinkage: Make syscall tables visible x86, asmlinkage: Make several variables used from assembler/linker script visible x86, asmlinkage: Make kprobes code visible and fix assembler code x86, asmlinkage: Make various syscalls asmlinkage x86, asmlinkage: Make 32bit/64bit __switch_to visible x86, asmlinkage: Make _*_start_kernel visible x86, asmlinkage: Make all interrupt handlers asmlinkage / __visible x86, asmlinkage: Change dotraplinkage into __visible on 32bit x86: Fix sys_call_table type in asm/syscall.h
2013-08-26x86/ioapic: Check attr against the previous setting when programmed more ↵Liu Ping Fan
than once When programming ioapic pinX more than once, current code does not check whether the later attr (trigger & polarity) is the same as the former or not. This causes broken semantics which can be observed in a qemu q35 machine, where ioapic's ioredtbl[x] can never be set as low-active, even if the hpet driver registered it. And hpet driver may share a high-level active IRQ line with other devices. So in qemu, when hpet-dev asserts low-level as kernel expects, the kernel has no response. With this patch, we can observe an ioredtbl[x] set as low-active for hpet. Fix it by reporting -EBUSY to the caller, when attr is different. Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Cc: Kevin Hao <haokexin@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1377248327-19633-1-git-send-email-pingfank@linux.vnet.ibm.com [ Made small readability edits to both the changelog and the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-20x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lockYoshihiro YUNOMAE
Prevent crash_kexec() from deadlocking on ioapic_lock. When crash_kexec() is executed on a CPU, the CPU will take ioapic_lock in disable_IO_APIC(). So if the cpu gets an NMI while locking ioapic_lock, a deadlock will happen. In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC(). You can reproduce this deadlock the following way: 1. Add mdelay(1000) after raw_spin_lock_irqsave() in native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c Although the deadlock can occur without this modification, it will increase the potential of the deadlock problem. 2. Build and install the kernel 3. Set up the OS which will run panic() and kexec when NMI is injected # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf # vim /etc/default/grub add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line # grub2-mkconfig 4. Reboot the OS 5. Run following command for each vcpu on the guest # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done; By running this command, cpus will get ioapic_lock for setting affinity. 6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you use VM). After injecting NMI, panic() is called in an nmi-handler context. Then, kexec will normally run in panic(), but the operation will be stopped by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()-> native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()-> clear_IO_APIC_pin()->ioapic_read_entry(). Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevel Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-06x86, asmlinkage: Make all interrupt handlers asmlinkage / __visibleAndi Kleen
These handlers are all referenced from assembler stubs, so need to be visible. The handlers without arguments become asmlinkage, the others __visible to not force regparms(0) on x86-32. I put it all into a single patch, please let me know if you want it it split up. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1375740170-7446-4-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-07-14x86: delete __cpuinit usage from all x86 filesPaul Gortmaker
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) are flagged as __cpuinit -- so if we remove the __cpuinit from arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the arch/x86 uses of the __cpuinit macros from all C files. x86 only had the one __CPUINIT used in assembly files, and it wasn't paired off with a .previous or a __FINIT, so we can delete it directly w/o any corresponding additional change there. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-09reboot: move arch/x86 reboot= handling to generic kernelRobin Holt
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt <holt@sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Russ Anderson <rja@sgi.com> Cc: Robin Holt <holt@sgi.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-02Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV update from Ingo Molnar: "There's a single commit in this tree, which adds support for a new SGI UV GRU (Global Reference Unit - fast NUMA messaging ASIC) hardware feature to scale up and beyond: an optional distributed mode that will allow per-node address mapping of local GRU space, as opposed to mapping all GRU hardware to the same contiguous high space" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/UV: Add GRU distributed mode mappings
2013-07-02Merge branch 'x86-tracing-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tracing updates from Ingo Molnar: "This tree adds IRQ vector tracepoints that are named after the handler and which output the vector #, based on a zero-overhead approach that relies on changing the IDT entries, by Seiji Aguchi. The new tracepoints look like this: # perf list | grep -i irq_vector irq_vectors:local_timer_entry [Tracepoint event] irq_vectors:local_timer_exit [Tracepoint event] irq_vectors:reschedule_entry [Tracepoint event] irq_vectors:reschedule_exit [Tracepoint event] irq_vectors:spurious_apic_entry [Tracepoint event] irq_vectors:spurious_apic_exit [Tracepoint event] irq_vectors:error_apic_entry [Tracepoint event] irq_vectors:error_apic_exit [Tracepoint event] [...]" * 'x86-tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tracing: Add config option checking to the definitions of mce handlers trace,x86: Do not call local_irq_save() in load_current_idt() trace,x86: Move creation of irq tracepoints from apic.c to irq.c x86, trace: Add irq vector tracepoints x86: Rename variables for debugging x86, trace: Introduce entering/exiting_irq() tracing: Add DEFINE_EVENT_FN() macro
2013-07-02Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc x86 cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, reloc: Use xorl instead of xorq in relocate_kernel_64.S x86, cleanups: Remove extra tab in __flush_tlb_one() x86/mce: Remove check for CONFIG_X86_MCE_P4THERMAL
2013-06-21trace,x86: Move creation of irq tracepoints from apic.c to irq.cSteven Rostedt (Red Hat)
Compiling without CONFIG_X86_LOCAL_APIC set, apic.c will not be compiled, and the irq tracepoints will not be created via the CREATE_TRACE_POINTS macro. When CONFIG_X86_LOCAL_APIC is not set, we get the following build error: LD init/built-in.o arch/x86/built-in.o: In function `trace_x86_platform_ipi_entry': linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_entry' arch/x86/built-in.o: In function `trace_x86_platform_ipi_exit': linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_exit' arch/x86/built-in.o: In function `trace_irq_work_entry': linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_entry' arch/x86/built-in.o: In function `trace_irq_work_exit': linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_exit' arch/x86/built-in.o:(__jump_table+0x8): undefined reference to `__tracepoint_x86_platform_ipi_entry' arch/x86/built-in.o:(__jump_table+0x14): undefined reference to `__tracepoint_x86_platform_ipi_exit' arch/x86/built-in.o:(__jump_table+0x20): undefined reference to `__tracepoint_irq_work_entry' arch/x86/built-in.o:(__jump_table+0x2c): undefined reference to `__tracepoint_irq_work_exit' make[1]: *** [vmlinux] Error 1 make: *** [sub-make] Error 2 As irq.c is always compiled for x86, it is a more appropriate location to create the irq tracepoints. Cc: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-20x86, trace: Add irq vector tracepointsSeiji Aguchi
[Purpose of this patch] As Vaibhav explained in the thread below, tracepoints for irq vectors are useful. http://www.spinics.net/lists/mm-commits/msg85707.html <snip> The current interrupt traces from irq_handler_entry and irq_handler_exit provide when an interrupt is handled. They provide good data about when the system has switched to kernel space and how it affects the currently running processes. There are some IRQ vectors which trigger the system into kernel space, which are not handled in generic IRQ handlers. Tracing such events gives us the information about IRQ interaction with other system events. The trace also tells where the system is spending its time. We want to know which cores are handling interrupts and how they are affecting other processes in the system. Also, the trace provides information about when the cores are idle and which interrupts are changing that state. <snip> On the other hand, my usecase is tracing just local timer event and getting a value of instruction pointer. I suggested to add an argument local timer event to get instruction pointer before. But there is another way to get it with external module like systemtap. So, I don't need to add any argument to irq vector tracepoints now. [Patch Description] Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events. But there is an above use case to trace specific irq_vector rather than tracing all events. In this case, we are concerned about overhead due to unwanted events. So, add following tracepoints instead of introducing irq_vector_entry/exit. so that we can enable them independently. - local_timer_vector - reschedule_vector - call_function_vector - call_function_single_vector - irq_work_entry_vector - error_apic_vector - thermal_apic_vector - threshold_apic_vector - spurious_apic_vector - x86_platform_ipi_vector Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty makes a zero when tracepoints are disabled. Detailed explanations are as follows. - Create trace irq handlers with entering_irq()/exiting_irq(). - Create a new IDT, trace_idt_table, at boot time by adding a logic to _set_gate(). It is just a copy of original idt table. - Register the new handlers for tracpoints to the new IDT by introducing macros to alloc_intr_gate() called at registering time of irq_vector handlers. - Add checking, whether irq vector tracing is on/off, into load_current_idt(). This has to be done below debug checking for these reasons. - Switching to debug IDT may be kicked while tracing is enabled. - On the other hands, switching to trace IDT is kicked only when debugging is disabled. In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being used for other purposes. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2013-06-20x86, trace: Introduce entering/exiting_irq()Seiji Aguchi
When implementing tracepoints in interrupt handers, if the tracepoints are simply added in the performance sensitive path of interrupt handers, it may cause potential performance problem due to the time penalty. To solve the problem, an idea is to prepare non-trace/trace irq handers and switch their IDTs at the enabling/disabling time. So, let's introduce entering_irq()/exiting_irq() for pre/post- processing of each irq handler. A way to use them is as follows. Non-trace irq handler: smp_irq_handler() { entering_irq(); /* pre-processing of this handler */ __smp_irq_handler(); /* * common logic between non-trace and trace handlers * in a vector. */ exiting_irq(); /* post-processing of this handler */ } Trace irq_handler: smp_trace_irq_handler() { entering_irq(); /* pre-processing of this handler */ trace_irq_entry(); /* tracepoint for irq entry */ __smp_irq_handler(); /* * common logic between non-trace and trace handlers * in a vector. */ trace_irq_exit(); /* tracepoint for irq exit */ exiting_irq(); /* post-processing of this handler */ } If tracepoints can place outside entering_irq()/exiting_irq() as follows, it looks cleaner. smp_trace_irq_handler() { trace_irq_entry(); smp_irq_handler(); trace_irq_exit(); } But it doesn't work. The problem is with irq_enter/exit() being called. They must be called before trace_irq_enter/exit(), because of the rcu_irq_enter() must be called before any tracepoints are used, as tracepoints use rcu to synchronize. As a possible alternative, we may be able to call irq_enter() first as follows if irq_enter() can nest. smp_trace_irq_hander() { irq_entry(); trace_irq_entry(); smp_irq_handler(); trace_irq_exit(); irq_exit(); } But it doesn't work, either. If irq_enter() is nested, it may have a time penalty because it has to check if it was already called or not. The time penalty is not desired in performance sensitive paths even if it is tiny. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2013-06-20x86: Fix trigger_all_cpu_backtrace() implementationMichel Lespinasse
The following change fixes the x86 implementation of trigger_all_cpu_backtrace(), which was previously (accidentally, as far as I can tell) disabled to always return false as on architectures that do not implement this function. trigger_all_cpu_backtrace(), as defined in include/linux/nmi.h, should call arch_trigger_all_cpu_backtrace() if available, or return false if the underlying arch doesn't implement this function. x86 did provide a suitable arch_trigger_all_cpu_backtrace() implementation, but it wasn't actually being used because it was declared in asm/nmi.h, which linux/nmi.h doesn't include. Also, linux/nmi.h couldn't easily be fixed by including asm/nmi.h, because that file is not available on all architectures. I am proposing to fix this by moving the x86 definition of arch_trigger_all_cpu_backtrace() to asm/irq.h. Tested via: echo l > /proc/sysrq-trigger Before the change, this uses a fallback implementation which shows backtraces on active CPUs (using smp_call_function_interrupt() ) After the change, this shows NMI backtraces on all CPUs Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1370518875-1346-1-git-send-email-walken@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-31x86/mce: Remove check for CONFIG_X86_MCE_P4THERMALPaul Bolle
The Kconfig symbol X86_MCE_P4THERMAL was removed in v2.6.32. Remove a useless check for its macro, as it will now always evaluate to false. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Link: http://lkml.kernel.org/r/1369853850.23034.28.camel@x61.thuisdomein Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-30x86/UV: Add GRU distributed mode mappingsDimitri Sivanich
GRU hardware will support an optional distributed mode that will allow per-node address mapping of local GRU space, as opposed to mapping all GRU hardware to the same contiguous high space. If GRU distributed mode is selected, setup per-node page table mappings. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Russ Anderson <rja@sgi.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20130529155609.GB22917@sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge Revert "x86, mm: Make spurious_fault check explicitly check explicitly check the PRESENT bit" x86/mm/numa: Don't check if node is NUMA_NO_NODE x86, efi: Make "noefi" really disable EFI runtime serivces x86/apic: Fix parsing of the 'lapic' cmdline option
2013-02-21Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Peter Anvin: "This is a huge set of several partly interrelated (and concurrently developed) changes, which is why the branch history is messier than one would like. The *really* big items are two humonguous patchsets mostly developed by Yinghai Lu at my request, which completely revamps the way we create initial page tables. In particular, rather than estimating how much memory we will need for page tables and then build them into that memory -- a calculation that has shown to be incredibly fragile -- we now build them (on 64 bits) with the aid of a "pseudo-linear mode" -- a #PF handler which creates temporary page tables on demand. This has several advantages: 1. It makes it much easier to support things that need access to data very early (a followon patchset uses this to load microcode way early in the kernel startup). 2. It allows the kernel and all the kernel data objects to be invoked from above the 4 GB limit. This allows kdump to work on very large systems. 3. It greatly reduces the difference between Xen and native (Xen's equivalent of the #PF handler are the temporary page tables created by the domain builder), eliminating a bunch of fragile hooks. The patch series also gets us a bit closer to W^X. Additional work in this pull is the 64-bit get_user() work which you were also involved with, and a bunch of cleanups/speedups to __phys_addr()/__pa()." * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits) x86, mm: Move reserving low memory later in initialization x86, doc: Clarify the use of asm("%edx") in uaccess.h x86, mm: Redesign get_user with a __builtin_choose_expr hack x86: Be consistent with data size in getuser.S x86, mm: Use a bitfield to mask nuisance get_user() warnings x86/kvm: Fix compile warning in kvm_register_steal_time() x86-32: Add support for 64bit get_user() x86-32, mm: Remove reference to alloc_remap() x86-32, mm: Remove reference to resume_map_numa_kva() x86-32, mm: Rip out x86_32 NUMA remapping code x86/numa: Use __pa_nodebug() instead x86: Don't panic if can not alloc buffer for swiotlb mm: Add alloc_bootmem_low_pages_nopanic() x86, 64bit, mm: hibernate use generic mapping_init x86, 64bit, mm: Mark data/bss/brk to nx x86: Merge early kernel reserve for 32bit and 64bit x86: Add Crash kernel low reservation x86, kdump: Remove crashkernel range find limit for 64bit memblock: Add memblock_mem_size() x86, boot: Not need to check setup_header version for setup_data ...
2013-02-20x86/apic: Fix parsing of the 'lapic' cmdline optionMathias Krause
Including " lapic " in the kernel cmdline on an x86-64 kernel makes it panic while parsing early params -- e.g. with no user visible output. Fix this bug by ensuring arg is non-NULL before passing it to strncmp(). Reported-by: PaX Team <pageexec@freemail.hu> Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1361303227-13174-1-git-send-email-minipli@googlemail.com Cc: stable@vger.kernel.org # v3.8 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV3 support update from Ingo Molnar: "Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming third major iteration and upscaling of the SGI UV supercomputing platform." * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3 x86, uv, uv3: Check current gru hub support for SGI UV3 x86, uv, uv3: Update Time Support for SGI UV3 x86, uv, uv3: Update x2apic Support for SGI UV3 x86, uv, uv3: Update Hub Info for SGI UV3 x86, uv, uv3: Update ACPI Check to include SGI UV3 x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)
2013-02-19Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic changes from Ingo Molnar: "Main changes: - Multiple MSI support added to the APIC, PCI and AHCI code - acked by all relevant maintainers, by Alexander Gordeev. The advantage is that multiple AHCI ports can have multiple MSI irqs assigned, and can thus spread to multiple CPUs. [ Drivers can make use of this new facility via the pci_enable_msi_block_auto() method ] - x86 IOAPIC code from interrupt remapping cleanups from Joerg Roedel: These patches move all interrupt remapping specific checks out of the x86 core code and replaces the respective call-sites with function pointers. As a result the interrupt remapping code is better abstraced from x86 core interrupt handling code. - Various smaller improvements, fixes and cleanups." * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess x86, kvm: Fix intialization warnings in kvm.c x86, irq: Move irq_remapped out of x86 core code x86, io_apic: Introduce eoi_ioapic_pin call-back x86, msi: Introduce x86_msi.compose_msi_msg call-back x86, irq: Introduce setup_remapped_irq() x86, irq: Move irq_remapped() check into free_remapped_irq x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq() x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core x86, irq: Add data structure to keep AMD specific irq remapping information x86, irq: Move irq_remapping_enabled declaration to iommu code x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin x86, io_apic: Move irq_remapping_enabled checks out of check_timer() x86, io_apic: Convert setup_ioapic_entry to function pointer x86, io_apic: Introduce set_affinity function pointer x86, msi: Use IRQ remapping specific setup_msi_irqs routine x86, hpet: Introduce x86_msi_ops.setup_hpet_msi x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging x86, io_apic: Introduce x86_io_apic_ops.disable() x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume ...
2013-02-11x86, uv, uv3: Update x2apic Support for SGI UV3Mike Travis
This patch adds support for the SGI UV3 hub to the common x2apic functions. The primary changes are to account for the similarities between UV2 and UV3 which are encompassed within the "UVX" nomenclature. One significant difference within UV3 is the handling of the MMIOH regions which are redirected to the target blade (with the device) in a different manner. It also now has two MMIOH regions for both small and large BARs. This aids in limiting the amount of physical address space removed from real memory that's used for I/O in the max config of 64TB. Signed-off-by: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20130211194508.752924185@gulag1.americas.sgi.com Acked-by: Russ Anderson <rja@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Steffen Persvold <sp@numascale.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-11x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systemsStoney Wang
When a HP ProLiant DL980 G7 Server boots a regular kernel, there will be intermittent lost interrupts which could result in a hang or (in extreme cases) data loss. The reason is that this system only supports x2apic physical mode, while the kernel boots with a logical-cluster default setting. This bug can be worked around by specifying the "x2apic_phys" or "nox2apic" boot option, but we want to handle this system without requiring manual workarounds. The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table. As all apicids are smaller than 255, BIOS need to pass the control to the OS with xapic mode, according to x2apic-spec, chapter 2.9. Current code handle x2apic when BIOS pass with xapic mode enabled: When user specifies x2apic_phys, or FADT indicates PHYSICAL: 1. During madt oem check, apic driver is set with xapic logical or xapic phys driver at first. 2. enable_IR_x2apic() will enable x2apic_mode. 3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe() will install the correct x2apic phys driver and use x2apic phys mode. Otherwise it will skip the driver will let x2apic_cluster_probe to take over to install x2apic cluster driver (wrong one) even though FADT indicates PHYSICAL, because x2apic_phys_probe does not check FADT PHYSICAL. Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the problem. Signed-off-by: Stoney Wang <song-bo.wang@hp.com> [ updated the changelog and simplified the code ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-28x86, io_apic: Introduce eoi_ioapic_pin call-backJoerg Roedel
This callback replaces the old __eoi_ioapic_pin function which needs a special path for interrupt remapping. Signed-off-by: Joerg Roedel <joro@8bytes.org> Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-28x86, msi: Introduce x86_msi.compose_msi_msg call-backJoerg Roedel
This call-back points to the right function for initializing the msi_msg structure. The old code for msi_msg generation was split up into the irq-remapped and the default case. The irq-remapped case just calls into the specific Intel or AMD implementation when the device is behind an IOMMU. Otherwise the default function is called. Signed-off-by: Joerg Roedel <joro@8bytes.org> Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-28x86, irq: Introduce setup_remapped_irq()Joerg Roedel
This function does irq-remapping specific interrupt setup like modifying the chip defaults. Signed-off-by: Joerg Roedel <joro@8bytes.org> Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>