summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-10-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "This is a pretty large update. I think it is roughly as big as what I usually had for the _whole_ rc period. There are a few bad bugs where the guest can OOPS or crash the host. We have also started looking at attack models for nested virtualization; bugs that usually result in the guest ring 0 crashing itself become more worrisome if you have nested virtualization, because the nested guest might bring down the non-nested guest as well. For current uses of nested virtualization these do not really have a security impact, but you never know and bugs are bugs nevertheless. A lot of these bugs are in 3.17 too, resulting in a large number of stable@ Ccs. I checked that all the patches apply there with no conflicts" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: vfio: fix unregister kvm_device_ops of vfio KVM: x86: Wrong assertion on paging_tmpl.h kvm: fix excessive pages un-pinning in kvm_iommu_map error path. KVM: x86: PREFETCH and HINT_NOP should have SrcMem flag KVM: x86: Emulator does not decode clflush well KVM: emulate: avoid accessing NULL ctxt->memopp KVM: x86: Decoding guest instructions which cross page boundary may fail kvm: x86: don't kill guest on unknown exit reason kvm: vmx: handle invvpid vm exit gracefully KVM: x86: Handle errors when RIP is set during far jumps KVM: x86: Emulator fixes for eip canonical checks on near branches KVM: x86: Fix wrong masking on relative jump/call KVM: x86: Improve thread safety in pit KVM: x86: Prevent host from panicking on shared MSR writes. KVM: x86: Check non-canonical addresses upon WRMSR
2014-10-24Merge tag 'stable/for-linus-3.18-b-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Fix regression in xen_clocksource_read() which caused all Xen guests to crash early in boot. - Several fixes for super rare race conditions in the p2m. - Assorted other minor fixes. * tag 'stable/for-linus-3.18-b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pci: Allocate memory for physdev_pci_device_add's optarr x86/xen: panic on bad Xen-provided memory map x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read() x86/xen: avoid race in p2m handling x86/xen: delay construction of mfn_list_list x86/xen: avoid writing to freed memory after race in p2m handling xen/balloon: Don't continue ballooning when BP_ECANCELED is encountered
2014-10-24Merge tag 'sound-3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Here are a chunk of small fixes since rc1: two PCM core fixes, one is a long-standing annoyance about lockdep and another is an ARM64 mmap fix. The rest are a HD-audio HDMI hotplug notification fix, a fix for missing NULL termination in Realtek codec quirks and a few new device/codec-specific quirks as usual" * tag 'sound-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macro ALSA: pcm: Fix false lockdep warnings ALSA: hda - Fix inverted LED gpio setup for Lenovo Ideapad ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug ALSA: usb-audio: Add support for Steinberg UR22 USB interface ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume ALSA: pcm: use the same dma mmap codepath both for arm and arm64
2014-10-24Merge tag 'random_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random updates from Ted Ts'o: "This adds a memzero_explicit() call which is guaranteed not to be optimized away by GCC. This is important when we are wiping cryptographically sensitive material" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: crypto: memzero_explicit - make sure to clear out sensitive data random: add and use memzero_explicit() for clearing data
2014-10-24Merge tag 'pm+acpi-3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "This is material that didn't make it to my 3.18-rc1 pull request for various reasons, mostly related to timing and travel (LinuxCon EU / LPC) plus a couple of fixes for recent bugs. The only really new thing here is the PM QoS class for memory bandwidth, but it is simple enough and users of it will be added in the next cycle. One major change in behavior is that platform devices enumerated by ACPI will use 32-bit DMA mask by default. Also included is an ACPICA update to a new upstream release, but that's mostly cleanups, changes in tools and similar. The rest is fixes and cleanups mostly. Specifics: - Fix for a recent PCI power management change that overlooked the fact that some IRQ chips might not be able to configure PCIe PME for system wakeup from Lucas Stach. - Fix for a bug introduced in 3.17 where acpi_device_wakeup() is called with a wrong ordering of arguments from Zhang Rui. - A bunch of intel_pstate driver fixes (all -stable candidates) from Dirk Brandewie, Gabriele Mazzotta and Pali Rohár. - Fixes for a rather long-standing problem with the OOM killer and the freezer that frozen processes killed by the OOM do not actually release any memory until they are thawed, so OOM-killing them is rather pointless, with a couple of cleanups on top (Michal Hocko, Cong Wang, Rafael J Wysocki). - ACPICA update to upstream release 20140926, inlcuding mostly cleanups reducing differences between the upstream ACPICA and the kernel code, tools changes (acpidump, acpiexec) and support for the _DDN object (Bob Moore, Lv Zheng). - New PM QoS class for memory bandwidth from Tomeu Vizoso. - Default 32-bit DMA mask for platform devices enumerated by ACPI (this change is mostly needed for some drivers development in progress targeted at 3.19) from Heikki Krogerus. - ACPI EC driver cleanups, mostly related to debugging, from Lv Zheng. - cpufreq-dt driver updates from Thomas Petazzoni. - powernv cpuidle driver update from Preeti U Murthy" * tag 'pm+acpi-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (34 commits) intel_pstate: Correct BYT VID values. intel_pstate: Fix BYT frequency reporting intel_pstate: Don't lose sysfs settings during cpu offline cpufreq: intel_pstate: Reflect current no_turbo state correctly cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy PCI / PM: handle failure to enable wakeup on PCIe PME ACPI: invoke acpi_device_wakeup() with correct parameters PM / freezer: Clean up code after recent fixes PM: convert do_each_thread to for_each_process_thread OOM, PM: OOM killed task shouldn't escape PM suspend freezer: remove obsolete comments in __thaw_task() freezer: Do not freeze tasks killed by OOM killer ACPI / platform: provide default DMA mask cpuidle: powernv: Populate cpuidle state details by querying the device-tree cpufreq: cpufreq-dt: adjust message related to regulators cpufreq: cpufreq-dt: extend with platform_data cpufreq: allow driver-specific data ACPI / EC: Cleanup coding style. ACPI / EC: Refine event/query debugging messages. ...
2014-10-24Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: "Sorry that I missed the merge window as there is a bug found in the last minute, and I have to fix it and wait for the code to be tested in linux-next tree for a few days. Now the buggy patch has been dropped entirely from my next branch. Thus I hope those changes can still be merged in 3.18-rc2 as most of them are platform thermal driver changes. Specifics: - introduce ACPI INT340X thermal drivers. Newer laptops and tablets may have thermal sensors and other devices with thermal control capabilities that are exposed for the OS to use via the ACPI INT340x device objects. Several drivers are introduced to expose the temperature information and cooling ability from these objects to user-space via the normal thermal framework. From: Lu Aaron, Lan Tianyu, Jacob Pan and Zhang Rui. - introduce a new thermal governor, which just uses a hysteresis to switch abruptly on/off a cooling device. This governor can be used to control certain fan devices that can not be throttled but just switched on or off. From: Peter Feuerer. - introduce support for some new thermal interrupt functions on i.MX6SX, in IMX thermal driver. From: Anson, Huang. - introduce tracing support on thermal framework. From: Punit Agrawal. - small fixes in OF thermal and thermal step_wise governor" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits) Thermal: int340x thermal: select ACPI fan driver Thermal: int3400_thermal: use acpi_thermal_rel parsing APIs Thermal: int340x_thermal: expose acpi thermal relationship tables Thermal: introduce int3403 thermal driver Thermal: introduce INT3402 thermal driver Thermal: move the KELVIN_TO_MILLICELSIUS macro to thermal.h ACPI / Fan: support INT3404 thermal device ACPI / Fan: add ACPI 4.0 style fan support ACPI / fan: convert to platform driver ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant ACPI / fan: remove no need check for device pointer ACPI / fan: remove unused macro Thermal: int3400 thermal: register to thermal framework Thermal: int3400 thermal: add capability to detect supporting UUIDs Thermal: introduce int3400 thermal driver ACPI: add ACPI_TYPE_LOCAL_REFERENCE support to acpi_extract_package() ACPI: make acpi_create_platform_device() an external API thermal: step_wise: fix: Prevent from binary overflow when trend is dropping ACPI: introduce ACPI int340x thermal scan handler thermal: Added Bang-bang thermal governor ...
2014-10-24kvm: vfio: fix unregister kvm_device_ops of vfioWanpeng Li
After commit 80ce163 (KVM: VFIO: register kvm_device_ops dynamically), kvm_device_ops of vfio can be registered dynamically. Commit 3c3c29fd (kvm-vfio: do not use module_init) move the dynamic register invoked by kvm_init in order to fix broke unloading of the kvm module. However, kvm_device_ops of vfio is unregistered after rmmod kvm-intel module which lead to device type collision detection warning after kvm-intel module reinsmod. WARNING: CPU: 1 PID: 10358 at /root/cathy/kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:3289 kvm_init+0x234/0x282 [kvm]() Modules linked in: kvm_intel(O+) kvm(O) nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd sunrpc pci_stub bridge stp llc autofs4 8021q cpufreq_ondemand ipv6 joydev microcode pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e i2c_i801 ixgbe ptp pps_core hwmon mdio tpm_tis tpm ipmi_si ipmi_msghandler acpi_cpufreq isci libsas scsi_transport_sas button dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kvm_intel] CPU: 1 PID: 10358 Comm: insmod Tainted: G W O 3.17.0-rc1 #2 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 0000000000000cd9 ffff880ff08cfd18 ffffffff814a61d9 0000000000000cd9 0000000000000000 ffff880ff08cfd58 ffffffff810417b7 ffff880ff08cfd48 ffffffffa045bcac ffffffffa049c420 0000000000000040 00000000000000ff Call Trace: [<ffffffff814a61d9>] dump_stack+0x49/0x60 [<ffffffff810417b7>] warn_slowpath_common+0x7c/0x96 [<ffffffffa045bcac>] ? kvm_init+0x234/0x282 [kvm] [<ffffffff810417e6>] warn_slowpath_null+0x15/0x17 [<ffffffffa045bcac>] kvm_init+0x234/0x282 [kvm] [<ffffffffa016e995>] vmx_init+0x1bf/0x42a [kvm_intel] [<ffffffffa016e7d6>] ? vmx_check_processor_compat+0x64/0x64 [kvm_intel] [<ffffffff810002ab>] do_one_initcall+0xe3/0x170 [<ffffffff811168a9>] ? __vunmap+0xad/0xb8 [<ffffffff8109c58f>] do_init_module+0x2b/0x174 [<ffffffff8109d414>] load_module+0x43e/0x569 [<ffffffff8109c6d8>] ? do_init_module+0x174/0x174 [<ffffffff8109c75a>] ? copy_module_from_user+0x39/0x82 [<ffffffff8109b7dd>] ? module_sect_show+0x20/0x20 [<ffffffff8109d65f>] SyS_init_module+0x54/0x81 [<ffffffff814a9a12>] system_call_fastpath+0x16/0x1b ---[ end trace 0626f4a3ddea56f3 ]--- The bug can be reproduced by: rmmod kvm_intel.ko insmod kvm_intel.ko without rmmod/insmod kvm.ko This patch fixes the bug by unregistering kvm_device_ops of vfio when the kvm-intel module is removed. Reported-by: Liu Rongrong <rongrongx.liu@intel.com> Fixes: 3c3c29fd0d7cddc32862c350d0700ce69953e3bd Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Wrong assertion on paging_tmpl.hNadav Amit
Even after the recent fix, the assertion on paging_tmpl.h is triggered. Apparently, the assertion wants to check that the PAE is always set on long-mode, but does it in incorrect way. Note that the assertion is not enabled unless the code is debugged by defining MMU_DEBUG. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24kvm: fix excessive pages un-pinning in kvm_iommu_map error path.Quentin Casasnovas
The third parameter of kvm_unpin_pages() when called from kvm_iommu_map_pages() is wrong, it should be the number of pages to un-pin and not the page size. This error was facilitated with an inconsistent API: kvm_pin_pages() takes a size, but kvn_unpin_pages() takes a number of pages, so fix the problem by matching the two. This was introduced by commit 350b8bd ("kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)"), which fixes the lack of un-pinning for pages intended to be un-pinned (i.e. memory leak) but unfortunately potentially aggravated the number of pages we un-pin that should have stayed pinned. As far as I understand though, the same practical mitigations apply. This issue was found during review of Red Hat 6.6 patches to prepare Ksplice rebootless updates. Thanks to Vegard for his time on a late Friday evening to help me in understanding this code. Fixes: 350b8bd ("kvm: iommu: fix the third parameter of... (CVE-2014-3601)") Cc: stable@vger.kernel.org Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jamie Iles <jamie.iles@oracle.com> Reviewed-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: PREFETCH and HINT_NOP should have SrcMem flagNadav Amit
The decode phase of the x86 emulator assumes that every instruction with the ModRM flag, and which can be used with RIP-relative addressing, has either SrcMem or DstMem. This is not the case for several instructions - prefetch, hint-nop and clflush. Adding SrcMem|NoAccess for prefetch and hint-nop and SrcMem for clflush. This fixes CVE-2014-8480. Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5 Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Emulator does not decode clflush wellNadav Amit
Currently, all group15 instructions are decoded as clflush (e.g., mfence, xsave). In addition, the clflush instruction requires no prefix (66/f2/f3) would exist. If prefix exists it may encode a different instruction (e.g., clflushopt). Creating a group for clflush, and different group for each prefix. This has been the case forever, but the next patch needs the cflush group in order to fix a bug introduced in 3.17. Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5 Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: emulate: avoid accessing NULL ctxt->memoppPaolo Bonzini
A failure to decode the instruction can cause a NULL pointer access. This is fixed simply by moving the "done" label as close as possible to the return. This fixes CVE-2014-8481. Reported-by: Andy Lutomirski <luto@amacapital.net> Cc: stable@vger.kernel.org Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Decoding guest instructions which cross page boundary may failNadav Amit
Once an instruction crosses a page boundary, the size read from the second page disregards the common case that part of the operand resides on the first page. As a result, fetch of long insturctions may fail, and thereby cause the decoding to fail as well. Cc: stable@vger.kernel.org Fixes: 5cfc7e0f5e5e1adf998df94f8e36edaf5d30d38e Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24kvm: x86: don't kill guest on unknown exit reasonMichael S. Tsirkin
KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was triggered by a priveledged application. Let's not kill the guest: WARN and inject #UD instead. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24kvm: vmx: handle invvpid vm exit gracefullyPetr Matousek
On systems with invvpid instruction support (corresponding bit in IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid causes vm exit, which is currently not handled and results in propagation of unknown exit to userspace. Fix this by installing an invvpid vm exit handler. This is CVE-2014-3646. Cc: stable@vger.kernel.org Signed-off-by: Petr Matousek <pmatouse@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Handle errors when RIP is set during far jumpsNadav Amit
Far jmp/call/ret may fault while loading a new RIP. Currently KVM does not handle this case, and may result in failed vm-entry once the assignment is done. The tricky part of doing so is that loading the new CS affects the VMCS/VMCB state, so if we fail during loading the new RIP, we are left in unconsistent state. Therefore, this patch saves on 64-bit the old CS descriptor and restores it if loading RIP failed. This fixes CVE-2014-3647. Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Emulator fixes for eip canonical checks on near branchesNadav Amit
Before changing rip (during jmp, call, ret, etc.) the target should be asserted to be canonical one, as real CPUs do. During sysret, both target rsp and rip should be canonical. If any of these values is noncanonical, a #GP exception should occur. The exception to this rule are syscall and sysenter instructions in which the assigned rip is checked during the assignment to the relevant MSRs. This patch fixes the emulator to behave as real CPUs do for near branches. Far branches are handled by the next patch. This fixes CVE-2014-3647. Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Fix wrong masking on relative jump/callNadav Amit
Relative jumps and calls do the masking according to the operand size, and not according to the address size as the KVM emulator does today. This patch fixes KVM behavior. Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Improve thread safety in pitAndy Honig
There's a race condition in the PIT emulation code in KVM. In __kvm_migrate_pit_timer the pit_timer object is accessed without synchronization. If the race condition occurs at the wrong time this can crash the host kernel. This fixes CVE-2014-3611. Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Prevent host from panicking on shared MSR writes.Andy Honig
The previous patch blocked invalid writes directly when the MSR is written. As a precaution, prevent future similar mistakes by gracefulling handle GPs caused by writes to shared MSRs. Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig <ahonig@google.com> [Remove parts obsoleted by Nadav's patch. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24KVM: x86: Check non-canonical addresses upon WRMSRNadav Amit
Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is written to certain MSRs. The behavior is "almost" identical for AMD and Intel (ignoring MSRs that are not implemented in either architecture since they would anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if non-canonical address is written on Intel but not on AMD (which ignores the top 32-bits). Accordingly, this patch injects a #GP on the MSRs which behave identically on Intel and AMD. To eliminate the differences between the architecutres, the value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to canonical value before writing instead of injecting a #GP. Some references from Intel and AMD manuals: According to Intel SDM description of WRMSR instruction #GP is expected on WRMSR "If the source register contains a non-canonical address and ECX specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE, IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP." According to AMD manual instruction manual: LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the LSTAR and CSTAR registers. If an RIP written by WRMSR is not in canonical form, a general-protection exception (#GP) occurs." IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the base field must be in canonical form or a #GP fault will occur." IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must be in canonical form." This patch fixes CVE-2014-3610. Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macroDavid Henningsson
Without this terminating entry, the pin matching would continue across random memory until a zero or a non-matching entry was found. The result being that in some cases, the pin quirk would not be applied correctly. Cc: stable@vger.kernel.org Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-10-23Merge tag 'remove-weak-declarations' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull weak function declaration removal from Bjorn Helgaas: "The "weak" attribute is commonly used for the default version of a function, where an architecture can override it by providing a strong version. Some header file declarations included the "weak" attribute. That's error-prone because it causes every implementation to be weak, with no strong version at all, and the linker chooses one based on link order. What we want is the "weak" attribute only on the *definition* of the default implementation. These changes remove "weak" from the declarations, leaving it on the default definitions" * tag 'remove-weak-declarations' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: uprobes: Remove "weak" from function declarations memory-hotplug: Remove "weak" from memory_block_size_bytes() declaration kgdb: Remove "weak" from kgdb_arch_pc() declaration ARC: kgdb: generic kgdb_arch_pc() suffices vmcore: Remove "weak" from function declarations clocksource: Remove "weak" from clocksource_default_clock() declaration x86, intel-mid: Remove "weak" from function declarations audit: Remove "weak" from audit_classify_compat_syscall() declaration
2014-10-23Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI updates from Peter Anvin: "This patchset falls under the "maintainers that grovel" clause in the v3.18-rc1 announcement. We had intended to push it late in the merge window since we got it into the -tip tree relatively late. Many of these are relatively simple things, but there are a couple of key bits, especially Ard's and Matt's patches" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) rtc: Disable EFI rtc for x86 efi: rtc-efi: Export platform:rtc-efi as module alias efi: Delete the in_nmi() conditional runtime locking efi: Provide a non-blocking SetVariable() operation x86/efi: Adding efi_printks on memory allocationa and pci.reads x86/efi: Mark initialization code as such x86/efi: Update comment regarding required phys mapped EFI services x86/efi: Unexport add_efi_memmap variable x86/efi: Remove unused efi_call* macros efi: Resolve some shadow warnings arm64: efi: Format EFI memory type & attrs with efi_md_typeattr_format() ia64: efi: Format EFI memory type & attrs with efi_md_typeattr_format() x86: efi: Format EFI memory type & attrs with efi_md_typeattr_format() efi: Introduce efi_md_typeattr_format() efi: Add macro for EFI_MEMORY_UCE memory attribute x86/efi: Clear EFI_RUNTIME_SERVICES if failing to enter virtual mode arm64/efi: Do not enter virtual mode if booting with efi=noruntime or noefi arm64/efi: uefi_init error handling fix efi: Add kernel param efi=noruntime lib: Add a generic cmdline parse function parse_option_str ...
2014-10-23Merge branches 'pm-cpuidle' and 'pm-cpufreq'Rafael J. Wysocki
* pm-cpuidle: cpuidle: powernv: Populate cpuidle state details by querying the device-tree * pm-cpufreq: intel_pstate: Correct BYT VID values. intel_pstate: Fix BYT frequency reporting intel_pstate: Don't lose sysfs settings during cpu offline cpufreq: intel_pstate: Reflect current no_turbo state correctly cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy cpufreq: cpufreq-dt: adjust message related to regulators cpufreq: cpufreq-dt: extend with platform_data cpufreq: allow driver-specific data
2014-10-23Merge branches 'acpi-pm' and 'pm-genirq'Rafael J. Wysocki
* acpi-pm: ACPI: invoke acpi_device_wakeup() with correct parameters * pm-genirq: PCI / PM: handle failure to enable wakeup on PCIe PME
2014-10-23Merge branch 'freezer'Rafael J. Wysocki
* freezer: PM / freezer: Clean up code after recent fixes PM: convert do_each_thread to for_each_process_thread OOM, PM: OOM killed task shouldn't escape PM suspend freezer: remove obsolete comments in __thaw_task() freezer: Do not freeze tasks killed by OOM killer
2014-10-23Merge branch 'pm-qos'Rafael J. Wysocki
* pm-qos: PM / QoS: Add PM_QOS_MEMORY_BANDWIDTH class
2014-10-23Merge branches 'acpi-ec' and 'acpi-platform'Rafael J. Wysocki
* acpi-ec: ACPI / EC: Cleanup coding style. ACPI / EC: Refine event/query debugging messages. ACPI / EC: Add detailed command/query debugging information. ACPI / EC: Enhance the logs to apply to QR_EC transactions. ACPI / EC: Add CPU ID to debugging messages. * acpi-platform: ACPI / platform: provide default DMA mask
2014-10-23intel_pstate: Correct BYT VID values.Dirk Brandewie
Using a VID value that is not high enough for the requested P state can cause machine checks. Add a ceiling function to ensure calulated VIDs with fractional values are set to the next highest integer VID value. The algorythm for calculating the non-trubo VID from the BIOS writers guide is: vid_ratio = (vid_max - vid_min) / (max_pstate - min_pstate) vid = ceiling(vid_min + (req_pstate - min_pstate) * vid_ratio) Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23intel_pstate: Fix BYT frequency reportingDirk Brandewie
BYT has a different conversion from P state to frequency than the core processors. This causes the min/max and current frequency to be misreported on some BYT SKUs. Tested on BYT N2820, Ivybridge and Haswell processors. Link: https://bugzilla.yoctoproject.org/show_bug.cgi?id=6663 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23intel_pstate: Don't lose sysfs settings during cpu offlineDirk Brandewie
The user may have custom settings don't destroy them during suspend. Link: https://bugzilla.kernel.org/show_bug.cgi?id=80651 Reported-by: Tobias Jakobi <liquid.acid@gmx.net> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23cpufreq: intel_pstate: Reflect current no_turbo state correctlyGabriele Mazzotta
Some BIOSes modify the state of MSR_IA32_MISC_ENABLE_TURBO_DISABLE based on the current power source for the system battery AC vs battery. Reflect the correct current state and ability to modify the no_turbo sysfs file based on current state of MSR_IA32_MISC_ENABLE_TURBO_DISABLE. Link: https://bugzilla.kernel.org/show_bug.cgi?id=83151 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23cpufreq: expose scaling_cur_freq sysfs file for set_policy() driversDirk Brandewie
Currently the core does not expose scaling_cur_freq for set_policy() drivers this breaks some userspace monitoring tools. Change the core to expose this file for all drivers and if the set_policy() driver supports the get() callback use it to retrieve the current frequency. Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23cpufreq: intel_pstate: Fix setting max_perf_pct in performance policyPali Rohár
Code which changes policy to powersave changes also max_policy_pct based on max_freq. Code which change max_perf_pct has upper limit base on value max_policy_pct. When policy is changing from powersave back to performance then max_policy_pct is not changed. Which means that changing max_perf_pct is not possible to high values if max_freq was too low in powersave policy. Test case: $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq 800000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 3300000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 100 $ echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor $ echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq $ echo 20 > /sys/devices/system/cpu/intel_pstate/max_perf_pct $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor powersave $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 800000 $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 20 $ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor $ echo 3300000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq $ echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 3300000 $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 24 And now intel_pstate driver allows to set maximal value for max_perf_pct based on max_policy_pct which is 24 for previous powersave max_freq 800000. This patch will set default value for max_policy_pct when setting policy to performance so it will allow to set also max value for max_perf_pct. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Cc: All applicable <stable@vger.kernel.org> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23Merge tag 'hwmon-for-linus-v3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull a hwmon fix from Guenter Roeck: "Fix potential compile problem for menf21bmc hwmon driver" * tag 'hwmon-for-linus-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (menf21bmc) Include linux/err.h
2014-10-23PCI / PM: handle failure to enable wakeup on PCIe PMELucas Stach
If the irqchip handling the PCIe PME interrupt is not able to enable interrupt wakeup we should properly reflect this in the PME suspend status. This fixes a kernel warning on resume, where it would try to disable the irq wakeup that failed to be activated while suspending, for example: WARNING: CPU: 0 PID: 609 at kernel/irq/manage.c:536 irq_set_irq_wake+0xc0/0xf8() Unbalanced IRQ 384 wake disable Fixes: 76cde7e49590 (PCI / PM: Make PCIe PME interrupts wake up from suspend-to-idle) Reported-and-tested-by: Richard Zhu <richard.zhu@freescale.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23ACPI: invoke acpi_device_wakeup() with correct parametersZhang Rui
Fix a bug that invokes acpi_device_wakeup() with wrong parameters. Fixes: f35cec255557 (ACPI / PM: Always enable wakeup GPEs when enabling device wakeup) Signed-off-by: Zhang Rui <rui.zhang@intel.com> Cc: 3.17+ <stable@vger.kernel.org> # 3.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Intel, nouveau, radeon and qxl. Mostly for bugs introduced in the merge window, nothing too shocking" [ And one cirrus fix added later and not mentioned in the pull request.. - Linus ] * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/cirrus: bind also to qemu-xen-traditional qxl: don't create too large primary surface drm/nouveau: fix regression on agp boards drm/gt215/gr: fix initialisation on gddr5 boards drm/radeon: reduce sparse false positive warnings drm/radeon: fix vm page table block size calculation drm/ttm: Don't evict BOs outside of the requested placement range drm/ttm: Don't skip fpfn check if lpfn is 0 in ttm_bo_mem_compat drm/radeon: use gart memory for DMA ring tests drm/radeon: fix speaker allocation setup drm/radeon: initialize sadb to NULL in the audio code drm/i915: fix short vs. long hpd detection drm/i915: Don't trust the DP_DETECT bit for eDP ports on CHV Revert "drm/radeon/dpm: drop clk/voltage dependency filters for SI" Revert "drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table" drm/i915: properly reenable gen8 pipe IRQs drm/i915: Move DIV_ROUND_CLOSEST_ULL macro to header drm/i915: intel_backlight scale() math WA
2014-10-23xen/pci: Allocate memory for physdev_pci_device_add's optarrBoris Ostrovsky
physdev_pci_device_add's optarr[] is a zero-sized array and therefore reference to add.optarr[0] is accessing memory that does not belong to the 'add' variable. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23x86/xen: panic on bad Xen-provided memory mapMartin Kelly
Panic if Xen provides a memory map with 0 entries. Although this is unlikely, it is better to catch the error at the point of seeing the map than later on as a symptom of some other crash. Signed-off-by: Martin Kelly <martkell@amazon.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read()Boris Ostrovsky
Commit 89cbc76768c2 ("x86: Replace __get_cpu_var uses") replaced __get_cpu_var() with this_cpu_ptr() in xen_clocksource_read() in such a way that instead of accessing a structure pointed to by a per-cpu pointer we are trying to get to a per-cpu structure. __this_cpu_read() of the pointer is the more appropriate accessor. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23x86/xen: avoid race in p2m handlingJuergen Gross
When a new p2m leaf is allocated this leaf is linked into the p2m tree via cmpxchg. Unfortunately the compare value for checking the success of the update is read after checking for the need of a new leaf. It is possible that a new leaf has been linked into the tree concurrently in between. This could lead to a leaked memory page and to the loss of some p2m entries. Avoid the race by using the read compare value for checking the need of a new p2m leaf and use ACCESS_ONCE() to get it. There are other places which seem to need ACCESS_ONCE() to ensure proper operation. Change them accordingly. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23x86/xen: delay construction of mfn_list_listJuergen Gross
The 3 level p2m tree for the Xen tools is constructed very early at boot by calling xen_build_mfn_list_list(). Memory needed for this tree is allocated via extend_brk(). As this tree (other than the kernel internal p2m tree) is only needed for domain save/restore, live migration and crash dump analysis it doesn't matter whether it is constructed very early or just some milliseconds later when memory allocation is possible by other means. This patch moves the call of xen_build_mfn_list_list() just after calling xen_pagetable_p2m_copy() simplifying this function, too, as it doesn't have to bother with two parallel trees now. The same applies for some other internal functions. While simplifying code, make early_can_reuse_p2m_middle() static and drop the unused second parameter. p2m_mid_identity_mfn can be removed as well, it isn't used either. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23x86/xen: avoid writing to freed memory after race in p2m handlingJuergen Gross
In case a race was detected during allocation of a new p2m tree element in alloc_p2m() the new allocated mid_mfn page is freed without updating the pointer to the found value in the tree. This will result in overwriting the just freed page with the mfn of the p2m leaf. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23xen/balloon: Don't continue ballooning when BP_ECANCELED is encounteredBoris Ostrovsky
Commit 3dcf63677d4e ("xen/balloon: cancel ballooning if adding new memory failed") makes reserve_additional_memory() return BP_ECANCELED when an error is encountered. This error, however, is ignored by the caller (balloon_process()) since it is overwritten by subsequent call to update_schedule(). This results in continuous attempts to add more memory, all of which are likely to fail again. We should stop trying to schedule next iteration of ballooning when the current one has failed. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23drm/cirrus: bind also to qemu-xen-traditionalOlaf Hering
qemu as used by xend/xm toolstack uses a different subvendor id. Bind the drm driver also to this emulated card. Signed-off-by: Olaf Hering <olaf@aepfle.de> cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-10-22uprobes: Remove "weak" from function declarationsBjorn Helgaas
For the following interfaces: set_swbp() set_orig_insn() is_swbp_insn() is_trap_insn() uprobe_get_swbp_addr() arch_uprobe_ignore() arch_uprobe_copy_ixol() kernel/events/uprobes.c provides default definitions explicitly marked "weak". Some architectures provide their own definitions intended to override the defaults, but the "weak" attribute on the declarations applied to the arch definitions as well, so the linker chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak annotation from pcibios_get_phb_of_node decl")). Remove the "weak" attribute from the declarations so we always prefer a non-weak definition over the weak one, independent of link order. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> CC: Victor Kamensky <victor.kamensky@linaro.org> CC: Oleg Nesterov <oleg@redhat.com> CC: David A. Long <dave.long@linaro.org> CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
2014-10-22memory-hotplug: Remove "weak" from memory_block_size_bytes() declarationBjorn Helgaas
drivers/base/memory.c provides a default memory_block_size_bytes() definition explicitly marked "weak". Several architectures provide their own definitions intended to override the default, but the "weak" attribute on the declaration applied to the arch definitions as well, so the linker chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak annotation from pcibios_get_phb_of_node decl")). Remove the "weak" attribute from the declaration so we always prefer a non-weak definition over the weak one, independent of link order. Fixes: 41f107266b19 ("drivers: base: Add prototype declaration to the header file") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> CC: Rashika Kheria <rashika.kheria@gmail.com> CC: Nathan Fontenot <nfont@austin.ibm.com> CC: Anton Blanchard <anton@au1.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: Yinghai Lu <yinghai@kernel.org>
2014-10-22kgdb: Remove "weak" from kgdb_arch_pc() declarationBjorn Helgaas
kernel/debug/debug_core.c provides a default kgdb_arch_pc() definition explicitly marked "weak". Several architectures provide their own definitions intended to override the default, but the "weak" attribute on the declaration applied to the arch definitions as well, so the linker chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak annotation from pcibios_get_phb_of_node decl")). Remove the "weak" attribute from the declaration so we always prefer a non-weak definition over the weak one, independent of link order. Fixes: 688b744d8bc8 ("kgdb: fix signedness mixmatches, add statics, add declaration to header") Tested-by: Vineet Gupta <vgupta@synopsys.com> # for ARC build Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Harvey Harrison <harvey.harrison@gmail.com>