summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-05-02Linux 2.6.28.10v2.6.28.10Greg Kroah-Hartman
2009-05-02unreached code in selinux_ip_postroute_iptables_compat() (CVE-2009-1184)Eugene Teo
Not upstream in 2.6.30, as the function was removed there, making this a non-issue. Node and port send checks can skip in the compat_net=1 case. This bug was introduced in commit effad8d. Signed-off-by: Eugene Teo <eugeneteo@kernel.sg> Reported-by: Dan Carpenter <error27@gmail.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02thinkpad-acpi: fix LED blinking through timer triggerHenrique de Moraes Holschuh
commit 75bd3bf2ade9d548be0d2bde60b5ee0fdce0b127 upstream. The set_blink hook code in the LED subdriver would never manage to get a LED to blink, and instead it would just turn it on. The consequence of this is that the "timer" trigger would not cause the LED to blink if given default parameters. This problem exists since 2.6.26-rc1. To fix it, switch the deferred LED work handling to use the thinkpad-acpi-specific LED status (off/on/blink) directly. This also makes the code easier to read, and to extend later. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: stable@kernel.org Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02b44: Use kernel DMA addresses for the kernel DMA APIMichael Buesch
commit 37efa239901493694a48f1d6f59f8de17c2c4509 upstream. We must not use the device DMA addresses for the kernel DMA API, because device DMA addresses have an additional offset added for the SSB translation. Use the original dma_addr_t for the sync operation. Cc: stable@kernel.org Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02exit_notify: kill the wrong capable(CAP_KILL) check (CVE-2009-1337)Oleg Nesterov
CVE-2009-1337 commit 432870dab85a2f69dc417022646cb9a70acf7f94 upstream. The CAP_KILL check in exit_notify() looks just wrong, kill it. Whatever logic we have to reset ->exit_signal, the malicious user can bypass it if it execs the setuid application before exiting. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02PCI: fix incorrect mask of PM No_Soft_Reset bitYu Zhao
commit 998dd7c719f62dcfa91d7bf7f4eb9c160e03d817 upstream. Reviewed-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02crypto: ixp4xx - Fix handling of chained sg buffersChristian Hohnstaedt
commit 0d44dc59b2b434b29aafeae581d06f81efac7c83 upstream. - keep dma functions away from chained scatterlists. Use the existing scatterlist iteration inside the driver to call dma_map_single() for each chunk and avoid dma_map_sg(). Signed-off-by: Christian Hohnstaedt <chohnstaedt@innominate.com> Tested-By: Karl Hiramoto <karl@hiramoto.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02fix ptrace slownessMiklos Szeredi
commit 53da1d9456fe7f87a920a78fdbdcf1225d197cb7 upstream. This patch fixes bug #12208: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208 Subject : uml is very slow on 2.6.28 host This turned out to be not a scheduler regression, but an already existing problem in ptrace being triggered by subtle scheduler changes. The problem is this: - task A is ptracing task B - task B stops on a trace event - task A is woken up and preempts task B - task A calls ptrace on task B, which does ptrace_check_attach() - this calls wait_task_inactive(), which sees that task B is still on the runq - task A goes to sleep for a jiffy - ... Since UML does lots of the above sequences, those jiffies quickly add up to make it slow as hell. This patch solves this by not rescheduling in read_unlock() after ptrace_stop() has woken up the tracer. Thanks to Oleg Nesterov and Ingo Molnar for the feedback. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02fs core fixesHugh Dickins
Please add the following 4 commits to 2.6.27-stable and 2.6.28-stable. However, there has been a lot of change here between 2.6.28 and 2.6.29: in particular, fs/exec.c's unsafe_exec() grew into the more complicated check_unsafe_exec(). So applying the original patches gives too many rejects: at the bottom is the diffstat and the combined patch required. 1 Commit: 53e9309e01277ec99c38e84e0ca16921287cf470 Author: Hugh Dickins <hugh@veritas.com> Date: Sat, 28 Mar 2009 23:16:03 +0000 (+0000) Subject: compat_do_execve should unshare_files 2 Commit: e426b64c412aaa3e9eb3e4b261dc5be0d5a83e78 Author: Hugh Dickins <hugh@veritas.com> Date: Sat, 28 Mar 2009 23:20:19 +0000 (+0000) Subject: fix setuid sometimes doesn't 3 Commit: 7c2c7d993044cddc5010f6f429b100c63bc7dffb Author: Hugh Dickins <hugh@veritas.com> Date: Sat, 28 Mar 2009 23:21:27 +0000 (+0000) Subject: fix setuid sometimes wouldn't 4 Commit: f1191b50ec11c8e2ca766d6d99eb5bb9d2c084a3 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Mon, 30 Mar 2009 11:35:18 +0000 (-0400) Subject: check_unsafe_exec() doesn't care about signal handlers sharing Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02powerpc: Sanitize stack pointer in signal handling codeJosh Boyer
This has been backported to 2.6.28.x from commit efbda86098 in Linus' tree On powerpc64 machines running 32-bit userspace, we can get garbage bits in the stack pointer passed into the kernel. Most places handle this correctly, but the signal handling code uses the passed value directly for allocating signal stack frames. This fixes the issue by introducing a get_clean_sp function that returns a sanitized stack pointer. For 32-bit tasks on a 64-bit kernel, the stack pointer is masked correctly. In all other cases, the stack pointer is simply returned. Additionally, we pass an 'is_32' parameter to get_sigframe now in order to get the properly sanitized stack. The callers are know to be 32 or 64-bit statically. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02block: include empty disks in /proc/diskstatsTejun Heo
commit 71982a409f12c50d011325a4471aa20666bb908d upstream. /proc/diskstats used to show stats for all disks whether they're zero-sized or not and their non-zero partitions. Commit 074a7aca7afa6f230104e8e65eba3420263714a5 accidentally changed the behavior such that it doesn't print out zero sized disks. This patch implements DISK_PITER_INCL_EMPTY_PART0 flag to partition iterator and uses it in diskstats_show() such that empty part0 is shown in /proc/diskstats. Reported and bisectd by Dianel Collins. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Collins <solemnwarning@solemnwarning.no-ip.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02md: fix deadlock when stopping arraysDan Williams
[backport of 5fd3a17ed456637a224cf4ca82b9ad9d005bc8d4] Resolve a deadlock when stopping redundant arrays, i.e. ones that require a call to sysfs_remove_group when shutdown. The deadlock is summarized below: Thread1 Thread2 ------- ------- read sysfs attribute stop array take mddev lock sysfs_remove_group sysfs_get_active wait for mddev lock wait for active Sysrq-w: -------- mdmon S 00000017 2212 4163 1 f1982ea8 00000046 2dcf6b85 00000017 c0b23100 f2f83ed0 c0b23100 f2f8413c c0b23100 c0b23100 c0b1fb98 f2f8413c 00000000 f2f8413c c0b23100 f2291ecc 00000002 c0b23100 00000000 00000017 f2f83ed0 f1982eac 00000046 c044d9dd Call Trace: [<c044d9dd>] ? debug_mutex_add_waiter+0x1d/0x58 [<c06ef451>] __mutex_lock_common+0x1d9/0x338 [<c06ef451>] ? __mutex_lock_common+0x1d9/0x338 [<c06ef5e3>] mutex_lock_interruptible_nested+0x33/0x3a [<c0634553>] ? mddev_lock+0x14/0x16 [<c0634553>] mddev_lock+0x14/0x16 [<c0634eda>] md_attr_show+0x2a/0x49 [<c04e9997>] sysfs_read_file+0x93/0xf9 mdadm D 00000017 2812 4177 1 f0401d78 00000046 430456f8 00000017 f0401d58 f0401d20 c0b23100 f2da2c4c c0b23100 c0b23100 c0b1fb98 f2da2c4c 0a10fc36 00000000 c0b23100 f0401d70 00000003 c0b23100 00000000 00000017 f2da29e0 00000001 00000002 00000000 Call Trace: [<c06eed1b>] schedule_timeout+0x1b/0x95 [<c06eed1b>] ? schedule_timeout+0x1b/0x95 [<c06eeb97>] ? wait_for_common+0x34/0xdc [<c044fa8a>] ? trace_hardirqs_on_caller+0x18/0x145 [<c044fbc2>] ? trace_hardirqs_on+0xb/0xd [<c06eec03>] wait_for_common+0xa0/0xdc [<c0428c7c>] ? default_wake_function+0x0/0x12 [<c06eeccc>] wait_for_completion+0x17/0x19 [<c04ea620>] sysfs_addrm_finish+0x19f/0x1d1 [<c04e920e>] sysfs_hash_and_remove+0x42/0x55 [<c04eb4db>] sysfs_remove_group+0x57/0x86 [<c0638086>] do_md_stop+0x13a/0x499 This has been there for a while, but is easier to trigger now that mdmon is closely watching sysfs. Cc: Neil Brown <neilb@suse.de> Reported-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02ath9k: AR9280 PCI devices must serialize IO as wellLuis R. Rodriguez
This is a port of: commit SHA1 5ec905a8df3fa877566ba98298433fbfb3d688cc for 2.6.28 Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
2009-05-02ath9k: implement IO serializationLuis R. Rodriguez
This is a port of: commit SHA1 6158425be398936af1fd04451f78ffad01529cb0 for 2.6.28. All 802.11n PCI devices (Cardbus, PCI, mini-PCI) require serialization of IO when on non-uniprocessor systems. PCI express devices not not require this. This should fix our only last standing open ath9k kernel.org bugzilla bug report: http://bugzilla.kernel.org/show_bug.cgi?id=12110 Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: VMX: Flush volatile msrs before emulating rdmsrAvi Kivity
(cherry picked from 516a1a7e9dc80358030fe01aabb3bedf882db9e2) Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers and need to be flushed to the vcpu struture before they can be read. This fixes cygwin longjmp() failure on Windows x64. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: x86: fix LAPIC pending count calculationMarcelo Tosatti
(cherry picked from b682b814e3cc340f905c14dff87ce8bdba7c5eba) Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: x86: disable kvmclock on non constant TSC hostsMarcelo Tosatti
(cherry picked from abe6655dd699069b53bcccbc65b2717f60203b12) This is better. Currently, this code path is posing us big troubles, and we won't have a decent patch in time. So, temporarily disable it. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: PIT: fix i8254 pending count readMarcelo Tosatti
(cherry picked from d2a8284e8fca9e2a938bee6cd074064d23864886) count_load_time assignment is bogus: its supposed to contain what it means, not the expiration time. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: mmu_notifiers release methodMarcelo Tosatti
(cherry picked from 85db06e514422ae429b5f85742d8111b70bd56f3) The destructor for huge pages uses the backing inode for adjusting hugetlbfs accounting. Hugepage mappings are destroyed by exit_mmap, after mmu_notifier_release, so there are no notifications through unmap_hugepage_range at this point. The hugetlbfs inode can be freed with pages backed by it referenced by the shadow. When the shadow releases its reference, the huge page destructor will access a now freed inode. Implement the release operation for kvm mmu notifiers to release page refs before the hugetlbfs inode is gone. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: MMU: handle large host sptes on invlpg/resyncMarcelo Tosatti
(cherry picked from 87917239204d67a316cb89751750f86c9ed3640b) The invlpg and sync walkers lack knowledge of large host sptes, descending to non-existant pagetable level. Stop at directory level in such case. Fixes SMP Windows XP with hugepages. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: MMU: check for present pdptr shadow page in walk_shadowMarcelo Tosatti
(cherry picked from eb64f1e8cd5c3cae912db30a77d062367f7a11a6) walk_shadow assumes the caller verified validity of the pdptr pointer in question, which is not the case for the invlpg handler. Fixes oops during Solaris 10 install. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: Advertise the bug in memory region destruction as fixedAvi Kivity
(cherry picked from 1a811b6167089bcdb84284f2dc9fd0b4d0f1899d) Userspace might need to act differently. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: set owner of cpu and vm file operationsChristian Borntraeger
(cherry picked from 3d3aab1b973b01bd2a1aa46307e94a1380b1d802) There is a race between a "close of the file descriptors" and module unload in the kvm module. You can easily trigger this problem by applying this debug patch: >--- kvm.orig/virt/kvm/kvm_main.c >+++ kvm/virt/kvm/kvm_main.c >@@ -648,10 +648,14 @@ void kvm_free_physmem(struct kvm *kvm) > kvm_free_physmem_slot(&kvm->memslots[i], NULL); > } > >+#include <linux/delay.h> > static void kvm_destroy_vm(struct kvm *kvm) > { > struct mm_struct *mm = kvm->mm; > >+ printk("off1\n"); >+ msleep(5000); >+ printk("off2\n"); > spin_lock(&kvm_lock); > list_del(&kvm->vm_list); > spin_unlock(&kvm_lock); and killing the userspace, followed by an rmmod. The problem is that kvm_destroy_vm can run while the module count is 0. That means, you can remove the module while kvm_destroy_vm is running. But kvm_destroy_vm is part of the module text. This causes a kerneloops. The race exists without the msleep but is much harder to trigger. This patch requires the fix for anon_inodes (anon_inodes: use fops->owner for module refcount). With this patch, we can set the owner of all anonymous KVM inodes file operations. The VFS will then control the KVM module refcount as long as there is an open file. kvm_destroy_vm will be called by the release function of the last closed file - before the VFS drops the module refcount. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: x86 emulator: Fix handling of VMMCALL instructionAmit Shah
(cherry picked from fbce554e940a983d005e29849636d0ef54b3eb18) The VMMCALL instruction doesn't get recognised and isn't processed by the emulator. This is seen on an Intel host that tries to execute the VMMCALL instruction after a guest live migrates from an AMD host. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: Really remove a slot when a user ask us soGlauber Costa
(cherry picked from 6f89724829cfd4ad6771a92fd4b8d59c90c7220c) Right now, KVM does not remove a slot when we do a register ioctl for size 0 (would be the expected behaviour). Instead, we only mark it as empty, but keep all bitmaps and allocated data structures present. It completely nullifies our chances of reusing that same slot again for mapping a different piece of memory. In this patch, we destroy rmaps, and vfree() the pointers that used to hold the dirty bitmap, rmap and lpage_info structures. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: Prevent trace call into unloaded module textWu Fengguang
(cherry picked from b82091824ee4970adf92d5cd6d57b12273171625) Add marker_synchronize_unregister() before module unloading. This prevents possible trace calls into unloaded module text. Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: Fix cpuid iteration on multiple leaves per eacNitin A Kamble
(cherry picked from 0fdf8e59faa5c60e9d77c8e14abe3a0f8bfcf586) The code to traverse the cpuid data array list for counting type of leaves is currently broken. This patches fixes the 2 things in it. 1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without it the code will never find a valid entry. 2. Also the stop condition in the for loop while looking for the next unflaged entry is broken. It needs to stop when it find one matching entry; and in the case of count of 1, it will be the same entry found in this iteration. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: Fix cpuid leaf 0xb loop terminationNitin A Kamble
(cherry picked from 0853d2c1d849ef69884d2447d90d04007590b72b) For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting leaf. The previous code was using bits 0-7 for this purpose, which is a bug. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: MMU: Fix aliased gfns treated as unaliasedIzik Eidus
(cherry picked from 2843099fee32a6020e1caa95c6026f28b5d43bff) Some areas of kvm x86 mmu are using gfn offset inside a slot without unaliasing the gfn first. This patch makes sure that the gfn will be unaliased and add gfn_to_memslot_unaliased() to save the calculating of the gfn unaliasing in case we have it unaliased already. Signed-off-by: Izik Eidus <ieidus@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: SVM: Set the 'busy' flag of the TR selectorAmit Shah
(cherry picked from c0d09828c870f90c6bc72070ada281568f89c63b) The busy flag of the TR selector is not set by the hardware. This breaks migration from amd hosts to intel hosts. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: SVM: Set the 'g' bit of the cs selector for cross-vendor migrationAmit Shah
(cherry picked from 25022acc3dd5f0b54071c7ba7c371860f2971b52) The hardware does not set the 'g' bit of the cs selector and this breaks migration from amd hosts to intel hosts. Set this bit if the segment limit is beyond 1 MB. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: VMX: Move private memory slot positionSheng Yang
(cherry picked from 6fe639792c7b8e462baeaac39ecc33541fd5da6e) PCI device assignment would map guest MMIO spaces as separate slot, so it is possible that the device has more than 2 MMIO spaces and overwrite current private memslot. The patch move private memory slot to the top of userspace visible memory slots. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: MMU: Extend kvm_mmu_page->slot_bitmap sizeSheng Yang
(cherry picked from 291f26bc0f89518ad7ee3207c09eb8a743ac8fcc) Otherwise set_bit() for private memory slot(above KVM_MEMORY_SLOTS) would corrupted memory in 32bit host. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: call kvm_arch_vcpu_reset() instead of the kvm_x86_ops callbackGleb Natapov
(cherry picked from 5f179287fa02723215eecf681d812b303c243973) Call kvm_arch_vcpu_reset() instead of directly using arch callback. The function does additional things. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02KVM: x86: Reset pending/inject NMI state on CPU resetJan Kiszka
(cherry picked from 448fa4a9c5dbc6941dd19ed09692c588d815bb06) CPU reset invalidates pending or already injected NMIs, therefore reset the related state variables. Based on original patch by Gleb Natapov. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02anon_inodes: use fops->owner for module refcountChristian Borntraeger
There is an imbalance for anonymous inodes. If the fops->owner field is set, the module reference count of owner is decreases on release. ("filp_close" --> "__fput" ---> "fops_put") On the other hand, anon_inode_getfd does not increase the module reference count of owner. This causes two problems: - if owner is set, the module refcount goes negative - if owner is not set, the module can be unloaded while code is running This patch changes anon_inode_getfd to be symmetric regarding fops->owner handling. I have checked all existing users of anon_inode_getfd. Noone sets fops->owner, thats why nobody has seen the module refcount negative. The refcounting was tested with a patched and unpatched KVM module.(see patch 2/2) I also did an epoll_open/close test. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@redhat.com> (cherry picked from commit e3a2a0d4e5ace731e60e2eff4fb7056ecb34adc1) Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fbJens Axboe
commit 78f707bfc723552e8309b7c38a8d0cc51012e813 upstream. The above commit added WRITE_SYNC and switched various places to using that for committing writes that will be waited upon immediately after submission. However, this causes a performance regression with AS and CFQ for ext3 at least, since sync_dirty_buffer() will submit some writes with WRITE_SYNC while ext3 has sumitted others dependent writes without the sync flag set. This causes excessive anticipation/idling in the IO scheduler because sync and async writes get interleaved, causing a big performance regression for the below test case (which is meant to simulate sqlite like behaviour). ---- test case ---- int main(int argc, char **argv) { int fdes, i; FILE *fp; struct timeval start; struct timeval end; struct timeval res; gettimeofday(&start, NULL); for (i=0; i<ROWS; i++) { fp = fopen("test_file", "a"); fprintf(fp, "Some Text Data\n"); fdes = fileno(fp); fsync(fdes); fclose(fp); } gettimeofday(&end, NULL); timersub(&end, &start, &res); fprintf(stdout, "time to write %d lines is %ld(msec)\n", ROWS, (res.tv_sec*1000000 + res.tv_usec)/1000); return 0; } ------------------- Thanks to Sean.White@APCC.com for tracking down this performance regression and providing a test case. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02hugetlbfs: return negative error code for bad mount optionAkinobu Mita
upstream commit: c12ddba09394c60e1120e6997794fa6ed52da884 This fixes the following BUG: # mount -o size=MM -t hugetlbfs none /huge hugetlbfs: Bad value 'MM' for mount option 'size=MM' ------------[ cut here ]------------ kernel BUG at fs/super.c:996! Due to BUG_ON(!mnt->mnt_sb); in vfs_kern_mount(). Also, remove unused #include <linux/quotaops.h> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02agp: zero pages before sending to userspaceShaohua Li
upstream commit: 59de2bebabc5027f93df999d59cc65df591c3e6e CVE-2009-1192 AGP pages might be mapped into userspace finally, so the pages should be set to zero before userspace can use it. Otherwise there is potential information leakage. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02r8169: Reset IntrStatus after chip resetFrancois Romieu
upstream commit: d78ad8cbfe73ad568de38814a75e9c92ad0a907c Original comment (Karsten): On a MSI MS-6702E mainboard, when in rtl8169_init_one() for the first time after BIOS has run, IntrStatus reads 5 after chip has been reset. IntrStatus should equal 0 there, so patch changes IntrStatus reset to happen after chip reset instead of before. Remark (Francois): Assuming that the loglevel of the driver is increased above NETIF_MSG_INTR, the bug reveals itself with a typical "interrupt 0025 in poll" message at startup. In retrospect, the message should had been read as an hint of an unexpected hardware state several months ago :o( Fixes (at least part of) https://bugzilla.redhat.com/show_bug.cgi?id=460747 Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Josep <josep.puigdemont@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02Input: gameport - fix attach driver codeDmitry Torokhov
upstream commit: 4ced8e7cb990a2c3bbf0ac7f27b35c890e7ce895 The commit 6902c0bead4ce266226fc0c5b3828b850bdc884a that moved driver registration out of kgameportd thread was incomplete and did not add the code necessary to actually attach driver to already registered devices, rectify that. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02USB: usb-storage: augment unusual_devs entry for Simple Tech/DatafabAlan Stern
upstream commit: e4813eec8d47c8299d968bd5349dc881fa481c26 This patch (as1227) adds the MAX_SECTORS_64 flag to the unusual_devs entry for the Simple Tech/Datafab controller. This fixes Bugzilla #12882. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: binbin <binbinsh@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-05-02USB: fix oops in cdc-wdm in case of malformed descriptorsOliver Neukum
upstream commit: e13c594f3a1fc2c78e7a20d1a07974f71e4b448f cdc-wdm needs to ignore extremely malformed descriptors. Signed-off-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-05-02USB: ftdi_sio: add vendor/project id for JETI specbos 1201 spectrometerPeter Korsgaard
upstream commit: ae27d84351f1f3568118318a8c40ff3a154bd629 Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-05-02usb gadget: fix ethernet link reports to ethtoolJonathan McDowell
upstream commit: 237e75bf1e558f7330f8deb167fa3116405bef2c The g_ether USB gadget driver currently decides whether or not there's a link to report back for eth_get_link based on if the USB link speed is set. The USB gadget speed is however often set even before the device is enumerated. It seems more sensible to only report a "link" if we're actually connected to a host that wants to talk to us. The patch below does this for me - tested with the PXA27x UDC driver. Signed-off-by: Jonathan McDowell <noodles@earth.li> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2009-05-02SCSI: sg: avoid blk_put_request/blk_rq_unmap_user in interruptFUJITA Tomonori
upstream commit: c96952ed7031e7c576ecf90cf95b8ec099d5295a This fixes the following oops: http://marc.info/?l=linux-kernel&m=123316111415677&w=2 You can reproduce this bug by interrupting a program before a sg response completes. This leads to the special sg state (the orphan state), then sg calls blk_put_request in interrupt (rq->end_io). The above bug report shows the recursive lock problem because sg calls blk_put_request in interrupt. We could call __blk_put_request here instead however we also need to handle blk_rq_unmap_user here, which can't be called in interrupt too. In the orphan state, we don't need to care about the data transfer (the program revoked the command) so adding 'just free the resource' mode to blk_rq_unmap_user is a possible option. I prefer to avoid complicating the blk mapping API when possible. I change the orphan state to call sg_finish_rem_req via execute_in_process_context. We hold sg_fd->kref so sg_fd doesn't go away until keventd_wq finishes our work. copy_from_user/to_user fails so blk_rq_unmap_user just frees the resource without the data transfer. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02SCSI: sg: fix races with ioctl(SG_IO)Tony Battersby
upstream commit: a2dd3b4cea335713b58996bb07b3abcde1175f47 sg_io_owned needs to be set before the command is sent to the midlevel; otherwise, a quickly-completing command may cause a different CPU to see "srp->done == 1 && !srp->sg_io_owned", which would lead to incorrect behavior. Check srp->done and set srp->orphan while holding rq_list_lock to prevent races with sg_rq_end_io(). There is no need to check sfp->closed from read/write/ioctl/poll/etc. since the kernel guarantees that this won't happen. The usefulness of sg_srp_done() was questionable before; now it is definitely not needed. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02SCSI: sg: fix races during device removalTony Battersby
upstream commit: c6517b7942fad663cc1cf3235cbe4207cf769332 sg has the following problems related to device removal: * opening a sg fd races with removing a device * closing a sg fd races with removing a device * /proc/scsi/sg/* access races with removing a device * command completion races with removing a device * command completion races with closing a sg fd * can rmmod sg with active commands These problems can cause kernel oopses, memory-use-after-free, or double-free errors. This patch fixes these problems by using krefs to manage the lifetime of sg_device and sg_fd. Each command submitted to the midlevel holds a reference to sg_fd until the completion callback. This ensures that sg_fd doesn't go away if the fd is closed with commands still outstanding. sg_fd gets the reference of sg_device (with scsi_device) and also makes sure that the sg module doesn't go away. /proc/scsi/sg/* functions don't play nicely with krefs because they give information about sg_fds which have been closed but not yet freed due to still having outstanding commands and sg_devices which have been removed but not yet freed due to still being referenced by one or more sg_fds. To deal with this safely without removing functionality, /proc functions now access sg_device and sg_fd while holding a lock instead of using kref_get()/kref_put(). Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> [chrisw: big for -stable, helps fix real bug, and made it through rc2 upstream] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02mm: pass correct mm when growing stackHugh Dickins
upstream commit: 05fa199d45c54a9bda7aa3ae6537253d6f097aa9 Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in security_vm_enough_memory(), when do_execve() is touching the target mm's stack, to set up its args and environment. Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns an mm-less kernel thread to do the exec. And in any case, that vm_enough_memory check when growing stack ought to be done on the target mm, not on the execer's mm (though apart from the warning, it only makes a slight tweak to OVERCOMMIT_NEVER behaviour). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-02pata_hpt37x: fix HPT370 DMA timeoutsSergei Shtylyov
upstream commit: 265b7215aed36941620b65ecfff516200fb190c1 The libata driver has copied the code from the IDE driver which caused a post 2.4.18 regression on many HPT370[A] chips -- DMA stopped to work completely, only causing timeouts. Now remove hpt370_bmdma_start() for good... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>