summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2009-07-27mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()Benjamin Herrenschmidt
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() Upcoming paches to support the new 64-bit "BookE" powerpc architecture will need to have the virtual address corresponding to PTE page when freeing it, due to the way the HW table walker works. Basically, the TLB can be loaded with "large" pages that cover the whole virtual space (well, sort-of, half of it actually) represented by a PTE page, and which contain an "indirect" bit indicating that this TLB entry RPN points to an array of PTEs from which the TLB can then create direct entries. Thus, in order to invalidate those when PTE pages are deleted, we need the virtual address to pass to tlbilx or tlbivax instructions. The old trick of sticking it somewhere in the PTE page struct page sucks too much, the address is almost readily available in all call sites and almost everybody implemets these as macros, so we may as well add the argument everywhere. I added it to the pmd and pud variants for consistency. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV] Acked-by: Nick Piggin <npiggin@suse.de> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-22Merge branch 'perf-counters-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-perf * 'perf-counters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-perf: (31 commits) perf_counter tools: Give perf top inherit option perf_counter tools: Fix vmlinux symbol generation breakage perf_counter: Detect debugfs location perf_counter: Add tracepoint support to perf list, perf stat perf symbol: C++ demangling perf: avoid structure size confusion by using a fixed size perf_counter: Fix throttle/unthrottle event logging perf_counter: Improve perf stat and perf record option parsing perf_counter: PERF_SAMPLE_ID and inherited counters perf_counter: Plug more stack leaks perf: Fix stack data leak perf_counter: Remove unused variables perf_counter: Make call graph option consistent perf_counter: Add perf record option to log addresses perf_counter: Log vfork as a fork event perf_counter: Synthesize VDSO mmap event perf_counter: Make sure we dont leak kernel memory to userspace perf_counter tools: Fix index boundary check perf_counter: Fix the tracepoint channel to perfcounters perf_counter, x86: Extend perf_counter Pentium M support ...
2009-07-22Merge branch 'core-fixes-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softirq: introduce tasklet_hrtimer infrastructure
2009-07-22Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Delegate irq affinity setting to the irq thread
2009-07-22Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix nr_uninterruptible accounting of frozen tasks really sched: fix load average accounting vs. cpu hotplug sched: Account for vruntime wrapping
2009-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) sky2: Avoid races in sky2_down drivers/net/mlx4: Adjust constant drivers/net: Move a dereference below a NULL test drivers/net: Move a dereference below a NULL test connector: maintainer/mail update. USB host CDC Phonet network interface driver macsonic, jazzsonic: fix oops on module unload macsonic: move probe function to .devinit.text can: switch carrier on if device was stopped while in bus-off state can: restart device even if dev_alloc_skb() fails can: sja1000: remove duplicated includes New device ID for sc92031 [1088:2031] 3c589_cs: re-initialize the multicast in the tc589_reset Fix error return for setsockopt(SO_TIMESTAMPING) netxen: fix thermal check and shutdown netxen: fix deadlock on dev close netxen: fix context deletion sequence net: Micrel KS8851 SPI network driver tcp: Use correct peer adr when copying MD5 keys tcp: Fix MD5 signature checking on IPv4 mapped sockets ...
2009-07-22perf_counter: PERF_SAMPLE_ID and inherited countersPeter Zijlstra
Anton noted that for inherited counters the counter-id as provided by PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID because each inherited counter gets its own id. His suggestion was to always return the parent counter id, since that is the primary counter id as exposed. However, these inherited counters have a unique identifier so that events like PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which counter gets modified, which is important when trying to normalize the sample streams. This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD, which is more useful anyway, since changing periods became a lot more common than initially thought -- rendering PERF_EVENT_PERIOD the less useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate value, since it reports the value used to trigger the overflow, whereas PERF_EVENT_PERIOD simply reports the requested period changed, which might only take effect on the next cycle). This still leaves us PERF_EVENT_THROTTLE to consider, but since that _should_ be a rare occurrence, and linking it to a primary id is the most useful bit to diagnose the problem, we introduce a PERF_SAMPLE_STREAM_ID, for those few cases where the full reconstruction is important. [Does change the ABI a little, but I see no other way out] Suggested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1248095846.15751.8781.camel@twins>
2009-07-22softirq: introduce tasklet_hrtimer infrastructurePeter Zijlstra
commit ca109491f (hrtimer: removing all ur callback modes) moved all hrtimer callbacks into hard interrupt context when high resolution timers are active. That breaks code which relied on the assumption that the callback happens in softirq context. Provide a generic infrastructure which combines tasklets and hrtimers together to provide an in-softirq hrtimer experience. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: kaber@trash.net Cc: David Miller <davem@davemloft.net> LKML-Reference: <1248265724.27058.1366.camel@twins> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-21genirq: Delegate irq affinity setting to the irq threadThomas Gleixner
irq_set_thread_affinity() calls set_cpus_allowed_ptr() which might sleep, but irq_set_thread_affinity() is called with desc->lock held and can be called from hard interrupt context as well. The code has another bug as it does not hold a ref on the task struct as required by set_cpus_allowed_ptr(). Just set the IRQTF_AFFINITY bit in action->thread_flags. The next time the thread runs it migrates itself. Solves all of the above problems nicely. Add kerneldoc to irq_set_thread_affinity() while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission>
2009-07-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixesLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: vmlinux.lds.h: restructure BSS linker script macros kconfig: initialize the screen before using curses(3) functions kconfig: variable argument lists needs `stdarg.h' kbuild, deb-pkg: fix install scripts for posix sh
2009-07-20tcp: Fix MD5 signature checking on IPv4 mapped socketsJohn Dykstra
Fix MD5 signature checking so that an IPv4 active open to an IPv6 socket can succeed. In particular, use the correct address family's signature generation function for the SYN/ACK. Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-18sched: fix nr_uninterruptible accounting of frozen tasks reallyThomas Gleixner
commit e3c8ca8336 (sched: do not count frozen tasks toward load) broke the nr_uninterruptible accounting on freeze/thaw. On freeze the task is excluded from accounting with a check for (task->flags & PF_FROZEN), but that flag is cleared before the task is thawed. So while we prevent that the task with state TASK_UNINTERRUPTIBLE is accounted to nr_uninterruptible on freeze we decrement nr_uninterruptible on thaw. Use a separate flag which is handled by the freezing task itself. Set it before calling the scheduler with TASK_UNINTERRUPTIBLE state and clear it after we return from frozen state. Cc: <stable@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-17Merge branch 'drm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm: Move a dereference below a NULL test fb/intelfb: conflict with DRM_I915 and hide by default drm/ttm: fix misplaced parentheses drm/via: Fix vblank IRQ on VIA hardware. drm: drm_gem, check kzalloc retval drm: drm_debugfs, check kmalloc retval drm/radeon: add some missing pci ids
2009-07-18vmlinux.lds.h: restructure BSS linker script macrosTim Abbott
The BSS section macros in vmlinux.lds.h currently place the .sbss input section outside the bounds of [__bss_start, __bss_end]. On all architectures except for microblaze that handle both .sbss and __bss_start/__bss_end, this is wrong: the .sbss input section is within the range [__bss_start, __bss_end]. Relatedly, the example code at the top of the file actually has __bss_start/__bss_end defined twice; I believe the right fix here is to define them in the BSS_SECTION macro but not in the BSS macro. Another problem with the current macros is that several architectures have an ALIGN(4) or some other small number just before __bss_stop in their linker scripts. The BSS_SECTION macro currently hardcodes this to 4; while it should really be an argument. It also ignores its sbss_align argument; fix that. mn10300 is the only user at present of any of the macros touched by this patch. It looks like mn10300 actually was incorrectly converted to use the new BSS() macro (the alignment of 4 prior to conversion was a __bss_stop alignment, but the argument to the BSS macro is a start alignment). So fix this as well. I'd like acks from Sam and David on this one. Also CCing Paul, since he has a patch from me which will need to be updated to use BSS_SECTION(0, PAGE_SIZE, 4) once this gets merged. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-07-17virtio_net: Sync header with qemuAlex Williamson
Qemu added support for a few extra RX modes that Linux doesn't currently make use of. Sync the headers to maintain consistency. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-07-17lguest: fix journeyMatias Zabaljauregui
fix: "make Guest" was complaining about duplicated G:032 Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-07-16net: sock_copy() fixesEric Dumazet
Commit e912b1142be8f1e2c71c71001dc992c6e5eb2ec1 (net: sk_prot_alloc() should not blindly overwrite memory) took care of not zeroing whole new socket at allocation time. sock_copy() is another spot where we should be very careful. We should not set refcnt to a non null value, until we are sure other fields are correctly setup, or a lockless reader could catch this socket by mistake, while not fully (re)initialized. This patch puts sk_node & sk_refcnt to the very beginning of struct sock to ease sock_copy() & sk_prot_alloc() job. We add appropriate smp_wmb() before sk_refcnt initializations to match our RCU requirements (changes to sock keys should be committed to memory before sk_refcnt setting) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-16Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timer stats: fix quick check optimization
2009-07-16vt: drop bootmem/slab memory distinctionJohannes Weiner
Bootmem is not used for the vt screen buffer anymore as slab is now available at the time the console is initialized. Get rid of the now superfluous distinction between slab and bootmem, it's always slab. This also fixes a kmalloc leak which Catalin described thusly: Commit a5f4f52e ("vt: use kzalloc() instead of the bootmem allocator") replaced the alloc_bootmem() with kzalloc() but didn't set vc_kmalloced to 1 and the memory block is later leaked. The corresponding kmemleak trace: unreferenced object 0xdf828000 (size 8192): comm "swapper", pid 0, jiffies 4294937296 backtrace: [<c006d473>] __save_stack_trace+0x17/0x1c [<c000d869>] log_early+0x55/0x84 [<c01cfa4b>] kmemleak_alloc+0x33/0x3c [<c006c013>] __kmalloc+0xd7/0xe4 [<c00108c7>] con_init+0xbf/0x1b8 [<c0010149>] console_init+0x11/0x20 [<c0008797>] start_kernel+0x137/0x1e4 Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-15Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ahci: add device ID for 82801JI sata controller drivers/ata: Move a dereference below a NULL test libata: implement and use HORKAGE_NOSETXFER, take#2 libata: fix follow-up SRST failure path
2009-07-15drm/radeon: add some missing pci idsAlex Deucher
Also, fix ordering for a couple others Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@linux.ie>
2009-07-14libata: implement and use HORKAGE_NOSETXFER, take#2Tejun Heo
PIONEER DVD-RW DVRTD08 times out SETXFER if no media is present. The device is SATA and simply skipping SETXFER works around the problem. Implement ATA_HORKAGE_NOSETXFER and apply it to the device. Reported by Moritz Rigler in the following thread. http://thread.gmane.org/gmane.linux.ide/36790 and by Lars in bko#9540. Updated to whine and ignore NOSETXFER if PATA component is detected as suggested by Alan Cox. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Moritz Rigler <linux-ide@momail.e4ward.com> Reported-by: Lars <lars21ce@gmx.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-07-14Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: Fix migration expiry check hrtimer: migration: do not check expiry time on current CPU
2009-07-14Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing/function-profiler: do not free per cpu variable stat tracing/events: Move TRACE_SYSTEM outside of include guard
2009-07-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: Revert "NET: Fix locking issues in PPP, 6pack, mkiss and strip line disciplines." skbuff.h: Fix comment for NET_IP_ALIGN drivers/net: using spin_lock_irqsave() in net_send_packet() NET: phy_device, fix lock imbalance gre: fix ToS/DiffServ inherit bug igb: gcc-3.4.6 fix atlx: duplicate testing of MCAST flag NET: Fix locking issues in PPP, 6pack, mkiss and strip line disciplines. netdev: restore MTU change operation netdev: restore MAC address set and validate operations sit: fix regression: do not release skb->dst before xmit net: ip_push_pending_frames() fix net: sk_prot_alloc() should not blindly overwrite memory
2009-07-14skbuff.h: Fix comment for NET_IP_ALIGNTobias Klauser
Use the correct function call for skb_reserve in the comment for NET_IP_ALIGN. Signed-off-by: Tobias Klauser <klto@zhaw.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-13Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix race between write_metadata_buffer and get_write_access ext4: Fix ext4_mb_initialize_context() to initialize all fields ext4: fix null handler of ioctls in no journal mode ext4: Fix buffer head reference leak in no-journal mode ext4: Move __ext4_journalled_writepage() to avoid forward declaration ext4: Fix mmap/truncate race when blocksize < pagesize && !nodellaoc ext4: Fix mmap/truncate race when blocksize < pagesize && delayed allocation ext4: Don't look at buffer_heads outside i_size. ext4: Fix goal inum check in the inode allocator ext4: fix no journal corruption with locale-gen ext4: Calculate required journal credits for inserting an extent properly ext4: Fix truncation of symlinks after failed write jbd2: Fix a race between checkpointing code and journal_get_write_access() ext4: Use rcu_barrier() on module unload. ext4: naturally align struct ext4_allocation_request ext4: mark several more functions in mballoc.c as noinline ext4: Fix potential reclaim deadlock when truncating partial block jbd2: Remove GFP_ATOMIC kmalloc from inside spinlock critical region ext4: Fix type warning on 64-bit platforms in tracing events header
2009-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: wm97xx_batery: replace driver_data with dev_get_drvdata() omap: video: remove direct access of driver_data Sound: remove direct access of driver_data driver model: fix show/store prototypes in doc. Firmware: firmware_class, fix lock imbalance Driver Core: remove BUS_ID_SIZE sparc: remove driver-core BUS_ID_SIZE partitions: fix broken uevent_suppress conversion devres: WARN() and return, don't crash on device_del() of uninitialized device
2009-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (48 commits) USB: otg: fix module reinsert issue USB: handle zero-length usbfs submissions correctly USB: EHCI: report actual_length for iso transfers USB: option: remove unnecessary and erroneous code USB: cypress_m8: remove invalid Clear-Halt USB: musb_host: undo incorrect change in musb_advance_schedule() USB: fix LANGID=0 regression USB: serial: sierra driver id_table additions USB serial: Add ID for Turtelizer, an FT2232L-based JTAG/RS-232 adapter. USB: fix race leading to a write after kfree in usbfs USB: Sierra: fix oops upon device close USB: option.c: add A-Link 3GU device id USB: Serial: Add support for Arkham Technology adapters USB: Fix option_ms regression in 2.6.31-rc2 USB: gadget audio: select SND_PCM USB: ftdi: support NDI devices Revert USB: usbfs: deprecate and hide option for !embedded USB: usb.h: fix kernel-doc notation USB: RNDIS gadget, fix issues talking from PXA USB: serial: FTDI with product code FB80 and vendor id 0403 ...
2009-07-13tracing/events: Move TRACE_SYSTEM outside of include guardLi Zefan
If TRACE_INCLDUE_FILE is defined, <trace/events/TRACE_INCLUDE_FILE.h> will be included and compiled, otherwise it will be <trace/events/TRACE_SYSTEM.h> So TRACE_SYSTEM should be defined outside of #if proctection, just like TRACE_INCLUDE_FILE. Imaging this scenario: #include <trace/events/foo.h> -> TRACE_SYSTEM == foo ... #include <trace/events/bar.h> -> TRACE_SYSTEM == bar ... #define CREATE_TRACE_POINTS #include <trace/events/foo.h> -> TRACE_SYSTEM == bar !!! and then bar.h will be included and compiled. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A5A9CF1.2010007@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-12USB: usb.h: fix kernel-doc notationRandy Dunlap
Fix usb.h kernel-doc warnings: Warning(include/linux/usb.h:918): Excess struct/union/enum/typedef member 'nodename' description in 'usb_device_driver' Warning(include/linux/usb.h:939): No description found for parameter 'nodename' Warning(include/linux/usb.h:1219): No description found for parameter 'sg' Warning(include/linux/usb.h:1219): No description found for parameter 'num_sgs' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-12Revert "USB: Add Intel Langwell USB OTG Transceiver Drive"Greg Kroah-Hartman
This reverts commit 453f77558810ffa669ed5a510a7173ec49def396. The driver should not have been accepted as the MSRT code is not in the main kernel yet, which this depends on. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Hao Wu <hao.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-12Driver Core: remove BUS_ID_SIZEKay Sievers
The name size limit is gone from the driver-core, this is the removal of the last left-over. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-12Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6Linus Torvalds
* 'kmemleak' of git://linux-arm.org/linux-2.6: kmemleak: Remove alloc_bootmem annotations introduced in the past kmemleak: Add callbacks to the bootmem allocator kmemleak: Allow partial freeing of memory blocks kmemleak: Trace the kmalloc_large* functions in slub kmemleak: Scan objects allocated during a scanning episode kmemleak: Do not acquire scan_mutex in kmemleak_open() kmemleak: Remove the reported leaks number limitation kmemleak: Add more cond_resched() calls in the scanning thread kmemleak: Renice the scanning thread to +10
2009-07-12headers: smp_lock.h reduxAlexey Dobriyan
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12personality: fix PER_CLEAR_ON_SETIDJulien Tinnes
We have found that the current PER_CLEAR_ON_SETID mask on Linux doesn't include neither ADDR_COMPAT_LAYOUT, nor MMAP_PAGE_ZERO. The current mask is READ_IMPLIES_EXEC|ADDR_NO_RANDOMIZE. We believe it is important to add MMAP_PAGE_ZERO, because by using this personality it is possible to have the first page mapped inside a process running as setuid root. This could be used in those scenarios: - Exploiting a NULL pointer dereference issue in a setuid root binary - Bypassing the mmap_min_addr restrictions of the Linux kernel: by running a setuid binary that would drop privileges before giving us control back (for instance by loading a user-supplied library), we could get the first page mapped in a process we control. By further using mremap and mprotect on this mapping, we can then completely bypass the mmap_min_addr restrictions. Less importantly, we believe ADDR_COMPAT_LAYOUT should also be added since on x86 32bits it will in practice disable most of the address space layout randomization (only the stack will remain randomized). Signed-off-by: Julien Tinnes <jt@cr0.org> Signed-off-by: Tavis Ormandy <taviso@sdf.lonestar.org> Cc: stable@kernel.org Acked-by: Christoph Hellwig <hch@infradead.org> Acked-by: Kees Cook <kees@ubuntu.com> Acked-by: Eugene Teo <eugene@redhat.com> [ Shortened lines and fixed whitespace as per Christophs' suggestion ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-11Fix compile error due to congestion_wait() changesTrond Myklebust
Move the definition of BLK_RW_ASYNC/BLK_RW_SYNC into linux/backing-dev.h so that it is available to all callers of set/clear_bdi_congested(). This replaces commit 097041e576ee3a50d92dd643ee8ca65bf6a62e21 ("fuse: Fix build error"), which will be reverted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10tty: Fix USB kref leakAlan Cox
The sysrq code acquired a kref leak. Fix it by passing the tty separately from the caller (thus effectively using the callers kref which all the callers hold anyway) Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: reset oom_cfqq in cfq_set_request() block: fix sg SG_DXFER_TO_FROM_DEV regression block: call blk_scsi_ioctl_init() Fix congestion_wait() sync/async vs read/write confusion
2009-07-10Merge branch 'core-fixes-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: dma-debug: Fix the overlap() function to be correct and readable oprofile: reset bt_lost_no_mapping with other stats x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon signals: declare sys_rt_tgsigqueueinfo in syscalls.h rcu: Mark Hierarchical RCU no longer experimental dma-debug: Put all hash-chain locks into the same lock class dma-debug: fix off-by-one error in overlap function
2009-07-10sched: optimize cond_resched()Peter Zijlstra
Optimize cond_resched() by removing one conditional. Currently cond_resched() checks system_state == SYSTEM_RUNNING in order to avoid scheduling before the scheduler is running. We can however, as per suggestion of Matt, use PREEMPT_ACTIVE to accomplish that very same. Suggested-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10sched: INIT_PREEMPT_COUNTPeter Zijlstra
Pull the initial preempt_count value into a single definition site. Maintainers for: alpha, ia64 and m68k, please have a look, your arch code is funny. The header magic is a bit odd, but similar to the KERNEL_DS one, CPP waits with expanding these macros until the INIT_THREAD_INFO macro itself is expanded, which is in arch/*/kernel/init_task.c where we've already included sched.h so we're good. Cc: tony.luck@intel.com Cc: rth@twiddle.net Cc: geert@linux-m68k.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10block: fix sg SG_DXFER_TO_FROM_DEV regressionFUJITA Tomonori
I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use the block layer mapping API (2.6.28). Douglas Gilbert explained SG_DXFER_TO_FROM_DEV: http://www.spinics.net/lists/linux-scsi/msg37135.html = The semantics of SG_DXFER_TO_FROM_DEV were: - copy user space buffer to kernel (LLD) buffer - do SCSI command which is assumed to be of the DATA_IN (data from device) variety. This would overwrite some or all of the kernel buffer - copy kernel (LLD) buffer back to the user space. The idea was to detect short reads by filling the original user space buffer with some marker bytes ("0xec" it would seem in this report). The "resid" value is a better way of detecting short reads but that was only added this century and requires co-operation from the LLD. = This patch changes the block layer mapping API to support this semantics. This simply adds another field to struct rq_map_data and enables __bio_copy_iov() to copy data from user space even with READ requests. It's better to add the flags field and kills null_mapped and the new from_user fields in struct rq_map_data but that approach makes it difficult to send this patch to stable trees because st and osst drivers use struct rq_map_data (they were converted to use the block layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer mapping API. zhou sf reported this regiression and tested this patch: http://www.spinics.net/lists/linux-scsi/msg37128.html http://www.spinics.net/lists/linux-scsi/msg37168.html Reported-by: zhou sf <sxzzsf@gmail.com> Tested-by: zhou sf <sxzzsf@gmail.com> Cc: stable@kernel.org Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-10Fix congestion_wait() sync/async vs read/write confusionJens Axboe
Commit 1faa16d22877f4839bd433547d770c676d1d964c accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-10timer stats: fix quick check optimizationHeiko Carstens
git commit 507e1231 "timer stats: Optimize by adding quick check to avoid function calls" added one wrong check so that one unnecessary function call isn't elimated. time_stats_account_hrtimer() checks if timer->start_pid isn't initialized in order to find out if timer_stats_update_stats() should be called. However start_pid is initialized with -1 instead of 0, so that the function call always happens. Check timer->start_site like in timer_stats_account_timer() to fix this. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-10hrtimer: Fix migration expiry checkThomas Gleixner
The timer migration expiry check should prevent the migration of a timer to another CPU when the timer expires before the next event is scheduled on the other CPU. Migrating the timer might delay it because we can not reprogram the clock event device on the other CPU. But the code implementing that check has two flaws: - for !HIGHRES the check compares the expiry value with the clock events device expiry value which is wrong for CLOCK_REALTIME based timers. - the check is racy. It holds the hrtimer base lock of the target CPU, but the clock event device expiry value can be modified nevertheless, e.g. by an timer interrupt firing. The !HIGHRES case is easy to fix as we can enqueue the timer on the cpu which was selected by the load balancer. It runs the idle balancing code once per jiffy anyway. So the maximum delay for the timer is the same as when we keep the tick on the current cpu going. In the HIGHRES case we can get the next expiry value from the hrtimer cpu_base of the target CPU and serialize the update with the cpu_base lock. This moves the lock section in hrtimer_interrupt() so we can set next_event to KTIME_MAX while we are handling the expired timers and set it to the next expiry value after we handled the timers under the base lock. While the expired timers are processed timer migration is blocked because the expiry time of the timer is always <= KTIME_MAX. Also remove the now useless clockevents_get_next_event() function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits) cxgb3: Fix crash caused by stashing wrong netdev_queue ixgbe: Fix coexistence of FCoE and Flow Director in 82599 memory barrier: adding smp_mb__after_lock net: adding memory barrier to the poll and receive callbacks netpoll: Fix carrier detection for drivers that are using phylib includecheck fix: include/linux, rfkill.h p54: tx refused but queue active Atheros Kconfig needs to be dependent on WLAN_80211 mac80211: fix docbook mac80211_hwsim: avoid NULL access ssb: Add support for 4318E b43: Add support for 4318E zd1211rw: adding SONY IFU-WLM2 (054c:0257) as a zd1211b device zd1211rw: 07b8:6001 is a ZD1211B r6040: bump driver version to 0.24 and date to 08 July 2009 r6040: restore MIER register correctly when IRQ line is shared ipv4: Fix fib_trie rebalancing, part 4 (root thresholds) davinci_emac: fix kernel oops when changing MAC address while interface is down igb: set lan id prior to configuring phy mac80211: minstrel: avoid accessing negative indices in rix_to_ndx() ...
2009-07-09memory barrier: adding smp_mb__after_lockJiri Olsa
Adding smp_mb__after_lock define to be used as a smp_mb call after a lock. Making it nop for x86, since {read|write|spin}_lock() on x86 are full memory barriers. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-09net: adding memory barrier to the poll and receive callbacksJiri Olsa
Adding memory barrier after the poll_wait function, paired with receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper to wrap the memory barrier. Without the memory barrier, following race can happen. The race fires, when following code paths meet, and the tp->rcv_nxt and __add_wait_queue updates stay in CPU caches. CPU1 CPU2 sys_select receive packet ... ... __add_wait_queue update tp->rcv_nxt ... ... tp->rcv_nxt check sock_def_readable ... { schedule ... if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) wake_up_interruptible(sk->sk_sleep) ... } If there was no cache the code would work ok, since the wait_queue and rcv_nxt are opposit to each other. Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already passed the tp->rcv_nxt check and sleeps, or will get the new value for tp->rcv_nxt and will return with new data mask. In both cases the process (CPU1) is being added to the wait queue, so the waitqueue_active (CPU2) call cannot miss and will wake up CPU1. The bad case is when the __add_wait_queue changes done by CPU1 stay in its cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then endup calling schedule and sleep forever if there are no more data on the socket. Calls to poll_wait in following modules were ommited: net/bluetooth/af_bluetooth.c net/irda/af_irda.c net/irda/irnet/irnet_ppp.c net/mac80211/rc80211_pid_debugfs.c net/phonet/socket.c net/rds/af_rds.c net/rfkill/core.c net/sunrpc/cache.c net/sunrpc/rpc_pipe.c net/tipc/socket.c Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08includecheck fix: include/linux, rfkill.hJaswinder Singh Rajput
fix the following 'make includecheck' warning: include/linux/rfkill.h: linux/types.h is included more than once. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>