summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2014-07-09usb-storage/SCSI: Add broken_fua blacklist flagAlan Stern
commit b14bf2d0c0358140041d1c1805a674376964d0e0 upstream. Some buggy JMicron USB-ATA bridges don't know how to translate the FUA bit in READs or WRITEs. This patch adds an entry in unusual_devs.h and a blacklist flag to tell the sd driver not to use FUA. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Michael Büsch <m@bues.ch> Tested-by: Michael Büsch <m@bues.ch> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-06ptrace,x86: force IRET path after a ptrace_stop()Tejun Heo
commit b9cd18de4db3c9ffa7e17b0dc0ca99ed5aa4d43a upstream. The 'sysret' fastpath does not correctly restore even all regular registers, much less any segment registers or reflags values. That is very much part of why it's faster than 'iret'. Normally that isn't a problem, because the normal ptrace() interface catches the process using the signal handler infrastructure, which always returns with an iret. However, some paths can get caught using ptrace_event() instead of the signal path, and for those we need to make sure that we aren't going to return to user space using 'sysret'. Otherwise the modifications that may have been done to the register set by the tracer wouldn't necessarily take effect. Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from arch_ptrace_stop_needed() which is invoked from ptrace_stop(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30genirq: Sanitize spurious interrupt detection of threaded irqsThomas Gleixner
commit 1e77d0a1ed7417d2a5a52a7b8d32aea1833faa6c upstream. Till reported that the spurious interrupt detection of threaded interrupts is broken in two ways: - note_interrupt() is called for each action thread of a shared interrupt line. That's wrong as we are only interested whether none of the device drivers felt responsible for the interrupt, but by calling multiple times for a single interrupt line we account IRQ_NONE even if one of the drivers felt responsible. - note_interrupt() when called from the thread handler is not serialized. That leaves the members of irq_desc which are used for the spurious detection unprotected. To solve this we need to defer the spurious detection of a threaded interrupt to the next hardware interrupt context where we have implicit serialization. If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we check whether the previous interrupt requested a deferred check. If not, we request a deferred check for the next hardware interrupt and return. If set, we check whether one of the interrupt threads signaled success. Depending on this information we feed the result into the spurious detector. If one primary handler of a shared interrupt returns IRQ_HANDLED we disable the deferred check of irq threads on the same line, as we have found at least one device driver who cared. Reported-by: Till Straumann <strauman@slac.stanford.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Austin Schuh <austin@peloton-tech.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: linux-can@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30ACPI: add dynamic_debug supportBjørn Mork
commit 45fef5b88d1f2f47ecdefae6354372d440ca5c84 upstream. Commit 1a699476e258 ("ACPI / hotplug / PCI: Hotplug notifications from acpi_bus_notify()") added debug messages for a few common events. These debug messages are unconditionally enabled if CONFIG_DYNAMIC_DEBUG is defined, contrary to the documented meaning, making the ACPI system spew lots of unwanted noise on any kernel with dynamic debugging. The bug was introduced by commit fbfddae69657 ("ACPI: Add acpi_handle_<level>() interfaces"), which added the CONFIG_DYNAMIC_DEBUG dependency without respecting its meaning. Fix by adding real support for dynamic_debug. Fixes: fbfddae69657 ("ACPI: Add acpi_handle_<level>() interfaces") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30ext4: fix data integrity sync in ordered modeNamjae Jeon
commit 1c8349a17137b93f0a83f276c764a6df1b9a116e upstream. When we perform a data integrity sync we tag all the dirty pages with PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages. Later we check for this tag in write_cache_pages_da and creates a struct mpage_da_data containing contiguously indexed pages tagged with this tag and sync these pages with a call to mpage_da_map_and_submit. This process is done in while loop until all the PAGECACHE_TAG_TOWRITE pages are synced. We also do journal start and stop in each iteration. journal_stop could initiate journal commit which would call ext4_writepage which in turn will call ext4_bio_write_page even for delayed OR unwritten buffers. When ext4_bio_write_page is called for such buffers, even though it does not sync them but it clears the PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages are also not synced by the currently running data integrity sync. We will end up with dirty pages although sync is completed. This could cause a potential data loss when the sync call is followed by a truncate_pagecache call, which is exactly the case in collapse_range. (It will cause generic/127 failure in xfstests) To avoid this issue, we can use set_page_writeback_keepwrite instead of set_page_writeback, which doesn't clear TOWRITE tag. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30ptrace: fix fork event messages across pid namespacesMatthew Dempsky
commit 4e52365f279564cef0ddd41db5237f0471381093 upstream. When tracing a process in another pid namespace, it's important for fork event messages to contain the child's pid as seen from the tracer's pid namespace, not the parent's. Otherwise, the tracer won't be able to correlate the fork event with later SIGTRAP signals it receives from the child. We still risk a race condition if a ptracer from a different pid namespace attaches after we compute the pid_t value. However, sending a bogus fork event message in this unlikely scenario is still a vast improvement over the status quo where we always send bogus fork event messages to debuggers in a different pid namespace than the forking process. Signed-off-by: Matthew Dempsky <mdempsky@chromium.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Julien Tinnes <jln@chromium.org> Cc: Roland McGrath <mcgrathr@chromium.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30mm: page_alloc: use word-based accesses for get/set pageblock bitmapsMel Gorman
commit e58469bafd0524e848c3733bc3918d854595e20f upstream. The test_bit operations in get/set pageblock flags are expensive. This patch reads the bitmap on a word basis and use shifts and masks to isolate the bits of interest. Similarly masks are used to set a local copy of the bitmap and then use cmpxchg to update the bitmap if there have been no other changes made in parallel. In a test running dd onto tmpfs the overhead of the pageblock-related functions went from 1.27% in profiles to 0.5%. In addition to the performance benefits, this patch closes races that are possible between: a) get_ and set_pageblock_migratetype(), where get_pageblock_migratetype() reads part of the bits before and other part of the bits after set_pageblock_migratetype() has updated them. b) set_pageblock_migratetype() and set_pageblock_skip(), where the non-atomic read-modify-update set bit operation in set_pageblock_skip() will cause lost updates to some bits changed in the set_pageblock_migratetype(). Joonsoo Kim first reported the case a) via code inspection. Vlastimil Babka's testing with a debug patch showed that either a) or b) occurs roughly once per mmtests' stress-highalloc benchmark (although not necessarily in the same pageblock). Furthermore during development of unrelated compaction patches, it was observed that frequent calls to {start,undo}_isolate_page_range() the race occurs several thousands of times and has resulted in NULL pointer dereferences in move_freepages() and free_one_page() in places where free_list[migratetype] is manipulated by e.g. list_move(). Further debugging confirmed that migratetype had invalid value of 6, causing out of bounds access to the free_list array. That confirmed that the race exist, although it may be extremely rare, and currently only fatal where page isolation is performed due to memory hot remove. Races on pageblocks being updated by set_pageblock_migratetype(), where both old and new migratetype are lower MIGRATE_RESERVE, currently cannot result in an invalid value being observed, although theoretically they may still lead to unexpected creation or destruction of MIGRATE_RESERVE pageblocks. Furthermore, things could get suddenly worse when memory isolation is used more, or when new migratetypes are added. After this patch, the race has no longer been observed in testing. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-30hugetlb: restrict hugepage_migration_support() to x86_64Naoya Horiguchi
commit c177c81e09e517bbf75b67762cdab1b83aba6976 upstream. Currently hugepage migration is available for all archs which support pmd-level hugepage, but testing is done only for x86_64 and there're bugs for other archs. So to avoid breaking such archs, this patch limits the availability strictly to x86_64 until developers of other archs get interested in enabling this feature. Simply disabling hugepage migration on non-x86_64 archs is not enough to fix the reported problem where sys_move_pages() hits the BUG_ON() in follow_page(FOLL_GET), so let's fix this by checking if hugepage migration is supported in vma_migratable(). Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Hugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-16fs,userns: Change inode_capable to capable_wrt_inode_uidgidAndy Lutomirski
commit 23adbe12ef7d3d4195e80800ab36b37bee28cd03 upstream. The kernel has no concept of capabilities with respect to inodes; inodes exist independently of namespaces. For example, inode_capable(inode, CAP_LINUX_IMMUTABLE) would be nonsense. This patch changes inode_capable to check for uid and gid mappings and renames it to capable_wrt_inode_uidgid, which should make it more obvious what it does. Fixes CVE-2014-4014. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-04Merge branch 'for-3.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fix from Tejun Heo: "It is very late but this is an important percpu-refcount fix from Sebastian Ott. The problem is that percpu_ref_*() used __this_cpu_*() instead of this_cpu_*(). The difference between the two is that the latter is atomic on the local cpu while the former is not. this_cpu_inc() is guaranteed to increment the percpu counter on the cpu that the operation is executed on without any synchronization; however, __this_cpu_inc() doesn't and if the local cpu invokes the function from different contexts (e.g. process and irq) of the same CPU, it's not guaranteed to actually increment as it may be implemented as rmw. This bug existed from the get-go but it hasn't been noticed earlier probably because on x86 __this_cpu_inc() is equivalent to this_cpu_inc() as both get translated into single instruction; however, s390 uses the generic rmw implementation and gets affected by the bug. Kudos to Sebastian and Heiko for diagnosing it. The change is very low risk and fixes a critical issue on the affected architectures, so I think it's a good candidate for inclusion although it's very late in the devel cycle. On the other hand, this has been broken since v3.11, so backporting it through -stable post -rc1 won't be the end of the world. I'll ping Christoph whether __this_cpu_*() ops can be better annotated so that it can trigger lockdep warning when used from multiple contexts" * 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu-refcount: fix usage of this_cpu_ops
2014-06-04percpu-refcount: fix usage of this_cpu_opsSebastian Ott
The percpu-refcount infrastructure uses the underscore variants of this_cpu_ops in order to modify percpu reference counters. (e.g. __this_cpu_inc()). However the underscore variants do not atomically update the percpu variable, instead they may be implemented using read-modify-write semantics (more than one instruction). Therefore it is only safe to use the underscore variant if the context is always the same (process, softirq, or hardirq). Otherwise it is possible to lose updates. This problem is something that Sebastian has seen within the aio subsystem which uses percpu refcounters both in process and softirq context leading to reference counts that never dropped to zeroes; even though the number of "get" and "put" calls matched. Fix this by using the non-underscore this_cpu_ops variant which provides correct per cpu atomic semantics and fixes the corrupted reference counts. Cc: Kent Overstreet <kmo@daterainc.com> Cc: <stable@vger.kernel.org> # v3.11+ Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org> References: http://lkml.kernel.org/g/alpine.LFD.2.11.1406041540520.21183@denkbrett
2014-06-03kernfs: move the last knowledge of sysfs out from kernfsJianyu Zhan
There is still one residue of sysfs remaining: the sb_magic SYSFS_MAGIC. However this should be kernfs user specific, so this patch moves it out. Kerrnfs user should specify their magic number while mouting. Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Unbreak zebra and other netlink apps, from Eric W Biederman. 2) Some new qmi_wwan device IDs, from Aleksander Morgado. 3) Fix info leak in DCB netlink handler of qlcnic driver, from Dan Carpenter. 4) inet_getid() and ipv6_select_ident() do not generate monotonically increasing ID numbers, fix from Eric Dumazet. 5) Fix memory leak in __sk_prepare_filter(), from Leon Yu. 6) Netlink leftover bytes warning message is user triggerable, rate limit it. From Michal Schmidt. 7) Fix non-linear SKB panic in ipvs, from Peter Christensen. 8) Congestion window undo needs to be performed even if only never retransmitted data is SACK'd, fix from Yuching Cheng. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits) net: filter: fix possible memory leak in __sk_prepare_filter() net: ec_bhf: Add runtime dependencies tcp: fix cwnd undo on DSACK in F-RTO netlink: Only check file credentials for implicit destinations ipheth: Add support for iPad 2 and iPad 3 team: fix mtu setting net: fix inet_getid() and ipv6_select_ident() bugs net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI net: qmi_wwan: add additional Sierra Wireless QMI devices bridge: Prevent insertion of FDB entry with disallowed vlan netlink: rate-limit leftover bytes warning and print process name bridge: notify user space after fdb update net: qmi_wwan: add Netgear AirCard 341U net: fix wrong mac_len calculation for vlans batman-adv: fix NULL pointer dereferences net/mlx4_core: Reset RoCE VF gids when guest driver goes down emac: aggregation of v1-2 PLB errors for IER register emac: add missing support of 10mbit in emac/rgmii can: only rename enabled led triggers when changing the netdev name ipvs: Fix panic due to non-linear skb ...
2014-06-02netlink: Only check file credentials for implicit destinationsEric W. Biederman
It was possible to get a setuid root or setcap executable to write to it's stdout or stderr (which has been set made a netlink socket) and inadvertently reconfigure the networking stack. To prevent this we check that both the creator of the socket and the currentl applications has permission to reconfigure the network stack. Unfortunately this breaks Zebra which always uses sendto/sendmsg and creates it's socket without any privileges. To keep Zebra working don't bother checking if the creator of the socket has privilege when a destination address is specified. Instead rely exclusively on the privileges of the sender of the socket. Note from Andy: This is exactly Eric's code except for some comment clarifications and formatting fixes. Neither I nor, I think, anyone else is thrilled with this approach, but I'm hesitant to wait on a better fix since 3.15 is almost here. Note to stable maintainers: This is a mess. An earlier series of patches in 3.15 fix a rather serious security issue (CVE-2014-0181), but they did so in a way that breaks Zebra. The offending series includes: commit aa4cf9452f469f16cea8c96283b641b4576d4a7b Author: Eric W. Biederman <ebiederm@xmission.com> Date: Wed Apr 23 14:28:03 2014 -0700 net: Add variants of capable for use on netlink messages If a given kernel version is missing that series of fixes, it's probably worth backporting it and this patch. if that series is present, then this fix is critical if you care about Zebra. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02team: fix mtu settingJiri Pirko
Now it is not possible to set mtu to team device which has a port enslaved to it. The reason is that when team_change_mtu() calls dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU event is called and team_device_event() returns NOTIFY_BAD forbidding the change. So fix this by returning NOTIFY_DONE here in case team is changing mtu in team_change_mtu(). Introduced-by: 3d249d4c "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-29Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "The usual random collection of relatively small ARM fixes" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs ARM: 8064/1: fix v7-M signal return ARM: 8057/1: amba: Add Qualcomm vendor ID. ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode ARM: 8051/1: put_user: fix possible data corruption in put_user ARM: 8048/1: fix v7-M setup stack location
2014-05-27Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull slave-dmaengine fixes from Vinod Koul: "We have three small fixes. First one from Andy reverts the devm_request irq as we need to ensure the tasklet is killed after irq is freed, so we need to do free irq in our code. Other two from Arnd are fixing the compilation issue in omap and sa11x0 drivers with ARM randconfigs" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: sa11x0: remove broken #ifdef dmaengine: omap: hide filter_fn for built-in drivers dmaengine: dw: went back to plain {request,free}_irq() calls
2014-05-25ARM: 8057/1: amba: Add Qualcomm vendor ID.srinik
This patch adds Qualcomm amba vendor Id to the list. This ID is used in mmci driver. The ID selected in same lines like 0x41 is "A" for ARM, 0x51 is "Q" for Qualcomm. As there are no physical register on Qcom SOC for amba vendor id, this is a fake ID assigned based on "Q" prefix from Qualcomm. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-05-23Merge tag 'dmaengine-fixes-3.15-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: "Two fixes for -stable: - async_mult() sometimes maps less buffers than initially requested. We end up freeing dmaengine_unmap_data on an invalid pool. - mv_xor: register write ordering fix" * tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: dmaengine: fix dmaengine_unmap failure dma: mv_xor: Flush descriptors before activating a channel
2014-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "It looks like a sizeble collection but this is nearly 3 weeks of bug fixing while you were away. 1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute the packet through a non-IPSEC protected path and the code has to be able to handle SKBs attached to routes lacking an attached xfrm state. From Steffen Klassert. 2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported sub-protocols, also from Steffen Klassert. 3) Set local_df on fragmented netfilter skbs otherwise we won't be able to forward successfully, from Florian Westphal. 4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without holding RCU lock, from Bjorn Mork. 5) local_df test in ip_may_fragment is inverted, from Florian Westphal. 6) jme driver doesn't check for DMA mapping failures, from Neil Horman. 7) qlogic driver doesn't calculate number of TX queues properly, from Shahed Shaikh. 8) fib_info_cnt can drift irreversibly positive if we fail to allocate the fi->fib_metrics array, from Sergey Popovich. 9) Fix use after free in ip6_route_me_harder(), also from Sergey Popovich. 10) When SYSCTL is disabled, we don't handle local_port_range and ping_group_range defaults properly at all, from Cong Wang. 11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim driver, fix from Bjorn Mork. 12) cassini driver needs nested lock annotations for TX locking, from Emil Goode. 13) On init error ipv6 VTI driver can unregister pernet ops twice, oops. Fix from Mahtias Krause. 14) If macvlan device is down, don't propagate IFF_ALLMULTI changes, from Peter Christensen. 15) Missing NULL pointer check while parsing netlink config options in ip6_tnl_validate(). From Susant Sahani. 16) Fix handling of neighbour entries during ipv6 router reachability probing, from Duan Jiong. 17) x86 and s390 JIT address randomization has some address calculation bugs leading to crashes, from Alexei Starovoitov and Heiko Carstens. 18) Clear up those uglies with nop patching and net_get_random_once(), from Hannes Frederic Sowa. 19) Option length miscalculated in ip6_append_data(), fix also from Hannes Frederic Sowa. 20) A while ago we fixed a race during device unregistry when a namespace went down, turns out there is a second place that needs similar protection. From Cong Wang. 21) In the new Altera TSE driver multicast filtering isn't working, disable it and just use promisc mode until the cause is found. From Vince Bridgers. 22) When we disable router enabling in ipv6 we have to flush the cached routes explicitly, from Duan Jiong. 23) NBMA tunnels should not cache routes on the tunnel object because the key is variable, from Timo Teräs. 24) With stacked devices GRO information in skb->cb[] can be not setup properly, make sure it is in all code paths. From Eric Dumazet. 25) Really fix stacked vlan locking, multiple levels of nesting with intervening non-vlan devices are possible. From Vlad Yasevich. 26) Fallback ipip tunnel device's mtu is not setup properly, from Steffen Klassert. 27) The packet scheduler's tcindex filter can crash because we structure copy objects with list_head's inside, oops. From Cong Wang. 28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric Dumazet. 29) In some configurations 'itag' in __mkroute_input() can end up being used uninitialized because of how fib_validate_source() works. Fix it by explitly initializing itag to zero like all the other fib_validate_source() callers do, from Li RongQing" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) batman: fix a bogus warning from batadv_is_on_batman_iface() ipv4: initialise the itag variable in __mkroute_input bonding: Send ALB learning packets using the right source bonding: Don't assume 802.1Q when sending alb learning packets. net: doc: Update references to skb->rxhash stmmac: Remove unbalanced clk_disable call ipv6: gro: fix CHECKSUM_COMPLETE support net_sched: fix an oops in tcindex filter can: peak_pci: prevent use after free at netdev removal ip_tunnel: Initialize the fallback device properly vlan: Fix build error wth vlan_get_encap_level() can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. bnx2x: Convert return 0 to return rc bonding: Fix alb mode to only use first level vlans. bonding: Fix stacked device detection in arp monitoring macvlan: Fix lockdep warnings with stacked macvlan devices vlan: Fix lockdep warning with stacked vlan devices. net: Allow for more then a single subclass for netif_addr_lock net: Find the nesting level of a given device by type. ...
2014-05-23Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "The biggest commit is an irqtime accounting loop latency fix, the rest are misc fixes all over the place: deadline scheduling, docs, numa, balancer and a bad to-idle latency fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/numa: Initialize newidle balance stats in sd_numa_init() sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance() sched: Skip double execution of pick_next_task_fair() sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check sched/deadline: Fix memory leak sched/deadline: Fix sched_yield() behavior sched: Sanitize irq accounting madness sched/docbook: Fix 'make htmldocs' warnings caused by missing description
2014-05-23Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "The biggest changes are fixes for races that kept triggering Trinity crashes, plus liblockdep build fixes and smaller misc fixes. The liblockdep bits in perf/urgent are a pull mistake - they should have been in locking/urgent - but by the time I noticed other commits were added and testing was done :-/ Sorry about that" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix a race between ring_buffer_detach() and ring_buffer_attach() perf: Prevent false warning in perf_swevent_add perf: Limit perf_event_attr::sample_period to 63 bits tools/liblockdep: Remove all build files when doing make clean tools/liblockdep: Build liblockdep from tools/Makefile perf/x86/intel: Fix Silvermont's event constraints perf: Fix perf_event_init_context() perf: Fix race in removing an event
2014-05-23wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STRMasatake YAMATO
In commit ad86622b478e ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide EXIT_TRACE from user-space") the order of task state definitions were changed: EXIT_DEAD and EXIT_ZOMBIE were swapped. Though the charterers for the states in TASK_STATE_TO_CHAR_STR string were not updated. This patch synchronizes the string to the order of definitions. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-21dmaengine: fix dmaengine_unmap failureXuelin Shi
The count which is used to get_unmap_data maybe not the same as the count computed in dmaengine_unmap which causes to free data in a wrong pool. This patch fixes this issue by keeping the map count with unmap_data structure and use this count to get the pool. Cc: <stable@vger.kernel.org> Signed-off-by: Xuelin Shi <xuelin.shi@freescale.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2014-05-21Merge tag 'driver-core-3.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve some reported issues and a regression from 3.13" * tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: make sure read buffer is zeroed kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs
2014-05-21Merge branch 'for-3.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull more cgroup fixes from Tejun Heo: "Three more patches to fix cgroup_freezer breakage due to the recent cgroup internal locking changes - an operation cgroup_freezer was using now requires sleepable context and cgroup_freezer was invoking that while holding a spin lock. cgroup_freezer was using an overly elaborate hierarchical locking scheme. While it's possible to convert the hierarchical spinlocks directly to mutexes, this patch simplifies the overall locking so that it uses a global mutex. This has the added benefit of avoiding iterating potentially huge number of tasks under a spinlock. While the patch is on the larger side in the devel cycle, the changes made are mostly straight-forward and the locking logic is a lot simpler afterwards" * 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix rcu_read_lock() leak in update_if_frozen() cgroup_freezer: replace freezer->lock with freezer_mutex cgroup: introduce task_css_is_root()
2014-05-21Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull device tree fixes from Grant Likely: "Drivercore bugfixes for v3.15 This branch contains bug fixes important to get into v3.15. There is a fix for modifying properties seen during early boot, a fix for an incorrect prototype when CONFIG_OF=n, and a couple of corrections to device tree memory nodes on a few platforms" * tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: mips: dts: Fix missing device_type="memory" property in memory nodes arm: dts: Fix missing device_type="memory" for ste-ccu8540 of: fix CONFIG_OF=n prototype of of_node_full_name() of: make of_update_property() usable earlier in the boot process
2014-05-21dmaengine: omap: hide filter_fn for built-in driversArnd Bergmann
It is not possible to reference the omap_dma_filter_fn filter function from a built-in driver if the dmaengine driver itself is a loadable module, which is a valid configuration otherwise. This provides only the dummy alternative if the function is referenced by a built-in driver to allow a successful build. The filter function is only required by ATAGS based platforms, which will continue to be broken after this change for the bogus configuration. When booting from DT, with the dma channels correctly listed there, it will work fine. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Lindgren <tony@atomide.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Vinod Koul <vinod.koul@intel.com> Cc: dmaengine@vger.kernel.org Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-05-20vlan: Fix build error wth vlan_get_encap_level()Vlad Yasevich
The new function vlan_get_encap_level() uses vlan_dev_priv() which is only conditionally avaialble when VLAN support is enabled. Make vlan_get_encap_level() conditionally available as well. Fixes: 44a4085538c8 ("bonding: Fix stacked device detection in arp monitoring") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> CC: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-20Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two small updates from the irq departement: - Provide missing inline stub for a SMP only function - Add sub-maintainer for the drivers/irqchip/ part of the irq subsystem. YAY!" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add co-maintainer for drivers/irqchip genirq: Provide irq_force_affinity fallback for non-SMP
2014-05-19perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()Peter Zijlstra
Alexander noticed that we use RCU iteration on rb->event_list but do not use list_{add,del}_rcu() to add,remove entries to that list, nor do we observe proper grace periods when re-using the entries. Merge ring_buffer_detach() into ring_buffer_attach() such that attaching to the NULL buffer is detaching. Furthermore, ensure that between any 'detach' and 'attach' of the same event we observe the required grace period, but only when strictly required. In effect this means that only ioctl(.request = PERF_EVENT_IOC_SET_OUTPUT) will wait for a grace period, while the normal initial attach and final detach will not be delayed. This patch should, I think, do the right thing under all circumstances, the 'normal' cases all should never see the extra grace period, but the two cases: 1) PERF_EVENT_IOC_SET_OUTPUT on an event which already has a ring_buffer set, will now observe the required grace period between removing itself from the old and attaching itself to the new buffer. This case is 'simple' in that both buffers are present in perf_event_set_output() one could think an unconditional synchronize_rcu() would be sufficient; however... 2) an event that has a buffer attached, the buffer is destroyed (munmap) and then the event is attached to a new/different buffer using PERF_EVENT_IOC_SET_OUTPUT. This case is more complex because the buffer destruction does: ring_buffer_attach(.rb = NULL) followed by the ioctl() doing: ring_buffer_attach(.rb = foo); and we still need to observe the grace period between these two calls due to us reusing the event->rb_entry list_head. In order to make 2 happen we use Paul's latest cond_synchronize_rcu() call. Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140507123526.GD13658@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-16bonding: Fix stacked device detection in arp monitoringVlad Yasevich
Prior to commit fbd929f2dce460456807a51e18d623db3db9f077 bonding: support QinQ for bond arp interval the arp monitoring code allowed for proper detection of devices stacked on top of vlans. Since the above commit, the code can still detect a device stacked on top of single vlan, but not a device stacked on top of Q-in-Q configuration. The search will only set the inner vlan tag if the route device is the vlan device. However, this is not always the case, as it is possible to extend the stacked configuration. With this patch it is possible to provision devices on top Q-in-Q vlan configuration that should be used as a source of ARP monitoring information. For example: ip link add link bond0 vlan10 type vlan proto 802.1q id 10 ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 ip link add link vlan100 type macvlan Note: This patch limites the number of stacked VLANs to 2, just like before. The original, however had another issue in that if we had more then 2 levels of VLANs, we would end up generating incorrectly tagged traffic. This is no longer possible. Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval) CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Patric McHardy <kaber@trash.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16macvlan: Fix lockdep warnings with stacked macvlan devicesVlad Yasevich
Macvlan devices try to avoid stacking, but that's not always successfull or even desired. As an example, the following configuration is perefectly legal and valid: eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1 However, this configuration produces the following lockdep trace: [ 115.620418] ====================================================== [ 115.620477] [ INFO: possible circular locking dependency detected ] [ 115.620516] 3.15.0-rc1+ #24 Not tainted [ 115.620540] ------------------------------------------------------- [ 115.620577] ip/1704 is trying to acquire lock: [ 115.620604] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.620686] but task is already holding lock: [ 115.620723] (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40 [ 115.620795] which lock already depends on the new lock. [ 115.620853] the existing dependency chain (in reverse order) is: [ 115.620894] -> #1 (&macvlan_netdev_addr_lock_key){+.....}: [ 115.620935] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.620974] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621019] [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q] [ 115.621066] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621105] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621143] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b [ 115.621174] -> #0 (&vlan_netdev_addr_lock_key/1){+.....}: [ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60 [ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan] [ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b [ 115.621174] other info that might help us debug this: [ 115.621174] Possible unsafe locking scenario: [ 115.621174] CPU0 CPU1 [ 115.621174] ---- ---- [ 115.621174] lock(&macvlan_netdev_addr_lock_key); [ 115.621174] lock(&vlan_netdev_addr_lock_key/1); [ 115.621174] lock(&macvlan_netdev_addr_lock_key); [ 115.621174] lock(&vlan_netdev_addr_lock_key/1); [ 115.621174] *** DEADLOCK *** [ 115.621174] 2 locks held by ip/1704: [ 115.621174] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40 [ 115.621174] #1: (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40 [ 115.621174] stack backtrace: [ 115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24 [ 115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010 [ 115.621174] ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0 [ 115.621174] ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8 [ 115.621174] 0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230 [ 115.621174] Call Trace: [ 115.621174] [<ffffffff816ee20c>] dump_stack+0x4d/0x66 [ 115.621174] [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e [ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60 [ 115.621174] [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0 [ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130 [ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50 [ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80 [ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan] [ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0 [ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40 [ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140 [ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170 [ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60 [ 115.621174] [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30 [ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0 [ 115.621174] [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60 [ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730 [ 115.621174] [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730 [ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250 [ 115.621174] [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10 [ 115.621174] [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40 [ 115.621174] [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40 [ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0 [ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40 [ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0 [ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740 [ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0 [ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0 [ 115.621174] [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0 [ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0 [ 115.621174] [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0 [ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380 [ 115.621174] [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570 [ 115.621174] [<ffffffff810cfe9f>] ? up_read+0x1f/0x40 [ 115.621174] [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570 [ 115.621174] [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0 [ 115.621174] [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0 [ 115.621174] [<ffffffff8120a284>] ? mntput+0x24/0x40 [ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80 [ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20 [ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b Fix this by correctly providing macvlan lockdep class. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16vlan: Fix lockdep warning with stacked vlan devices.Vlad Yasevich
This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f. vlan: Fix lockdep warning when vlan dev handle notification Instead we use the new new API to find the lock subclass of our vlan device. This way we can support configurations where vlans are interspersed with other devices: bond -> vlan -> macvlan -> vlan Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16net: Allow for more then a single subclass for netif_addr_lockVlad Yasevich
Currently netif_addr_lock_nested assumes that there can be only a single nesting level between 2 devices. However, if we have multiple devices of the same type stacked, this fails. For example: eth0 <-- vlan0.10 <-- vlan0.10.20 A more complicated configuration may stack more then one type of device in different order. Ex: eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1 This patch adds an ndo_* function that allows each stackable device to report its nesting level. If the device doesn't provide this function default subclass of 1 is used. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16net: Find the nesting level of a given device by type.Vlad Yasevich
Multiple devices in the kernel can be stacked/nested and they need to know their nesting level for the purposes of lockdep. This patch provides a generic function that determines a nesting level of a particular device by its type (ex: vlan, macvlan, etc). We only care about nesting of the same type of devices. For example: eth0 <- vlan0.10 <- macvlan0 <- vlan1.20 The nesting level of vlan1.20 would be 1, since there is another vlan in the stack under it. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16net/mlx4_core: Add UPDATE_QP SRIOV wrapper supportMatan Barak
This patch adds UPDATE_QP SRIOV wrapper support. The mechanism is a general one, but currently only source MAC index changes are allowed for VFs. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15rtnetlink: wait for unregistering devices in rtnl_link_unregister()Cong Wang
From: Cong Wang <cwang@twopensource.com> commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no devices are unregistering) introduced rtnl_lock_unregistering() for default_device_exit_batch(). Same race could happen we when rmmod a driver which calls rtnl_link_unregister() as we call dev->destructor without rtnl lock. For long term, I think we should clean up the mess of netdev_run_todo() and net namespce exit code. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15of: fix CONFIG_OF=n prototype of of_node_full_name()Stephen Rothwell
Make the CONFIG_OF=n prototpe of of_node_full_name() mateh the CONFIG_OF=y version. Fixes compile warnings like this: sound/soc/soc-core.c: In function 'soc_check_aux_dev': sound/soc/soc-core.c:1667:3: warning: passing argument 1 of 'of_node_full_name' discards 'const' qualifier from pointer target type [enabled by default] codecname = of_node_full_name(aux_dev->codec_of_node); when CONFIG_OF is not defined. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Grant Likely <grant.likely@linaro.org>
2014-05-14net: avoid dependency of net_get_random_once on nop patchingHannes Frederic Sowa
net_get_random_once depends on the static keys infrastructure to patch up the branch to the slow path during boot. This was realized by abusing the static keys api and defining a new initializer to not enable the call site while still indicating that the branch point should get patched up. This was needed to have the fast path considered likely by gcc. The static key initialization during boot up normally walks through all the registered keys and either patches in ideal nops or enables the jump site but omitted that step on x86 if ideal nops where already placed at static_key branch points. Thus net_get_random_once branches not always became active. This patch switches net_get_random_once to the ordinary static_key api and thus places the kernel fast path in the - by gcc considered - unlikely path. Microbenchmarks on Intel and AMD x86-64 showed that the unlikely path actually beats the likely path in terms of cycle cost and that different nop patterns did not make much difference, thus this switch should not be noticeable. Fixes: a48e42920ff38b ("net: introduce new macro net_get_random_once") Reported-by: Tuomas Räsänen <tuomasjjrasanen@tjjr.fi> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13cgroup: introduce task_css_is_root()Tejun Heo
Determining the css of a task usually requires RCU read lock as that's the only thing which keeps the returned css accessible till its reference is acquired; however, testing whether a task belongs to the root can be performed without dereferencing the returned css by comparing the returned pointer against the root one in init_css_set[] which never changes. Implement task_css_is_root() which can be invoked in any context. This will be used by the scheduled cgroup_freezer change. v2: cgroup no longer supports modular controllers. No need to export init_css_set. Pointed out by Li. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-13kernfs, sysfs, cgroup: restrict extra perm check on open to sysfsTejun Heo
The kernfs open method - kernfs_fop_open() - inherited extra permission checks from sysfs. While the vfs layer allows ignoring the read/write permissions checks if the issuer has CAP_DAC_OVERRIDE, sysfs explicitly denied open regardless of the cap if the file doesn't have any of the UGO perms of the requested access or doesn't implement the requested operation. It can be debated whether this was a good idea or not but the behavior is too subtle and dangerous to change at this point. After cgroup got converted to kernfs, this extra perm check also got applied to cgroup breaking libcgroup which opens write-only files with O_RDWR as root. This patch gates the extra open permission check with a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs. For sysfs, nothing changes. For cgroup, root now can perform any operation regardless of the permissions as it was before kernfs conversion. Note that kernfs still fails unimplemented operations with -EINVAL. While at it, add comments explaining KERNFS_ROOT flags. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Andrey Wagin <avagin@gmail.com> Tested-by: Andrey Wagin <avagin@gmail.com> Cc: Li Zefan <lizefan@huawei.com> References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com Fixes: 2bd59d48ebfb ("cgroup: convert to kernfs") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-09Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "A somewhat unpleasantly large collection of small fixes. The big ones are the __visible tree sweep and a fix for 'earlyprintk=efi,keep'. It was using __init functions with predictably suboptimal results. Another key fix is a build fix which would produce output that simply would not decompress correctly in some configuration, due to the existing Makefiles picking up an unfortunate local label and mistaking it for the global symbol _end. Additional fixes include the handling of 64-bit numbers when setting the vdso data page (a latent bug which became manifest when i386 started exporting a vdso with time functions), a fix to the new MSR manipulation accessors which would cause features to not get properly unblocked, a build fix for 32-bit userland, and a few new platform quirks" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall() x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro x86: Fix typo preventing msr_set/clear_bit from having an effect x86/intel: Add quirk to disable HPET for the Baytrail platform x86/hpet: Make boot_hpet_disable extern x86-64, build: Fix stack protector Makefile breakage with 32-bit userland x86/reboot: Add reboot quirk for Certec BPC600 asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/* asmlinkage, x86: Add explicit __visible to arch/x86/* asmlinkage: Revert "lto: Make asmlinkage __visible" x86, build: Don't get confused by local symbols x86/efi: earlyprintk=efi,keep fix
2014-05-08Merge tag 'mfd-mmc-fixes-3.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull mmc/rtsx revert from Lee Jones. * tag 'mfd-mmc-fixes-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mmc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"
2014-05-08mmc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"Micky Ching
This reverts commit c42deffd5b53c9e583d83c7964854ede2f12410d. commit <mmc: rtsx: add support for pre_req and post_req> did use mutex_unlock() in tasklet, but mutex_unlock() can't be used in tasklet(atomic context). The driver needs to use mutex to avoid concurrency, so we can't use tasklet here, the patch need to be removed. The spinlock host->lock and pcr->lock may deadlock, one way to solve the deadlock is remove host->lock in sd_isr_done_transfer(), but if using workqueue the we can avoid using the spinlock and also avoid the problem. Signed-off-by: Micky Ching <micky_ching@realsil.com.cn> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-05-07net: mdio: of_mdiobus_register(): fall back to mdiobus_register() for !CONFIG_OFDaniel Mack
If CONFIG_OF is not set, make of_mdiobus_register() call mdiobus_register() instead of returning -ENOSYS. This way, we can just call of_mdiobus_register() from all DT-enabled drivers to handle the compat cases. Signed-off-by: Daniel Mack <zonque@gmail.com> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-07Revert "net: core: introduce netif_skb_dev_features"Florian Westphal
This reverts commit d206940319c41df4299db75ed56142177bb2e5f6, there are no more callers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-07sched/deadline: Fix sched_yield() behaviorJuri Lelli
yield_task_dl() is broken: o it forces current to be throttled setting its runtime to zero; o it sets current's dl_se->dl_new to one, expecting that dl_task_timer() will queue it back with proper parameters at replenish time. Unfortunately, dl_task_timer() has this check at the very beginning: if (!dl_task(p) || dl_se->dl_new) goto unlock; So, it just bails out and the task is never replenished. It actually yielded forever. To fix this, introduce a new flag indicating that the task properly yielded the CPU before its current runtime expired. While this is a little overdoing at the moment, the flag would be useful in the future to discriminate between "good" jobs (of which remaining runtime could be reclaimed, i.e. recycled) and "bad" jobs (for which dl_throttled task has been set) that needed to be stopped. Reported-by: yjay.kim <yjay.kim@lge.com> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140429103953.e68eba1b2ac3309214e3dc5a@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07genirq: Provide irq_force_affinity fallback for non-SMPArnd Bergmann
Patch 01f8fa4f01d "genirq: Allow forcing cpu affinity of interrupts" added an irq_force_affinity() function, and 30ccf03b4a6 "clocksource: Exynos_mct: Use irq_force_affinity() in cpu bringup" subsequently uses it. However, the driver can be used with CONFIG_SMP disabled, but the function declaration is only available for CONFIG_SMP, leading to this build error: drivers/clocksource/exynos_mct.c:431:3: error: implicit declaration of function 'irq_force_affinity' [-Werror=implicit-function-declaration] irq_force_affinity(mct_irqs[MCT_L0_IRQ + cpu], cpumask_of(cpu)); This patch introduces a dummy helper function for the non-SMP case that always returns success, to get rid of the build error. Since the patches causing the problem are marked for stable backports, this one should be as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/5619084.0zmrrIUZLV@wuerfel Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-06Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: agp: info leak in agpioc_info_wrap() fs/affs/super.c: bugfix / double free fanotify: fix -EOVERFLOW with large files on 64-bit slub: use sysfs'es release mechanism for kmem_cache revert "mm: vmscan: do not swap anon pages just because free+file is low" autofs: fix lockref lookup mm: filemap: update find_get_pages_tag() to deal with shadow entries mm/compaction: make isolate_freepages start at pageblock boundary MAINTAINERS: zswap/zbud: change maintainer email address mm/page-writeback.c: fix divide by zero in pos_ratio_polynom hugetlb: ensure hugepage access is denied if hugepages are not supported slub: fix memcg_propagate_slab_attrs drivers/rtc/rtc-pcf8523.c: fix month definition