summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-12-12mm/hugetlb.c: fix resv map memory leak for placeholder entriesMike Kravetz
Dmitry Vyukov reported the following memory leak unreferenced object 0xffff88002eaafd88 (size 32): comm "a.out", pid 5063, jiffies 4295774645 (age 15.810s) hex dump (first 32 bytes): 28 e9 4e 63 00 88 ff ff 28 e9 4e 63 00 88 ff ff (.Nc....(.Nc.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: kmalloc include/linux/slab.h:458 region_chg+0x2d4/0x6b0 mm/hugetlb.c:398 __vma_reservation_common+0x2c3/0x390 mm/hugetlb.c:1791 vma_needs_reservation mm/hugetlb.c:1813 alloc_huge_page+0x19e/0xc70 mm/hugetlb.c:1845 hugetlb_no_page mm/hugetlb.c:3543 hugetlb_fault+0x7a1/0x1250 mm/hugetlb.c:3717 follow_hugetlb_page+0x339/0xc70 mm/hugetlb.c:3880 __get_user_pages+0x542/0xf30 mm/gup.c:497 populate_vma_page_range+0xde/0x110 mm/gup.c:919 __mm_populate+0x1c7/0x310 mm/gup.c:969 do_mlock+0x291/0x360 mm/mlock.c:637 SYSC_mlock2 mm/mlock.c:658 SyS_mlock2+0x4b/0x70 mm/mlock.c:648 Dmitry identified a potential memory leak in the routine region_chg, where a region descriptor is not free'ed on an error path. However, the root cause for the above memory leak resides in region_del. In this specific case, a "placeholder" entry is created in region_chg. The associated page allocation fails, and the placeholder entry is left in the reserve map. This is "by design" as the entry should be deleted when the map is released. The bug is in the region_del routine which is used to delete entries within a specific range (and when the map is released). region_del did not handle the case where a placeholder entry exactly matched the start of the range range to be deleted. In this case, the entry would not be deleted and leaked. The fix is to take these special placeholder entries into account in region_del. The region_chg error path leak is also fixed. Fixes: feba16e25a57 ("mm/hugetlb: add region_del() to delete a specific range of entries") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: <stable@vger.kernel.org> [4.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: hugetlb: call huge_pte_alloc() only if ptep is nullNaoya Horiguchi
Currently at the beginning of hugetlb_fault(), we call huge_pte_offset() and check whether the obtained *ptep is a migration/hwpoison entry or not. And if not, then we get to call huge_pte_alloc(). This is racy because the *ptep could turn into migration/hwpoison entry after the huge_pte_offset() check. This race results in BUG_ON in huge_pte_alloc(). We don't have to call huge_pte_alloc() when the huge_pte_offset() returns non-NULL, so let's fix this bug with moving the code into else block. Note that the *ptep could turn into a migration/hwpoison entry after this block, but that's not a problem because we have another !pte_present check later (we never go into hugetlb_no_page() in that case.) Fixes: 290408d4a250 ("hugetlb: hugepage migration core") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> [2.6.36+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12kernel: remove stop_machine() Kconfig dependencyChris Wilson
Currently the full stop_machine() routine is only enabled on SMP if module unloading is enabled, or if the CPUs are hotpluggable. This leads to configurations where stop_machine() is broken as it will then only run the callback on the local CPU with irqs disabled, and not stop the other CPUs or run the callback on them. For example, this breaks MTRR setup on x86 in certain configs since ea8596bb2d8d379 ("kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions") as the MTRR is only established on the boot CPU. This patch removes the Kconfig option for STOP_MACHINE and uses the SMP and HOTPLUG_CPU config options to compile the correct stop_machine() for the architecture, removing the false dependency on MODULE_UNLOAD in the process. Link: https://lkml.org/lkml/2014/10/8/124 References: https://bugs.freedesktop.org/show_bug.cgi?id=84794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: kmemleak: mark kmemleak_init prototype as __initNicolas Iooss
The kmemleak_init() definition in mm/kmemleak.c is marked __init but its prototype in include/linux/kmemleak.h is marked __ref since commit a6186d89c913 ("kmemleak: Mark the early log buffer as __initdata"). This causes a section mismatch which is reported as a warning when building with clang -Wsection, because kmemleak_init() is declared in section .ref.text but defined in .init.text. Fix this by marking kmemleak_init() prototype __init. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: fix kerneldoc on mem_cgroup_replace_pageHugh Dickins
Whoops, I missed removing the kerneldoc comment of the lrucare arg removed from mem_cgroup_replace_page; but it's a good comment, keep it. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12osd fs: __r4w_get_page rely on PageUptodate for uptodateHugh Dickins
Commit 42cb14b110a5 ("mm: migrate dirty page without clear_page_dirty_for_io etc") simplified the migration of a PageDirty pagecache page: one stat needs moving from zone to zone and that's about all. It's convenient and safest for it to shift the PageDirty bit from old page to new, just before updating the zone stats: before copying data and marking the new PageUptodate. This is all done while both pages are isolated and locked, just as before; and just as before, there's a moment when the new page is visible in the radix_tree, but not yet PageUptodate. What's new is that it may now be briefly visible as PageDirty before it is PageUptodate. When I scoured the tree to see if this could cause a problem anywhere, the only places I found were in two similar functions __r4w_get_page(): which look up a page with find_get_page() (not using page lock), then claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate. I'm not sure whether that was right before, but now it might be wrong (on rare occasions): only claim the page is uptodate if PageUptodate. Or perhaps the page in question could never be migratable anyway? Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Boaz Harrosh <ooo@electrozaur.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12MAINTAINERS: make Vladimir co-maintainer of the memory controllerJohannes Weiner
Vladimir architected and authored much of the current state of the memcg's slab memory accounting and tracking. Make sure he gets CC'd on bug reports ;-) Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any ↵Michal Hocko
progress Tetsuo Handa has reported that the system might basically livelock in OOM condition without triggering the OOM killer. The issue is caused by internal dependency of the direct reclaim on vmstat counter updates (via zone_reclaimable) which are performed from the workqueue context. If all the current workers get assigned to an allocation request, though, they will be looping inside the allocator trying to reclaim memory but zone_reclaimable can see stalled numbers so it will consider a zone reclaimable even though it has been scanned way too much. WQ concurrency logic will not consider this situation as a congested workqueue because it relies that worker would have to sleep in such a situation. This also means that it doesn't try to spawn new workers or invoke the rescuer thread if the one is assigned to the queue. In order to fix this issue we need to do two things. First we have to let wq concurrency code know that we are in trouble so we have to do a short sleep. In order to prevent from issues handled by 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone") we limit the sleep only to worker threads which are the ones of the interest anyway. The second thing to do is to create a dedicated workqueue for vmstat and mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to have a spare worker thread for it. Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Tejun Heo <tj@kernel.org> Cc: Cristopher Lameter <clameter@sgi.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Arkadiusz Miskiewicz <arekm@maven.pl> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: fix swapped Movable and Reclaimable in /proc/pagetypeinfoVlastimil Babka
Commit 016c13daa5c9 ("mm, page_alloc: use masks and shifts when converting GFP flags to migrate types") has swapped MIGRATE_MOVABLE and MIGRATE_RECLAIMABLE in the enum definition. However, migratetype_names wasn't updated to reflect that. As a result, the file /proc/pagetypeinfo shows the counts for Movable as Reclaimable and vice versa. Additionally, commit 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand") introduced MIGRATE_HIGHATOMIC, but did not add a letter to distinguish it into show_migration_types(), so it doesn't appear in the listing of free areas during page alloc failures or oom kills. This patch fixes both problems. The atomic reserves will show with a letter 'H' in the free areas listings. Fixes: 016c13daa5c9 ("mm, page_alloc: use masks and shifts when converting GFP flags to migrate types") Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12memcg: fix memory.high targetVladimir Davydov
When the memory.high threshold is exceeded, try_charge() schedules a task_work to reclaim the excess. The reclaim target is set to the number of pages requested by try_charge(). This is wrong, because try_charge() usually charges more pages than requested (batch > nr_pages) in order to refill per cpu stocks. As a result, a process in a cgroup can easily exceed memory.high significantly when doing a lot of charges w/o returning to userspace (e.g. reading a file in big chunks). Fix this issue by assuring that when exceeding memory.high a process reclaims as many pages as were actually charged (i.e. batch). Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: hugetlb: fix hugepage memory leak caused by wrong reserve countNaoya Horiguchi
When dequeue_huge_page_vma() in alloc_huge_page() fails, we fall back on alloc_buddy_huge_page() to directly create a hugepage from the buddy allocator. In that case, however, if alloc_buddy_huge_page() succeeds we don't decrement h->resv_huge_pages, which means that successful hugetlb_fault() returns without releasing the reserve count. As a result, subsequent hugetlb_fault() might fail despite that there are still free hugepages. This patch simply adds decrementing code on that code path. I reproduced this problem when testing v4.3 kernel in the following situation: - the test machine/VM is a NUMA system, - hugepage overcommiting is enabled, - most of hugepages are allocated and there's only one free hugepage which is on node 0 (for example), - another program, which calls set_mempolicy(MPOL_BIND) to bind itself to node 1, tries to allocate a hugepage, - the allocation should fail but the reserve count is still hold. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> [3.16+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-11Merge tag 'dm-4.4-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Five stable fixes: - Two DM btree bufio buffer leak fixes that resolve reported BUG_ONs during DM thinp metadata close's dm_bufio_client_destroy(). - A DM thinp range discard fix to handle discarding a partially mapped range. - A DM thinp metadata snapshot fix to make sure the btree roots saved in the metadata snapshot are the most current. - A DM space map metadata refcounting fix that improves both DM thinp and DM cache metadata" * tag 'dm-4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm btree: fix bufio buffer leaks in dm_btree_del() error path dm space map metadata: fix ref counting bug when bootstrapping a new space map dm thin metadata: fix bug when taking a metadata snapshot dm thin metadata: fix bug in dm_thin_remove_range() dm btree: fix leak of bufio-backed block in btree_split_sibling error path
2015-12-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Two bugfixes, both bound for -stable" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: break infinite loop in fuse_fill_write_pages() cuse: fix memory leak
2015-12-11Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Not too much this time. - One nouveau workaround extended to a few more GPUs - Some amdgpu big endian fixes, and a regression fixer - Some vmwgfx fixes - One ttm locking fix - One vgaarb fix" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: vgaarb: fix signal handling in vga_get() radeon: Fix VCE IB test on Big-Endian systems radeon: Fix VCE ring test for Big-Endian systems radeon/cik: Fix GFX IB test on Big-Endian drm/amdgpu: fix the lost duplicates checking drm/nouveau/pmu: remove whitelist for PGOB-exit WAR, enable by default drm/vmwgfx: Implement the cursor_set2 callback v2 drm/vmwgfx: fix a warning message drm/ttm: Fixed a read/write lock imbalance
2015-12-11vgaarb: fix signal handling in vga_get()Kirill A. Shutemov
There are few defects in vga_get() related to signal hadning: - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE case; - if we found pending signal we must remove ourself from wait queue and change task state back to running; - -ERESTARTSYS is more appropriate, I guess. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: stable@vger.kernel.org Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-12-11Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes some big endian fixes and one regression fix. * 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux: radeon: Fix VCE IB test on Big-Endian systems radeon: Fix VCE ring test for Big-Endian systems radeon/cik: Fix GFX IB test on Big-Endian drm/amdgpu: fix the lost duplicates checking
2015-12-10Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Most are minor to important fixes. There is one performance enhancement that I took on the grounds that failing to check if other processes can run before running what's intended to be a background, idle-time task is a bug, even though the primary effect of the fix is to improve performance (and it was a very simple patch)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/mlx5: Postpone remove_keys under knowledge of coming preemption IB/mlx4: Use vmalloc for WR buffers when needed IB/mlx4: Use correct order of variables in log message iser-target: Remove explicit mlx4 work-around mlx4: Expose correct max_sge_rd limit IB/mad: Require CM send method for everything except ClassPortInfo IB/cma: Add a missing rcu_read_unlock() IB core: Fix ib_sg_to_pages() IB/srp: Fix srp_map_sg_fr() IB/srp: Fix indirect data buffer rkey endianness IB/srp: Initialize dma_length in srp_map_idb IB/srp: Fix possible send queue overflow IB/srp: Fix a memory leak IB/sa: Put netlink request into the request list before sending IB/iser: use sector_div instead of do_div IB/core: use RCU for uverbs id lookup IB/qib: Minor fixes to qib per SFF 8636 IB/core: Fix user mode post wr corruption IB/qib: Fix qib_mr structure
2015-12-10Merge tag 'sound-4.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Again less intensive changes in this rc: you can find only a few HD-audio fixes (noise fixes for Intel Broxton chip and a few Thinkpad models, quirks for Alienware 17 and Packard Bell DOTS) in addition to a long-standing rme96 bug fix" * tag 'sound-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/ca0132 - quirk for Alienware 17 2015 ALSA: hda - Fix noise problems on Thinkpad T440s ALSA: hda - Fixing speaker noise on the two latest thinkpad models ALSA: hda - Add inverted dmic for Packard Bell DOTS ALSA: hda - Fix playback noise with 24/32 bit sample size on BXT ALSA: rme96: Fix unexpected volume reset after rate changes
2015-12-10dm btree: fix bufio buffer leaks in dm_btree_del() error pathJoe Thornber
If dm_btree_del()'s call to push_frame() fails, e.g. due to btree_node_validator finding invalid metadata, the dm_btree_del() error path must unlock all frames (which have active dm-bufio buffers) that were pushed onto the del_stack. Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio buffers have leaked, e.g.: device-mapper: bufio: leaked buffer 3, hold count 1, list 0 Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2015-12-09Merge tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fixes from Alex Williamson: - Various fixes for removing redundancy, const'ifying structs, avoiding stack usage, fixing WARN usage (Krzysztof Kozlowski, Julia Lawall, Kees Cook, Dan Carpenter) - Revert No-IOMMU mode as the intended user has not emerged (Alex Williamson) * tag 'vfio-v4.4-rc5' of git://github.com/awilliam/linux-vfio: Revert: "vfio: Include No-IOMMU mode" vfio: fix a warning message vfio: platform: remove needless stack usage vfio-pci: constify pci_error_handlers structures vfio: Drop owner assignment from platform_driver
2015-12-09Merge tag 'devicetree-fixes-for-4.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DT fixes from Rob Herring: "I think this should be all for 4.4: - Fix incorrect warning about overlapping memory regions - Export of_irq_find_parent again which was made static in 4.4, but has users pending for 4.5. - Fix of_msi_map_rid declaration location - Fix re-entrancy for of_fdt_unflatten_tree - Clean-up of phys_addr_t printks" * tag 'devicetree-fixes-for-4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of/irq: move of_msi_map_rid declaration to the correct ifdef section of/irq: Export of_irq_find_parent again of/fdt: Add mutex protection for calls to __unflatten_device_tree() of/address: fix typo in comment block of of_translate_one() of: do not use 0x in front of %pa of: Fix comparison of reserved memory regions
2015-12-09Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "One small build fix, a couple do_div() fixes, and a fix for the gpio basic clock type are the major changes here. There's also a couple fixes for the TI, sunxi, and scpi clock drivers" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sunxi: pll2: Fix clock running too fast clk: scpi: add missing of_node_put clk: qoriq: fix memory leak imx/clk-pllv2: fix wrong do_div() usage imx/clk-pllv1: fix wrong do_div() usage clk: mmp: add linux/clk.h includes clk: ti: drop locking code from mux/divider drivers clk: ti816x: Add missing dmtimer clkdev entries clk: ti: fapll: fix wrong do_div() usage clk: ti: clkt_dpll: fix wrong do_div() usage clk: gpio: Get parent clk names in of_gpio_clk_setup()
2015-12-09Merge tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmiLinus Torvalds
Pull IPMI fix from Corey Minyard: "Fix an Oops if an interrupt occurs at startup. This can happen on some hardware" * tag 'for-linus-4.4-1' of git://git.code.sf.net/p/openipmi/linux-ipmi: ipmi: move timer init to before irq is setup
2015-12-09ipmi: move timer init to before irq is setupJan Stancek
We encountered a panic on boot in ipmi_si on a dell per320 due to an uninitialized timer as follows. static int smi_start_processing(void *send_info, ipmi_smi_t intf) { /* Try to claim any interrupts. */ if (new_smi->irq_setup) new_smi->irq_setup(new_smi); --> IRQ arrives here and irq handler tries to modify uninitialized timer which triggers BUG_ON(!timer->function) in __mod_timer(). Call Trace: <IRQ> [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si] [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si] [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si] [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350 [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si] [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170 [<ffffffff810f245e>] handle_edge_irq+0xde/0x180 [<ffffffff8100fc59>] handle_irq+0x49/0xa0 [<ffffffff8154643c>] do_IRQ+0x6c/0xf0 [<ffffffff8100ba53>] ret_from_intr+0x0/0x11 /* Set up the timer that drives the interface. */ setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi); The following patch fixes the problem. To: Openipmi-developer@lists.sourceforge.net To: Corey Minyard <minyard@acm.org> CC: linux-kernel@vger.kernel.org Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org # Applies cleanly to 3.10-, needs small rework before
2015-12-09bitops.h: correctly handle rol32 with 0 byte shiftSasha Levin
ROL on a 32 bit integer with a shift of 32 or more is undefined and the result is arch-dependent. Avoid this by handling the trivial case of roling by 0 correctly. The trivial solution of checking if shift is 0 breaks gcc's detection of this code as a ROL instruction, which is unacceptable. This bug was reported and fixed in GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157): The standard rotate idiom, (x << n) | (x >> (32 - n)) is recognized by gcc (for concreteness, I discuss only the case that x is an uint32_t here). However, this is portable C only for n in the range 0 < n < 32. For n == 0, we get x >> 32 which gives undefined behaviour according to the C standard (6.5.7, Bitwise shift operators). To portably support n == 0, one has to write the rotate as something like (x << n) | (x >> ((-n) & 31)) And this is apparently not recognized by gcc. Note that this is broken on older GCCs and will result in slower ROL. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-09dm space map metadata: fix ref counting bug when bootstrapping a new space mapJoe Thornber
When applying block operations (BOPs) do not remove them from the uncommitted BOP ring-buffer until after they've been applied -- in case we recurse. Also, perform BOP_INC operation, in dm_sm_metadata_create() and sm_metadata_extend(), in terms of the uncommitted BOP ring-buffer rather than using direct calls to sm_ll_inc(). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2015-12-09dm thin metadata: fix bug when taking a metadata snapshotJoe Thornber
When you take a metadata snapshot the btree roots for the mapping and details tree need to have their reference counts incremented so they persist for the lifetime of the metadata snap. The roots being incremented were those currently written in the superblock, which could possibly be out of date if concurrent IO is triggering new mappings, breaking of sharing, etc. Fix this by performing a commit with the metadata lock held while taking a metadata snapshot. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2015-12-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple of fixes, both -stable fodder (9p one all way back to 2.6.32, dio - to all branches where "Fix negative return from dio read beyond eof" will end up it; it's a fixup to commit marked for -stable)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix the regression from "direct-io: Fix negative return from dio read beyond eof" 9p: ->evict_inode() should kick out ->i_data, not ->i_mapping
2015-12-09Merge tag 'pci-v4.4-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These are more fixes I'd like to have in v4.4. Several for the Altera driver added for v4.4, and one for an MSI domain problem that affects several arm64 platforms: MSI: - Only use the generic MSI layer when domain is hierarchical (Marc Zyngier) Altera host bridge driver: - Fix loop in tlp_read_packet() (Dan Carpenter) - Fix Requester ID for config accesses (Ley Foon Tan) - Check TLP completion status (Ley Foon Tan) - Fix error when INTx is 4 (Ley Foon Tan)" * tag 'pci-v4.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: altera: Fix error when INTx is 4 PCI: altera: Check TLP completion status PCI: altera: Fix Requester ID for config accesses PCI: altera: Fix loop in tlp_read_packet() PCI/MSI: Only use the generic MSI layer when domain is hierarchical
2015-12-09ALSA: hda/ca0132 - quirk for Alienware 17 2015Gabriele Martino
The Alienware 17 (2015) has the same card and pin configuration of the Alienware 15, so the same quirks must be applied. Signed-off-by: Gabriele Martino <g.martino@gmx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-12-09of/irq: move of_msi_map_rid declaration to the correct ifdef sectionRob Herring
In checking fixes for of_irq_find_parent declaration location, I found that of_msi_map_rid is also wrong. of_msi_map_rid is not implemented for Sparc, so it should not be in the Sparc specific section of the header. Move it to just depend on OF_IRQ. Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
2015-12-09of/irq: Export of_irq_find_parent againCarlo Caione
of_irq_find_parent was made static since it had no users outside of of_irq.c. Export it again since we are going to use it again. Signed-off-by: Carlo Caione <carlo@endlessm.com> [robh: move of_irq_find_parent to correct ifdef section] Signed-off-by: Rob Herring <robh@kernel.org>
2015-12-09ALSA: hda - Fix noise problems on Thinkpad T440sTakashi Iwai
Lenovo Thinkpad T440s suffers from constant background noises, and it seems to be a generic hardware issue on this model: https://forums.lenovo.com/t5/ThinkPad-T400-T500-and-newer-T/T440s-speaker-noise/td-p/1339883 As the noise comes from the analog loopback path, disabling the path is the easy workaround. Also, the machine gives significant cracking noises at PM suspend. A workaround found by trial-and-error is to disable the shutup callback currently used for ALC269-variant. This patch addresses these noise issues by introducing a new fixup chain. Although the same workaround might be applicable to other Thinkpad models, it's applied only to T440s (17aa:220c) in this patch, so far, just to be safe (you chicken!). As a compromise, a new model option string "tp440" is provided now, though, so that owners of other Thinkpad models can test it more easily. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=958504 Reported-and-tested-by: Tim Hardeck <thardeck@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-12-09radeon: Fix VCE IB test on Big-Endian systemsOded Gabbay
This patch makes the VCE IB test pass on Big-Endian systems. It converts to little-endian the contents of the VCE message. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-09radeon: Fix VCE ring test for Big-Endian systemsOded Gabbay
This patch fixes the VCE ring test when running on Big-Endian machines. Every write to the ring needs to be translated to little-endian. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-09radeon/cik: Fix GFX IB test on Big-EndianOded Gabbay
This patch makes the IB test on the GFX ring pass for CI-based cards installed in Big-Endian machines. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-09drm/amdgpu: fix the lost duplicates checkingChunming Zhou
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Cc: stable@vger.kernel.org
2015-12-09Merge tag 'vmwgfx-fixes-4.4-151208' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-fixes Pull request of 2015-12-08 A couple of fixes for vmwgfx. A WARN() fix by Dan Carpenter, a TTM read/write lock imbalance causing occasional hangs with Wayland and an implementation of cursor_set2 to fix incorrectly offset Wayland cursors. * tag 'vmwgfx-fixes-4.4-151208' of git://people.freedesktop.org/~thomash/linux: drm/vmwgfx: Implement the cursor_set2 callback v2 drm/vmwgfx: fix a warning message drm/ttm: Fixed a read/write lock imbalance
2015-12-09Merge branch 'linux-4.4' of https://github.com/skeggsb/linux into drm-fixesDave Airlie
Just the one commit I mentioned earlier, making the PGOB workaround the default. * 'linux-4.4' of https://github.com/skeggsb/linux: drm/nouveau/pmu: remove whitelist for PGOB-exit WAR, enable by default
2015-12-08Merge branch 'for-linus-4.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull uml fixes from Richard Weinberger: "This contains various bug fixes, most of them are fall out from the merge window" * 'for-linus-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: fix returns without va_end um: Fix fpstate handling arch: um: fix error when linking vmlinux. um: Fix get_signal() usage
2015-12-09drm/nouveau/pmu: remove whitelist for PGOB-exit WAR, enable by defaultBen Skeggs
NVIDIA have indicated that the workaround is required on all GK10[467] boards that have the PGOB fuse set. I've left the commandline option in place for now, as paranoia. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-12-08IB/mlx5: Postpone remove_keys under knowledge of coming preemptionLeon Romanovsky
The remove_keys() logic is performed as garbage collection task. Such task is intended to be run when no other active processes are running. The need_resched() will return TRUE if there are user tasks to be activated in near future. In such case, we don't execute remove_keys() and postpone the garbage collection work to try to run in next cycle, in order to free CPU resources to other tasks. The possible pseudo-code to trigger such scenario: 1. Allocate a lot of MR to fill the cache above the limit. 2. Wait a small amount of time "to calm" the system. 3. Start CPU extensive operations on multi-node cluster. 4. Expect performance degradation during MR cache shrink operation. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mlx4: Use vmalloc for WR buffers when neededWengang Wang
There are several hits that WR buffer allocation(kmalloc) failed. It failed at order 3 and/or 4 contigous pages allocation. At the same time there are actually 100MB+ free memory but well fragmented. So try vmalloc when kmalloc failed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/mlx4: Use correct order of variables in log messageWengang Wang
There is a mis-order in mlx4 log. Fix it. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08Merge branch 'for-4.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "More change than I'd have liked at this stage. The pids controller and the changes made to cgroup core to support it introduced and revealed several important issues. - Assigning membership to a newly created task and migrating it can race leading to incorrect accounting. Oleg fixed it by widening threadgroup synchronization. It looks like we'll be able to merge it with a different percpu rwsem which is used in fork path making things simpler and cheaper. - The recent change to extend cgroup membership to zombies (so that pid accounting can extend till the pid is actually released) missed pinning the underlying data structures leading to use-after-free. Fixed. - v2 hierarchy was calling subsystem callbacks with the wrong target cgroup_subsys_state based on the incorrect assumption that they share the same target. pids is the first controller affected by this. Subsys callbacks updated so that they can deal with multi-target migrations" * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup_pids: don't account for the root cgroup cgroup: fix handling of multi-destination migration from subtree_control enabling cgroup_freezer: simplify propagation of CGROUP_FROZEN clearing in freezer_attach() cgroup: pids: kill pids_fork(), simplify pids_can_fork() and pids_cancel_fork() cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate() cgroup: make css_set pin its css's to avoid use-afer-free cgroup: fix cftype->file_offset handling
2015-12-08Merge branch 'for-4.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Nothing too interesting. All are device specific additions and workarounds" * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads libata-eh.c: Introduce new ata port flag for controller which lockup on read log page sata_sil: disable trim AHCI: Fix softreset failed issue of Port Multiplier sata/mvebu: use #ifdef around suspend/resume code ahci: Order SATA device IDs for codename Lewisburg ahci: Add Device ID for Intel Sunrise Point PCH
2015-12-08um: fix returns without va_endGeyslan G. Bem
When using va_list ensure that va_start will be followed by va_end. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08um: Fix fpstate handlingRichard Weinberger
The x86 FPU cleanup changed fpstate to a plain integer. UML on x86 has to deal with that too. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08arch: um: fix error when linking vmlinux.Lorenzo Colitti
On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with: arch/um/os-Linux/built-in.o: In function `os_timer_create': /android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create' arch/um/os-Linux/built-in.o: In function `os_timer_set_interval': /android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime' arch/um/os-Linux/built-in.o: In function `os_timer_remain': /android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime' arch/um/os-Linux/built-in.o: In function `os_timer_one_shot': /android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime' arch/um/os-Linux/built-in.o: In function `os_timer_disable': /android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime' This is because -lrt appears in the generated link commandline after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from arch/um/Makefile and adding it to the UM-specific section of scripts/link-vmlinux.sh. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08um: Fix get_signal() usageRichard Weinberger
If get_signal() returns us a signal to post we must not call it again, otherwise the already posted signal will be overridden. Before commit a610d6e672d this was the case as we stopped the while after a successful handle_signal(). Cc: <stable@vger.kernel.org> # 3.10- Fixes: a610d6e672d ("pull clearing RESTORE_SIGMASK into block_sigmask()") Signed-off-by: Richard Weinberger <richard@nod.at>