summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-09-26Linux 3.12.29v3.12.29Jiri Slaby
2014-09-26arm64: flush TLS registers during execWill Deacon
commit eb35bdd7bca29a13c8ecd44e6fd747a84ce675db upstream. Nathan reports that we leak TLS information from the parent context during an exec, as we don't clear the TLS registers when flushing the thread state. This patch updates the flushing code so that we: (1) Unconditionally zero the tpidr_el0 register (since this is fully context switched for native tasks and zeroed for compat tasks) (2) Zero the tp_value state in thread_info before clearing the tpidrr0_el0 register for compat tasks (since this is only writable by the set_tls compat syscall and therefore not fully switched). A missing compiler barrier is also added to the compat set_tls syscall. Acked-by: Nathan Lynch <Nathan_Lynch@mentor.com> Reported-by: Nathan Lynch <Nathan_Lynch@mentor.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26aio: add missing smp_rmb() in read_events_ringJeff Moyer
commit 2ff396be602f10b5eab8e73b24f20348fa2de159 upstream. We ran into a case on ppc64 running mariadb where io_getevents would return zeroed out I/O events. After adding instrumentation, it became clear that there was some missing synchronization between reading the tail pointer and the events themselves. This small patch fixes the problem in testing. Thanks to Zach for helping to look into this, and suggesting the fix. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26ibmveth: Fix endian issues with rx_no_buffer statisticAnton Blanchard
commit cbd5228199d8be45d895d9d0cc2b8ce53835fc21 upstream. Hidden away in the last 8 bytes of the buffer_list page is a solitary statistic. It needs to be byte swapped or else ethtool -S will produce numbers that terrify the user. Since we do this in multiple places, create a helper function with a comment explaining what is going on. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26ahci: add pcid for Marvel 0x9182 controllerMurali Karicheri
commit c5edfff9db6f4d2c35c802acb4abe0df178becee upstream. Keystone K2E EVM uses Marvel 0x9182 controller. This requires support for the ID in the ahci driver. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26ahci: Add Device IDs for Intel 9 Series PCHJames Ralston
commit 1b071a0947dbce5c184c12262e02540fbc493457 upstream. This patch adds the AHCI mode SATA Device IDs for the Intel 9 Series PCH. Signed-off-by: James Ralston <james.d.ralston@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26pata_scc: propagate return value of scc_wait_after_resetArjun Sreedharan
commit 4dc7c76cd500fa78c64adfda4b070b870a2b993c upstream. scc_bus_softreset not necessarily should return zero. Propagate the error code. Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-26libata: widen Crucial M550 blacklist matchingTejun Heo
commit 2a13772a144d2956a7fedd18685921d0a9b8b783 upstream. Crucial M550 may cause data corruption on queued trims and is blacklisted. The pattern used for it fails to match 1TB one as the capacity section will be four chars instead of three. Widen the pattern. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Charles Reiss <woggling@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81071 Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/radeon/TN: only enable bapm on MSI systemsAlex Deucher
commit 730a336c33a3398d65896e8ee3ef9f5679fe30a9 upstream. There still seem to be stability problems with other systems. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=72921 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/radeon: enable bapm by default on desktop TN/RL boardsAlex Deucher
commit 0c78a44964db3d483b0c09a8236e0fe123aa9cfc upstream. bapm enabled the GPU and CPU to share TDP headroom. It was disabled by default since some laptops hung when it was enabled in conjunction with dpm. It seems to be stable on desktop boards and fixes hangs on boot with dpm enabled on certain boards, so enable it by default on desktop boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=72921 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/i915: read HEAD register back in init_ring_common() to enforce orderingJiri Kosina
commit ece4a17d237a79f63fbfaf3f724a12b6d500555c upstream. Withtout this, ring initialization fails reliabily during resume with [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head ffffff8804 tail 00000000 start 000e4000 This is not a complete fix, but it is verified to make the ring initialization failures during resume much less likely. We were not able to root-cause this bug (likely HW-specific to Gen4 chips) yet. This is therefore used as a ducttape before problem is fully understood and proper fix created, so that people don't suffer from completely unusable systems in the meantime. The discussion and debugging is happening at https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/radeon: set VM base addr using the PFP v2Christian König
commit f1d2a26b506e9dc7bbe94fae40da0a0d8dcfacd0 upstream. Seems to make VM flushes more stable on SI and CIK. v2: only use the PFP on the GFX ring on CIK Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/radeon: load the lm63 driver for an lm64 thermal chip.Alex Deucher
commit 5dc355325b648dc9b4cf3bea4d968de46fd59215 upstream. Looks like the lm63 driver supports the lm64 as well. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/ttm: Pass GFP flags in order to avoid deadlock.Tetsuo Handa
commit a91576d7916f6cce76d30303e60e1ac47cf4a76d upstream. Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free() are currently doing GFP_KERNEL allocation. But these functions did not get updated to receive gfp_t argument. This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions, and removes the deadlock warning. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/ttm: Fix possible stack overflow by recursive shrinker calls.Tetsuo Handa
commit 71336e011d1d2312bcbcaa8fcec7365024f3a95d upstream. While ttm_dma_pool_shrink_scan() tries to take mutex before doing GFP_KERNEL allocation, ttm_pool_shrink_scan() does not do it. This can result in stack overflow if kmalloc() in ttm_page_pool_free() triggered recursion due to memory pressure. shrink_slab() => ttm_pool_shrink_scan() => ttm_page_pool_free() => kmalloc(GFP_KERNEL) => shrink_slab() => ttm_pool_shrink_scan() => ttm_page_pool_free() => kmalloc(GFP_KERNEL) Change ttm_pool_shrink_scan() to do like ttm_dma_pool_shrink_scan() does. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions.Tetsuo Handa
commit 22e71691fd54c637800d10816bbeba9cf132d218 upstream. I can observe that RHEL7 environment stalls with 100% CPU usage when a certain type of memory pressure is given. While the shrinker functions are called by shrink_slab() before the OOM killer is triggered, the stall lasts for many minutes. One of reasons of this stall is that ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with _manager->lock held causes someone (including kswapd) to deadlock when these functions are called due to memory pressure. This patch changes "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to avoid deadlock. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/ttm: Choose a pool to shrink correctly in ttm_dma_pool_shrink_scan().Tetsuo Handa
commit 46c2df68f03a236b30808bba361f10900c88d95e upstream. We can use "unsigned int" instead of "atomic_t" by updating start_pool variable under _manager->lock. This patch will make it possible to avoid skipping when choosing a pool to shrink in round-robin style, after next patch changes mutex_lock(_manager->lock) to !mutex_trylock(_manager->lork). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan().Tetsuo Handa
commit 11e504cc705e8ccb06ac93a276e11b5e8fee4d40 upstream. list_empty(&_manager->pools) being false before taking _manager->lock does not guarantee that _manager->npools != 0 after taking _manager->lock because _manager->npools is updated under _manager->lock. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: fix double kfreeGuido Martínez
commit c9a3ad25eddfdb898114a9d73cdb4c3472d9dfca upstream. display_timings_release calls kfree on the display_timings object passed to it. Calling kfree after it is wrong. SLUB debug showed the following warning: ============================================================================= BUG kmalloc-64 (Tainted: G W ): Object already free ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in of_get_display_timings+0x2c/0x214 age=601 cpu=0 pid=884 __slab_alloc.constprop.79+0x2e0/0x33c kmem_cache_alloc+0xac/0xdc of_get_display_timings+0x2c/0x214 panel_probe+0x7c/0x314 [tilcdc] platform_drv_probe+0x18/0x48 [..snip..] INFO: Freed in panel_destroy+0x18/0x3c [tilcdc] age=0 cpu=0 pid=907 __slab_free+0x34/0x330 panel_destroy+0x18/0x3c [tilcdc] tilcdc_unload+0xd0/0x118 [tilcdc] drm_dev_unregister+0x24/0x98 [..snip..] Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: fix release order on exitGuido Martínez
commit eb565a2bbadc6a5030a6dbe58db1aa52453e7edf upstream. Unregister resources in the correct order on tilcdc_drm_fini, which is the reverse order they were registered during tilcdc_drm_init. This also means unregistering the driver before releasing its resources. Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: panel: fix leak when unloading the moduleGuido Martínez
commit 3a49012224ca9016658a831a327ff6a7fe5bb4f9 upstream. The driver did not unregister the allocated framebuffer, which caused memory leaks (and memory manager WARNs) when unloading. Also, the framebuffer device under /dev still existed after unloading. Add a call to drm_fbdev_cma_fini when unloading the module to prevent both issues. Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: tfp410: fix dangling sysfs connector nodeGuido Martínez
commit 16dcbdef404f4e87dab985494381939fe0a2d456 upstream. Add a drm_sysfs_connector_remove call when we destroy the panel to make sure the connector node in sysfs gets deleted. This is required for proper unload and re-load of this driver, otherwise we will get a warning about a duplicate filename in sysfs. Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: slave: fix dangling sysfs connector nodeGuido Martínez
commit daa15b4cd1eee58eb1322062a3320b1dbe5dc96e upstream. Add a drm_sysfs_connector_remove call when we destroy the panel to make sure the connector node in sysfs gets deleted. This is required for proper unload and re-load of this driver as a module. Without this, we would get a warning at re-load time like so: tda998x 0-0070: found TDA19988 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 825 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x54/0x74() sysfs: cannot create duplicate filename '/class/drm/card0-HDMI-A-1' Modules linked in: [..] CPU: 0 PID: 825 Comm: modprobe Not tainted 3.15.0-rc4-00027-g9dcdef4 #82 [<c0013bb8>] (unwind_backtrace) from [<c0011824>] (show_stack+0x10/0x14) [<c0011824>] (show_stack) from [<c0034e8c>] (warn_slowpath_common+0x68/0x88) [<c0034e8c>] (warn_slowpath_common) from [<c0034edc>] (warn_slowpath_fmt+0x30/0x40) [<c0034edc>] (warn_slowpath_fmt) from [<c01243f4>] (sysfs_warn_dup+0x54/0x74) [<c01243f4>] (sysfs_warn_dup) from [<c0124708>] (sysfs_do_create_link_sd.isra.2+0xb0/0xb8) [<c0124708>] (sysfs_do_create_link_sd.isra.2) from [<c02ae37c>] (device_add+0x338/0x520) [<c02ae37c>] (device_add) from [<c02ae6e8>] (device_create_groups_vargs+0xa0/0xc4) [<c02ae6e8>] (device_create_groups_vargs) from [<c02ae758>] (device_create+0x24/0x2c) [<c02ae758>] (device_create) from [<c029b4ec>] (drm_sysfs_connector_add+0x64/0x204) [<c029b4ec>] (drm_sysfs_connector_add) from [<bf0b1b40>] (slave_modeset_init+0x120/0x1bc [tilcdc]) [<bf0b1b40>] (slave_modeset_init [tilcdc]) from [<bf0b2be8>] (tilcdc_load+0x214/0x4c0 [tilcdc]) [<bf0b2be8>] (tilcdc_load [tilcdc]) from [<c029955c>] (drm_dev_register+0xa4/0x104) [..snip..] ---[ end trace 4df8d614936ebdee ]--- [drm:drm_sysfs_connector_add] *ERROR* failed to register connector device: -17 Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18drm/tilcdc: panel: fix dangling sysfs connector nodeGuido Martínez
commit e396900e649b0af31161634d87fe37076f46c12b upstream. Add a drm_sysfs_connector_remove call when we destroy the panel to make sure the connector node in sysfs gets deleted. This is required for proper unload and re-load of this driver as a module. Without this, we would get a warning at re-load time like so: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 824 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x54/0x74() sysfs: cannot create duplicate filename '/class/drm/card0-LVDS-1' Modules linked in: [...] CPU: 0 PID: 824 Comm: modprobe Not tainted 3.15.0-rc4-00027-g6484f96-dirty #81 [<c0013bb8>] (unwind_backtrace) from [<c0011824>] (show_stack+0x10/0x14) [<c0011824>] (show_stack) from [<c0034e8c>] (warn_slowpath_common+0x68/0x88) [<c0034e8c>] (warn_slowpath_common) from [<c0034edc>] (warn_slowpath_fmt+0x30/0x40) [<c0034edc>] (warn_slowpath_fmt) from [<c01243f4>] (sysfs_warn_dup+0x54/0x74) [<c01243f4>] (sysfs_warn_dup) from [<c0124708>] (sysfs_do_create_link_sd.isra.2+0xb0/0xb8) [<c0124708>] (sysfs_do_create_link_sd.isra.2) from [<c02ae37c>] (device_add+0x338/0x520) [<c02ae37c>] (device_add) from [<c02ae6e8>] (device_create_groups_vargs+0xa0/0xc4) [<c02ae6e8>] (device_create_groups_vargs) from [<c02ae758>] (device_create+0x24/0x2c) [<c02ae758>] (device_create) from [<c029b4ec>] (drm_sysfs_connector_add+0x64/0x204) [<c029b4ec>] (drm_sysfs_connector_add) from [<bf0b1fec>] (panel_modeset_init+0xb8/0x134 [tilcdc]) [<bf0b1fec>] (panel_modeset_init [tilcdc]) from [<bf0b2bf0>] (tilcdc_load+0x214/0x4c0 [tilcdc]) [<bf0b2bf0>] (tilcdc_load [tilcdc]) from [<c029955c>] (drm_dev_register+0xa4/0x104) [ .. snip .. ] ---[ end trace b2d09cd9578b0497 ]--- [drm:drm_sysfs_connector_add] *ERROR* failed to register connector device: -17 Signed-off-by: Guido Martínez <guido@vanguardiasur.com.ar> Tested-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18carl9170: fix sending URBs with wrong type when using full-speedRonald Wahl
commit 671796dd96b6cd85b75fba9d3007bcf7e5f7c309 upstream. The driver assumes that endpoint 4 is always an interrupt endpoint. Unfortunately the type differs between high-speed and full-speed configurations while in the former case it is indeed an interrupt endpoint this is not true for the latter case - here it is a bulk endpoint. When sending URBs with the wrong type the kernel will generate a warning message including backtrace. In this specific case there will be a huge amount of warnings which can bring the system to freeze. To fix this we are now sending URBs to endpoint 4 using the type found in the endpoint descriptor. A side note: The carl9170 firmware currently specifies endpoint 4 as interrupt endpoint even in the full-speed configuration but this has no relevance because before this firmware is loaded the endpoint type is as described above and after the firmware is running the stick is not reenumerated and so the old descriptor is used. Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-18CIFS: Fix directory rename errorPavel Shilovsky
Commit a07d322059db66b84c9eb4f98959df468e88b34b upstream. CIFS servers process nlink counts differently for files and directories. In cifs_rename() if we the request fails on the existing target, we try to remove it through cifs_unlink() but this is not what we want to do for directories. As the result the following sequence of commands mkdir {1,2}; mv -T 1 2; rmdir {1,2}; mkdir {1,2}; echo foo > 2/bar and XFS test generic/023 fail with -ENOENT error. That's why the second mkdir reuses the existing inode (target inode of the mv -T command) with S_DEAD flag. Fix this by checking whether the target is directory or not and calling cifs_rmdir() rather than cifs_unlink() for directories. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17libceph: gracefully handle large reply messages from the monSage Weil
commit 73c3d4812b4c755efeca0140f606f83772a39ce4 upstream. We preallocate a few of the message types we get back from the mon. If we get a larger message than we are expecting, fall back to trying to allocate a new one instead of blindly using the one we have. Signed-off-by: Sage Weil <sage@redhat.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17IB/srp: Fix deadlock between host removal and multipathdBart Van Assche
commit bcc05910359183b431da92713e98eed478edf83a upstream. If scsi_remove_host() is invoked after a SCSI device has been blocked, if the fast_io_fail_tmo or dev_loss_tmo work gets scheduled on the workqueue executing srp_remove_work() and if an I/O request is scheduled after the SCSI device had been blocked by e.g. multipathd then the following deadlock can occur: kworker/6:1 D ffff880831f3c460 0 195 2 0x00000000 Call Trace: [<ffffffff814aafd9>] schedule+0x29/0x70 [<ffffffff814aa0ef>] schedule_timeout+0x10f/0x2a0 [<ffffffff8105af6f>] msleep+0x2f/0x40 [<ffffffff8123b0ae>] __blk_drain_queue+0x4e/0x180 [<ffffffff8123d2d5>] blk_cleanup_queue+0x225/0x230 [<ffffffffa0010732>] __scsi_remove_device+0x62/0xe0 [scsi_mod] [<ffffffffa000ed2f>] scsi_forget_host+0x6f/0x80 [scsi_mod] [<ffffffffa0002eba>] scsi_remove_host+0x7a/0x130 [scsi_mod] [<ffffffffa07cf5c5>] srp_remove_work+0x95/0x180 [ib_srp] [<ffffffff8106d7aa>] process_one_work+0x1ea/0x6c0 [<ffffffff8106dd9b>] worker_thread+0x11b/0x3a0 [<ffffffff810758bd>] kthread+0xed/0x110 [<ffffffff814b972c>] ret_from_fork+0x7c/0xb0 multipathd D ffff880096acc460 0 5340 1 0x00000000 Call Trace: [<ffffffff814aafd9>] schedule+0x29/0x70 [<ffffffff814aa0ef>] schedule_timeout+0x10f/0x2a0 [<ffffffff814ab79b>] io_schedule_timeout+0x9b/0xf0 [<ffffffff814abe1c>] wait_for_completion_io_timeout+0xdc/0x110 [<ffffffff81244b9b>] blk_execute_rq+0x9b/0x100 [<ffffffff8124f665>] sg_io+0x1a5/0x450 [<ffffffff8124fd21>] scsi_cmd_ioctl+0x2a1/0x430 [<ffffffff8124fef2>] scsi_cmd_blk_ioctl+0x42/0x50 [<ffffffffa00ec97e>] sd_ioctl+0xbe/0x140 [sd_mod] [<ffffffff8124bd04>] blkdev_ioctl+0x234/0x840 [<ffffffff811cb491>] block_ioctl+0x41/0x50 [<ffffffff811a0df0>] do_vfs_ioctl+0x300/0x520 [<ffffffff811a1051>] SyS_ioctl+0x41/0x80 [<ffffffff814b9962>] tracesys+0xd0/0xd5 Fix this by scheduling removal work on another workqueue than the transport layer timers. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: David Dillow <dave@thedillows.org> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17blkcg: don't call into policy draining if root_blkg is already goneTejun Heo
commit 2a1b4cf2331d92bc009bf94fa02a24604cdaf24c upstream. While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687bce42b ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Shirish Pargaonkar <spargaonkar@suse.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17mtd: nand: omap: Fix 1-bit Hamming code scheme, omap_calculate_ecc()Roger Quadros
commit 40ddbf5069bd4e11447c0088fc75318e0aac53f0 upstream. commit 65b97cf6b8de introduced in v3.7 caused a regression by using a reversed CS_MASK thus causing omap_calculate_ecc to always fail. As the NAND base driver never checks for .calculate()'s return value, the zeroed ECC values are used as is without showing any error to the user. However, this won't work and the NAND device won't be guarded by any error code. Fix the issue by using the correct mask. Code was tested on omap3beagle using the following procedure - flash the primary bootloader (MLO) from the kernel to the first NAND partition using nandwrite. - boot the board from NAND. This utilizes OMAP ROM loader that relies on 1-bit Hamming code ECC. Fixes: 65b97cf6b8de (mtd: nand: omap2: handle nand on gpmc) Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17mtd/ftl: fix the double free of the buffers allocated in build_maps()Kevin Hao
commit a152056c912db82860a8b4c23d0bd3a5aa89e363 upstream. I got the following panic on my fsl p5020ds board. Unable to handle kernel paging request for data at address 0x7375627379737465 Faulting instruction address: 0xc000000000100778 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=24 CoreNet Generic Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-next-20140613 #145 task: c0000000fe080000 ti: c0000000fe088000 task.ti: c0000000fe088000 NIP: c000000000100778 LR: c00000000010073c CTR: 0000000000000000 REGS: c0000000fe08aa00 TRAP: 0300 Not tainted (3.15.0-next-20140613) MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e24 XER: 00000000 DEAR: 7375627379737465 ESR: 0000000000000000 SOFTE: 1 GPR00: c0000000000c99b0 c0000000fe08ac80 c0000000009598e0 c0000000fe001d80 GPR04: 00000000000000d0 0000000000000913 c000000007902b20 0000000000000000 GPR08: c0000000feaae888 0000000000000000 0000000007091000 0000000000200200 GPR12: 0000000028ad2e28 c00000000fff4000 c0000000007abe08 0000000000000000 GPR16: c0000000007ab160 c0000000007aaf98 c00000000060ba68 c0000000007abda8 GPR20: c0000000007abde8 c0000000feaea6f8 c0000000feaea708 c0000000007abd10 GPR24: c000000000989370 c0000000008c6228 00000000000041ed c0000000fe00a400 GPR28: c00000000017c1cc 00000000000000d0 7375627379737465 c0000000fe001d80 NIP [c000000000100778] .__kmalloc_track_caller+0x70/0x168 LR [c00000000010073c] .__kmalloc_track_caller+0x34/0x168 Call Trace: [c0000000fe08ac80] [c00000000087e6b8] uevent_sock_list+0x0/0x10 (unreliable) [c0000000fe08ad20] [c0000000000c99b0] .kstrdup+0x44/0x90 [c0000000fe08adc0] [c00000000017c1cc] .__kernfs_new_node+0x4c/0x130 [c0000000fe08ae70] [c00000000017d7e4] .kernfs_new_node+0x2c/0x64 [c0000000fe08aef0] [c00000000017db00] .kernfs_create_dir_ns+0x34/0xc8 [c0000000fe08af80] [c00000000018067c] .sysfs_create_dir_ns+0x58/0xcc [c0000000fe08b010] [c0000000002c711c] .kobject_add_internal+0xc8/0x384 [c0000000fe08b0b0] [c0000000002c7644] .kobject_add+0x64/0xc8 [c0000000fe08b140] [c000000000355ebc] .device_add+0x11c/0x654 [c0000000fe08b200] [c0000000002b5988] .add_disk+0x20c/0x4b4 [c0000000fe08b2c0] [c0000000003a21d4] .add_mtd_blktrans_dev+0x340/0x514 [c0000000fe08b350] [c0000000003a3410] .mtdblock_add_mtd+0x74/0xb4 [c0000000fe08b3e0] [c0000000003a32cc] .blktrans_notify_add+0x64/0x94 [c0000000fe08b470] [c00000000039b5b4] .add_mtd_device+0x1d4/0x368 [c0000000fe08b520] [c00000000039b830] .mtd_device_parse_register+0xe8/0x104 [c0000000fe08b5c0] [c0000000003b8408] .of_flash_probe+0x72c/0x734 [c0000000fe08b750] [c00000000035ba40] .platform_drv_probe+0x38/0x84 [c0000000fe08b7d0] [c0000000003599a4] .really_probe+0xa4/0x29c [c0000000fe08b870] [c000000000359d3c] .__driver_attach+0x100/0x104 [c0000000fe08b900] [c00000000035746c] .bus_for_each_dev+0x84/0xe4 [c0000000fe08b9a0] [c0000000003593c0] .driver_attach+0x24/0x38 [c0000000fe08ba10] [c000000000358f24] .bus_add_driver+0x1c8/0x2ac [c0000000fe08bab0] [c00000000035a3a4] .driver_register+0x8c/0x158 [c0000000fe08bb30] [c00000000035b9f4] .__platform_driver_register+0x6c/0x80 [c0000000fe08bba0] [c00000000084e080] .of_flash_driver_init+0x1c/0x30 [c0000000fe08bc10] [c000000000001864] .do_one_initcall+0xbc/0x238 [c0000000fe08bd00] [c00000000082cdc0] .kernel_init_freeable+0x188/0x268 [c0000000fe08bdb0] [c0000000000020a0] .kernel_init+0x1c/0xf7c [c0000000fe08be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction dump: 41bd0010 480000c8 4bf04eb5 60000000 e94d0028 e93f0000 7cc95214 e8a60008 7fc9502a 2fbe0000 419e00c8 e93f0022 <7f7e482a> 39200000 88ed06b2 992d06b2 ---[ end trace b4c9a94804a42d40 ]--- It seems that the corrupted partition header on my mtd device triggers a bug in the ftl. In function build_maps() it will allocate the buffers needed by the mtd partition, but if something goes wrong such as kmalloc failure, mtd read error or invalid partition header parameter, it will free all allocated buffers and then return non-zero. In my case, it seems that partition header parameter 'NumTransferUnits' is invalid. And the ftl_freepart() is a function which free all the partition buffers allocated by build_maps(). Given the build_maps() is a self cleaning function, so there is no need to invoke this function even if build_maps() return with error. Otherwise it will causes the buffers to be freed twice and then weird things would happen. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Fix wrong restart readdir for SMB1Pavel Shilovsky
commit f736906a7669a77cf8cabdcbcf1dc8cb694e12ef upstream. The existing code calls server->ops->close() that is not right. This causes XFS test generic/310 to fail. Fix this by using server->ops->closedir() function. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Fix wrong filename length for SMB2Pavel Shilovsky
commit 1bbe4997b13de903c421c1cc78440e544b5f9064 upstream. The existing code uses the old MAX_NAME constant. This causes XFS test generic/013 to fail. Fix it by replacing MAX_NAME with PATH_MAX that SMB1 uses. Also remove an unused MAX_NAME constant definition. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Fix wrong directory attributes after renamePavel Shilovsky
commit b46799a8f28c43c5264ac8d8ffa28b311b557e03 upstream. When we requests rename we also need to update attributes of both source and target parent directories. Not doing it causes generic/309 xfstest to fail on SMB2 mounts. Fix this by marking these directories for force revalidating. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Possible null ptr deref in SMB2_tconSteve French
commit 18f39e7be0121317550d03e267e3ebd4dbfbb3ce upstream. As Raphael Geissert pointed out, tcon_error_exit can dereference tcon and there is one path in which tcon can be null. Signed-off-by: Steve French <smfrench@gmail.com> Reported-by: Raphael Geissert <geissert@debian.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Fix async reading on reconnectsPavel Shilovsky
commit 038bc961c31b070269ecd07349a7ee2e839d4fec upstream. If we get into read_into_pages() from cifs_readv_receive() and then loose a network, we issue cifs_reconnect that moves all mids to a private list and issue their callbacks. The callback of the async read request sets a mid to retry, frees it and wakes up a process that waits on the rdata completion. After the connection is established we return from read_into_pages() with a short read, use the mid that was freed before and try to read the remaining data from the a newly created socket. Both actions are not what we want to do. In reconnect cases (-EAGAIN) we should not mask off the error with a short read but should return the error code instead. Acked-by: Jeff Layton <jlayton@samba.org> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17CIFS: Fix STATUS_CANNOT_DELETE error mapping for SMB2Pavel Shilovsky
commit 21496687a79424572f46a84c690d331055f4866f upstream. The existing mapping causes unlink() call to return error after delete operation. Changing the mapping to -EACCES makes the client process the call like CIFS protocol does - reset dos attributes with ATTR_READONLY flag masked off and retry the operation. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17libceph: do not hard code max auth ticket lenIlya Dryomov
commit c27a3e4d667fdcad3db7b104f75659478e0c68d8 upstream. We hard code cephx auth ticket buffer size to 256 bytes. This isn't enough for any moderate setups and, in case tickets themselves are not encrypted, leads to buffer overflows (ceph_x_decrypt() errors out, but ceph_decode_copy() doesn't - it's just a memcpy() wrapper). Since the buffer is allocated dynamically anyway, allocated it a bit later, at the point where we know how much is going to be needed. Fixes: http://tracker.ceph.com/issues/8979 Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17libceph: add process_one_ticket() helperIlya Dryomov
commit 597cda357716a3cf8d994cb11927af917c8d71fa upstream. Add a helper for processing individual cephx auth tickets. Needed for the next commit, which deals with allocating ticket buffers. (Most of the diff here is whitespace - view with git diff -b). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctlyIlya Dryomov
commit 5f740d7e1531099b888410e6bab13f68da9b1a4d upstream. Determining ->last_piece based on the value of ->page_offset + length is incorrect because length here is the length of the entire message. ->last_piece set to false even if page array data item length is <= PAGE_SIZE, which results in invalid length passed to ceph_tcp_{send,recv}page() and causes various asserts to fire. # cat pages-cursor-init.sh #!/bin/bash rbd create --size 10 --image-format 2 foo FOO_DEV=$(rbd map foo) dd if=/dev/urandom of=$FOO_DEV bs=1M &>/dev/null rbd snap create foo@snap rbd snap protect foo@snap rbd clone foo@snap bar # rbd_resize calls librbd rbd_resize(), size is in bytes ./rbd_resize bar $(((4 << 20) + 512)) rbd resize --size 10 bar BAR_DEV=$(rbd map bar) # trigger a 512-byte copyup -- 512-byte page array data item dd if=/dev/urandom of=$BAR_DEV bs=1M count=1 seek=5 The problem exists only in ceph_msg_data_pages_cursor_init(), ceph_msg_data_pages_advance() does the right thing. The size_t cast is unnecessary. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17xfs: don't zero partial page cache pages during O_DIRECT writesChris Mason
commit 85e584da3212140ee80fd047f9058bbee0bc00d5 upstream. xfs is using truncate_pagecache_range to invalidate the page cache during DIO reads. This is different from the other filesystems who only invalidate pages during DIO writes. truncate_pagecache_range is meant to be used when we are freeing the underlying data structs from disk, so it will zero any partial ranges in the page. This means a DIO read can zero out part of the page cache page, and it is possible the page will stay in cache. buffered reads will find an up to date page with zeros instead of the data actually on disk. This patch fixes things by using invalidate_inode_pages2_range instead. It preserves the page cache invalidation, but won't zero any pages. [dchinner: catch error and warn if it fails. Comment.] Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17xfs: don't zero partial page cache pages during O_DIRECT writesDave Chinner
commit 834ffca6f7e345a79f6f2e2d131b0dfba8a4b67a upstream. Similar to direct IO reads, direct IO writes are using truncate_pagecache_range to invalidate the page cache. This is incorrect due to the sub-block zeroing in the page cache that truncate_pagecache_range() triggers. This patch fixes things by using invalidate_inode_pages2_range instead. It preserves the page cache invalidation, but won't zero any pages. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17xfs: don't dirty buffers beyond EOFDave Chinner
commit 22e757a49cf010703fcb9c9b4ef793248c39b0c2 upstream. generic/263 is failing fsx at this point with a page spanning EOF that cannot be invalidated. The operations are: 1190 mapwrite 0x52c00 thru 0x5e569 (0xb96a bytes) 1191 mapread 0x5c000 thru 0x5d636 (0x1637 bytes) 1192 write 0x5b600 thru 0x771ff (0x1bc00 bytes) where 1190 extents EOF from 0x54000 to 0x5e569. When the direct IO write attempts to invalidate the cached page over this range, it fails with -EBUSY and so any attempt to do page invalidation fails. The real question is this: Why can't that page be invalidated after it has been written to disk and cleaned? Well, there's data on the first two buffers in the page (1k block size, 4k page), but the third buffer on the page (i.e. beyond EOF) is failing drop_buffers because it's bh->b_state == 0x3, which is BH_Uptodate | BH_Dirty. IOWs, there's dirty buffers beyond EOF. Say what? OK, set_buffer_dirty() is called on all buffers from __set_page_buffers_dirty(), regardless of whether the buffer is beyond EOF or not, which means that when we get to ->writepage, we have buffers marked dirty beyond EOF that we need to clean. So, we need to implement our own .set_page_dirty method that doesn't dirty buffers beyond EOF. This is messy because the buffer code is not meant to be shared and it has interesting locking issues on the buffer dirty bits. So just copy and paste it and then modify it to suit what we need. Note: the solutions the other filesystems and generic block code use of marking the buffers clean in ->writepage does not work for XFS. It still leaves dirty buffers beyond EOF and invalidations still fail. Hence rather than play whack-a-mole, this patch simply prevents those buffers from being dirtied in the first place. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17xfs: quotacheck leaves dquot buffers without verifiersDave Chinner
commit 5fd364fee81a7888af806e42ed8a91c845894f2d upstream. When running xfs/305, I noticed that quotacheck was flushing dquot buffers that did not have the xfs_dquot_buf_ops verifiers attached: XFS (vdb): _xfs_buf_ioapply: no ops on block 0x1dc8/0x1dc8 ffff880052489000: 44 51 01 04 00 00 65 b8 00 00 00 00 00 00 00 00 DQ....e......... ffff880052489010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ffff880052489020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ffff880052489030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 1 PID: 2376 Comm: mount Not tainted 3.16.0-rc2-dgc+ #306 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88006fe38000 ffff88004a0ffae8 ffffffff81cf1cca 0000000000000001 ffff88004a0ffb88 ffffffff814d50ca 000010004a0ffc70 0000000000000000 ffff88006be56dc4 0000000000000021 0000000000001dc8 ffff88007c773d80 Call Trace: [<ffffffff81cf1cca>] dump_stack+0x45/0x56 [<ffffffff814d50ca>] _xfs_buf_ioapply+0x3ca/0x3d0 [<ffffffff810db520>] ? wake_up_state+0x20/0x20 [<ffffffff814d51f5>] ? xfs_bdstrat_cb+0x55/0xb0 [<ffffffff814d513b>] xfs_buf_iorequest+0x6b/0xd0 [<ffffffff814d51f5>] xfs_bdstrat_cb+0x55/0xb0 [<ffffffff814d53ab>] __xfs_buf_delwri_submit+0x15b/0x220 [<ffffffff814d6040>] ? xfs_buf_delwri_submit+0x30/0x90 [<ffffffff814d6040>] xfs_buf_delwri_submit+0x30/0x90 [<ffffffff8150f89d>] xfs_qm_quotacheck+0x17d/0x3c0 [<ffffffff81510591>] xfs_qm_mount_quotas+0x151/0x1e0 [<ffffffff814ed01c>] xfs_mountfs+0x56c/0x7d0 [<ffffffff814f0f12>] xfs_fs_fill_super+0x2c2/0x340 [<ffffffff811c9fe4>] mount_bdev+0x194/0x1d0 [<ffffffff814f0c50>] ? xfs_finish_flags+0x170/0x170 [<ffffffff814ef0f5>] xfs_fs_mount+0x15/0x20 [<ffffffff811ca8c9>] mount_fs+0x39/0x1b0 [<ffffffff811e4d67>] vfs_kern_mount+0x67/0x120 [<ffffffff811e757e>] do_mount+0x23e/0xad0 [<ffffffff8117abde>] ? __get_free_pages+0xe/0x50 [<ffffffff811e71e6>] ? copy_mount_options+0x36/0x150 [<ffffffff811e8103>] SyS_mount+0x83/0xc0 [<ffffffff81cfd40b>] tracesys+0xdd/0xe2 This was caused by dquot buffer readahead not attaching a verifier structure to the buffer when readahead was issued, resulting in the followup read of the buffer finding a valid buffer and so not attaching new verifiers to the buffer as part of the read. Also, when a verifier failure occurs, we then read the buffer without verifiers. Attach the verifiers manually after this read so that if the buffer is then written it will be verified that the corruption has been repaired. Further, when flushing a dquot we don't ask for a verifier when reading in the dquot buffer the dquot belongs to. Most of the time this isn't an issue because the buffer is still cached, but when it is not cached it will result in writing the dquot buffer without having the verfier attached. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17xfs: ensure verifiers are attached to recovered buffersDave Chinner
commit 67dc288c21064b31a98a53dc64f6b9714b819fd6 upstream. Crash testing of CRC enabled filesystems has resulted in a number of reports of bad CRCs being detected after the filesystem was mounted. Errors such as the following were being seen: XFS (sdb3): Mounting V5 Filesystem XFS (sdb3): Starting recovery (logdev: internal) XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1 XFS (sdb3): Unmount and run xfs_repair XFS (sdb3): First 64 bytes of corrupted metadata buffer: ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40 XAGF...........@ ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01 ..mS..w......... ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03 ................ ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00 ................ XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1 The errors were typically being seen in AGF, AGI and their related btree block buffers some time after log recovery had run. Often it wasn't until later subsequent mounts that the problem was discovered. The common symptom was a buffer with the correct contents, but a CRC and an LSN that matched an older version of the contents. Some debug added to _xfs_buf_ioapply() indicated that buffers were being written without verifiers attached to them from log recovery, and Jan Kara isolated the cause to log recovery readahead an dit's interactions with buffers that had a more recent LSN on disk than the transaction being recovered. In this case, the buffer did not get a verifier attached, and os when the second phase of log recovery ran and recovered EFIs and unlinked inodes, the buffers were modified and written without the verifier running. Hence they had up to date contents, but stale LSNs and CRCs. Fix it by attaching verifiers to buffers we skip due to future LSN values so they don't escape into the buffer cache without the correct verifier attached. This patch is based on analysis and a patch from Jan Kara. Reported-by: Jan Kara <jack@suse.cz> Reported-by: Fanael Linithien <fanael4@gmail.com> Reported-by: Grozdan <neutrino8@gmail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17RDMA/uapi: Include socket.h in rdma_user_cm.hDoug Ledford
commit db1044d458a287c18c4d413adc4ad12e92e253b5 upstream. added struct sockaddr_storage to rdma_user_cm.h without also adding an include for linux/socket.h to make sure it is defined. Systemtap needs the header files to build standalone and cannot rely on other files to pre-include other headers, so add linux/socket.h to the list of includes in this file. Fixes: ee7aed4528f ("RDMA/ucma: Support querying for AF_IB addresses") Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17RDMA/iwcm: Use a default listen backlog if neededSteve Wise
commit 2f0304d21867476394cd51a54e97f7273d112261 upstream. If the user creates a listening cm_id with backlog of 0 the IWCM ends up not allowing any connection requests at all. The correct behavior is for the IWCM to pick a default value if the user backlog parameter is zero. Lustre from version 1.8.8 onward uses a backlog of 0, which breaks iwarp support without this fix. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17md/raid10: Fix memory leak when raid10 reshape completes.NeilBrown
commit b39685526f46976bcd13aa08c82480092befa46c upstream. When a raid10 commences a resync/recovery/reshape it allocates some buffer space. When a resync/recovery completes the buffer space is freed. But not when the reshape completes. This can result in a small memory leak. There is a subtle side-effect of this bug. When a RAID10 is reshaped to a larger array (more devices), the reshape is immediately followed by a "resync" of the new space. This "resync" will use the buffer space which was allocated for "reshape". This can cause problems including a "BUG" in the SCSI layer. So this is suitable for -stable. Fixes: 3ea7daa5d7fde47cd41f4d56c2deb949114da9d6 Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17md/raid10: fix memory leak when reshaping a RAID10.NeilBrown
commit ce0b0a46955d1bb389684a2605dbcaa990ba0154 upstream. raid10 reshape clears unwanted bits from a bio->bi_flags using a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC was added. Since then it clears that bit but shouldn't. This results in a memory leak. So change to used the approved method of clearing unwanted bits. As this causes a memory leak which can consume all of memory the fix is suitable for -stable. Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd Reported-by: mdraid.pkoch@dfgh.net (Peter Koch) Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2014-09-17md/raid6: avoid data corruption during recovery of double-degraded RAID6NeilBrown
commit 9c4bdf697c39805078392d5ddbbba5ae5680e0dd upstream. During recovery of a double-degraded RAID6 it is possible for some blocks not to be recovered properly, leading to corruption. If a write happens to one block in a stripe that would be written to a missing device, and at the same time that stripe is recovering data to the other missing device, then that recovered data may not be written. This patch skips, in the double-degraded case, an optimisation that is only safe for single-degraded arrays. Bug was introduced in 2.6.32 and fix is suitable for any kernel since then. In an older kernel with separate handle_stripe5() and handle_stripe6() functions the patch must change handle_stripe6(). Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8 Cc: Yuri Tikhonov <yur@emcraft.com> Cc: Dan Williams <dan.j.williams@intel.com> Reported-by: "Manibalan P" <pmanibalan@amiindia.co.in> Tested-by: "Manibalan P" <pmanibalan@amiindia.co.in> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423 Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>