summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-02-24media: sti: bdisp: fix a possible sleep-in-atomic-context bug in ↵Jia-Ju Bai
bdisp_device_run() [ Upstream commit bb6d42061a05d71dd73f620582d9e09c8fbf7f5b ] The driver may sleep while holding a spinlock. The function call path (from bottom to top) in Linux 4.19 is: drivers/media/platform/sti/bdisp/bdisp-hw.c, 385: msleep in bdisp_hw_reset drivers/media/platform/sti/bdisp/bdisp-v4l2.c, 341: bdisp_hw_reset in bdisp_device_run drivers/media/platform/sti/bdisp/bdisp-v4l2.c, 317: _raw_spin_lock_irqsave in bdisp_device_run To fix this bug, msleep() is replaced with udelay(). This bug is found by a static analysis tool STCheck written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Fabien Dessenne <fabien.dessenne@st.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24char/random: silence a lockdep splat with printk()Sergey Senozhatsky
[ Upstream commit 1b710b1b10eff9d46666064ea25f079f70bc67a8 ] Sergey didn't like the locking order, uart_port->lock -> tty_port->lock uart_write (uart_port->lock) __uart_start pl011_start_tx pl011_tx_chars uart_write_wakeup tty_port_tty_wakeup tty_port_default tty_port_tty_get (tty_port->lock) but those code is so old, and I have no clue how to de-couple it after checking other locks in the splat. There is an onging effort to make all printk() as deferred, so until that happens, workaround it for now as a short-term fix. LTP: starting iogen01 (export LTPROOT; rwtest -N iogen01 -i 120s -s read,write -Da -Dv -n 2 500b:$TMPDIR/doio.f1.$$ 1000b:$TMPDIR/doio.f2.$$) WARNING: possible circular locking dependency detected ------------------------------------------------------ doio/49441 is trying to acquire lock: ffff008b7cff7290 (&(&zone->lock)->rlock){..-.}, at: rmqueue+0x138/0x2050 but task is already holding lock: 60ff000822352818 (&pool->lock/1){-.-.}, at: start_flush_work+0xd8/0x3f0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&pool->lock/1){-.-.}: lock_acquire+0x320/0x360 _raw_spin_lock+0x64/0x80 __queue_work+0x4b4/0xa10 queue_work_on+0xac/0x11c tty_schedule_flip+0x84/0xbc tty_flip_buffer_push+0x1c/0x28 pty_write+0x98/0xd0 n_tty_write+0x450/0x60c tty_write+0x338/0x474 __vfs_write+0x88/0x214 vfs_write+0x12c/0x1a4 redirected_tty_write+0x90/0xdc do_loop_readv_writev+0x140/0x180 do_iter_write+0xe0/0x10c vfs_writev+0x134/0x1cc do_writev+0xbc/0x130 __arm64_sys_writev+0x58/0x8c el0_svc_handler+0x170/0x240 el0_sync_handler+0x150/0x250 el0_sync+0x164/0x180 -> #3 (&(&port->lock)->rlock){-.-.}: lock_acquire+0x320/0x360 _raw_spin_lock_irqsave+0x7c/0x9c tty_port_tty_get+0x24/0x60 tty_port_default_wakeup+0x1c/0x3c tty_port_tty_wakeup+0x34/0x40 uart_write_wakeup+0x28/0x44 pl011_tx_chars+0x1b8/0x270 pl011_start_tx+0x24/0x70 __uart_start+0x5c/0x68 uart_write+0x164/0x1c8 do_output_char+0x33c/0x348 n_tty_write+0x4bc/0x60c tty_write+0x338/0x474 redirected_tty_write+0xc0/0xdc do_loop_readv_writev+0x140/0x180 do_iter_write+0xe0/0x10c vfs_writev+0x134/0x1cc do_writev+0xbc/0x130 __arm64_sys_writev+0x58/0x8c el0_svc_handler+0x170/0x240 el0_sync_handler+0x150/0x250 el0_sync+0x164/0x180 -> #2 (&port_lock_key){-.-.}: lock_acquire+0x320/0x360 _raw_spin_lock+0x64/0x80 pl011_console_write+0xec/0x2cc console_unlock+0x794/0x96c vprintk_emit+0x260/0x31c vprintk_default+0x54/0x7c vprintk_func+0x218/0x254 printk+0x7c/0xa4 register_console+0x734/0x7b0 uart_add_one_port+0x734/0x834 pl011_register_port+0x6c/0xac sbsa_uart_probe+0x234/0x2ec platform_drv_probe+0xd4/0x124 really_probe+0x250/0x71c driver_probe_device+0xb4/0x200 __device_attach_driver+0xd8/0x188 bus_for_each_drv+0xbc/0x110 __device_attach+0x120/0x220 device_initial_probe+0x20/0x2c bus_probe_device+0x54/0x100 device_add+0xae8/0xc2c platform_device_add+0x278/0x3b8 platform_device_register_full+0x238/0x2ac acpi_create_platform_device+0x2dc/0x3a8 acpi_bus_attach+0x390/0x3cc acpi_bus_attach+0x108/0x3cc acpi_bus_attach+0x108/0x3cc acpi_bus_attach+0x108/0x3cc acpi_bus_scan+0x7c/0xb0 acpi_scan_init+0xe4/0x304 acpi_init+0x100/0x114 do_one_initcall+0x348/0x6a0 do_initcall_level+0x190/0x1fc do_basic_setup+0x34/0x4c kernel_init_freeable+0x19c/0x260 kernel_init+0x18/0x338 ret_from_fork+0x10/0x18 -> #1 (console_owner){-...}: lock_acquire+0x320/0x360 console_lock_spinning_enable+0x6c/0x7c console_unlock+0x4f8/0x96c vprintk_emit+0x260/0x31c vprintk_default+0x54/0x7c vprintk_func+0x218/0x254 printk+0x7c/0xa4 get_random_u64+0x1c4/0x1dc shuffle_pick_tail+0x40/0xac __free_one_page+0x424/0x710 free_one_page+0x70/0x120 __free_pages_ok+0x61c/0xa94 __free_pages_core+0x1bc/0x294 memblock_free_pages+0x38/0x48 __free_pages_memory+0xcc/0xfc __free_memory_core+0x70/0x78 free_low_memory_core_early+0x148/0x18c memblock_free_all+0x18/0x54 mem_init+0xb4/0x17c mm_init+0x14/0x38 start_kernel+0x19c/0x530 -> #0 (&(&zone->lock)->rlock){..-.}: validate_chain+0xf6c/0x2e2c __lock_acquire+0x868/0xc2c lock_acquire+0x320/0x360 _raw_spin_lock+0x64/0x80 rmqueue+0x138/0x2050 get_page_from_freelist+0x474/0x688 __alloc_pages_nodemask+0x3b4/0x18dc alloc_pages_current+0xd0/0xe0 alloc_slab_page+0x2b4/0x5e0 new_slab+0xc8/0x6bc ___slab_alloc+0x3b8/0x640 kmem_cache_alloc+0x4b4/0x588 __debug_object_init+0x778/0x8b4 debug_object_init_on_stack+0x40/0x50 start_flush_work+0x16c/0x3f0 __flush_work+0xb8/0x124 flush_work+0x20/0x30 xlog_cil_force_lsn+0x88/0x204 [xfs] xfs_log_force_lsn+0x128/0x1b8 [xfs] xfs_file_fsync+0x3c4/0x488 [xfs] vfs_fsync_range+0xb0/0xd0 generic_write_sync+0x80/0xa0 [xfs] xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs] xfs_file_write_iter+0x1a0/0x218 [xfs] __vfs_write+0x1cc/0x214 vfs_write+0x12c/0x1a4 ksys_write+0xb0/0x120 __arm64_sys_write+0x54/0x88 el0_svc_handler+0x170/0x240 el0_sync_handler+0x150/0x250 el0_sync+0x164/0x180 other info that might help us debug this: Chain exists of: &(&zone->lock)->rlock --> &(&port->lock)->rlock --> &pool->lock/1 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&pool->lock/1); lock(&(&port->lock)->rlock); lock(&pool->lock/1); lock(&(&zone->lock)->rlock); *** DEADLOCK *** 4 locks held by doio/49441: #0: a0ff00886fc27408 (sb_writers#8){.+.+}, at: vfs_write+0x118/0x1a4 #1: 8fff00080810dfe0 (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0x2a8/0x300 [xfs] #2: ffff9000129f2390 (rcu_read_lock){....}, at: rcu_lock_acquire+0x8/0x38 #3: 60ff000822352818 (&pool->lock/1){-.-.}, at: start_flush_work+0xd8/0x3f0 stack backtrace: CPU: 48 PID: 49441 Comm: doio Tainted: G W Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xe8/0x150 print_circular_bug+0x368/0x380 check_noncircular+0x28c/0x294 validate_chain+0xf6c/0x2e2c __lock_acquire+0x868/0xc2c lock_acquire+0x320/0x360 _raw_spin_lock+0x64/0x80 rmqueue+0x138/0x2050 get_page_from_freelist+0x474/0x688 __alloc_pages_nodemask+0x3b4/0x18dc alloc_pages_current+0xd0/0xe0 alloc_slab_page+0x2b4/0x5e0 new_slab+0xc8/0x6bc ___slab_alloc+0x3b8/0x640 kmem_cache_alloc+0x4b4/0x588 __debug_object_init+0x778/0x8b4 debug_object_init_on_stack+0x40/0x50 start_flush_work+0x16c/0x3f0 __flush_work+0xb8/0x124 flush_work+0x20/0x30 xlog_cil_force_lsn+0x88/0x204 [xfs] xfs_log_force_lsn+0x128/0x1b8 [xfs] xfs_file_fsync+0x3c4/0x488 [xfs] vfs_fsync_range+0xb0/0xd0 generic_write_sync+0x80/0xa0 [xfs] xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs] xfs_file_write_iter+0x1a0/0x218 [xfs] __vfs_write+0x1cc/0x214 vfs_write+0x12c/0x1a4 ksys_write+0xb0/0x120 __arm64_sys_write+0x54/0x88 el0_svc_handler+0x170/0x240 el0_sync_handler+0x150/0x250 el0_sync+0x164/0x180 Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/1573679785-21068-1-git-send-email-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24x86/fpu: Deactivate FPU state after failure during state loadSebastian Andrzej Siewior
[ Upstream commit bbc55341b9c67645d1a5471506370caf7dd4a203 ] In __fpu__restore_sig(), fpu_fpregs_owner_ctx needs to be reset if the FPU state was not fully restored. Otherwise the following may happen (on the same CPU): Task A Task B fpu_fpregs_owner_ctx *active* A.fpu __fpu__restore_sig() ctx switch load B.fpu *active* B.fpu fpregs_lock() copy_user_to_fpregs_zeroing() copy_kernel_to_xregs() *modify* copy_user_to_xregs() *fails* fpregs_unlock() ctx switch skip loading B.fpu, *active* B.fpu In the success case, fpu_fpregs_owner_ctx is set to the current task. In the failure case, the FPU state might have been modified by loading the init state. In this case, fpu_fpregs_owner_ctx needs to be reset in order to ensure that the FPU state of the following task is loaded from saved state (and not skipped because it was the previous state). Reset fpu_fpregs_owner_ctx after a failure during restore occurred, to ensure that the FPU state for the next task is always loaded. The problem was debugged-by Yu-cheng Yu <yu-cheng.yu@intel.com>. [ bp: Massage commit message. ] Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Reported-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191220195906.plk6kpmsrikvbcfn@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24iommu/vt-d: Fix off-by-one in PASID allocationJacob Pan
[ Upstream commit 39d630e332144028f56abba83d94291978e72df1 ] PASID allocator uses IDR which is exclusive for the end of the allocation range. There is no need to decrement pasid_max. Fixes: af39507305fb ("iommu/vt-d: Apply global PASID in SVA") Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in ↵Jia-Ju Bai
grgpio_irq_map/unmap() [ Upstream commit e36eaf94be8f7bc4e686246eed3cf92d845e2ef8 ] The driver may sleep while holding a spinlock. The function call path (from bottom to top) in Linux 4.19 is: drivers/gpio/gpio-grgpio.c, 261: request_irq in grgpio_irq_map drivers/gpio/gpio-grgpio.c, 255: _raw_spin_lock_irqsave in grgpio_irq_map drivers/gpio/gpio-grgpio.c, 318: free_irq in grgpio_irq_unmap drivers/gpio/gpio-grgpio.c, 299: _raw_spin_lock_irqsave in grgpio_irq_unmap request_irq() and free_irq() can sleep at runtime. To fix these bugs, request_irq() and free_irq() are called without holding the spinlock. These bugs are found by a static analysis tool STCheck written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Link: https://lore.kernel.org/r/20191218132605.10594-1-baijiaju1990@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24clk: meson: meson8b: make the CCF use the glitch-free mali muxMartin Blumenstingl
[ Upstream commit 8daeaea99caabe24a0929fac17977ebfb882fa86 ] The "mali_0" or "mali_1" clock trees should not be updated while the clock is running. Enforce this by setting CLK_SET_RATE_GATE on the "mali_0" and "mali_1" gates. This makes the CCF switch to the "mali_1" tree when "mali_0" is currently active and vice versa, which is exactly what the vendor driver does when updating the frequency of the mali clock. This fixes a potential hang when changing the GPU frequency at runtime. Fixes: 74e1f2521f16ff ("clk: meson: meson8b: add the GPU clock tree") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE numberOliver O'Halloran
[ Upstream commit 3b5b9997b331e77ce967eba2c4bc80dc3134a7fe ] On pseries there is a bug with adding hotplugged devices to an IOMMU group. For a number of dumb reasons fixing that bug first requires re-working how VFs are configured on PowerNV. For background, on PowerNV we use the pcibios_sriov_enable() hook to do two things: 1. Create a pci_dn structure for each of the VFs, and 2. Configure the PHB's internal BARs so the MMIO range for each VF maps to a unique PE. Roughly speaking a PE is the hardware counterpart to a Linux IOMMU group since all the devices in a PE share the same IOMMU table. A PE also defines the set of devices that should be isolated in response to a PCI error (i.e. bad DMA, UR/CA, AER events, etc). When isolated all MMIO and DMA traffic to and from devicein the PE is blocked by the root complex until the PE is recovered by the OS. The requirement to block MMIO causes a giant headache because the P8 PHB generally uses a fixed mapping between MMIO addresses and PEs. As a result we need to delay configuring the IOMMU groups for device until after MMIO resources are assigned. For physical devices (i.e. non-VFs) the PE assignment is done in pcibios_setup_bridge() which is called immediately after the MMIO resources for downstream devices (and the bridge's windows) are assigned. For VFs the setup is more complicated because: a) pcibios_setup_bridge() is not called again when VFs are activated, and b) The pci_dev for VFs are created by generic code which runs after pcibios_sriov_enable() is called. The work around for this is a two step process: 1. A fixup in pcibios_add_device() is used to initialised the cached pe_number in pci_dn, then 2. A bus notifier then adds the device to the IOMMU group for the PE specified in pci_dn->pe_number. A side effect fixing the pseries bug mentioned in the first paragraph is moving the fixup out of pcibios_add_device() and into pcibios_bus_add_device(), which is called much later. This results in step 2. failing because pci_dn->pe_number won't be initialised when the bus notifier is run. We can fix this by removing the need for the fixup. The PE for a VF is known before the VF is even scanned so we can initialise pci_dn->pe_number pcibios_sriov_enable() instead. Unfortunately, moving the initialisation causes two problems: 1. We trip the WARN_ON() in the current fixup code, and 2. The EEH core clears pdn->pe_number when recovering a VF and relies on the fixup to correctly re-set it. The only justification for either of these is a comment in eeh_rmv_device() suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order for the VF to be scanned. However, this comment appears to have no basis in reality. Both bugs can be fixed by just deleting the code. Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191028085424.12006-1-oohall@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24clk: at91: sam9x60: fix programmable clock prescalerEugen Hristev
[ Upstream commit 66d9f5214c9ba1c151478f99520b6817302d50dc ] The prescaler works as parent rate divided by (PRES + 1) (is_pres_direct == 1) It does not work in the way of parent rate shifted to the right by (PRES + 1), which means division by 2^(PRES + 1) (is_pres_direct == 0) Thus is_pres_direct must be enabled for this SoC, to make the right computation. This field was added in commit 45b06682113b ("clk: at91: fix programmable clock for sama5d2") SAM9X60 has the same field as SAMA5D2 in the PCK Fixes: 01e2113de9a5 ("clk: at91: add sam9x60 pmc driver") Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Link: https://lkml.kernel.org/r/1575977088-16781-1-git-send-email-eugen.hristev@microchip.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: sun4i-csi: Fix [HV]sync polarity handlingChen-Yu Tsai
[ Upstream commit 1948dcf0f928b8bcdca57ca3fba8545ba380fc29 ] The Allwinner camera sensor interface has a different definition of [HV]sync. While the timing diagram uses the names HSYNC and VSYNC, the note following the diagram and register names use HREF and VREF. Combined they imply the hardware uses either [HV]REF or inverted [HV]SYNC. There are also registers to set horizontal skip lengths in pixels and vertical skip lengths in lines, also known as back porches. Fix the polarity handling by using the opposite polarity flag for the checks. Also rename `[hv]sync_pol` to `[hv]ref_pol` to better match the hardware register description. Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: sun4i-csi: Fix data sampling polarity handlingChen-Yu Tsai
[ Upstream commit cf9e6d5dbdd56ef2aa72f28c806711c4293c8848 ] The CLK_POL field specifies whether data is sampled on the falling or rising edge of PCLK, not whether the data lines are active high or low. Evidence of this can be found in the timing diagram labeled "horizontal size setting and pixel clock timing". Fix the setting by checking the correct flag, V4L2_MBUS_PCLK_SAMPLE_RISING. While at it, reorder the three polarity flag checks so HSYNC and VSYNC are grouped together. Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: sun4i-csi: Deal with DRAM offsetChen-Yu Tsai
[ Upstream commit 249b286171fa9c358e8d5c825b48c4ebea97c498 ] On Allwinner SoCs, some high memory bandwidth devices do DMA directly over the memory bus (called MBUS), instead of the system bus. These devices include the CSI camera sensor interface, video (codec) engine, display subsystem, etc.. The memory bus has a different addressing scheme without the DRAM starting offset. Deal with this using the "interconnects" property from the device tree, or if that is not available, set dev->dma_pfn_offset to PHYS_PFN_OFFSET. Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: i2c: mt9v032: fix enum mbus codes and frame sizesEugen Hristev
[ Upstream commit 1451d5ae351d938a0ab1677498c893f17b9ee21d ] This driver supports both the mt9v032 (color) and the mt9v022 (mono) sensors. Depending on which sensor is used, the format from the sensor is different. The format.code inside the dev struct holds this information. The enum mbus and enum frame sizes need to take into account both type of sensors, not just the color one. To solve this, use the format.code in these functions instead of the hardcoded bayer color format (which is only used for mt9v032). [Sakari Ailus: rewrapped commit message] Suggested-by: Wenyou Yang <wenyou.yang@microchip.com> Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: ov5640: Fix check for PLL1 exceeding max allowed rateAdam Ford
[ Upstream commit 2e3df204f9af42a47823ee955c08950373417420 ] The variable _rate is by ov5640_compute_sys_clk() which returns zero if the PLL exceeds 1GHz. Unfortunately, the check to see if the max PLL1 output is checking 'rate' and not '_rate' and 'rate' does not ever appear to be 0. This patch changes the check against the returned value of '_rate' to determine if the PLL1 output exceeds 1GHz. Fixes: aa2882481cad ("media: ov5640: Adjust the clock based on the expected rate") Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24pxa168fb: Fix the function used to release some memory in an error handling pathChristophe JAILLET
[ Upstream commit 3c911fe799d1c338d94b78e7182ad452c37af897 ] In the probe function, some resources are allocated using 'dma_alloc_wc()', they should be released with 'dma_free_wc()', not 'dma_free_coherent()'. We already use 'dma_free_wc()' in the remove function, but not in the error handling path of the probe function. Also, remove a useless 'PAGE_ALIGN()'. 'info->fix.smem_len' is already PAGE_ALIGNed. Fixes: 638772c7553f ("fb: add support of LCD display controller on pxa168/910 (base layer)") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Lubomir Rintel <lkundrak@v3.sk> CC: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190831100024.3248-1-christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24drm/msm/adreno: fix zap vs no-zap handlingRob Clark
[ Upstream commit 15ab987c423df561e0949d77fb5043921ae59956 ] We can have two cases, when it comes to "zap" fw. Either the fw requires zap fw to take the GPU out of secure mode at boot, or it does not and we can write RBBM_SECVID_TRUST_CNTL directly. Previously we decided based on whether zap fw load succeeded, but this is not a great plan because: 1) we could have zap fw in the filesystem on a device where it is not required 2) we could have the inverse case Instead, shift to deciding based on whether we have a 'zap-shader' node in dt. In practice, there is only one device (currently) with upstream dt that does not use zap (cheza), and it already has a /delete-node/ for the zap-shader node. Fixes: abccb9fe3267 ("drm/msm/a6xx: Add zap shader load") Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24drm/mipi_dbi: Fix off-by-one bugs in mipi_dbi_blank()Geert Uytterhoeven
[ Upstream commit 2ce18249af5a28031b3f909cfafccc88ea966c9d ] When configuring the frame memory window, the last column and row numbers are written to the column resp. page address registers. These numbers are thus one less than the actual window width resp. height. While this is handled correctly in mipi_dbi_fb_dirty() since commit 03ceb1c8dfd1e293 ("drm/tinydrm: Fix setting of the column/page end addresses."), it is not in mipi_dbi_blank(). The latter still forgets to subtract one when calculating the most significant bytes of the column and row numbers, thus programming wrong values when the display width or height is a multiple of 256. Fixes: 02dd95fe31693626 ("drm/tinydrm: Add MIPI DBI support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191230130604.31006-1-geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24printk: fix exclusive_console replayingJohn Ogness
[ Upstream commit def97da136515cb289a14729292c193e0a93bc64 ] Commit f92b070f2dc8 ("printk: Do not miss new messages when replaying the log") introduced a new variable @exclusive_console_stop_seq to store when an exclusive console should stop printing. It should be set to the @console_seq value at registration. However, @console_seq is previously set to @syslog_seq so that the exclusive console knows where to begin. This results in the exclusive console immediately reactivating all the other consoles and thus repeating the messages for those consoles. Set @console_seq after @exclusive_console_stop_seq has stored the current @console_seq value. Fixes: f92b070f2dc8 ("printk: Do not miss new messages when replaying the log") Link: http://lkml.kernel.org/r/20191219115322.31160-1-john.ogness@linutronix.de Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24pinctrl: sh-pfc: sh7264: Fix CAN function GPIOsGeert Uytterhoeven
[ Upstream commit 55b1cb1f03ad5eea39897d0c74035e02deddcff2 ] pinmux_func_gpios[] contains a hole due to the missing function GPIO definition for the "CTX0&CTX1" signal, which is the logical "AND" of the two CAN outputs. Fix this by: - Renaming CRX0_CRX1_MARK to CTX0_CTX1_MARK, as PJ2MD[2:0]=010 configures the combined "CTX0&CTX1" output signal, - Renaming CRX0X1_MARK to CRX0_CRX1_MARK, as PJ3MD[1:0]=10 configures the shared "CRX0/CRX1" input signal, which is fed to both CAN inputs, - Adding the missing function GPIO definition for "CTX0&CTX1" to pinmux_func_gpios[], - Moving all CAN enums next to each other. See SH7262 Group, SH7264 Group User's Manual: Hardware, Rev. 4.00: [1] Figure 1.2 (3) (Pin Assignment for the SH7264 Group (1-Mbyte Version), [2] Figure 1.2 (4) Pin Assignment for the SH7264 Group (640-Kbyte Version, [3] Table 1.4 List of Pins, [4] Figure 20.29 Connection Example when Using This Module as 1-Channel Module (64 Mailboxes x 1 Channel), [5] Table 32.10 Multiplexed Pins (Port J), [6] Section 32.2.30 (3) Port J Control Register 0 (PJCR0). Note that the last 2 disagree about PJ2MD[2:0], which is probably the root cause of this bug. But considering [4], "CTx0&CTx1" in [5] must be correct, and "CRx0&CRx1" in [6] must be wrong. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20191218194812.12741-4-geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24gianfar: Fix TX timestamping with a stacked DSA driverVladimir Oltean
[ Upstream commit c26a2c2ddc0115eb088873f5c309cf46b982f522 ] The driver wrongly assumes that it is the only entity that can set the SKBTX_IN_PROGRESS bit of the current skb. Therefore, in the gfar_clean_tx_ring function, where the TX timestamp is collected if necessary, the aforementioned bit is used to discriminate whether or not the TX timestamp should be delivered to the socket's error queue. But a stacked driver such as a DSA switch can also set the SKBTX_IN_PROGRESS bit, which is actually exactly what it should do in order to denote that the hardware timestamping process is undergoing. Therefore, gianfar would misinterpret the "in progress" bit as being its own, and deliver a second skb clone in the socket's error queue, completely throwing off a PTP process which is not expecting to receive it, _even though_ TX timestamping is not enabled for gianfar. There have been discussions [0] as to whether non-MAC drivers need or not to set SKBTX_IN_PROGRESS at all (whose purpose is to avoid sending 2 timestamps, a sw and a hw one, to applications which only expect one). But as of this patch, there are at least 2 PTP drivers that would break in conjunction with gianfar: the sja1105 DSA switch and the felix switch, by way of its ocelot core driver. So regardless of that conclusion, fix the gianfar driver to not do stuff based on flags set by others and not intended for it. [0]: https://www.spinics.net/lists/netdev/msg619699.html Fixes: f0ee7acfcdd4 ("gianfar: Add hardware TX timestamping support") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24ALSA: ctl: allow TLV read operation for callback type of element in locked caseTakashi Sakamoto
[ Upstream commit d61fe22c2ae42d9fd76c34ef4224064cca4b04b0 ] A design of ALSA control core allows applications to execute three operations for TLV feature; read, write and command. Furthermore, it allows driver developers to process the operations by two ways; allocated array or callback function. In the former, read operation is just allowed, thus developers uses the latter when device driver supports variety of models or the target model is expected to dynamically change information stored in TLV container. The core also allows applications to lock any element so that the other applications can't perform write operation to the element for element value and TLV information. When the element is locked, write and command operation for TLV information are prohibited as well as element value. Any read operation should be allowed in the case. At present, when an element has callback function for TLV information, TLV read operation returns EPERM if the element is locked. On the other hand, the read operation is success when an element has allocated array for TLV information. In both cases, read operation is success for element value expectedly. This commit fixes the bug. This change can be backported to v4.14 kernel or later. Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Reviewed-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20191223093347.15279-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAITRitesh Harjani
[ Upstream commit f629afe3369e9885fd6e9cc7a4f514b6a65cf9e9 ] Apparently our current rwsem code doesn't like doing the trylock, then lock for real scheme. So change our dax read/write methods to just do the trylock for the RWF_NOWAIT case. This seems to fix AIM7 regression in some scalable filesystems upto ~25% in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression") Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Matthew Bobrowski <mbobrowski@mbobrowski.org> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/20191212055557.11151-2-riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24leds: pca963x: Fix open-drain initializationZahari Petkov
[ Upstream commit 697529091ac7a0a90ca349b914bb30641c13c753 ] Before commit bb29b9cccd95 ("leds: pca963x: Add bindings to invert polarity") Mode register 2 was initialized directly with either 0x01 or 0x05 for open-drain or totem pole (push-pull) configuration. Afterwards, MODE2 initialization started using bitwise operations on top of the default MODE2 register value (0x05). Using bitwise OR for setting OUTDRV with 0x01 and 0x05 does not produce correct results. When open-drain is used, instead of setting OUTDRV to 0, the driver keeps it as 1: Open-drain: 0x05 | 0x01 -> 0x05 (0b101 - incorrect) Totem pole: 0x05 | 0x05 -> 0x05 (0b101 - correct but still wrong) Now OUTDRV setting uses correct bitwise operations for initialization: Open-drain: 0x05 & ~0x04 -> 0x01 (0b001 - correct) Totem pole: 0x05 | 0x04 -> 0x05 (0b101 - correct) Additional MODE2 register definitions are introduced now as well. Fixes: bb29b9cccd95 ("leds: pca963x: Add bindings to invert polarity") Signed-off-by: Zahari Petkov <zahari@balena.io> Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24drm/amd/display: Map ODM memory correctly when doing ODM combineNikola Cornij
[ Upstream commit ec5b356c58941bb8930858155d9ce14ceb3d30a0 ] [why] Up to 4 ODM memory pieces are required per ODM combine and cannot overlap, i.e. each ODM "session" has to use its own memory pieces. The ODM-memory mapping is currently broken for generic case. The maximum number of memory pieces is ASIC-dependent, but it's always big enough to satisfy maximum number of ODM combines. Memory pieces are mapped as a bit-map, i.e. one memory piece corresponds to one bit. The OPTC doing ODM needs to select memory pieces by setting the corresponding bits, making sure there's no overlap with other OPTC instances that might be doing ODM. The current mapping works only for OPTC instance indexes smaller than 3. For instance indexes 3 and up it practically maps no ODM memory, causing black, gray or white screen in display configs that include ODM on OPTC instance 3 or up. [how] Statically map two unique ODM memory pieces for each OPTC instance and piece them together when programming ODM combine mode. Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24PCI: Fix pci_add_dma_alias() bitmask sizeJames Sewart
[ Upstream commit f8bf2aeb651b3460a4b36fd7ba1ba1d31777d35c ] The number of possible devfns is 256, but pci_add_dma_alias() allocated a bitmap of size 255. Fix this off-by-one error. This fixes commits 338c3149a221 ("PCI: Add support for multiple DMA aliases") and c6635792737b ("PCI: Allocate dma_alias_mask with bitmap_zalloc()"), but I doubt it was possible to see a problem because it takes 4 64-bit longs (or 8 32-bit longs) to hold 255 bits, and bitmap_zalloc() doesn't save the 255-bit size anywhere. [bhelgaas: commit log, move #define to drivers/pci/pci.h, include loop limit fix from Qian Cai: https://lore.kernel.org/r/20191218170004.5297-1-cai@lca.pw] Signed-off-by: James Sewart <jamessewart@arista.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24brcmfmac: Fix use after free in brcmf_sdio_readframes()Dan Carpenter
[ Upstream commit 216b44000ada87a63891a8214c347e05a4aea8fe ] The brcmu_pkt_buf_free_skb() function frees "pkt" so it leads to a static checker warning: drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:1974 brcmf_sdio_readframes() error: dereferencing freed memory 'pkt' It looks like there was supposed to be a continue after we free "pkt". Fixes: 4754fceeb9a6 ("brcmfmac: streamline SDIO read frame routine") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24brcmfmac: Fix memory leak in brcmf_p2p_create_p2pdev()Navid Emamdoost
[ Upstream commit 5cc509aa83c6acd2c5cd94f99065c39d2bd0a490 ] In the implementation of brcmf_p2p_create_p2pdev() the allocated memory for p2p_vif is leaked when the mac address is the same as primary interface. To fix this, go to error path to release p2p_vif via brcmf_free_vif(). Fixes: cb746e47837a ("brcmfmac: check p2pdev mac address uniqueness") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24cpu/hotplug, stop_machine: Fix stop_machine vs hotplug orderPeter Zijlstra
[ Upstream commit 45178ac0cea853fe0e405bf11e101bdebea57b15 ] Paul reported a very sporadic, rcutorture induced, workqueue failure. When the planets align, the workqueue rescuer's self-migrate fails and then triggers a WARN for running a work on the wrong CPU. Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call could be ignored! When stopper->enabled is false, stop_machine will insta complete the work, without actually doing the work. Worse, it will not WARN about this (we really should fix this). It turns out there is a small window where a freshly online'ed CPU is marked 'online' but doesn't yet have the stopper task running: BP AP bringup_cpu() __cpu_up(cpu, idle) --> start_secondary() ... cpu_startup_entry() bringup_wait_for_ap() wait_for_ap_thread() <-- cpuhp_online_idle() while (1) do_idle() ... available to run kthreads ... stop_machine_unpark() stopper->enable = true; Close this by moving the stop_machine_unpark() into cpuhp_online_idle(), such that the stopper thread is ready before we start the idle loop and schedule. Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Debugged-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24clk: meson: pll: Fix by 0 division in __pll_params_to_rate()Remi Pommarel
[ Upstream commit d8488a41800d9f5c80bc0d17b9cc2c91b4841464 ] Some meson pll registers can be initialized with 0 as N value, introducing the following division by 0 when computing rate : UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9 division by zero CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty #400 Call trace: dump_backtrace+0x0/0x1c0 show_stack+0x14/0x20 dump_stack+0xc4/0x100 ubsan_epilogue+0x14/0x68 __ubsan_handle_divrem_overflow+0x98/0xb8 __pll_params_to_rate+0xdc/0x140 meson_clk_pll_recalc_rate+0x278/0x3a0 __clk_register+0x7c8/0xbb0 devm_clk_hw_register+0x54/0xc0 meson_eeclkc_probe+0xf4/0x1a0 platform_drv_probe+0x54/0xd8 really_probe+0x16c/0x438 driver_probe_device+0xb0/0xf0 device_driver_attach+0x94/0xa0 __driver_attach+0x70/0x108 bus_for_each_dev+0xd8/0x128 driver_attach+0x30/0x40 bus_add_driver+0x1b0/0x2d8 driver_register+0xbc/0x1d0 __platform_driver_register+0x78/0x88 axg_driver_init+0x18/0x20 do_one_initcall+0xc8/0x24c kernel_init_freeable+0x2b0/0x344 kernel_init+0x10/0x128 ret_from_fork+0x10/0x18 This checks if N is null before doing the division. Fixes: 7a29a869434e ("clk: meson: Add support for Meson clock controller") Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Remi Pommarel <repk@triplefau.lt> [jbrunet@baylibre.com: update the comment in above the fix] Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24media: meson: add missing allocation failure check on new_bufColin Ian King
[ Upstream commit 11e0e167d071a28288a7a0a211d48c571d19b56f ] Currently if the allocation of new_buf fails then a null pointer dereference occurs when assiging new_buf->vb. Avoid this by returning early on a memory allocation failure as there is not much more can be done at this point. Addresses-Coverity: ("Dereference null return") Fixes: 3e7f51bd9607 ("media: meson: add v4l2 m2m video decoder driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24f2fs: call f2fs_balance_fs outside of locked pageJaegeuk Kim
[ Upstream commit bdf03299248916640a835a05d32841bb3d31912d ] Otherwise, we can hit deadlock by waiting for the locked page in move_data_block in GC. Thread A Thread B - do_page_mkwrite - f2fs_vm_page_mkwrite - lock_page - f2fs_balance_fs - mutex_lock(gc_mutex) - f2fs_gc - do_garbage_collect - ra_data_block - grab_cache_page - f2fs_balance_fs - mutex_lock(gc_mutex) Fixes: 39a8695824510 ("f2fs: refactor ->page_mkwrite() flow") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24f2fs: preallocate DIO blocks when forcing buffered_ioJaegeuk Kim
[ Upstream commit 47501f87c61ad2aa234add63e1ae231521dbc3f5 ] The previous preallocation and DIO decision like below. allow_outplace_dio !allow_outplace_dio f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO But, Javier reported Case (*) where zoned device bypassed preallocation but fell back to buffered writes in f2fs_direct_IO(), resulting in stale data being read. In order to fix the issue, actually we need to preallocate blocks whenever we fall back to buffered IO like this. No change is made in the other cases. allow_outplace_dio !allow_outplace_dio f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO Reported-and-tested-by: Javier Gonzalez <javier@javigon.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Javier González <javier@javigon.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24rcu: Fix data-race due to atomic_t copy-by-valueMarco Elver
[ Upstream commit 6cf539a87a61a4fbc43f625267dbcbcf283872ed ] This fixes a data-race where `atomic_t dynticks` is copied by value. The copy is performed non-atomically, resulting in a data-race if `dynticks` is updated concurrently. This data-race was found with KCSAN: ================================================================== BUG: KCSAN: data-race in dyntick_save_progress_counter / rcu_irq_enter write to 0xffff989dbdbe98e0 of 4 bytes by task 10 on cpu 3: atomic_add_return include/asm-generic/atomic-instrumented.h:78 [inline] rcu_dynticks_snap kernel/rcu/tree.c:310 [inline] dyntick_save_progress_counter+0x43/0x1b0 kernel/rcu/tree.c:984 force_qs_rnp+0x183/0x200 kernel/rcu/tree.c:2286 rcu_gp_fqs kernel/rcu/tree.c:1601 [inline] rcu_gp_fqs_loop+0x71/0x880 kernel/rcu/tree.c:1653 rcu_gp_kthread+0x22c/0x3b0 kernel/rcu/tree.c:1799 kthread+0x1b5/0x200 kernel/kthread.c:255 <snip> read to 0xffff989dbdbe98e0 of 4 bytes by task 154 on cpu 7: rcu_nmi_enter_common kernel/rcu/tree.c:828 [inline] rcu_irq_enter+0xda/0x240 kernel/rcu/tree.c:870 irq_enter+0x5/0x50 kernel/softirq.c:347 <snip> Reported by Kernel Concurrency Sanitizer on: CPU: 7 PID: 154 Comm: kworker/7:1H Not tainted 5.3.0+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Workqueue: kblockd blk_mq_run_work_fn ================================================================== Signed-off-by: Marco Elver <elver@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24rcu: Fix missed wakeup of exp_wq waitersNeeraj Upadhyay
[ Upstream commit fd6bc19d7676a060a171d1cf3dcbf6fd797eb05f ] Tasks waiting within exp_funnel_lock() for an expedited grace period to elapse can be starved due to the following sequence of events: 1. Tasks A and B both attempt to start an expedited grace period at about the same time. This grace period will have completed when the lower four bits of the rcu_state structure's ->expedited_sequence field are 0b'0100', for example, when the initial value of this counter is zero. Task A wins, and thus does the actual work of starting the grace period, including acquiring the rcu_state structure's .exp_mutex and sets the counter to 0b'0001'. 2. Because task B lost the race to start the grace period, it waits on ->expedited_sequence to reach 0b'0100' inside of exp_funnel_lock(). This task therefore blocks on the rcu_node structure's ->exp_wq[1] field, keeping in mind that the end-of-grace-period value of ->expedited_sequence (0b'0100') is shifted down two bits before indexing the ->exp_wq[] field. 3. Task C attempts to start another expedited grace period, but blocks on ->exp_mutex, which is still held by Task A. 4. The aforementioned expedited grace period completes, so that ->expedited_sequence now has the value 0b'0100'. A kworker task therefore acquires the rcu_state structure's ->exp_wake_mutex and starts awakening any tasks waiting for this grace period. 5. One of the first tasks awakened happens to be Task A. Task A therefore releases the rcu_state structure's ->exp_mutex, which allows Task C to start the next expedited grace period, which causes the lower four bits of the rcu_state structure's ->expedited_sequence field to become 0b'0101'. 6. Task C's expedited grace period completes, so that the lower four bits of the rcu_state structure's ->expedited_sequence field now become 0b'1000'. 7. The kworker task from step 4 above continues its wakeups. Unfortunately, the wake_up_all() refetches the rcu_state structure's .expedited_sequence field: wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rcu_state.expedited_sequence) & 0x3]); This results in the wakeup being applied to the rcu_node structure's ->exp_wq[2] field, which is unfortunate given that Task B is instead waiting on ->exp_wq[1]. On a busy system, no harm is done (or at least no permanent harm is done). Some later expedited grace period will redo the wakeup. But on a quiet system, such as many embedded systems, it might be a good long time before there was another expedited grace period. On such embedded systems, this situation could therefore result in a system hang. This issue manifested as DPM device timeout during suspend (which usually qualifies as a quiet time) due to a SCSI device being stuck in _synchronize_rcu_expedited(), with the following stack trace: schedule() synchronize_rcu_expedited() synchronize_rcu() scsi_device_quiesce() scsi_bus_suspend() dpm_run_callback() __device_suspend() This commit therefore prevents such delays, timeouts, and hangs by making rcu_exp_wait_wake() use its "s" argument consistently instead of refetching from rcu_state.expedited_sequence. Fixes: 3b5f668e715b ("rcu: Overlap wakeups with next expedited grace period") Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24rcu/nocb: Fix dump_tree hierarchy print always activeStefan Reiter
[ Upstream commit 610dea36d3083a977e4f156206cbe1eaa2a532f0 ] Commit 18cd8c93e69e ("rcu/nocb: Print gp/cb kthread hierarchy if dump_tree") added print statements to rcu_organize_nocb_kthreads for debugging, but incorrectly guarded them, causing the function to always spew out its message. This patch fixes it by guarding both pr_alert statements with dump_tree, while also changing the second pr_alert to a pr_cont, to print the hierarchy in a single line (assuming that's how it was supposed to work). Fixes: 18cd8c93e69e ("rcu/nocb: Print gp/cb kthread hierarchy if dump_tree") Signed-off-by: Stefan Reiter <stefan@pimaker.at> [ paulmck: Make single-nocbs-CPU GP kthreads look less erroneous. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24drm/qxl: Complete exception handling in qxl_device_init()Markus Elfring
[ Upstream commit dbe3ad61dcebc49fe3efca70a0f752a95b4600f2 ] A coccicheck run provided information like the following. drivers/gpu/drm/qxl/qxl_kms.c:295:1-7: ERROR: missing iounmap; ioremap on line 178 and execution via conditional on line 185 Generated by: scripts/coccinelle/free/iounmap.cocci A jump target was specified in an if branch. The corresponding function call did not release the desired system resource then. Thus use the label “rom_unmap” instead to fix the exception handling for this function implementation. Fixes: 5043348a4969ae1661c008efe929abd0d76e3792 ("drm: qxl: Fix error handling at qxl_device_init") Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: http://patchwork.freedesktop.org/patch/msgid/5e5ef9c4-4d85-3c93-cf28-42cfcb5b0649@web.de Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24wil6210: fix break that is never reached because of zero'ing of a retry counterColin Ian King
[ Upstream commit 5b1413f00b5beb9f5fed94e43ea0c497d5db9633 ] There is a check on the retry counter invalid_buf_id_retry that is always false because invalid_buf_id_retry is initialized to zero on each iteration of a while-loop. Fix this by initializing the retry counter before the while-loop starts. Addresses-Coverity: ("Logically dead code") Fixes: b4a967b7d0f5 ("wil6210: reset buff id in status message after completion") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24ath10k: Fix qmi init error handlingJeffrey Hugo
[ Upstream commit f8a595a87e93a33a10879f4b856be818d2f53c84 ] When ath10k_qmi_init() fails, the error handling does not free the irq resources, which causes an issue if we EPROBE_DEFER as we'll attempt to (re-)register irqs which are already registered. Fix this by doing a power off since we just powered on the hardware, and freeing the irqs as error handling. Fixes: ba94c753ccb4 ("ath10k: add QMI message handshake for wcn3990 client") Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24drm/gma500: Fixup fbdev stolen size usage evaluationPaul Kocialkowski
[ Upstream commit fd1a5e521c3c083bb43ea731aae0f8b95f12b9bd ] psbfb_probe performs an evaluation of the required size from the stolen GTT memory, but gets it wrong in two distinct ways: - The resulting size must be page-size-aligned; - The size to allocate is derived from the surface dimensions, not the fb dimensions. When two connectors are connected with different modes, the smallest will be stored in the fb dimensions, but the size that needs to be allocated must match the largest (surface) dimensions. This is what is used in the actual allocation code. Fix this by correcting the evaluation to conform to the two points above. It allows correctly switching to 16bpp when one connector is e.g. 1920x1080 and the other is 1024x768. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191107153048.843881-1-paul.kocialkowski@bootlin.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24net/sched: flower: add missing validation of TCA_FLOWER_FLAGSDavide Caratti
[ Upstream commit e2debf0852c4d66ba1a8bde12869b196094c70a7 ] unlike other classifiers that can be offloaded (i.e. users can set flags like 'skip_hw' and 'skip_sw'), 'cls_flower' doesn't validate the size of netlink attribute 'TCA_FLOWER_FLAGS' provided by user: add a proper entry to fl_policy. Fixes: 5b33f48842fa ("net/flower: Introduce hardware offload support") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGSDavide Caratti
[ Upstream commit 1afa3cc90f8fb745c777884d79eaa1001d6927a6 ] unlike other classifiers that can be offloaded (i.e. users can set flags like 'skip_hw' and 'skip_sw'), 'cls_matchall' doesn't validate the size of netlink attribute 'TCA_MATCHALL_FLAGS' provided by user: add a proper entry to mall_policy. Fixes: b87f7936a932 ("net/sched: Add match-all classifier hw offloading.") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24net: dsa: tag_qca: Make sure there is headroom for tagPer Forlin
[ Upstream commit 04fb91243a853dbde216d829c79d9632e52aa8d9 ] Passing tag size to skb_cow_head will make sure there is enough headroom for the tag data. This change does not introduce any overhead in case there is already available headroom for tag. Signed-off-by: Per Forlin <perfn@axis.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24net/smc: fix leak of kernel memory to user spaceEric Dumazet
[ Upstream commit 457fed775c97ac2c0cd1672aaf2ff2c8a6235e87 ] As nlmsg_put() does not clear the memory that is reserved, it this the caller responsability to make sure all of this memory will be written, in order to not reveal prior content. While we are at it, we can provide the socket cookie even if clsock is not set. syzbot reported : BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] BUG: KMSAN: uninit-value in __swab32p include/uapi/linux/swab.h:179 [inline] BUG: KMSAN: uninit-value in __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline] BUG: KMSAN: uninit-value in get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline] BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline] BUG: KMSAN: uninit-value in ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline] BUG: KMSAN: uninit-value in bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252 CPU: 1 PID: 5262 Comm: syz-executor.5 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] __fswab32 include/uapi/linux/swab.h:59 [inline] __swab32p include/uapi/linux/swab.h:179 [inline] __be32_to_cpup include/uapi/linux/byteorder/little_endian.h:82 [inline] get_unaligned_be32 include/linux/unaligned/access_ok.h:30 [inline] ____bpf_skb_load_helper_32 net/core/filter.c:240 [inline] ____bpf_skb_load_helper_32_no_cache net/core/filter.c:255 [inline] bpf_skb_load_helper_32_no_cache+0x14a/0x390 net/core/filter.c:252 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_kmalloc_large+0x73/0xc0 mm/kmsan/kmsan_hooks.c:128 kmalloc_large_node_hook mm/slub.c:1406 [inline] kmalloc_large_node+0x282/0x2c0 mm/slub.c:3841 __kmalloc_node_track_caller+0x44b/0x1200 mm/slub.c:4368 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] netlink_dump+0x44b/0x1ab0 net/netlink/af_netlink.c:2224 __netlink_dump_start+0xbb2/0xcf0 net/netlink/af_netlink.c:2352 netlink_dump_start include/linux/netlink.h:233 [inline] smc_diag_handler_dump+0x2ba/0x300 net/smc/smc_diag.c:242 sock_diag_rcv_msg+0x211/0x610 net/core/sock_diag.c:256 netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] kernel_sendmsg+0x433/0x440 net/socket.c:679 sock_no_sendpage+0x235/0x300 net/core/sock.c:2740 kernel_sendpage net/socket.c:3776 [inline] sock_sendpage+0x1e1/0x2c0 net/socket.c:937 pipe_to_sendpage+0x38c/0x4c0 fs/splice.c:458 splice_from_pipe_feed fs/splice.c:512 [inline] __splice_from_pipe+0x539/0xed0 fs/splice.c:636 splice_from_pipe fs/splice.c:671 [inline] generic_splice_sendpage+0x1d5/0x2d0 fs/splice.c:844 do_splice_from fs/splice.c:863 [inline] do_splice fs/splice.c:1170 [inline] __do_sys_splice fs/splice.c:1447 [inline] __se_sys_splice+0x2380/0x3350 fs/splice.c:1427 __x64_sys_splice+0x6e/0x90 fs/splice.c:1427 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24enic: prevent waking up stopped tx queues over watchdog resetFiro Yang
[ Upstream commit 0f90522591fd09dd201065c53ebefdfe3c6b55cb ] Recent months, our customer reported several kernel crashes all preceding with following message: NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out Error message of one of those crashes: BUG: unable to handle kernel paging request at ffffffffa007e090 After analyzing severl vmcores, I found that most of crashes are caused by memory corruption. And all the corrupted memory areas are overwritten by data of network packets. Moreover, I also found that the tx queues were enabled over watchdog reset. After going through the source code, I found that in enic_stop(), the tx queues stopped by netif_tx_disable() could be woken up over a small time window between netif_tx_disable() and the napi_disable() by the following code path: napi_poll-> enic_poll_msix_wq-> vnic_cq_service-> enic_wq_service-> netif_wake_subqueue(enic->netdev, q_number)-> test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state) In turn, upper netowrk stack could queue skb to ENIC NIC though enic_hard_start_xmit(). And this might introduce some race condition. Our customer comfirmed that this kind of kernel crash doesn't occur over 90 days since they applied this patch. Signed-off-by: Firo Yang <firo.yang@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24core: Don't skip generic XDP program execution for cloned SKBsToke Høiland-Jørgensen
[ Upstream commit ad1e03b2b3d4430baaa109b77bc308dc73050de3 ] The current generic XDP handler skips execution of XDP programs entirely if an SKB is marked as cloned. This leads to some surprising behaviour, as packets can end up being cloned in various ways, which will make an XDP program not see all the traffic on an interface. This was discovered by a simple test case where an XDP program that always returns XDP_DROP is installed on a veth device. When combining this with the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending side, SKBs reliably end up in the cloned state, causing them to be passed through to the receiving interface instead of being dropped. A minimal reproducer script for this is included below. This patch fixed the issue by simply triggering the existing linearisation code for cloned SKBs instead of skipping the XDP program execution. This behaviour is in line with the behaviour of the native XDP implementation for the veth driver, which will reallocate and copy the SKB data if the SKB is marked as shared. Reproducer Python script (requires BCC and Scapy): from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP from bcc import BPF import time, sys, subprocess, shlex SKB_MODE = (1 << 1) DRV_MODE = (1 << 2) PYTHON=sys.executable def client(): time.sleep(2) # Sniffing on the sender causes skb_cloned() to be set s = AsyncSniffer() s.start() for p in range(10): sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"), verbose=False) time.sleep(0.1) s.stop() return 0 def server(mode): prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}") func = prog.load_func("dummy_drop", BPF.XDP) prog.attach_xdp("a_to_b", func, mode) time.sleep(1) s = sniff(iface="a_to_b", count=10, timeout=15) if len(s): print(f"Got {len(s)} packets - should have gotten 0") return 1 else: print("Got no packets - as expected") return 0 if len(sys.argv) < 2: print(f"Usage: {sys.argv[0]} <skb|drv>") sys.exit(1) if sys.argv[1] == "client": sys.exit(client()) elif sys.argv[1] == "server": mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE sys.exit(server(mode)) else: try: mode = sys.argv[1] if mode not in ('skb', 'drv'): print(f"Usage: {sys.argv[0]} <skb|drv>") sys.exit(1) print(f"Running in {mode} mode") for cmd in [ 'ip netns add netns_a', 'ip netns add netns_b', 'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b', # Disable ipv6 to make sure there's no address autoconf traffic 'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1', 'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1', 'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa', 'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc', 'ip -n netns_a link set dev a_to_b up', 'ip -n netns_b link set dev b_to_a up']: subprocess.check_call(shlex.split(cmd)) server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}")) client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client")) client.wait() server.wait() sys.exit(server.returncode) finally: subprocess.run(shlex.split("ip netns delete netns_a")) subprocess.run(shlex.split("ip netns delete netns_b")) Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices") Reported-by: Stepan Horacek <shoracek@redhat.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-19Linux 5.4.21Greg Kroah-Hartman
2020-02-19mmc: core: Rework wp-gpio handlingMichał Mirosław
[ Upstream commit 9073d10b098973519044f5fcdc25586810b435da ] Use MMC_CAP2_RO_ACTIVE_HIGH flag as indicator if GPIO line is to be inverted compared to DT/platform-specified polarity. The flag is not used after init in GPIO mode anyway. No functional changes intended. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/a60f563f11bbff821da2fa2949ca82922b144860.1576031637.git.mirq-linux@rere.qmqm.pl Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-19gpio: add gpiod_toggle_active_low()Michał Mirosław
[ Upstream commit d3a5bcb4a17f1ad072484bb92c42519ff3aba6e1 ] Add possibility to toggle active-low flag of a gpio descriptor. This is useful for compatibility code, where defaults are inverted vs DT gpio flags or the active-low flag is taken from elsewhere. Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/7ce0338e01ad17fa5a227176813941b41a7c35c1.1576031637.git.mirq-linux@rere.qmqm.pl Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-19KVM: x86/mmu: Fix struct guest_walker arrays for 5-level pagingSean Christopherson
[ Upstream commit f6ab0107a4942dbf9a5cf0cca3f37e184870a360 ] Define PT_MAX_FULL_LEVELS as PT64_ROOT_MAX_LEVEL, i.e. 5, to fix shadow paging for 5-level guest page tables. PT_MAX_FULL_LEVELS is used to size the arrays that track guest pages table information, i.e. using a "max levels" of 4 causes KVM to access garbage beyond the end of an array when querying state for level 5 entries. E.g. FNAME(gpte_changed) will read garbage and most likely return %true for a level 5 entry, soft-hanging the guest because FNAME(fetch) will restart the guest instead of creating SPTEs because it thinks the guest PTE has changed. Note, KVM doesn't yet support 5-level nested EPT, so PT_MAX_FULL_LEVELS gets to stay "4" for the PTTYPE_EPT case. Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-19ext4: choose hardlimit when softlimit is larger than hardlimit in ↵Chengguang Xu
ext4_statfs_project() [ Upstream commit 57c32ea42f8e802bda47010418e25043e0c9337f ] Setting softlimit larger than hardlimit seems meaningless for disk quota but currently it is allowed. In this case, there may be a bit of comfusion for users when they run df comamnd to directory which has project quota. For example, we set 20M softlimit and 10M hardlimit of block usage limit for project quota of test_dir(project id 123). [root@hades mnt_ext4]# repquota -P -a *** Report for project quotas on device /dev/loop0 Block grace time: 7days; Inode grace time: 7days Block limits File limits Project used soft hard grace used soft hard grace ---------------------------------------------------------------------- 0 -- 13 0 0 2 0 0 123 -- 10237 20480 10240 5 200 100 The result of df command as below: [root@hades mnt_ext4]# df -h test_dir Filesystem Size Used Avail Use% Mounted on /dev/loop0 20M 10M 10M 50% /home/cgxu/test/mnt_ext4 Even though it looks like there is another 10M free space to use, if we write new data to diretory test_dir(inherit project id), the write will fail with errno(-EDQUOT). After this patch, the df result looks like below. [root@hades mnt_ext4]# df -h test_dir Filesystem Size Used Avail Use% Mounted on /dev/loop0 10M 10M 3.0K 100% /home/cgxu/test/mnt_ext4 Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191016022501.760-1-cgxu519@mykernel.net Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-19jbd2: do not clear the BH_Mapped flag when forgetting a metadata bufferzhangyi (F)
[ Upstream commit c96dceeabf765d0b1b1f29c3bf50a5c01315b820 ] Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") set the BH_Freed flag when forgetting a metadata buffer which belongs to the committing transaction, it indicate the committing process clear dirty bits when it is done with the buffer. But it also clear the BH_Mapped flag at the same time, which may trigger below NULL pointer oops when block_size < PAGE_SIZE. rmdir 1 kjournald2 mkdir 2 jbd2_journal_commit_transaction commit transaction N jbd2_journal_forget set_buffer_freed(bh1) jbd2_journal_commit_transaction commit transaction N+1 ... clear_buffer_mapped(bh1) ext4_getblk(bh2 ummapped) ... grow_dev_page init_page_buffers bh1->b_private=NULL bh2->b_private=NULL jbd2_journal_put_journal_head(jh1) __journal_remove_journal_head(hb1) jh1 is NULL and trigger oops *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has already been unmapped. For the metadata buffer we forgetting, we should always keep the mapped flag and clear the dirty flags is enough, so this patch pick out the these buffers and keep their BH_Mapped flag. Link: https://lore.kernel.org/r/20200213063821.30455-3-yi.zhang@huawei.com Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>