summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2026-03-06net: dsa: sja1105: reorder sja1105_reload_cbs() and phylink_replay_link_end()Vladimir Oltean
Move phylink_replay_link_end() as the last locked operation under sja1105_static_config_reload(). The purpose is to be able to goto this step from the error path of intermediate steps (we must call phylink_replay_link_end()). sja1105_reload_cbs() notably does not depend on port states or link speeds. See commit 954ad9bf13c4 ("net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload") which has discussed this issue specifically. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260304220900.3865120-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5e: RX, Fix XDP multi-buf frag counting for legacy RQDragos Tatulea
XDP multi-buf programs can modify the layout of the XDP buffer when the program calls bpf_xdp_pull_data() or bpf_xdp_adjust_tail(). The referenced commit in the fixes tag corrected the assumption in the mlx5 driver that the XDP buffer layout doesn't change during a program execution. However, this fix introduced another issue: the dropped fragments still need to be counted on the driver side to avoid page fragment reference counting issues. Such issue can be observed with the test_xdp_native_adjst_tail_shrnk_data selftest when using a payload of 3600 and shrinking by 256 bytes (an upcoming selftest patch): the last fragment gets released by the XDP code but doesn't get tracked by the driver. This results in a negative pp_ref_count during page release and the following splat: WARNING: include/net/page_pool/helpers.h:297 at mlx5e_page_release_fragmented.isra.0+0x4a/0x50 [mlx5_core], CPU#12: ip/3137 Modules linked in: [...] CPU: 12 UID: 0 PID: 3137 Comm: ip Not tainted 6.19.0-rc3+ #12 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5e_page_release_fragmented.isra.0+0x4a/0x50 [mlx5_core] [...] Call Trace: <TASK> mlx5e_dealloc_rx_wqe+0xcb/0x1a0 [mlx5_core] mlx5e_free_rx_descs+0x7f/0x110 [mlx5_core] mlx5e_close_rq+0x50/0x60 [mlx5_core] mlx5e_close_queues+0x36/0x2c0 [mlx5_core] mlx5e_close_channel+0x1c/0x50 [mlx5_core] mlx5e_close_channels+0x45/0x80 [mlx5_core] mlx5e_safe_switch_params+0x1a5/0x230 [mlx5_core] mlx5e_change_mtu+0xf3/0x2f0 [mlx5_core] netif_set_mtu_ext+0xf1/0x230 do_setlink.isra.0+0x219/0x1180 rtnl_newlink+0x79f/0xb60 rtnetlink_rcv_msg+0x213/0x3a0 netlink_rcv_skb+0x48/0xf0 netlink_unicast+0x24a/0x350 netlink_sendmsg+0x1ee/0x410 __sock_sendmsg+0x38/0x60 ____sys_sendmsg+0x232/0x280 ___sys_sendmsg+0x78/0xb0 __sys_sendmsg+0x5f/0xb0 [...] do_syscall_64+0x57/0xc50 This patch fixes the issue by doing page frag counting on all the original XDP buffer fragments for all relevant XDP actions (XDP_TX , XDP_REDIRECT and XDP_PASS). This is basically reverting to the original counting before the commit in the fixes tag. As frag_page is still pointing to the original tail, the nr_frags parameter to xdp_update_skb_frags_info() needs to be calculated in a different way to reflect the new nr_frags. Fixes: afd5ba577c10 ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Link: https://patch.msgid.link/20260305142634.1813208-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQDragos Tatulea
XDP multi-buf programs can modify the layout of the XDP buffer when the program calls bpf_xdp_pull_data() or bpf_xdp_adjust_tail(). The referenced commit in the fixes tag corrected the assumption in the mlx5 driver that the XDP buffer layout doesn't change during a program execution. However, this fix introduced another issue: the dropped fragments still need to be counted on the driver side to avoid page fragment reference counting issues. The issue was discovered by the drivers/net/xdp.py selftest, more specifically the test_xdp_native_tx_mb: - The mlx5 driver allocates a page_pool page and initializes it with a frag counter of 64 (pp_ref_count=64) and the internal frag counter to 0. - The test sends one packet with no payload. - On RX (mlx5e_skb_from_cqe_mpwrq_nonlinear()), mlx5 configures the XDP buffer with the packet data starting in the first fragment which is the page mentioned above. - The XDP program runs and calls bpf_xdp_pull_data() which moves the header into the linear part of the XDP buffer. As the packet doesn't contain more data, the program drops the tail fragment since it no longer contains any payload (pp_ref_count=63). - mlx5 device skips counting this fragment. Internal frag counter remains 0. - mlx5 releases all 64 fragments of the page but page pp_ref_count is 63 => negative reference counting error. Resulting splat during the test: WARNING: CPU: 0 PID: 188225 at ./include/net/page_pool/helpers.h:297 mlx5e_page_release_fragmented.isra.0+0xbd/0xe0 [mlx5_core] Modules linked in: [...] CPU: 0 UID: 0 PID: 188225 Comm: ip Not tainted 6.18.0-rc7_for_upstream_min_debug_2025_12_08_11_44 #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5e_page_release_fragmented.isra.0+0xbd/0xe0 [mlx5_core] [...] Call Trace: <TASK> mlx5e_free_rx_mpwqe+0x20a/0x250 [mlx5_core] mlx5e_dealloc_rx_mpwqe+0x37/0xb0 [mlx5_core] mlx5e_free_rx_descs+0x11a/0x170 [mlx5_core] mlx5e_close_rq+0x78/0xa0 [mlx5_core] mlx5e_close_queues+0x46/0x2a0 [mlx5_core] mlx5e_close_channel+0x24/0x90 [mlx5_core] mlx5e_close_channels+0x5d/0xf0 [mlx5_core] mlx5e_safe_switch_params+0x2ec/0x380 [mlx5_core] mlx5e_change_mtu+0x11d/0x490 [mlx5_core] mlx5e_change_nic_mtu+0x19/0x30 [mlx5_core] netif_set_mtu_ext+0xfc/0x240 do_setlink.isra.0+0x226/0x1100 rtnl_newlink+0x7a9/0xba0 rtnetlink_rcv_msg+0x220/0x3c0 netlink_rcv_skb+0x4b/0xf0 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 ____sys_sendmsg+0x1e8/0x240 ___sys_sendmsg+0x7c/0xb0 [...] __sys_sendmsg+0x5f/0xb0 do_syscall_64+0x55/0xc70 The problem applies for XDP_PASS as well which is handled in a different code path in the driver. This patch fixes the issue by doing page frag counting on all the original XDP buffer fragments for all relevant XDP actions (XDP_TX , XDP_REDIRECT and XDP_PASS). This is basically reverting to the original counting before the commit in the fixes tag. As frag_page is still pointing to the original tail, the nr_frags parameter to xdp_update_skb_frags_info() needs to be calculated in a different way to reflect the new nr_frags. Fixes: 87bcef158ac1 ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Cc: Amery Hung <ameryhung@gmail.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305142634.1813208-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5e: Fix DMA FIFO desync on error CQE SQ recoveryGal Pressman
In case of a TX error CQE, a recovery flow is triggered, mlx5e_reset_txqsq_cc_pc() resets dma_fifo_cc to 0 but not dma_fifo_pc, desyncing the DMA FIFO producer and consumer. After recovery, the producer pushes new DMA entries at the old dma_fifo_pc, while the consumer reads from position 0. This causes us to unmap stale DMA addresses from before the recovery. The DMA FIFO is a purely software construct with no HW counterpart. At the point of reset, all WQEs have been flushed so dma_fifo_cc is already equal to dma_fifo_pc. There is no need to reset either counter, similar to how skb_fifo pc/cc are untouched. Remove the 'dma_fifo_cc = 0' reset. This fixes the following WARNING: WARNING: CPU: 0 PID: 0 at drivers/iommu/dma-iommu.c:1240 iommu_dma_unmap_page+0x79/0x90 Modules linked in: mlx5_vdpa vringh vdpa bonding mlx5_ib mlx5_vfio_pci ipip mlx5_fwctl tunnel4 mlx5_core ib_ipoib geneve ip6_gre ip_gre gre nf_tables ip6_tunnel rdma_ucm ib_uverbs ib_umad vfio_pci vfio_pci_core act_mirred act_skbedit act_vlan vhost_net vhost tap ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress vhost_iotlb iptable_raw tunnel6 vfio_iommu_type1 vfio openvswitch nsh rpcsec_gss_krb5 auth_rpcgss oid_registry xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter overlay zram zsmalloc rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core fuse [last unloaded: nf_tables] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.13.0-rc5_for_upstream_min_debug_2024_12_30_21_33 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:iommu_dma_unmap_page+0x79/0x90 Code: 2b 4d 3b 21 72 26 4d 3b 61 08 73 20 49 89 d8 44 89 f9 5b 4c 89 f2 4c 89 e6 48 89 ef 5d 41 5c 41 5d 41 5e 41 5f e9 c7 ae 9e ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 Call Trace: <IRQ> ? __warn+0x7d/0x110 ? iommu_dma_unmap_page+0x79/0x90 ? report_bug+0x16d/0x180 ? handle_bug+0x4f/0x90 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? iommu_dma_unmap_page+0x79/0x90 ? iommu_dma_unmap_page+0x2e/0x90 dma_unmap_page_attrs+0x10d/0x1b0 mlx5e_tx_wi_dma_unmap+0xbe/0x120 [mlx5_core] mlx5e_poll_tx_cq+0x16d/0x690 [mlx5_core] mlx5e_napi_poll+0x8b/0xac0 [mlx5_core] __napi_poll+0x24/0x190 net_rx_action+0x32a/0x3b0 ? mlx5_eq_comp_int+0x7e/0x270 [mlx5_core] ? notifier_call_chain+0x35/0xa0 handle_softirqs+0xc9/0x270 irq_exit_rcu+0x71/0xd0 common_interrupt+0x7f/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 Fixes: db75373c91b0 ("net/mlx5e: Recover Send Queue (SQ) from error state") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305142634.1813208-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5: Fix peer miss rules host disabled checksCarolina Jubran
The check on mlx5_esw_host_functions_enabled(esw->dev) for adding VF peer miss rules is incorrect. These rules match traffic from peer's VFs, so the local device's host function status is irrelevant. Remove this check to ensure peer VF traffic is properly handled regardless of local host configuration. Also fix the PF peer miss rule deletion to be symmetric with the add path, so only attempt to delete the rule if it was actually created. Fixes: 520369ef43a8 ("net/mlx5: Support disabling host PFs") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305142634.1813208-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5: Fix crash when moving to switchdev modePatrisious Haddad
When moving to switchdev mode when the device doesn't support IPsec, we try to clean up the IPsec resources anyway which causes the crash below, fix that by correctly checking for IPsec support before trying to clean up its resources. [27642.515799] WARNING: arch/x86/mm/fault.c:1276 at do_user_addr_fault+0x18a/0x680, CPU#4: devlink/6490 [27642.517159] Modules linked in: xt_conntrack xt_MASQUERADE ip6table_nat ip6table_filter ip6_tables iptable_nat nf_nat xt_addrtype rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_fwctl nfnetlink zram zsmalloc mlx5_ib fuse rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_core ib_core [27642.521358] CPU: 4 UID: 0 PID: 6490 Comm: devlink Not tainted 6.19.0-rc5_for_upstream_min_debug_2026_01_14_16_47 #1 NONE [27642.522923] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [27642.524528] RIP: 0010:do_user_addr_fault+0x18a/0x680 [27642.525362] Code: ff 0f 84 75 03 00 00 48 89 ee 4c 89 e7 e8 5e b9 22 00 49 89 c0 48 85 c0 0f 84 a8 02 00 00 f7 c3 60 80 00 00 74 22 31 c9 eb ae <0f> 0b 48 83 c4 10 48 89 ea 48 89 de 4c 89 f7 5b 5d 41 5c 41 5d 41 [27642.528166] RSP: 0018:ffff88810770f6b8 EFLAGS: 00010046 [27642.529038] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff88810b980f00 [27642.530158] RDX: 00000000000000a0 RSI: 0000000000000002 RDI: ffff88810770f728 [27642.531270] RBP: 00000000000000a0 R08: 0000000000000000 R09: 0000000000000000 [27642.532383] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888103f3c4c0 [27642.533499] R13: 0000000000000000 R14: ffff88810770f728 R15: 0000000000000000 [27642.534614] FS: 00007f197c741740(0000) GS:ffff88856a94c000(0000) knlGS:0000000000000000 [27642.535915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [27642.536858] CR2: 00000000000000a0 CR3: 000000011334c003 CR4: 0000000000172eb0 [27642.537982] Call Trace: [27642.538466] <TASK> [27642.538907] exc_page_fault+0x76/0x140 [27642.539583] asm_exc_page_fault+0x22/0x30 [27642.540282] RIP: 0010:_raw_spin_lock_irqsave+0x10/0x30 [27642.541134] Code: 07 85 c0 75 11 ba ff 00 00 00 f0 0f b1 17 75 06 b8 01 00 00 00 c3 31 c0 c3 90 0f 1f 44 00 00 53 9c 5b fa 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 48 89 d8 5b c3 89 c6 e8 7e 02 00 00 48 89 d8 5b [27642.543936] RSP: 0018:ffff88810770f7d8 EFLAGS: 00010046 [27642.544803] RAX: 0000000000000000 RBX: 0000000000000202 RCX: ffff888113ad96d8 [27642.545916] RDX: 0000000000000001 RSI: ffff88810770f818 RDI: 00000000000000a0 [27642.547027] RBP: 0000000000000098 R08: 0000000000000400 R09: ffff88810b980f00 [27642.548140] R10: 0000000000000001 R11: ffff888101845a80 R12: 00000000000000a8 [27642.549263] R13: ffffffffa02a9060 R14: 00000000000000a0 R15: ffff8881130d8a40 [27642.550379] complete_all+0x20/0x90 [27642.551010] mlx5e_ipsec_disable_events+0xb6/0xf0 [mlx5_core] [27642.552022] mlx5e_nic_disable+0x12d/0x220 [mlx5_core] [27642.552929] mlx5e_detach_netdev+0x66/0xf0 [mlx5_core] [27642.553822] mlx5e_netdev_change_profile+0x5b/0x120 [mlx5_core] [27642.554821] mlx5e_vport_rep_load+0x419/0x590 [mlx5_core] [27642.555757] ? xa_load+0x53/0x90 [27642.556361] __esw_offloads_load_rep+0x54/0x70 [mlx5_core] [27642.557328] mlx5_esw_offloads_rep_load+0x45/0xd0 [mlx5_core] [27642.558320] esw_offloads_enable+0xb4b/0xc90 [mlx5_core] [27642.559247] mlx5_eswitch_enable_locked+0x34e/0x4f0 [mlx5_core] [27642.560257] ? mlx5_rescan_drivers_locked+0x222/0x2d0 [mlx5_core] [27642.561284] mlx5_devlink_eswitch_mode_set+0x5ac/0x9c0 [mlx5_core] [27642.562334] ? devlink_rate_set_ops_supported+0x21/0x3a0 [27642.563220] devlink_nl_eswitch_set_doit+0x67/0xe0 [27642.564026] genl_family_rcv_msg_doit+0xe0/0x130 [27642.564816] genl_rcv_msg+0x183/0x290 [27642.565466] ? __devlink_nl_pre_doit.isra.0+0x160/0x160 [27642.566329] ? devlink_nl_eswitch_get_doit+0x290/0x290 [27642.567181] ? devlink_nl_pre_doit_parent_dev_optional+0x20/0x20 [27642.568147] ? genl_family_rcv_msg_dumpit+0xf0/0xf0 [27642.568966] netlink_rcv_skb+0x4b/0xf0 [27642.569629] genl_rcv+0x24/0x40 [27642.570215] netlink_unicast+0x255/0x380 [27642.570901] ? __alloc_skb+0xfa/0x1e0 [27642.571560] netlink_sendmsg+0x1f3/0x420 [27642.572249] __sock_sendmsg+0x38/0x60 [27642.572911] __sys_sendto+0x119/0x180 [27642.573561] ? __sys_recvmsg+0x5c/0xb0 [27642.574227] __x64_sys_sendto+0x20/0x30 [27642.574904] do_syscall_64+0x55/0xc10 [27642.575554] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [27642.576391] RIP: 0033:0x7f197c85e807 [27642.577050] Code: c7 c0 ff ff ff ff eb be 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 45 08 0d 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 69 c3 55 48 89 e5 53 48 83 ec 38 44 89 4d d0 [27642.579846] RSP: 002b:00007ffebd4e2248 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [27642.581082] RAX: ffffffffffffffda RBX: 000055cfcd9cd2a0 RCX: 00007f197c85e807 [27642.582200] RDX: 0000000000000038 RSI: 000055cfcd9cd490 RDI: 0000000000000003 [27642.583320] RBP: 00007ffebd4e2290 R08: 00007f197c942200 R09: 000000000000000c [27642.584437] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [27642.585555] R13: 000055cfcd9cd490 R14: 00007ffebd4e45d1 R15: 000055cfcd9cd2a0 [27642.586671] </TASK> [27642.587121] ---[ end trace 0000000000000000 ]--- [27642.587910] BUG: kernel NULL pointer dereference, address: 00000000000000a0 Fixes: 664f76be38a1 ("net/mlx5: Fix IPsec cleanup over MPV device") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305142634.1813208-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net/mlx5: Fix deadlock between devlink lock and esw->wqCosmin Ratiu
esw->work_queue executes esw_functions_changed_event_handler -> esw_vfs_changed_event_handler and acquires the devlink lock. .eswitch_mode_set (acquires devlink lock in devlink_nl_pre_doit) -> mlx5_devlink_eswitch_mode_set -> mlx5_eswitch_disable_locked -> mlx5_eswitch_event_handler_unregister -> flush_workqueue deadlocks when esw_vfs_changed_event_handler executes. Fix that by no longer flushing the work to avoid the deadlock, and using a generation counter to keep track of work relevance. This avoids an old handler manipulating an esw that has undergone one or more mode changes: - the counter is incremented in mlx5_eswitch_event_handler_unregister. - the counter is read and passed to the ephemeral mlx5_host_work struct. - the work handler takes the devlink lock and bails out if the current generation is different than the one it was scheduled to operate on. - mlx5_eswitch_cleanup does the final draining before destroying the wq. No longer flushing the workqueue has the side effect of maybe no longer cancelling pending vport_change_handler work items, but that's ok since those are disabled elsewhere: - mlx5_eswitch_disable_locked disables the vport eq notifier. - mlx5_esw_vport_disable disarms the HW EQ notification and marks vport->enabled under state_lock to false to prevent pending vport handler from doing anything. - mlx5_eswitch_cleanup destroys the workqueue and makes sure all events are disabled/finished. Fixes: f1bc646c9a06 ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305081019.1811100-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06qmi_wwan: allow max_mtu above hard_mtu to control rx_urb_sizeLaurent Vivier
Commit c7159e960f14 ("usbnet: limit max_mtu based on device's hard_mtu") capped net->max_mtu to the device's hard_mtu in usbnet_probe(). While this correctly prevents oversized packets on standard USB network devices, it breaks the qmi_wwan driver. qmi_wwan relies on userspace (e.g. ModemManager) setting a large MTU on the wwan0 interface to configure rx_urb_size via usbnet_change_mtu(). QMI modems negotiate USB transfer sizes of 16,383 or 32,767 bytes, and the USB receive buffers must be sized accordingly. With max_mtu capped to hard_mtu (~1500 bytes), userspace can no longer raise the MTU, the receive buffers remain small, and download speeds drop from >300 Mbps to ~0.8 Mbps. Introduce a FLAG_NOMAXMTU driver flag that allows individual usbnet drivers to opt out of the max_mtu cap. Set this flag in qmi_wwan's driver_info structures to restore the previous behavior for QMI devices, while keeping the safety fix in place for all other usbnet drivers. Fixes: c7159e960f14 ("usbnet: limit max_mtu based on device's hard_mtu") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/CAPh3n803k8JcBPV5qEzUB-oKzWkAs-D5CU7z=Vd_nLRCr5ZqQg@mail.gmail.com/ Reported-by: Koen Vandeputte <koen.vandeputte@citymesh.com> Tested-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://patch.msgid.link/20260304134338.1785002-1-lvivier@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06bonding: handle BOND_LINK_FAIL, BOND_LINK_BACK as valid link statesHangbin Liu
Before the fixed commit, we check slave->new_link during commit state, which values are only BOND_LINK_{NOCHANGE, UP, DOWN}. After the commit, we start using slave->link_new_state, which state also could be BOND_LINK_{FAIL, BACK}. For example, when we set updelay/downdelay, after a failover, the slave->link_new_state could be set to BOND_LINK_{FAIL, BACK} in bond_miimon_inspect(). And later in bond_miimon_commit(), it will treat it as invalid and print an error, which would cause confusion for users. [ 106.440254] bond0: (slave veth2): link status down for interface, disabling it in 200 ms [ 106.440265] bond0: (slave veth2): invalid new link 1 on slave [ 106.648276] bond0: (slave veth2): link status definitely down, disabling slave [ 107.480271] bond0: (slave veth2): link status up, enabling it in 200 ms [ 107.480288] bond0: (slave veth2): invalid new link 3 on slave [ 107.688302] bond0: (slave veth2): link status definitely up, 10000 Mbps full duplex Let's handle BOND_LINK_{FAIL, BACK} as valid link states. Fixes: 1899bb325149 ("bonding: fix state transition issue in link monitoring") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260304-b4-bond_updelay-v1-2-f72eb2e454d0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06bonding: do not set usable_slaves for broadcast modeHangbin Liu
After commit e0caeb24f538 ("net: bonding: update the slave array for broadcast mode"), broadcast mode will also set all_slaves and usable_slaves during bond_enslave(). But if we also set updelay, during enslave, the slave init state will be BOND_LINK_BACK. And later bond_update_slave_arr() will alloc usable_slaves but add nothing. This will cause bond_miimon_inspect() to have ignore_updelay always true. So the updelay will be always ignored. e.g. [ 6.498368] bond0: (slave veth2): link status definitely down, disabling slave [ 7.536371] bond0: (slave veth2): link status up, enabling it in 0 ms [ 7.536402] bond0: (slave veth2): link status definitely up, 10000 Mbps full duplex To fix it, we can either always call bond_update_slave_arr() on every place when link changes. Or, let's just not set usable_slaves for broadcast mode. Fixes: e0caeb24f538 ("net: bonding: update the slave array for broadcast mode") Reported-by: Liang Li <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260304-b4-bond_updelay-v1-1-f72eb2e454d0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06net: mctp: fix device leak on probe failureJohan Hovold
Driver core holds a reference to the USB interface and its parent USB device while the interface is bound to a driver and there is no need to take additional references unless the structures are needed after disconnect. This driver takes a reference to the USB device during probe but does not to release it on probe failures. Drop the redundant device reference to fix the leak, reduce cargo culting, make it easier to spot drivers where an extra reference is needed, and reduce the risk of further memory leaks. Fixes: 0791c0327a6e ("net: mctp: Add MCTP USB transport driver") Cc: stable@vger.kernel.org # 6.15 Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20260305104549.16110-1-johan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06drm/amd: Fix a few more NULL pointer dereference in device cleanupMario Limonciello
I found a few more paths that cleanup fails due to a NULL version pointer on unsupported hardware. Add NULL checks as applicable. Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in powergated state in some paths") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5a05f8414fc10f307eb965f303580c7778f8dd2) Cc: stable@vger.kernel.org
2026-03-06drm/amdgpu: fix gpu idle power consumption issue for gfx v12Yang Wang
Older versions of the MES firmware may cause abnormal GPU power consumption. When performing inference tasks on the GPU (e.g., with Ollama using ROCm), the GPU may show abnormal power consumption in idle state and incorrect GPU load information. This issue has been fixed in firmware version 0x8b and newer. Closes: https://github.com/ROCm/ROCm/issues/5706 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4e22a5fe6ea6e0b057e7f246df4ac3ff8bfbc46a)
2026-03-06drm/amdgpu: Fix kernel-doc comments for some LUT propertiesCristian Ciocaltea
The following members of struct amdgpu_mode_info do not have valid references in the related kernel-doc sections: - plane_shaper_lut_property - plane_shaper_lut_size_property, - plane_lut3d_size_property Correct all affected comment blocks. Fixes: f545d82479b4 ("drm/amd/display: add plane shaper LUT and TF driver-specific properties") Fixes: 671994e3bf33 ("drm/amd/display: add plane 3D LUT driver-specific properties") Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ec5708d6e547f7efe2f009073bfa98dbc4c5c2ac)
2026-03-06drm/amd: Fix NULL pointer dereference in device cleanupMario Limonciello
When GPU initialization fails due to an unsupported HW block IP blocks may have a NULL version pointer. During cleanup in amdgpu_device_fini_hw, the code calls amdgpu_device_set_pg_state and amdgpu_device_set_cg_state which iterate over all IP blocks and access adev->ip_blocks[i].version without NULL checks, leading to a kernel NULL pointer dereference. Add NULL checks for adev->ip_blocks[i].version in both amdgpu_device_set_cg_state and amdgpu_device_set_pg_state to prevent dereferencing NULL pointers during GPU teardown when initialization has failed. Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in powergated state in some paths") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b7ac77468cda92eecae560b05f62f997a12fe2f2) Cc: stable@vger.kernel.org
2026-03-06drm/amd/pm: add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v14Yang Wang
add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v14.0.2/14.0.3 Fixes: 9710b84e2a6a ("drm/amd/pm: add overdrive support on smu v14.0.2/3") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5018 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b5cf07d80bb16d1593579ccdb23f08ea4262c14)
2026-03-06drm/amd/pm: add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v13Yang Wang
add missing od setting PP_OD_FEATURE_ZERO_FAN_BIT for smu v13.0.0/13.0.7 Fixes: cfffd980bf21 ("drm/amd/pm: add zero RPM OD setting support for SMU13") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5018 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 576a10797b607ee9e4068218daf367b481564120)
2026-03-06Merge tag 'drm-fixes-2026-03-07' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly fixes pull. There is one mm fix in here for a HMM livelock triggered by the xe driver tests. Otherwise it's a pretty wide range of fixes across the board, ttm UAF regression fix, amdgpu fixes, nouveau doesn't crash my laptop anymore fix, and a fair bit of misc. Seems about right for rc3. mm: - mm: Fix a hmm_range_fault() livelock / starvation problem pagemap: - Revert "drm/pagemap: Disable device-to-device migration" ttm: - fix function return breaking reclaim - fix build failure on PREEMPT_RT - fix bo->resource UAF dma-buf: - include ioctl.h in uapi header sched: - fix kernel doc warning amdgpu: - LUT fixes - VCN5 fix - Dispclk fix - SMU 13.x fix - Fix race in VM acquire - PSP 15.x fix - UserQ fix amdxdna: - fix invalid payload for failed command - fix NULL ptr dereference - fix major fw version check - avoid inconsistent fw state on error i915/display: - Fix for Lenovo T14 G7 display not refreshing xe: - Do not preempt fence signaling CS instructions - Some leak and finalization fixes - Workaround fix nouveau: - avoid runtime suspend oops when using dp aux panthor: - fix gem_sync argument ordering solomon: - fix incorrect display output renesas: - fix DSI divider programming ethosu: - fix job submit error clean-up refcount - fix NPU_OP_ELEMENTWISE validation - handle possible underflows in IFM size calcs" * tag 'drm-fixes-2026-03-07' of https://gitlab.freedesktop.org/drm/kernel: (38 commits) accel: ethosu: Handle possible underflow in IFM size calculations accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar accel: ethosu: Fix job submit error clean-up refcount underflows accel/amdxdna: Split mailbox channel create function drm/panthor: Correct the order of arguments passed to gem_sync Revert "drm/syncobj: Fix handle <-> fd ioctls with dirty stack" drm/ttm: Fix bo resource use-after-free nouveau/dpcd: return EBUSY for aux xfer if the device is asleep accel/amdxdna: Fix major version check on NPU1 platform drm/amdgpu/userq: refcount userqueues to avoid any race conditions drm/amdgpu/userq: Consolidate wait ioctl exit path drm/amdgpu/psp: Use Indirect access address for GFX to PSP mailbox drm/amdgpu: Fix use-after-free race in VM acquire drm/amd/pm: remove invalid gpu_metrics.energy_accumulator on smu v13.0.x drm/xe: Fix memory leak in xe_vm_madvise_ioctl drm/xe/reg_sr: Fix leak on xa_store failure drm/xe/xe2_hpg: Correct implementation of Wa_16025250150 drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure Revert "drm/pagemap: Disable device-to-device migration" drm/i915/psr: Fix for Panel Replay X granularity DPCD register handling ...
2026-03-06drm/msm/dsi: fix pclk rate calculation for bonded dsiPengyu Luo
Recently, we round up new_hdisplay once at most, for bonded dsi, we may need twice, since they are independent links, we should round up each half separately. This also aligns with the hdisplay we program later in dsi_timing_setup() Example: full_hdisplay = 1904, dsc_bpp = 8, bpc = 8 new_full_hdisplay = DIV_ROUND_UP(1904 * 8, 8 * 3) = 635 if we use half display new_half_hdisplay = DIV_ROUND_UP(952 * 8, 8 * 3) = 318 new_full_display = 636 Fixes: 7c9e4a554d4a ("drm/msm/dsi: Reduce pclk rate for compression") Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/709716/ Link: https://lore.kernel.org/r/20260306163255.215456-1-mitltlatltl@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-03-06Merge tag 'spi-fix-v7.0-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "One device specific fix here, it was possible we might end up trying to dereference an invalid pointer while reporting a transfer timeout on DesignWare controllers" * tag 'spi-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-dw-dma: fix print error log when wait finish transaction
2026-03-06Merge tag 'regulator-fix-v7.0-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of small, driver specific fixes which might not even have much impact if you have the affected devices depending on your setup" * tag 'regulator-fix-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: pf9453: Respect IRQ trigger settings from firmware regulator: mt6363: Fix incorrect and redundant IRQ disposal in probe
2026-03-06Merge tag 'hid-for-linus-2026030601' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix a few memory leaks (Günther Noack) - fix potential kernel crashes in cmedia, creative-sb0540 and zydacron (Greg Kroah-Hartman) - fix NULL pointer dereference in pidff (Tomasz Pakuła) - fix battery reporting for Apple Magic Trackpad 2 (Julius Lehmann) - mcp2221 proper handling of failed read operation (Romain Sioen) - various device quirks / device ID additions * tag 'hid-for-linus-2026030601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: mcp2221: cancel last I2C command on read error HID: asus: add xg mobile 2023 external hardware support HID: multitouch: Keep latency normal on deactivate for reactivation gesture HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs selftests: hid: tests: test_wacom_generic: add tests for display devices and opaque devices HID: multitouch: new class MT_CLS_EGALAX_P80H84 HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2 HID: pidff: Fix condition effect bit clearing HID: Add HID_CLAIMED_INPUT guards in raw_event callbacks missing them HID: asus: avoid memory leak in asus_report_fixup() HID: magicmouse: avoid memory leak in magicmouse_report_fixup() HID: apple: avoid memory leak in apple_report_fixup() HID: Document memory allocation properties of report_fixup()
2026-03-06Merge tag 'platform-drivers-x86-v7.0-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - alienware-wmi-wmax: Add G-Mode support to m18 laptops - asus-armoury: Add support for FA401UM, G733QS, GX650RX - dell-wmi-sysman: Don't hex dump plaintext password data - hp-bioscfg: Support large number of enumeration attributes - hp-wmi: Add support for Omen 14-fb1xxx, 16-xd0xxx, 16-wf0xxx, and Victus-d0xxx - int3472: Handle GPIO type 0x10 (DOVDD) - intel-hid: - Add Dell 14 & 16 Plus 2-in-1 to dmi_vgbs_allow_list - Enable 5-button array on ThinkPad X1 Fold 16 Gen 1 - mellanox: mlxreg: Fix kernel-doc warnings - oxpec: Add support for OneXPlayer X1 Air, X1z, APEX, and Aokzoe A2 Pro - redmi-wmi: Add more Fn hotkey mappings - thinkpad_acpi: Fix errors reading battery thresholds - touchscreen_dmi: Add quirk for y-inverted Goodix touchscreen on SUPI S10 - uniwill-laptop: - FN lock/super key lock attributes rename - Fix crash on unexpected battery event - A special key combination can alter FN lock status so mark it volatile - Handle FN lock event * tag 'platform-drivers-x86-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (27 commits) platform/x86: dell-wmi-sysman: Don't hex dump plaintext password data platform_data/mlxreg: mlxreg.h: fix all kernel-doc warnings platform/x86: asus-armoury: add support for FA401UM platform/x86: asus-armoury: add support for GX650RX platform/x86: hp-bioscfg: Support allocations of larger data platform/x86: oxpec: Add support for Aokzoe A2 Pro platform/x86: oxpec: Add support for OneXPlayer X1 Air platform/x86: oxpec: Add support for OneXPlayer X1z platform/x86: oxpec: Add support for OneXPlayer APEX platform/x86: uniwill-laptop: Handle FN lock event platform/x86: uniwill-laptop: Mark FN lock status as being volatile platform/x86: uniwill-laptop: Fix crash on unexpected battery event platform/x86: uniwill-laptop: Rename FN lock and super key lock attrs platform/x86: redmi-wmi: Add more hotkey mappings platform/x86: alienware-wmi-wmax: Add G-Mode support to m18 laptops platform/x86: hp-wmi: add Omen 14-fb1xxx (board 8E41) support platform/x86: dell-wmi: Add audio/mic mute key codes platform/x86: hp-wmi: Add Victus 16-d0xxx support platform/x86: intel-hid: Enable 5-button array on ThinkPad X1 Fold 16 Gen 1 platform/x86: int3472: Handle GPIO type 0x10 (DOVDD) ...
2026-03-06Merge tag 'pmdomain-v7.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: - rockchip: Fix PD_VCODEC for RK3588 - bcm: Fix broken reset status read for bcm2835 * tag 'pmdomain-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: rockchip: Fix PD_VCODEC for RK3588 pmdomain: bcm: bcm2835-power: Fix broken reset status read
2026-03-06Merge tag 'v7.0-p2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Fix use-after-free in ccp - Fix bug when SEV is disabled in ccp - Fix tfm_count leak in atmel-sha204a * tag 'v7.0-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: atmel-sha204a - Fix OOM ->tfm_count leak crypto: ccp - Fix use-after-free on error path crypto: ccp - allow callers to use HV-Fixed page API when SEV is disabled
2026-03-06Merge tag 'ata-7.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Niklas Cassel: - Fix a problem where the deferred non-NCQ command would incorrectly get completed as a failed command, if there was another command that timed out. Found by Gemini (Guenter) - The deferred non-NCQ command work is only supposed to run after the last NCQ command finishes. However, because the work was never canceled on error (e.g. a timeout), the work could incorrectly run when commands were still in flight. Found by syzbot (me) - Add a quirk to make sure that QEMU harddrives can potentially use up to 32 MiB I/Os (Pedro) - Add a quirk to disable LPM on Seagate ST1000DM010-2EP102 (Maximilian) * tag 'ata-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-eh: Fix detection of deferred qc timeouts ata: libata-core: Add BRIDGE_OK quirk for QEMU drives ata: libata: cancel pending work after clearing deferred_qc ata: libata-core: Disable LPM on ST1000DM010-2EP102
2026-03-06Merge tag 'block-7.0-20260305' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Improve quirk visibility and configurability (Maurizio) - Fix runtime user modification to queue setup (Keith) - Fix multipath leak on try_module_get failure (Keith) - Ignore ambiguous spec definitions for better atomics support (John) - Fix admin queue leak on controller reset (Ming) - Fix large allocation in persistent reservation read keys (Sungwoo Kim) - Fix fcloop callback handling (Justin) - Securely free DHCHAP secrets (Daniel) - Various cleanups and typo fixes (John, Wilfred) - Avoid a circular lock dependency issue in the sysfs nr_requests or scheduler store handling - Fix a circular lock dependency with the pcpu mutex and the queue freeze lock - Cleanup for bio_copy_kern(), using __bio_add_page() rather than the bio_add_page(), as adding a page here cannot fail. The exiting code had broken cleanup for the error condition, so make it clear that the error condition cannot happen - Fix for a __this_cpu_read() in preemptible context splat * tag 'block-7.0-20260305' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: block: use trylock to avoid lockdep circular dependency in sysfs nvme: fix memory allocation in nvme_pr_read_keys() block: use __bio_add_page in bio_copy_kern block: break pcpu_alloc_mutex dependency on freeze_lock blktrace: fix __this_cpu_read/write in preemptible context nvme-multipath: fix leak on try_module_get failure nvmet-fcloop: Check remoteport port_state before calling done callback nvme-pci: do not try to add queue maps at runtime nvme-pci: cap queue creation to used queues nvme-pci: ensure we're polling a polled queue nvme: fix memory leak in quirks_param_set() nvme: correct comment about nvme_ns_remove() nvme: stop setting namespace gendisk device driver data nvme: add support for dynamic quirk configuration via module parameter nvme: fix admin queue leak on controller reset nvme-fabrics: use kfree_sensitive() for DHCHAP secrets nvme: stop using AWUPF nvme: expose active quirks in sysfs nvme/host: fixup some typos
2026-03-06firmware: arm_ffa: Remove vm_id argument in ffa_rxtx_unmap()Yeoreum Yun
According to the FF-A specification (DEN0077, v1.1, §13.7), when FFA_RXTX_UNMAP is invoked from any instance other than non-secure physical, the w1 register must be zero (MBZ). If a non-zero value is supplied in this context, the SPMC must return FFA_INVALID_PARAMETER. The Arm FF-A driver operates exclusively as a guest or non-secure physical instance where the partition ID is always zero and is not invoked from a hypervisor context where w1 carries a VM ID. In this execution model, the partition ID observed by the driver is always zero, and passing a VM ID is unnecessary and potentially invalid. Remove the vm_id parameter from ffa_rxtx_unmap() and ensure that the SMC call is issued with w1 implicitly zeroed, as required by the specification. This prevents invalid parameter errors and aligns the implementation with the defined FF-A ABI behavior. Fixes: 3bbfe9871005 ("firmware: arm_ffa: Add initial Arm FFA driver support") Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Message-Id: <20260304120953.847671-1-yeoreum.yun@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
2026-03-06cxl/acpi: Fix CXL_ACPI and CXL_PMEM Kconfig tristate mismatchKeith Busch
Commit e7e222ad73d9 ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko") moves devm_cxl_add_nvdimm_bridge() into the cxl_pmem file, which has independent config compile options for built-in or module. The call from cxl_acpi_probe() is guarded by IS_ENABLED(CONFIG_CXL_PMEM), which evaluates to true for both =y and =m. When CONFIG_CXL_PMEM=m, a built-in cxl_acpi attempts to reference a symbol exported by a module, which fails to link. CXL_PMEM cannot simply be promoted to =y in this configuration because it depends on LIBNVDIMM, which may itself be =m. Add a Kconfig dependency to prevent CXL_ACPI from being built-in when CXL_PMEM is a module. This contrains CXL_ACPI to =m when CXL_PMEM=m, while still allowing CXL_ACPI to be freely configured when CXL_PMEM is either built-in or disabled. [ dj: Fix up commit reference formatting. ] Fixes: e7e222ad73d9 ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260305204057.1516948-1-kbusch@meta.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-03-06pinctrl: renesas: rzt2h: Fix invalid wait contextCosmin Tanislav
The rzt2h_gpio_get_direction() function is called from gpiod_get_direction(), which ends up being used within the __setup_irq() call stack when requesting an interrupt. __setup_irq() holds a raw_spinlock_t with IRQs disabled, which creates an atomic context. spinlock_t cannot be used within atomic context when PREEMPT_RT is enabled, since it may become a sleeping lock. An "[ BUG: Invalid wait context ]" splat is observed when running with CONFIG_PROVE_LOCKING enabled, describing exactly the aforementioned call stack. __setup_irq() needs to hold a raw_spinlock_t with IRQs disabled to serialize access against a concurrent hard interrupt. Switch to raw_spinlock_t to fix this. Fixes: 829dde3369a9 ("pinctrl: renesas: rzt2h: Add GPIO IRQ chip to handle interrupts") Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260205103930.666051-1-cosmin-gabriel.tanislav.xa@renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2026-03-06pinctrl: renesas: rzt2h: Fix device node leak in rzt2h_gpio_register()Felix Gu
When calling of_parse_phandle_with_fixed_args(), the caller is responsible for calling of_node_put() to release the device node reference. In rzt2h_gpio_register(), the driver fails to call of_node_put() to release the reference in of_args.np, which causes a memory leak. Add the missing of_node_put() call to fix the leak. Fixes: 34d4d093077a ("pinctrl: renesas: Add support for RZ/T2H") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260127-rzt2h-v1-1-86472e7421b8@gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2026-03-06ublk: fix NULL pointer dereference in ublk_ctrl_set_size()Mehul Rao
ublk_ctrl_set_size() unconditionally dereferences ub->ub_disk via set_capacity_and_notify() without checking if it is NULL. ub->ub_disk is NULL before UBLK_CMD_START_DEV completes (it is only assigned in ublk_ctrl_start_dev()) and after UBLK_CMD_STOP_DEV runs (ublk_detach_disk() sets it to NULL). Since the UBLK_CMD_UPDATE_SIZE handler performs no state validation, a user can trigger a NULL pointer dereference by sending UPDATE_SIZE to a device that has been added but not yet started, or one that has been stopped. Fix this by checking ub->ub_disk under ub->mutex before dereferencing it, and returning -ENODEV if the disk is not available. Fixes: 98b995660bff ("ublk: Add UBLK_U_CMD_UPDATE_SIZE") Cc: stable@vger.kernel.org Signed-off-by: Mehul Rao <mehulrao@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-06wifi: mac80211_hwsim: fully initialise PMSR capabilitiesJohannes Berg
Since the recent additions to PMSR capabilities, it's no longer sufficient to call parse_pmsr_capa() here since the capabilities that were added aren't represented/filled by it. Always init the data to zero to avoid using uninitialized memory. Fixes: 86c6b6e4d187 ("wifi: nl80211/cfg80211: add new FTM capabilities") Reported-by: syzbot+c686c6b197d10ff3a749@syzkaller.appspotmail.com Closes: https://lore.kernel.org/69a67aa3.a70a0220.b118c.000a.GAE@google.com/ Link: https://patch.msgid.link/20260303113739.176403-2-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-06Merge tag 'drm-xe-fixes-2026-03-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Cross-subsystem Changes: - mm: Fix a hmm_range_fault() livelock / starvation problem (Thomas) Core Changes: - Revert "drm/pagemap: Disable device-to-device migration" (Thomas) Driver Changes: - Do not preempt fence signaling CS instructions (Brost) - Some leak and finalization fixes (Shuicheng, Tomasz, Varun, Zhanjun) - Workaround fix (Roper) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aamGvvGRBRtX8-6u@intel.com
2026-03-06Merge tag 'drm-misc-fixes-2026-03-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Another early drm-misc-fixes PR to revert the previous uapi fix sent in drm-misc-fixes-2026-03-05, together with a UAF fix in TTM, an argument order fix for panthor, a fix for the firmware getting stuck on resource allocation error handling for amdxdna, and a few fixes for ethosu (size calculation and reference underflows, and a validation fix). Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260306-grumpy-pegasus-of-witchcraft-6bd2db@houat
2026-03-06Merge tag 'drm-misc-fixes-2026-03-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A return type fix for ttm, a display fix for solomon, several misc fixes for amdxdna, a DSI clock rate fix for rz-du, a uapi fix for syncobj, a possible build failure fix for dma-buf, a doc warning fix for sched, a build failure fix for ttm tests, and a crash fix when suspended for nouveau. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260305-ludicrous-quirky-raven-7cdafd@houat
2026-03-06ata: libata-eh: Fix detection of deferred qc timeoutsGuenter Roeck
If the ata_qc_for_each_raw() loop finishes without finding a matching SCSI command for any QC, the variable qc will hold a pointer to the last element examined, which has the tag i == ATA_MAX_QUEUE - 1. This qc can match the port deferred QC (ap->deferred_qc). If that happens, the condition qc == ap->deferred_qc evaluates to true despite the loop not breaking with a match on the SCSI command for this QC. In that case, the error handler mistakenly intercepts a command that has not been issued yet and that has not timed out, and thus erroneously returning a timeout error. Fix the problem by checking for i < ATA_MAX_QUEUE in addition to qc == ap->deferred_qc. The problem was found by an experimental code review agent based on gemini-3.1-pro while reviewing backports into v6.18.y. Assisted-by: Gemini:gemini-3.1-pro Fixes: eddb98ad9364 ("ata: libata-eh: correctly handle deferred qc timeouts") Signed-off-by: Guenter Roeck <linux@roeck-us.net> [cassel: modified commit log as suggested by Damien] Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-03-06Merge tag 'drm-intel-fixes-2026-03-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix for #7284: Lenovo T14 G7 display not refreshing Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/aakz17Jx3Ye9Vqci@jlahtine-mobl
2026-03-05net: dsa: realtek: rtl8365mb: remove ifOutDiscards from rx_packetsMieczyslaw Nalewaj
rx_packets should report the number of frames successfully received: unicast + multicast + broadcast. Subtracting ifOutDiscards (a TX counter) is incorrect and can undercount RX packets. RX drops are already reported via rx_dropped (e.g. etherStatsDropEvents), so there is no need to adjust rx_packets. This patch removes the subtraction of ifOutDiscards from rx_packets in rtl8365mb_stats_update(). Link: https://lore.kernel.org/netdev/878777925.105015.1763423928520@mail.yahoo.com/ Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Signed-off-by: Mieczyslaw Nalewaj <namiltd@yahoo.com> Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260303-realtek_namiltd_fix2-v1-1-bfa433d3401e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06drm/msm/dsi/phy: fix hardware revisionPengyu Luo
The hardware revision for TSMC 3nm-based Qualcomm SOCs should be 7.2, this can be confirmed from REG_DSI_7nm_PHY_CMN_REVISION_ID0, the value is 0x27, which means hardware revision is 7.2 No functional change. Fixes: 1337d7ebfb6d ("drm/msm/dsi/phy: Add support for SM8750") Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/707414/ Link: https://lore.kernel.org/r/20260226122958.22555-2-mitltlatltl@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-03-06drm/msm/dpu: Correct the SA8775P intr_underrun/intr_underrun indexAbhinav Kumar
The intr_underrun and intr_vsync indices have been swapped, just simply corrects them. Cc: stable@vger.kernel.org Fixes: b139c80d181c ("drm/msm/dpu: Add SA8775P support") Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Yongxing Mou <yongxing.mou@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/709209/ Link: https://lore.kernel.org/r/20260305-mdss_catalog-v5-2-06678ac39ac7@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-03-05drm/msm/a8xx: Fix ubwc config related to swizzlingAkhil P Oommen
To disable l2/l3 swizzling in A8x, set the respective bits in both GRAS_NC_MODE_CNTL and RB_CCU_NC_MODE_CNTL registers. This is required for Glymur where it is recommended to keep l2/l3 swizzling disabled. Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Message-ID: <20260305-a8xx-ubwc-fix-v1-1-d99b6da4c5a9@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2026-03-05accel: ethosu: Handle possible underflow in IFM size calculationsRob Herring (Arm)
If the command stream has larger padding sizes than the IFM and OFM diminsions, then the calculations will underflow to a negative value. The result is a very large region bounds which is caught on submit, but it's better to catch it earlier. Current mesa ethosu driver has a signedness bug which resulted in padding of 127 (the max) and triggers this issue. Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-3-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalarRob Herring (Arm)
The NPU_OP_ELEMENTWISE instruction uses a scalar value for IFM2 if the IFM2_BROADCAST "scalar" mode is set. It is a bit (7) on the u65 and part of a field (bits 3:0) on the u85. The driver was hardcoded to the u85. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-2-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05accel: ethosu: Fix job submit error clean-up refcount underflowsRob Herring (Arm)
If the job submit fails before adding the job to the scheduler queue such as when the GEM buffer bounds checks fail, then doing a ethosu_job_put() results in a pm_runtime_put_autosuspend() without the corresponding pm_runtime_resume_and_get(). The dma_fence_put()'s are also unnecessary, but seem to be harmless. Split the ethosu_job_cleanup() function into 2 parts for the before and after the job is queued. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-1-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05Merge tag 'acpi-7.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support fixes from Rafael Wysocki: - Revert a commit related to ACPI device power management that was not supposed to make any functional difference, but it did so and introduced a regression (Rafael Wysocki) - Update the _CPC object definition in ACPICA to match ACPI 6.6 and prevent the kernel from printing a false-positive warning regarding _CPC output package format on platforms shipping with firmware based on ACPI 6.6 (Saket Dumbre) * tag 'acpi-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: PM: Let acpi_dev_pm_attach() skip devices without ACPI PM" ACPICA: Update the _CPC definition to match ACPI 6.6
2026-03-05Merge tag 'net-7.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, netfilter and wireless. Current release - new code bugs: - sched: cake: fixup cake_mq rate adjustment for diffserv config - wifi: fix missing ieee80211_eml_params member initialization Previous releases - regressions: - tcp: give up on stronger sk_rcvbuf checks (for now) Previous releases - always broken: - net: fix rcu_tasks stall in threaded busypoll - sched: - fq: clear q->band_pkt_count[] in fq_reset() - only allow act_ct to bind to clsact/ingress qdiscs and shared blocks - bridge: check relevant per-VLAN options in VLAN range grouping - xsk: fix fragment node deletion to prevent buffer leak Misc: - spring cleanup of inactive maintainers" * tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits) xdp: produce a warning when calculated tailroom is negative net: enetc: use truesize as XDP RxQ info frag_size libeth, idpf: use truesize as XDP RxQ info frag_size i40e: use xdp.frame_sz as XDP RxQ info frag_size i40e: fix registering XDP RxQ info ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz ice: fix rxq info registering in mbuf packets xsk: introduce helper to determine rxq->frag_size xdp: use modulo operation to calculate XDP frag tailroom selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour net/sched: act_ife: Fix metalist update behavior selftests: net: add test for IPv4 route with loopback IPv6 nexthop net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled MAINTAINERS: remove Thomas Falcon from IBM ibmvnic MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC ...
2026-03-05Merge branch 'acpica'Rafael J. Wysocki
Merge a fix updating the _CPC object definition in ACPICA to avoid printing a false-positive output package format warning on new platforms (Saket Dumbre) * acpica: ACPICA: Update the _CPC definition to match ACPI 6.6
2026-03-05accel/amdxdna: Split mailbox channel create functionLizhi Hou
The management channel used for firmware control command submission is currently created after the firmware is started. If channel creation fails (for example, due to memory allocation failure or workqueue creation interruption), the firmware remains in a pending state and is unable to receive any control commands. To avoid leaving the firmware in this inconsistent state, split xdna_mailbox_create_channel() into two separate functions so that resource allocation can be completed before interacting with the hardware. xdna_mailbox_alloc_channel() Allocates memory and initializes the workqueue. This can be called earlier, before interacting with the hardware. xdna_mailbox_start_channel() Performs the hardware interaction required to start the channel. Rename xdna_mailbox_destroy_channel() to xdna_mailbox_free_channel(). Ensure that xdna_mailbox_stop_channel() and xdna_mailbox_free_channel() properly unwind the corresponding start and allocation steps, respectively. Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260305062041.3954024-1-lizhi.hou@amd.com
2026-03-05remoteproc: imx_rproc: Fix unreachable platform prepare_opsPeng Fan
Smatch reports unreachable code in imx_rproc_prepare(), where an early return inside the reserved-memory parsing loop prevents platform prepare_ops from being executed. When of_reserved_mem_region_to_resource() fails, imx_rproc_prepare() returns immediately, so the platform-specific prepare callback is never called. As a result, prepare_ops such as imx_rproc_sm_lmm_prepare() on i.MX95 have no chance to run. This is problematic when Linux controls the M7 Logical Machine and is responsible for preparing resources such as TCM. Without running the platform prepare callback, loading the M7 ELF into TCM may fail if the bootloader did not power up and initialize TCM. Fix this by breaking out of the reserved-memory loop instead of returning, allowing the platform prepare_ops to be executed as intended. Fixes: edd2a9956055 ("remoteproc: imx_rproc: Introduce prepare ops for imx_rproc_dcfg") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-remoteproc/aYYXAa2Fj36XG4yQ@p14s/T/#t Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Link: https://lore.kernel.org/r/20260208-imx-rproc-fix-v1-1-ad74555eb9a4@nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>