| Age | Commit message (Collapse) | Author |
|
Extend the MAC_TCR_SS (Speed Select) register field width from 2 bits
to 3 bits to properly support all speed settings.
The MAC_TCR register's SS field encoding requires 3 bits to represent
all supported speeds:
- 0x00: 10Gbps (XGMII)
- 0x02: 2.5Gbps (GMII) / 100Mbps
- 0x03: 1Gbps / 10Mbps
- 0x06: 2.5Gbps (XGMII) - P100a only
With only 2 bits, values 0x04-0x07 cannot be represented, which breaks
2.5G XGMII mode on newer platforms and causes incorrect speed select
values to be programmed.
Fixes: 07445f3c7ca1 ("amd-xgbe: Add support for 10 Mbps speed")
Co-developed-by: Guruvendra Punugupati <Guruvendra.Punugupati@amd.com>
Signed-off-by: Guruvendra Punugupati <Guruvendra.Punugupati@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20260226170753.250312-1-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
speed is not 1G
When both eth interfaces with links up are added to a bridge or hsr
interface, ping fails if the link speed is not 1Gbps (e.g., 100Mbps).
The issue is seen because when switching to offload (bridge/hsr) mode,
prueth_emac_restart() restarts the firmware and clears DRAM with
memset_io(), setting all memory to 0. This includes PORT_LINK_SPEED_OFFSET
which firmware reads for link speed. The value 0 corresponds to
FW_LINK_SPEED_1G (0x00), so for 1Gbps links the default value is correct
and ping works. For 100Mbps links, the firmware needs FW_LINK_SPEED_100M
(0x01) but gets 0 instead, causing ping to fail. The function
emac_adjust_link() is called to reconfigure, but it detects no state change
(emac->link is still 1, speed/duplex match PHY) so new_state remains false
and icssg_config_set_speed() is never called to correct the firmware speed
value.
The fix resets emac->link to 0 before calling emac_adjust_link() in
prueth_emac_common_start(). This forces new_state=true, ensuring
icssg_config_set_speed() is called to write the correct speed value to
firmware memory.
Fixes: 06feac15406f ("net: ti: icssg-prueth: Fix emac link speed handling")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20260226102356.2141871-1-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 31a7a0bbeb00 ("dpaa2-switch: add bounds check for if_id in IRQ
handler") introduces a range check for if_id to avoid an out-of-bounds
access. If an out-of-bounds if_id is detected, the interrupt status is
not cleared. This may result in an interrupt storm.
Clear the interrupt status after detecting an out-of-bounds if_id to avoid
the problem.
Found by an experimental AI code review agent at Google.
Fixes: 31a7a0bbeb00 ("dpaa2-switch: add bounds check for if_id in IRQ handler")
Cc: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://patch.msgid.link/20260227055812.1777915-1-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2026-02-19 (idpf, ice, i40e, ixgbevf, e1000e)
For idpf:
Li Li moves the check for software marker to occur after incrementing
next to clean to avoid re-encountering the same packet. He also adds a
couple of checks to prevent NULL pointer dereferences and NULLs rss_key,
after free, in error path so that later checks are properly evaluated.
Brian Vazquez adjusts IRQ naming to have correlation with netdev naming.
Sreedevi removes validation of action type as part of ntuple rule
deletion.
For ice:
Aaron Ma breaks RDMA initialization into two steps and adjusts calls so
that VSIs are entirely configured before plugging.
Michal Schmidt fixes initialization of loopback VSI to have proper
resources allocated to allow for loopback testing to occur.
For i40e:
Thomas Gleixner fixes a leak of preempt count by replacing get_cpu()
with smp_processor_id().
For ixgbevf:
Jedrzej adds a check for mailbox version before attempting to call an
associated link state call that is supported in that mailbox version.
For e1000e:
Vitaly clears power gating feature for Panther Lake systems to avoid
packet issues.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: clear DPG_EN after reset to avoid autonomous power-gating
e1000e: introduce new board type for Panther Lake PCH
ixgbevf: fix link setup issue
i40e: Fix preempt count leak in napi poll tracepoint
ice: fix crash in ethtool offline loopback test
ice: recap the VSI and QoS info after rebuild
idpf: Fix flow rule delete failure due to invalid validation
idpf: change IRQ naming to match netdev and ethtool queue numbering
idpf: nullify pointers after they are freed
idpf: skip deallocating txq group's txqs if it is NULL
idpf: skip deallocating bufq_sets from rx_qgrp if it is NULL
idpf: increment completion queue next_to_clean in sw marker wait routine
====================
Link: https://patch.msgid.link/20260225211546.1949260-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MANA hardware requires at least one doorbell ring every 8 wraparounds
of the CQ. The driver rings the doorbell as a form of flow control to
inform hardware that CQEs have been consumed.
The NAPI poll functions mana_poll_tx_cq() and mana_poll_rx_cq() can
poll up to CQE_POLLING_BUFFER (512) completions per call. If the CQ
has fewer than 512 entries, a single poll call can process more than
4 wraparounds without ringing the doorbell. The doorbell threshold
check also uses ">" instead of ">=", delaying the ring by one extra
CQE beyond 4 wraparounds. Combined, these issues can cause the driver
to exceed the 8-wraparound hardware limit, leading to missed
completions and stalled queues.
Fix this by capping the number of CQEs polled per call to 4 wraparounds
of the CQ in both TX and RX paths. Also change the doorbell threshold
from ">" to ">=" so the doorbell is rung as soon as 4 wraparounds are
reached.
Cc: stable@vger.kernel.org
Fixes: 58a63729c957 ("net: mana: Fix doorbell out of order violation and avoid unnecessary doorbell rings")
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20260226192833.1050807-1-longli@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The TRENDnet TUC-ET2G is a RTL8156 based usb ethernet adapter. Add its
vendor and product IDs.
Signed-off-by: Valentin Spreckels <valentin@spreckels.dev>
Link: https://patch.msgid.link/20260226195409.7891-2-valentin@spreckels.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ALE table
In the current implementation, flushing multicast entries in MAC mode
incorrectly deletes entries for all ports instead of only the target port,
disrupting multicast traffic on other ports. The cause is adding multicast
entries by setting only host port bit, and not setting the MAC port bits.
Fix this by setting the MAC port's bit in the port mask while adding the
multicast entry. Also fix the flush logic to preserve the host port bit
during removal of MAC port and free ALE entries when mask contains only
host port.
Fixes: 5c50a856d550 ("drivers: net: ethernet: cpsw: add multicast address to ALE table")
Signed-off-by: Chintan Vankar <c-vankar@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224181359.2055322-1-c-vankar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from IPsec, Bluetooth and netfilter
Current release - regressions:
- wifi: fix dev_alloc_name() return value check
- rds: fix recursive lock in rds_tcp_conn_slots_available
Current release - new code bugs:
- vsock: lock down child_ns_mode as write-once
Previous releases - regressions:
- core:
- do not pass flow_id to set_rps_cpu()
- consume xmit errors of GSO frames
- netconsole: avoid OOB reads, msg is not nul-terminated
- netfilter: h323: fix OOB read in decode_choice()
- tcp: re-enable acceptance of FIN packets when RWIN is 0
- udplite: fix null-ptr-deref in __udp_enqueue_schedule_skb().
- wifi: brcmfmac: fix potential kernel oops when probe fails
- phy: register phy led_triggers during probe to avoid AB-BA deadlock
- eth:
- bnxt_en: fix deleting of Ntuple filters
- wan: farsync: fix use-after-free bugs caused by unfinished tasklets
- xscale: check for PTP support properly
Previous releases - always broken:
- tcp: fix potential race in tcp_v6_syn_recv_sock()
- kcm: fix zero-frag skb in frag_list on partial sendmsg error
- xfrm:
- fix race condition in espintcp_close()
- always flush state and policy upon NETDEV_UNREGISTER event
- bluetooth:
- purge error queues in socket destructors
- fix response to L2CAP_ECRED_CONN_REQ
- eth:
- mlx5:
- fix circular locking dependency in dump
- fix "scheduling while atomic" in IPsec MAC address query
- gve: fix incorrect buffer cleanup for QPL
- team: avoid NETDEV_CHANGEMTU event when unregistering slave
- usb: validate USB endpoints"
* tag 'net-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
netfilter: nf_conntrack_h323: fix OOB read in decode_choice()
dpaa2-switch: validate num_ifs to prevent out-of-bounds write
net: consume xmit errors of GSO frames
vsock: document write-once behavior of the child_ns_mode sysctl
vsock: lock down child_ns_mode as write-once
selftests/vsock: change tests to respect write-once child ns mode
net/mlx5e: Fix "scheduling while atomic" in IPsec MAC address query
net/mlx5: Fix missing devlink lock in SRIOV enable error path
net/mlx5: E-switch, Clear legacy flag when moving to switchdev
net/mlx5: LAG, disable MPESW in lag_disable_change()
net/mlx5: DR, Fix circular locking dependency in dump
selftests: team: Add a reference count leak test
team: avoid NETDEV_CHANGEMTU event when unregistering slave
net: mana: Fix double destroy_workqueue on service rescan PCI path
MAINTAINERS: Update maintainer entry for QUALCOMM ETHQOS ETHERNET DRIVER
dpll: zl3073x: Remove redundant cleanup in devm_dpll_init()
selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0
tcp: re-enable acceptance of FIN packets when RWIN is 0
vsock: Use container_of() to get net namespace in sysctl handlers
net: usb: kaweth: validate USB endpoints
...
|
|
In ath12k_wmi_tlv_fw_stats_data_parse() and
ath12k_wmi_tlv_rssi_chain_parse(), the driver uses
ieee80211_find_sta_by_ifaddr() to look up the station associated with the
incoming firmware statistics. This works under normal conditions but fails
during AP disconnection, resulting in log messages like:
wlan0: deauthenticating from xxxxxx by local choice (Reason: 3=DEAUTH_LEAVING)
wlan0: moving STA xxxxxx to state 3
wlan0: moving STA xxxxxx to state 2
wlan0: moving STA xxxxxx to state 1
ath12k_pci 0000:02:00.0: not found station bssid xxxxxx for vdev stat
ath12k_pci 0000:02:00.0: not found station of bssid xxxxxx for rssi chain
ath12k_pci 0000:02:00.0: failed to pull fw stats: -71
ath12k_pci 0000:02:00.0: time out while waiting for get fw stats
wlan0: Removed STA xxxxxx
wlan0: Destroyed STA xxxxxx
The failure happens because the station has already been removed from
ieee80211_local::sta_hash by the time firmware statistics are requested
through drv_sta_statistics().
Switch the lookup to ath12k_link_sta_find_by_addr(), which searches the
driver's link station hash table that still has the station recorded
at that time. This also implicitly fixes another issue: the current code
always uses deflink regardless of which link the statistics belong to,
which is incorrect in MLO scenarios. The new helper returns the correct
link station.
Additionally, raise the log level on lookup failures. With the updated
helper, such failures should no longer occur under normal conditions.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
Fixes: 79e7b04b5388 ("wifi: ath12k: report station mode signal strength")
Fixes: 6af5bc381b36 ("wifi: ath12k: report station mode per-chain signal strength")
Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20260129-ath12k-fw-stats-fixes-v1-2-55d66064f4d5@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
To get firmware statistics, currently ar->pdev->pdev_id is passed as an
argument to ath12k_mac_get_fw_stats() in ath12k_mac_op_sta_statistics().
For single pdev device like WCN7850, its value is 0 which represents the
SoC pdev id. As a result, WCN7850 firmware sends the same reply to host
twice, which further results in memory leak:
unreferenced object 0xffff88812e286000 (size 192):
comm "softirq", pid 0, jiffies 4294981997
hex dump (first 32 bytes):
10 a5 40 11 81 88 ff ff 10 a5 40 11 81 88 ff ff ..@.......@.....
00 00 00 00 00 00 00 00 80 ff ff ff 33 05 00 00 ............3...
backtrace (crc cecc8c82):
__kmalloc_cache_noprof
ath12k_wmi_tlv_fw_stats_parse
ath12k_wmi_tlv_iter
ath12k_wmi_op_rx
ath12k_htc_rx_completion_handler
ath12k_ce_per_engine_service
ath12k_pci_ce_workqueue
process_one_work
bh_worker
tasklet_action
handle_softirqs
Detailed explanation is:
1. ath12k_mac_get_fw_stats() called in ath12k_mac_op_sta_statistics() to
get vdev statistics, making the caller thread wait.
2. firmware sends the first reply, ath12k_wmi_tlv_fw_stats_data_parse()
allocates buffers to cache necessary information. Following that, in
ath12k_wmi_fw_stats_process() if events of all started vdev haved been
received, is_end flag is set hence the waiting thread gets waken up by
the ar->fw_stats_done/->fw_stats_complete signals.
3. ath12k_mac_get_fw_stats() wakes up and returns successfully.
ath12k_mac_op_sta_statistics() saves required parameters and calls
ath12k_fw_stats_reset() to free buffers allocated earlier.
4. firmware sends the second reply. As usual, buffers are allocated and
attached to the ar->fw_stats.vdevs list. Note this time there is no
thread waiting, therefore no chance to free those buffers.
5. ath12k module gets unloaded. If there has been no more firmware
statistics request made since step 4, or if the request fails (see
the example in the following patch), there is no chance to call
ath12k_fw_stats_reset(). Consequently those buffers leak.
Actually for single pdev device, using SoC pdev id in
ath12k_mac_op_sta_statistics() is wrong, because the purpose is to get
statistics of a specific station, which is mapped to a specific pdev. That
said, the id of actual individual pdev should be fetched and used instead.
The helper ath12k_mac_get_target_pdev_id() serves for this purpose, hence
use it to fix this issue. Note it also works for other devices as well due
to the single_pdev_only check inside.
The same applies to ath12k_mac_op_get_txpower() and
ath12k_mac_op_link_sta_statistics() as well.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
Fixes: 79e7b04b5388 ("wifi: ath12k: report station mode signal strength")
Fixes: e92c658b056b ("wifi: ath12k: add get_txpower mac ops")
Fixes: ebebe66ec208 ("wifi: ath12k: fill link station statistics for MLO")
Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20260129-ath12k-fw-stats-fixes-v1-1-55d66064f4d5@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
The driver obtains sw_attr.num_ifs from firmware via dpsw_get_attributes()
but never validates it against DPSW_MAX_IF (64). This value controls
iteration in dpaa2_switch_fdb_get_flood_cfg(), which writes port indices
into the fixed-size cfg->if_id[DPSW_MAX_IF] array. When firmware reports
num_ifs >= 64, the loop can write past the array bounds.
Add a bound check for num_ifs in dpaa2_switch_init().
dpaa2_switch_fdb_get_flood_cfg() appends the control interface (port
num_ifs) after all matched ports. When num_ifs == DPSW_MAX_IF and all
ports match the flood filter, the loop fills all 64 slots and the control
interface write overflows by one entry.
The check uses >= because num_ifs == DPSW_MAX_IF is also functionally
broken.
build_if_id_bitmap() silently drops any ID >= 64:
if (id[i] < DPSW_MAX_IF)
bmap[id[i] / 64] |= ...
Fixes: 539dda3c5d19 ("staging: dpaa2-switch: properly setup switching domains")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://patch.msgid.link/SYBPR01MB78812B47B7F0470B617C408AAF74A@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix a "scheduling while atomic" bug in mlx5e_ipsec_init_macs() by
replacing mlx5_query_mac_address() with ether_addr_copy() to get the
local MAC address directly from netdev->dev_addr.
The issue occurs because mlx5_query_mac_address() queries the hardware
which involves mlx5_cmd_exec() that can sleep, but it is called from
the mlx5e_ipsec_handle_event workqueue which runs in atomic context.
The MAC address is already available in netdev->dev_addr, so no need
to query hardware. This avoids the sleeping call and resolves the bug.
Call trace:
BUG: scheduling while atomic: kworker/u112:2/69344/0x00000200
__schedule+0x7ab/0xa20
schedule+0x1c/0xb0
schedule_timeout+0x6e/0xf0
__wait_for_common+0x91/0x1b0
cmd_exec+0xa85/0xff0 [mlx5_core]
mlx5_cmd_exec+0x1f/0x50 [mlx5_core]
mlx5_query_nic_vport_mac_address+0x7b/0xd0 [mlx5_core]
mlx5_query_mac_address+0x19/0x30 [mlx5_core]
mlx5e_ipsec_init_macs+0xc1/0x720 [mlx5_core]
mlx5e_ipsec_build_accel_xfrm_attrs+0x422/0x670 [mlx5_core]
mlx5e_ipsec_handle_event+0x2b9/0x460 [mlx5_core]
process_one_work+0x178/0x2e0
worker_thread+0x2ea/0x430
Fixes: cee137a63431 ("net/mlx5e: Handle ESN update events")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224114652.1787431-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit miss to add locking in the error path of
mlx5_sriov_enable(). When pci_enable_sriov() fails,
mlx5_device_disable_sriov() is called to clean up. This cleanup function
now expects to be called with the devlink instance lock held.
Add the missing devl_lock(devlink) and devl_unlock(devlink)
Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224114652.1787431-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit introduced MLX5_PRIV_FLAGS_SWITCH_LEGACY to identify
when a transition to legacy mode is requested via devlink. However, the
logic failed to clear this flag if the mode was subsequently changed
back to MLX5_ESWITCH_OFFLOADS (switchdev). Consequently, if a user
toggled from legacy to switchdev, the flag remained set, leaving the
driver with wrong state indicating
Fix this by explicitly clearing the MLX5_PRIV_FLAGS_SWITCH_LEGACY bit
when the requested mode is MLX5_ESWITCH_OFFLOADS.
Fixes: 2a4f56fbcc47 ("net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224114652.1787431-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlx5_lag_disable_change() unconditionally called mlx5_disable_lag() when
LAG was active, which is incorrect for MLX5_LAG_MODE_MPESW.
Hnece, call mlx5_disable_mpesw() when running in MPESW mode.
Fixes: a32327a3a02c ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224114652.1787431-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a circular locking dependency between dbg_mutex and the domain
rx/tx mutexes that could lead to a deadlock.
The dump path in dr_dump_domain_all() was acquiring locks in the order:
dbg_mutex -> rx.mutex -> tx.mutex
While the table/matcher creation paths acquire locks in the order:
rx.mutex -> tx.mutex -> dbg_mutex
This inverted lock ordering creates a circular dependency. Fix this by
changing dr_dump_domain_all() to acquire the domain lock before
dbg_mutex, matching the order used in mlx5dr_table_create() and
mlx5dr_matcher_create().
Lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
6.19.0-rc6net_next_e817c4e #1 Not tainted
------------------------------------------------------
sos/30721 is trying to acquire lock:
ffff888102df5900 (&dmn->info.rx.mutex){+.+.}-{4:4}, at:
dr_dump_start+0x131/0x450 [mlx5_core]
but task is already holding lock:
ffff888102df5bc0 (&dmn->dump_info.dbg_mutex){+.+.}-{4:4}, at:
dr_dump_start+0x10b/0x450 [mlx5_core]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&dmn->dump_info.dbg_mutex){+.+.}-{4:4}:
__mutex_lock+0x91/0x1060
mlx5dr_matcher_create+0x377/0x5e0 [mlx5_core]
mlx5_cmd_dr_create_flow_group+0x62/0xd0 [mlx5_core]
mlx5_create_flow_group+0x113/0x1c0 [mlx5_core]
mlx5_chains_create_prio+0x453/0x2290 [mlx5_core]
mlx5_chains_get_table+0x2e2/0x980 [mlx5_core]
esw_chains_create+0x1e6/0x3b0 [mlx5_core]
esw_create_offloads_fdb_tables.cold+0x62/0x63f [mlx5_core]
esw_offloads_enable+0x76f/0xd20 [mlx5_core]
mlx5_eswitch_enable_locked+0x35a/0x500 [mlx5_core]
mlx5_devlink_eswitch_mode_set+0x561/0x950 [mlx5_core]
devlink_nl_eswitch_set_doit+0x67/0xe0
genl_family_rcv_msg_doit+0xe0/0x130
genl_rcv_msg+0x188/0x290
netlink_rcv_skb+0x4b/0xf0
genl_rcv+0x24/0x40
netlink_unicast+0x1ed/0x2c0
netlink_sendmsg+0x210/0x450
__sock_sendmsg+0x38/0x60
__sys_sendto+0x119/0x180
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x70/0xd00
entry_SYSCALL_64_after_hwframe+0x4b/0x53
-> #1 (&dmn->info.tx.mutex){+.+.}-{4:4}:
__mutex_lock+0x91/0x1060
mlx5dr_table_create+0x11d/0x530 [mlx5_core]
mlx5_cmd_dr_create_flow_table+0x62/0x140 [mlx5_core]
__mlx5_create_flow_table+0x46f/0x960 [mlx5_core]
mlx5_create_flow_table+0x16/0x20 [mlx5_core]
esw_create_offloads_fdb_tables+0x136/0x240 [mlx5_core]
esw_offloads_enable+0x76f/0xd20 [mlx5_core]
mlx5_eswitch_enable_locked+0x35a/0x500 [mlx5_core]
mlx5_devlink_eswitch_mode_set+0x561/0x950 [mlx5_core]
devlink_nl_eswitch_set_doit+0x67/0xe0
genl_family_rcv_msg_doit+0xe0/0x130
genl_rcv_msg+0x188/0x290
netlink_rcv_skb+0x4b/0xf0
genl_rcv+0x24/0x40
netlink_unicast+0x1ed/0x2c0
netlink_sendmsg+0x210/0x450
__sock_sendmsg+0x38/0x60
__sys_sendto+0x119/0x180
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x70/0xd00
entry_SYSCALL_64_after_hwframe+0x4b/0x53
-> #0 (&dmn->info.rx.mutex){+.+.}-{4:4}:
__lock_acquire+0x18b6/0x2eb0
lock_acquire+0xd3/0x2c0
__mutex_lock+0x91/0x1060
dr_dump_start+0x131/0x450 [mlx5_core]
seq_read_iter+0xe3/0x410
seq_read+0xfb/0x130
full_proxy_read+0x53/0x80
vfs_read+0xba/0x330
ksys_read+0x65/0xe0
do_syscall_64+0x70/0xd00
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dmn->dump_info.dbg_mutex);
lock(&dmn->info.tx.mutex);
lock(&dmn->dump_info.dbg_mutex);
lock(&dmn->info.rx.mutex);
*** DEADLOCK ***
Fixes: 9222f0b27da2 ("net/mlx5: DR, Add support for dumping steering info")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260224114652.1787431-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
A good number of fixes:
- cfg80211:
- cancel rfkill work appropriately
- fix radiotap parsing to correctly reject field 18
- fix wext (yes...) off-by-one for IGTK key ID
- mac80211:
- fix for mesh NULL pointer dereference
- fix for stack out-of-bounds (2 bytes) write on
specific multi-link action frames
- set default WMM parameters for all links
- mwifiex: check dev_alloc_name() return value correctly
- libertas: fix potential timer use-after-free
- brcmfmac: fix crash on probe failure
* tag 'wireless-2026-02-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: fix NULL pointer dereference in mesh_rx_csa_frame()
wifi: mac80211: bounds-check link_id in ieee80211_ml_reconfiguration
wifi: mac80211: set default WMM parameters on all links
wifi: libertas: fix use-after-free in lbs_free_adapter()
wifi: mwifiex: Fix dev_alloc_name() return value check
wifi: brcmfmac: Fix potential kernel oops when probe fails
wifi: radiotap: reject radiotap with unknown bits
wifi: cfg80211: cancel rfkill_block work in wiphy_unregister()
wifi: cfg80211: wext: fix IGTK key ID off-by-one
====================
Link: https://patch.msgid.link/20260225113159.360574-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot is reporting
unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 3
ref_tracker: netdev@ffff88807dcf8618 has 1/2 users at
__netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
netdev_hold include/linux/netdevice.h:4429 [inline]
inetdev_init+0x201/0x4e0 net/ipv4/devinet.c:286
inetdev_event+0x251/0x1610 net/ipv4/devinet.c:1600
notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85
call_netdevice_notifiers_mtu net/core/dev.c:2318 [inline]
netif_set_mtu_ext+0x5aa/0x800 net/core/dev.c:9886
netif_set_mtu+0xd7/0x1b0 net/core/dev.c:9907
dev_set_mtu+0x126/0x260 net/core/dev_api.c:248
team_port_del+0xb07/0xcb0 drivers/net/team/team_core.c:1333
team_del_slave drivers/net/team/team_core.c:1936 [inline]
team_device_event+0x207/0x5b0 drivers/net/team/team_core.c:2929
notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85
call_netdevice_notifiers_extack net/core/dev.c:2281 [inline]
call_netdevice_notifiers net/core/dev.c:2295 [inline]
__dev_change_net_namespace+0xcb7/0x2050 net/core/dev.c:12592
do_setlink+0x2ce/0x4590 net/core/rtnetlink.c:3060
rtnl_changelink net/core/rtnetlink.c:3776 [inline]
__rtnl_newlink net/core/rtnetlink.c:3935 [inline]
rtnl_newlink+0x15a9/0x1be0 net/core/rtnetlink.c:4072
rtnetlink_rcv_msg+0x7d5/0xbe0 net/core/rtnetlink.c:6958
netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894
problem. Ido Schimmel found steps to reproduce
ip link add name team1 type team
ip link add name dummy1 mtu 1499 master team1 type dummy
ip netns add ns1
ip link set dev dummy1 netns ns1
ip -n ns1 link del dev dummy1
and also found that the same issue was fixed in the bond driver in
commit f51048c3e07b ("bonding: avoid NETDEV_CHANGEMTU event when
unregistering slave").
Let's do similar thing for the team driver, with commit ad7c7b2172c3 ("net:
hold netdev instance lock during sysfs operations") and commit 303a8487a657
("net: s/__dev_set_mtu/__netif_set_mtu/") also applied.
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Suggested-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260224125709.317574-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While testing corner cases in the driver, a use-after-free crash
was found on the service rescan PCI path.
When mana_serv_reset() calls mana_gd_suspend(), mana_gd_cleanup()
destroys gc->service_wq. If the subsequent mana_gd_resume() fails
with -ETIMEDOUT or -EPROTO, the code falls through to
mana_serv_rescan() which triggers pci_stop_and_remove_bus_device().
This invokes the PCI .remove callback (mana_gd_remove), which calls
mana_gd_cleanup() a second time, attempting to destroy the already-
freed workqueue. Fix this by NULL-checking gc->service_wq in
mana_gd_cleanup() and setting it to NULL after destruction.
Call stack of issue for reference:
[Sat Feb 21 18:53:48 2026] Call Trace:
[Sat Feb 21 18:53:48 2026] <TASK>
[Sat Feb 21 18:53:48 2026] mana_gd_cleanup+0x33/0x70 [mana]
[Sat Feb 21 18:53:48 2026] mana_gd_remove+0x3a/0xc0 [mana]
[Sat Feb 21 18:53:48 2026] pci_device_remove+0x41/0xb0
[Sat Feb 21 18:53:48 2026] device_remove+0x46/0x70
[Sat Feb 21 18:53:48 2026] device_release_driver_internal+0x1e3/0x250
[Sat Feb 21 18:53:48 2026] device_release_driver+0x12/0x20
[Sat Feb 21 18:53:48 2026] pci_stop_bus_device+0x6a/0x90
[Sat Feb 21 18:53:48 2026] pci_stop_and_remove_bus_device+0x13/0x30
[Sat Feb 21 18:53:48 2026] mana_do_service+0x180/0x290 [mana]
[Sat Feb 21 18:53:48 2026] mana_serv_func+0x24/0x50 [mana]
[Sat Feb 21 18:53:48 2026] process_one_work+0x190/0x3d0
[Sat Feb 21 18:53:48 2026] worker_thread+0x16e/0x2e0
[Sat Feb 21 18:53:48 2026] kthread+0xf7/0x130
[Sat Feb 21 18:53:48 2026] ? __pfx_worker_thread+0x10/0x10
[Sat Feb 21 18:53:48 2026] ? __pfx_kthread+0x10/0x10
[Sat Feb 21 18:53:48 2026] ret_from_fork+0x269/0x350
[Sat Feb 21 18:53:48 2026] ? __pfx_kthread+0x10/0x10
[Sat Feb 21 18:53:48 2026] ret_from_fork_asm+0x1a/0x30
[Sat Feb 21 18:53:48 2026] </TASK>
Fixes: 505cc26bcae0 ("net: mana: Add support for auxiliary device servicing events")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/aZ2bzL64NagfyHpg@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The kaweth driver should validate that the device it is probing has the
proper number and types of USB endpoints it is expecting before it binds
to it. If a malicious device were to not have the same urbs the driver
will crash later on when it blindly accesses these endpoints.
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Link: https://patch.msgid.link/2026022305-substance-virtual-c728@gregkh
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The kalmia driver should validate that the device it is probing has the
proper number and types of USB endpoints it is expecting before it binds
to it. If a malicious device were to not have the same urbs the driver
will crash later on when it blindly accesses these endpoints.
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730")
Link: https://patch.msgid.link/2026022326-shack-headstone-ef6f@gregkh
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pegasus driver should validate that the device it is probing has the
proper number and types of USB endpoints it is expecting before it binds
to it. If a malicious device were to not have the same urbs the driver
will crash later on when it blindly accesses these endpoints.
Cc: Petko Manolov <petkan@nucleusys.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026022347-legibly-attest-cc5c@gregkh
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Panther Lake systems introduced an autonomous power gating feature for
the integrated Gigabit Ethernet in shutdown state (S5) state. As part of
it, the reset value of DPG_EN bit was changed to 1. Clear this bit after
performing hardware reset to avoid errors such as Tx/Rx hangs, or packet
loss/corruption.
Fixes: 0c9183ce61bc ("e1000e: Add support for the next LOM generation")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add new board type for Panther Lake devices for separating device-specific
features and flows.
Additionally, remove the deprecated device IDs 0x57B5 and 0x57B6, which
are not used by any existing devices.
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
It may happen that VF spawned for E610 adapter has problem with setting
link up. This happens when ixgbevf supporting mailbox API 1.6 cooperates
with PF driver which doesn't support this version of API, and hence
doesn't support new approach for getting PF link data.
In that case VF asks PF to provide link data but as PF doesn't support
it, returns -EOPNOTSUPP what leads to early bail from link configuration
sequence.
Avoid such situation by using legacy VFLINKS approach whenever negotiated
API version is less than 1.6.
To reproduce the issue just create VF and set its link up - adapter must
be any from the E610 family, ixgbevf must support API 1.6 or higher while
ixgbevf must not.
Fixes: 53f0eb62b4d2 ("ixgbevf: fix getting link speed data for E610 devices")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Using get_cpu() in the tracepoint assignment causes an obvious preempt
count leak because nothing invokes put_cpu() to undo it:
softirq: huh, entered softirq 3 NET_RX with preempt_count 00000100, exited with 00000101?
This clearly has seen a lot of testing in the last 3+ years...
Use smp_processor_id() instead.
Fixes: 6d4d584a7ea8 ("i40e: Add i40e_napi_poll tracepoint")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Since the conversion of ice to page pool, the ethtool loopback test
crashes:
BUG: kernel NULL pointer dereference, address: 000000000000000c
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 1100f1067 P4D 0
Oops: Oops: 0002 [#1] SMP NOPTI
CPU: 23 UID: 0 PID: 5904 Comm: ethtool Kdump: loaded Not tainted 6.19.0-0.rc7.260128g1f97d9dcf5364.49.eln154.x86_64 #1 PREEMPT(lazy)
Hardware name: [...]
RIP: 0010:ice_alloc_rx_bufs+0x1cd/0x310 [ice]
Code: 83 6c 24 30 01 66 41 89 47 08 0f 84 c0 00 00 00 41 0f b7 dc 48 8b 44 24 18 48 c1 e3 04 41 bb 00 10 00 00 48 8d 2c 18 8b 04 24 <89> 45 0c 41 8b 4d 00 49 d3 e3 44 3b 5c 24 24 0f 83 ac fe ff ff 44
RSP: 0018:ff7894738aa1f768 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000700 RDI: 0000000000000000
RBP: 0000000000000000 R08: ff16dcae79880200 R09: 0000000000000019
R10: 0000000000000001 R11: 0000000000001000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ff16dcae6c670000
FS: 00007fcf428850c0(0000) GS:ff16dcb149710000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000c CR3: 0000000121227005 CR4: 0000000000773ef0
PKRU: 55555554
Call Trace:
<TASK>
ice_vsi_cfg_rxq+0xca/0x460 [ice]
ice_vsi_cfg_rxqs+0x54/0x70 [ice]
ice_loopback_test+0xa9/0x520 [ice]
ice_self_test+0x1b9/0x280 [ice]
ethtool_self_test+0xe5/0x200
__dev_ethtool+0x1106/0x1a90
dev_ethtool+0xbe/0x1a0
dev_ioctl+0x258/0x4c0
sock_do_ioctl+0xe3/0x130
__x64_sys_ioctl+0xb9/0x100
do_syscall_64+0x7c/0x700
entry_SYSCALL_64_after_hwframe+0x76/0x7e
[...]
It crashes because we have not initialized libeth for the rx ring.
Fix it by treating ICE_VSI_LB VSIs slightly more like normal PF VSIs and
letting them have a q_vector. It's just a dummy, because the loopback
test does not use interrupts, but it contains a napi struct that can be
passed to libeth_rx_fq_create() called from ice_vsi_cfg_rxq() ->
ice_rxq_pp_create().
Fixes: 93f53db9f9dc ("ice: switch to Page Pool")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix IRDMA hardware initialization timeout (-110) after resume by
separating VSI-dependent configuration from RDMA resource allocation,
ensuring VSI is rebuilt before IRDMA accesses it.
After resume from suspend, IRDMA hardware initialization fails:
ice: IRDMA hardware initialization FAILED init_state=4 status=-110
Separate RDMA initialization into two phases:
1. ice_init_rdma() - Allocate resources only (no VSI/QoS access, no plug)
2. ice_rdma_finalize_setup() - Assign VSI/QoS info and plug device
This allows:
- ice_init_rdma() to stay in ice_resume() (mirrors ice_deinit_rdma()
in ice_suspend())
- VSI assignment deferred until after ice_vsi_rebuild() completes
- QoS info updated after ice_dcb_rebuild() completes
- Device plugged only when control queues, VSI, and DCB are all ready
Fixes: bc69ad74867db ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When deleting a flow rule using "ethtool -N <dev> delete <location>",
idpf_sideband_action_ena() incorrectly validates fsp->ring_cookie even
though ethtool doesn't populate this field for delete operations. The
uninitialized ring_cookie may randomly match RX_CLS_FLOW_DISC or
RX_CLS_FLOW_WAKE, causing validation to fail and preventing legitimate
rule deletions. Remove the unnecessary sideband action enable check and
ring_cookie validation during delete operations since action validation
is not required when removing existing rules.
Fixes: ada3e24b84a0 ("idpf: add flow steering support")
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The code uses the vidx for the IRQ name but that doesn't match ethtool
reporting nor netdev naming, this makes it hard to tune the device and
associate queues with IRQs. Sequentially requesting irqs starting from
'0' makes the output consistent.
This commit changes the interrupt numbering but preserves the name
format, maintaining ABI compatibility. Existing tools relying on the old
numbering are already non-functional, as they lack a useful correlation
to the interrupts.
Before:
ethtool -L eth1 tx 1 combined 3
grep . /proc/irq/*/*idpf*/../smp_affinity_list
/proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167
/proc/irq/68/idpf-eth1-TxRx-1/../smp_affinity_list:0
/proc/irq/70/idpf-eth1-TxRx-3/../smp_affinity_list:1
/proc/irq/71/idpf-eth1-TxRx-4/../smp_affinity_list:2
/proc/irq/72/idpf-eth1-Tx-5/../smp_affinity_list:3
ethtool -S eth1 | grep -v ': 0'
NIC statistics:
tx_q-0_pkts: 1002
tx_q-1_pkts: 2679
tx_q-2_pkts: 1113
tx_q-3_pkts: 1192 <----- tx_q-3 vs idpf-eth1-Tx-5
rx_q-0_pkts: 1143
rx_q-1_pkts: 3172
rx_q-2_pkts: 1074
After:
ethtool -L eth1 tx 1 combined 3
grep . /proc/irq/*/*idpf*/../smp_affinity_list
/proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167
/proc/irq/68/idpf-eth1-TxRx-0/../smp_affinity_list:0
/proc/irq/70/idpf-eth1-TxRx-1/../smp_affinity_list:1
/proc/irq/71/idpf-eth1-TxRx-2/../smp_affinity_list:2
/proc/irq/72/idpf-eth1-Tx-3/../smp_affinity_list:3
ethtool -S eth1 | grep -v ': 0'
NIC statistics:
tx_q-0_pkts: 118
tx_q-1_pkts: 134
tx_q-2_pkts: 228
tx_q-3_pkts: 138 <--- tx_q-3 matches idpf-eth1-Tx-3
rx_q-0_pkts: 111
rx_q-1_pkts: 366
rx_q-2_pkts: 120
Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport")
Signed-off-by: Brian Vazquez <brianvv@google.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
rss_data->rss_key needs to be nullified after it is freed.
Checks like "if (!rss_data->rss_key)" in the code could fail
if it is not nullified.
Tested: built and booted the kernel.
Fixes: 83f38f210b85 ("idpf: Fix RSS LUT NULL pointer crash on early ethtool operations")
Signed-off-by: Li Li <boolli@google.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In idpf_txq_group_alloc(), if any txq group's txqs failed to
allocate memory:
for (j = 0; j < tx_qgrp->num_txq; j++) {
tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
GFP_KERNEL);
if (!tx_qgrp->txqs[j])
goto err_alloc;
}
It would cause a NULL ptr kernel panic in idpf_txq_group_rel():
for (j = 0; j < txq_grp->num_txq; j++) {
if (flow_sch_en) {
kfree(txq_grp->txqs[j]->refillq);
txq_grp->txqs[j]->refillq = NULL;
}
kfree(txq_grp->txqs[j]);
txq_grp->txqs[j] = NULL;
}
[ 6.532461] BUG: kernel NULL pointer dereference, address: 0000000000000058
...
[ 6.534433] RIP: 0010:idpf_txq_group_rel+0xc9/0x110
...
[ 6.538513] Call Trace:
[ 6.538639] <TASK>
[ 6.538760] idpf_vport_queues_alloc+0x75/0x550
[ 6.538978] idpf_vport_open+0x4d/0x3f0
[ 6.539164] idpf_open+0x71/0xb0
[ 6.539324] __dev_open+0x142/0x260
[ 6.539506] netif_open+0x2f/0xe0
[ 6.539670] dev_open+0x3d/0x70
[ 6.539827] bond_enslave+0x5ed/0xf50
[ 6.540005] ? rcutree_enqueue+0x1f/0xb0
[ 6.540193] ? call_rcu+0xde/0x2a0
[ 6.540375] ? barn_get_empty_sheaf+0x5c/0x80
[ 6.540594] ? __kfree_rcu_sheaf+0xb6/0x1a0
[ 6.540793] ? nla_put_ifalias+0x3d/0x90
[ 6.540981] ? kvfree_call_rcu+0xb5/0x3b0
[ 6.541173] ? kvfree_call_rcu+0xb5/0x3b0
[ 6.541365] do_set_master+0x114/0x160
[ 6.541547] do_setlink+0x412/0xfb0
[ 6.541717] ? security_sock_rcv_skb+0x2a/0x50
[ 6.541931] ? sk_filter_trim_cap+0x7c/0x320
[ 6.542136] ? skb_queue_tail+0x20/0x50
[ 6.542322] ? __nla_validate_parse+0x92/0xe50
ro[o t t o6 .d5e4f2a5u4l0t]- ? security_capable+0x35/0x60
[ 6.542792] rtnl_newlink+0x95c/0xa00
[ 6.542972] ? __rtnl_unlock+0x37/0x70
[ 6.543152] ? netdev_run_todo+0x63/0x530
[ 6.543343] ? allocate_slab+0x280/0x870
[ 6.543531] ? security_capable+0x35/0x60
[ 6.543722] rtnetlink_rcv_msg+0x2e6/0x340
[ 6.543918] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 6.544138] netlink_rcv_skb+0x16a/0x1a0
[ 6.544328] netlink_unicast+0x20a/0x320
[ 6.544516] netlink_sendmsg+0x304/0x3b0
[ 6.544748] __sock_sendmsg+0x89/0xb0
[ 6.544928] ____sys_sendmsg+0x167/0x1c0
[ 6.545116] ? ____sys_recvmsg+0xed/0x150
[ 6.545308] ___sys_sendmsg+0xdd/0x120
[ 6.545489] ? ___sys_recvmsg+0x124/0x1e0
[ 6.545680] ? rcutree_enqueue+0x1f/0xb0
[ 6.545867] ? rcutree_enqueue+0x1f/0xb0
[ 6.546055] ? call_rcu+0xde/0x2a0
[ 6.546222] ? evict+0x286/0x2d0
[ 6.546389] ? rcutree_enqueue+0x1f/0xb0
[ 6.546577] ? kmem_cache_free+0x2c/0x350
[ 6.546784] __x64_sys_sendmsg+0x72/0xc0
[ 6.546972] do_syscall_64+0x6f/0x890
[ 6.547150] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 6.547393] RIP: 0033:0x7fc1a3347bd0
...
[ 6.551375] RIP: 0010:idpf_txq_group_rel+0xc9/0x110
...
[ 6.578856] Rebooting in 10 seconds..
We should skip deallocating txqs[j] if it is NULL in the first place.
Tested: with this patch, the kernel panic no longer appears.
Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues")
Signed-off-by: Li Li <boolli@google.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In idpf_rxq_group_alloc(), if rx_qgrp->splitq.bufq_sets failed to get
allocated:
rx_qgrp->splitq.bufq_sets = kcalloc(vport->num_bufqs_per_qgrp,
sizeof(struct idpf_bufq_set),
GFP_KERNEL);
if (!rx_qgrp->splitq.bufq_sets) {
err = -ENOMEM;
goto err_alloc;
}
idpf_rxq_group_rel() would attempt to deallocate it in
idpf_rxq_sw_queue_rel(), causing a kernel panic:
```
[ 7.967242] early-network-sshd-n-rexd[3148]: knetbase: Info: [ 8.127804] BUG: kernel NULL pointer dereference, address: 00000000000000c0
...
[ 8.129779] RIP: 0010:idpf_rxq_group_rel+0x101/0x170
...
[ 8.133854] Call Trace:
[ 8.133980] <TASK>
[ 8.134092] idpf_vport_queues_alloc+0x286/0x500
[ 8.134313] idpf_vport_open+0x4d/0x3f0
[ 8.134498] idpf_open+0x71/0xb0
[ 8.134668] __dev_open+0x142/0x260
[ 8.134840] netif_open+0x2f/0xe0
[ 8.135004] dev_open+0x3d/0x70
[ 8.135166] bond_enslave+0x5ed/0xf50
[ 8.135345] ? nla_put_ifalias+0x3d/0x90
[ 8.135533] ? kvfree_call_rcu+0xb5/0x3b0
[ 8.135725] ? kvfree_call_rcu+0xb5/0x3b0
[ 8.135916] do_set_master+0x114/0x160
[ 8.136098] do_setlink+0x412/0xfb0
[ 8.136269] ? security_sock_rcv_skb+0x2a/0x50
[ 8.136509] ? sk_filter_trim_cap+0x7c/0x320
[ 8.136714] ? skb_queue_tail+0x20/0x50
[ 8.136899] ? __nla_validate_parse+0x92/0xe50
[ 8.137112] ? security_capable+0x35/0x60
[ 8.137304] rtnl_newlink+0x95c/0xa00
[ 8.137483] ? __rtnl_unlock+0x37/0x70
[ 8.137664] ? netdev_run_todo+0x63/0x530
[ 8.137855] ? allocate_slab+0x280/0x870
[ 8.138044] ? security_capable+0x35/0x60
[ 8.138235] rtnetlink_rcv_msg+0x2e6/0x340
[ 8.138431] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 8.138650] netlink_rcv_skb+0x16a/0x1a0
[ 8.138840] netlink_unicast+0x20a/0x320
[ 8.139028] netlink_sendmsg+0x304/0x3b0
[ 8.139217] __sock_sendmsg+0x89/0xb0
[ 8.139399] ____sys_sendmsg+0x167/0x1c0
[ 8.139588] ? ____sys_recvmsg+0xed/0x150
[ 8.139780] ___sys_sendmsg+0xdd/0x120
[ 8.139960] ? ___sys_recvmsg+0x124/0x1e0
[ 8.140152] ? rcutree_enqueue+0x1f/0xb0
[ 8.140341] ? rcutree_enqueue+0x1f/0xb0
[ 8.140528] ? call_rcu+0xde/0x2a0
[ 8.140695] ? evict+0x286/0x2d0
[ 8.140856] ? rcutree_enqueue+0x1f/0xb0
[ 8.141043] ? kmem_cache_free+0x2c/0x350
[ 8.141236] __x64_sys_sendmsg+0x72/0xc0
[ 8.141424] do_syscall_64+0x6f/0x890
[ 8.141603] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 8.141841] RIP: 0033:0x7f2799d21bd0
...
[ 8.149905] Kernel panic - not syncing: Fatal exception
[ 8.175940] Kernel Offset: 0xf800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 8.176425] Rebooting in 10 seconds..
```
Tested: With this patch, the kernel panic no longer appears.
Fixes: 95af467d9a4e ("idpf: configure resources for RX queues")
Signed-off-by: Li Li <boolli@google.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently, in idpf_wait_for_sw_marker_completion(), when an
IDPF_TXD_COMPLT_SW_MARKER packet is found, the routine breaks out of
the for loop and does not increment the next_to_clean counter. This
causes the subsequent NAPI polls to run into the same
IDPF_TXD_COMPLT_SW_MARKER packet again and print out the following:
[ 23.261341] idpf 0000:05:00.0 eth1: Unknown TX completion type: 5
Instead, we should increment next_to_clean regardless when an
IDPF_TXD_COMPLT_SW_MARKER packet is found.
Tested: with the patch applied, we do not see the errors above from NAPI
polls anymore.
Fixes: 9d39447051a0 ("idpf: remove SW marker handling from NAPI")
Signed-off-by: Li Li <boolli@google.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When stmmac_init_timestamping() is called, it clears the receive and
transmit path booleans that allow timestamps to be read. These are
never re-initialised until after userspace requests timestamping
features to be enabled.
However, our copy of the timestamp configuration is not cleared, which
means we return the old configuration to userspace when requested.
This is inconsistent. Fix this by clearing the timestamp configuration.
Fixes: d6228b7cdd6e ("net: stmmac: implement the SIOCGHWTSTAMP ioctl")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/E1vuUu4-0000000Afea-0j9B@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is an AB-BA deadlock when both LEDS_TRIGGER_NETDEV and
LED_TRIGGER_PHY are enabled:
[ 1362.049207] [<8054e4b8>] led_trigger_register+0x5c/0x1fc <-- Trying to get lock "triggers_list_lock" via down_write(&triggers_list_lock);
[ 1362.054536] [<80662830>] phy_led_triggers_register+0xd0/0x234
[ 1362.060329] [<8065e200>] phy_attach_direct+0x33c/0x40c
[ 1362.065489] [<80651fc4>] phylink_fwnode_phy_connect+0x15c/0x23c
[ 1362.071480] [<8066ee18>] mtk_open+0x7c/0xba0
[ 1362.075849] [<806d714c>] __dev_open+0x280/0x2b0
[ 1362.080384] [<806d7668>] __dev_change_flags+0x244/0x24c
[ 1362.085598] [<806d7698>] dev_change_flags+0x28/0x78
[ 1362.090528] [<807150e4>] dev_ioctl+0x4c0/0x654 <-- Hold lock "rtnl_mutex" by calling rtnl_lock();
[ 1362.094985] [<80694360>] sock_ioctl+0x2f4/0x4e0
[ 1362.099567] [<802e9c4c>] sys_ioctl+0x32c/0xd8c
[ 1362.104022] [<80014504>] syscall_common+0x34/0x58
Here LED_TRIGGER_PHY is registering LED triggers during phy_attach
while holding RTNL and then taking triggers_list_lock.
[ 1362.191101] [<806c2640>] register_netdevice_notifier+0x60/0x168 <-- Trying to get lock "rtnl_mutex" via rtnl_lock();
[ 1362.197073] [<805504ac>] netdev_trig_activate+0x194/0x1e4
[ 1362.202490] [<8054e28c>] led_trigger_set+0x1d4/0x360 <-- Hold lock "triggers_list_lock" by down_read(&triggers_list_lock);
[ 1362.207511] [<8054eb38>] led_trigger_write+0xd8/0x14c
[ 1362.212566] [<80381d98>] sysfs_kf_bin_write+0x80/0xbc
[ 1362.217688] [<8037fcd8>] kernfs_fop_write_iter+0x17c/0x28c
[ 1362.223174] [<802cbd70>] vfs_write+0x21c/0x3c4
[ 1362.227712] [<802cc0c4>] ksys_write+0x78/0x12c
[ 1362.232164] [<80014504>] syscall_common+0x34/0x58
Here LEDS_TRIGGER_NETDEV is being enabled on an LED. It first takes
triggers_list_lock and then RTNL. A classical AB-BA deadlock.
phy_led_triggers_registers() does not require the RTNL, it does not
make any calls into the network stack which require protection. There
is also no requirement the PHY has been attached to a MAC, the
triggers only make use of phydev state. This allows the call to
phy_led_triggers_registers() to be placed elsewhere. PHY probe() and
release() don't hold RTNL, so solving the AB-BA deadlock.
Reported-by: Shiji Yang <yangshiji66@outlook.com>
Closes: https://lore.kernel.org/all/OS7PR01MB13602B128BA1AD3FA38B6D1FFBC69A@OS7PR01MB13602.jpnprd01.prod.outlook.com/
Fixes: 06f502f57d0d ("leds: trigger: Introduce a NETDEV trigger")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Shiji Yang <yangshiji66@outlook.com>
Link: https://patch.msgid.link/20260222152601.1978655-1-andrew@lunn.ch
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
pegasus_probe() fills URBs with hardcoded endpoint pipes without
verifying the endpoint descriptors:
- usb_rcvbulkpipe(dev, 1) for RX data
- usb_sndbulkpipe(dev, 2) for TX data
- usb_rcvintpipe(dev, 3) for status interrupts
A malformed USB device can present these endpoints with transfer types
that differ from what the driver assumes.
Add a pegasus_usb_ep enum for endpoint numbers, replacing magic
constants throughout. Add usb_check_bulk_endpoints() and
usb_check_int_endpoints() calls before any resource allocation to
verify endpoint types before use, rejecting devices with mismatched
descriptors at probe time, and avoid triggering assertion.
Similar fix to
- commit 90b7f2961798 ("net: usb: rtl8150: enable basic endpoint checking")
- commit 9e7021d2aeae ("net: usb: catc: enable basic endpoint checking")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260222050633.410165-1-n7l8m4@u.northwestern.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
msg passed to netconsole from the console subsystem is not guaranteed
to be nul-terminated. Before recent
commit 7eab73b18630 ("netconsole: convert to NBCON console infrastructure")
the message would be placed in printk_shared_pbufs, a static global
buffer, so KASAN had harder time catching OOB accesses. Now we see:
printk: console [netcon_ext0] enabled
BUG: KASAN: slab-out-of-bounds in string+0x1f7/0x240
Read of size 1 at addr ffff88813b6d4c00 by task pr/netcon_ext0/594
CPU: 65 UID: 0 PID: 594 Comm: pr/netcon_ext0 Not tainted 6.19.0-11754-g4246fd6547c9
Call Trace:
kasan_report+0xe4/0x120
string+0x1f7/0x240
vsnprintf+0x655/0xba0
scnprintf+0xba/0x120
netconsole_write+0x3fe/0xa10
nbcon_emit_next_record+0x46e/0x860
nbcon_kthread_func+0x623/0x750
Allocated by task 1:
nbcon_alloc+0x1ea/0x450
register_console+0x26b/0xe10
init_netconsole+0xbb0/0xda0
The buggy address belongs to the object at ffff88813b6d4000
which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 0 bytes to the right of
allocated 3072-byte region [ffff88813b6d4000, ffff88813b6d4c00)
Fixes: c62c0a17f9b7 ("netconsole: Append kernel version to message")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260219195021.2099699-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When the FarSync T-series card is being detached, the fst_card_info is
deallocated in fst_remove_one(). However, the fst_tx_task or fst_int_task
may still be running or pending, leading to use-after-free bugs when the
already freed fst_card_info is accessed in fst_process_tx_work_q() or
fst_process_int_work_q().
A typical race condition is depicted below:
CPU 0 (cleanup) | CPU 1 (tasklet)
| fst_start_xmit()
fst_remove_one() | tasklet_schedule()
unregister_hdlc_device()|
| fst_process_tx_work_q() //handler
kfree(card) //free | do_bottom_half_tx()
| card-> //use
The following KASAN trace was captured:
==================================================================
BUG: KASAN: slab-use-after-free in do_bottom_half_tx+0xb88/0xd00
Read of size 4 at addr ffff88800aad101c by task ksoftirqd/3/32
...
Call Trace:
<IRQ>
dump_stack_lvl+0x55/0x70
print_report+0xcb/0x5d0
? do_bottom_half_tx+0xb88/0xd00
kasan_report+0xb8/0xf0
? do_bottom_half_tx+0xb88/0xd00
do_bottom_half_tx+0xb88/0xd00
? _raw_spin_lock_irqsave+0x85/0xe0
? __pfx__raw_spin_lock_irqsave+0x10/0x10
? __pfx___hrtimer_run_queues+0x10/0x10
fst_process_tx_work_q+0x67/0x90
tasklet_action_common+0x1fa/0x720
? hrtimer_interrupt+0x31f/0x780
handle_softirqs+0x176/0x530
__irq_exit_rcu+0xab/0xe0
sysvec_apic_timer_interrupt+0x70/0x80
...
Allocated by task 41 on cpu 3 at 72.330843s:
kasan_save_stack+0x24/0x50
kasan_save_track+0x17/0x60
__kasan_kmalloc+0x7f/0x90
fst_add_one+0x1a5/0x1cd0
local_pci_probe+0xdd/0x190
pci_device_probe+0x341/0x480
really_probe+0x1c6/0x6a0
__driver_probe_device+0x248/0x310
driver_probe_device+0x48/0x210
__device_attach_driver+0x160/0x320
bus_for_each_drv+0x101/0x190
__device_attach+0x198/0x3a0
device_initial_probe+0x78/0xa0
pci_bus_add_device+0x81/0xc0
pci_bus_add_devices+0x7e/0x190
enable_slot+0x9b9/0x1130
acpiphp_check_bridge.part.0+0x2e1/0x460
acpiphp_hotplug_notify+0x36c/0x3c0
acpi_device_hotplug+0x203/0xb10
acpi_hotplug_work_fn+0x59/0x80
...
Freed by task 41 on cpu 1 at 75.138639s:
kasan_save_stack+0x24/0x50
kasan_save_track+0x17/0x60
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x43/0x70
kfree+0x135/0x410
fst_remove_one+0x2ca/0x540
pci_device_remove+0xa6/0x1d0
device_release_driver_internal+0x364/0x530
pci_stop_bus_device+0x105/0x150
pci_stop_and_remove_bus_device+0xd/0x20
disable_slot+0x116/0x260
acpiphp_disable_and_eject_slot+0x4b/0x190
acpiphp_hotplug_notify+0x230/0x3c0
acpi_device_hotplug+0x203/0xb10
acpi_hotplug_work_fn+0x59/0x80
...
The buggy address belongs to the object at ffff88800aad1000
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 28 bytes inside of
freed 1024-byte region
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xaad0
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x100000000000040(head|node=0|zone=1)
page_type: f5(slab)
raw: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
head: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000
head: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
head: 0100000000000003 ffffea00002ab401 00000000ffffffff 00000000ffffffff
head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88800aad0f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800aad0f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88800aad1000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88800aad1080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88800aad1100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fix this by ensuring that both fst_tx_task and fst_int_task are properly
canceled before the fst_card_info is released. Add tasklet_kill() in
fst_remove_one() to synchronize with any pending or running tasklets.
Since unregister_hdlc_device() stops data transmission and reception,
and fst_disable_intr() prevents further interrupts, it is appropriate
to place tasklet_kill() after these calls.
The bugs were identified through static analysis. To reproduce the issue
and validate the fix, a FarSync T-series card was simulated in QEMU and
delays(e.g., mdelay()) were introduced within the tasklet handler to
increase the likelihood of triggering the race condition.
Fixes: 2f623aaf9f31 ("net: farsync: Fix kmemleak when rmmods farsync")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260219124637.72578-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In DQ-QPL mode, gve_tx_clean_pending_packets() incorrectly uses the RDA
buffer cleanup path. It iterates num_bufs times and attempts to unmap
entries in the dma array.
This leads to two issues:
1. The dma array shares storage with tx_qpl_buf_ids (union).
Interpreting buffer IDs as DMA addresses results in attempting to
unmap incorrect memory locations.
2. num_bufs in QPL mode (counting 2K chunks) can significantly exceed
the size of the dma array, causing out-of-bounds access warnings
(trace below is how we noticed this issue).
UBSAN: array-index-out-of-bounds in
drivers/net/ethernet/drivers/net/ethernet/google/gve/gve_tx_dqo.c:178:5 index 18 is out of
range for type 'dma_addr_t[18]' (aka 'unsigned long long[18]')
Workqueue: gve gve_service_task [gve]
Call Trace:
<TASK>
dump_stack_lvl+0x33/0xa0
__ubsan_handle_out_of_bounds+0xdc/0x110
gve_tx_stop_ring_dqo+0x182/0x200 [gve]
gve_close+0x1be/0x450 [gve]
gve_reset+0x99/0x120 [gve]
gve_service_task+0x61/0x100 [gve]
process_scheduled_works+0x1e9/0x380
Fix this by properly checking for QPL mode and delegating to
gve_free_tx_qpl_bufs() to reclaim the buffers.
Cc: stable@vger.kernel.org
Fixes: a6fb8d5a8b69 ("gve: Tx path for DQO-QPL")
Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260220215324.1631350-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The lbs_free_adapter() function uses timer_delete() (non-synchronous)
for both command_timer and tx_lockup_timer before the structure is
freed. This is incorrect because timer_delete() does not wait for
any running timer callback to complete.
If a timer callback is executing when lbs_free_adapter() is called,
the callback will access freed memory since lbs_cfg_free() frees the
containing structure immediately after lbs_free_adapter() returns.
Both timer callbacks (lbs_cmd_timeout_handler and lbs_tx_lockup_handler)
access priv->driver_lock, priv->cur_cmd, priv->dev, and other fields,
which would all be use-after-free violations.
Use timer_delete_sync() instead to ensure any running timer callback
has completed before returning.
This bug was introduced in commit 8f641d93c38a ("libertas: detect TX
lockups and reset hardware") where del_timer() was used instead of
del_timer_sync() in the cleanup path. The command_timer has had the
same issue since the driver was first written.
Fixes: 8f641d93c38a ("libertas: detect TX lockups and reset hardware")
Fixes: 954ee164f4f4 ("[PATCH] libertas: reorganize and simplify init sequence")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Hodges <git@danielhodges.dev>
Link: https://patch.msgid.link/20260206195356.15647-1-git@danielhodges.dev
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
dev_alloc_name() returns the allocated ID on success, which could be
over 0.
Fix the return value check to check for negative error codes.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aYmsQfujoAe5qO02@stanley.mountain/
Fixes: 7bab5bdb81e3 ("wifi: mwifiex: Allocate dev name earlier for interface workqueue name")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20260210100337.1131279-1-wenst@chromium.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When probe of the sdio brcmfmac device fails for some reasons (i.e.
missing firmware), the sdiodev->bus is set to error instead of NULL, thus
the cleanup later in brcmf_sdio_remove() tries to free resources via
invalid bus pointer. This happens because sdiodev->bus is set 2 times:
first in brcmf_sdio_probe() and second time in brcmf_sdiod_probe(). Fix
this by chaning the brcmf_sdio_probe() function to return the error code
and set sdio->bus only there.
Fixes: 0ff0843310b7 ("wifi: brcmfmac: Add optional lpo clock enable support")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Arend van Spriel<arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20260203102133.1478331-1-m.szyprowski@samsung.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Conversion performed via this Coccinelle script:
// SPDX-License-Identifier: GPL-2.0-only
// Options: --include-headers-for-types --all-includes --include-headers --keep-comments
virtual patch
@gfp depends on patch && !(file in "tools") && !(file in "samples")@
identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
kzalloc_obj,kzalloc_objs,kzalloc_flex,
kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
@@
ALLOC(...
- , GFP_KERNEL
)
$ make coccicheck MODE=patch COCCI=gfp.cocci
Build and boot tested x86_64 with Fedora 42's GCC and Clang:
Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.
As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Coccinelle doesn't handle re-indenting line escapes. Fix the 2 places
where these got misaligned.
Remove 2 now-redundant type casts, found with:
$ git grep -P 'struct (\S+).*\)\s*k\S+alloc_(objs?|flex)\(struct \1'
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
When processing TCP stream data in ovpn_tcp_recv, we receive large
cloned skbs from __strp_rcv that may contain multiple coalesced packets.
The current implementation has two bugs:
1. Header offset overflow: Using pskb_pull with large offsets on
coalesced skbs causes skb->data - skb->head to exceed the u16 storage
of skb->network_header. This causes skb_reset_network_header to fail
on the inner decapsulated packet, resulting in packet drops.
2. Unaligned protocol headers: Extracting packets from arbitrary
positions within the coalesced TCP stream provides no alignment
guarantees for the packet data causing performance penalties on
architectures without efficient unaligned access. Additionally,
openvpn's 2-byte length prefix on TCP packets causes the subsequent
4-byte opcode and packet ID fields to be inherently misaligned.
Fix both issues by allocating a new skb for each openvpn packet and
using skb_copy_bits to extract only the packet content into the new
buffer, skipping the 2-byte length prefix. Also, check the length before
invoking the function that performs the allocation to avoid creating an
invalid skb.
If the packet has to be forwarded to userspace the 2-byte prefix can be
pushed to the head safely, without misalignment.
As a side effect, this approach also avoids the expensive linearization
that pskb_pull triggers on cloned skbs with page fragments. In testing,
this resulted in TCP throughput improvements of up to 74%.
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|