summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
4 daysidpf: only assign num refillqs if allocation was successfulJoshua Hay
As reported by AI review [1], if the refillqs allocation fails, refillqs will be NULL but num_refillqs will be non-zero. The release function will then dereference refillqs since it thinks the refillqs are present, resulting in a NULL ptr dereference. Only assign the num refillqs if the allocation was successful. This will prevent the release function from entering the loop and accessing refillqs. [1] https://lore.kernel.org/netdev/20260227035625.2632753-1-kuba@kernel.org/ Fixes: 95af467d9a4e3 ("idpf: configure resources for RX queues") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 daysidpf: clear stale cdev_info ptrJoshua Hay
Deinit calls idpf_idc_deinit_core_aux_device to free the cdev_info memory, but leaves the adapter->cdev_info field with a stale pointer value. This will bypass subsequent "if (!cdev_info)" checks if cdev_info is not reallocated. For example, if idc_init fails after a reset, cdev_info will already have been freed during the reset handling, but it will not have been reallocated. The next reset or rmmod will result in a crash. [ +0.000008] BUG: kernel NULL pointer dereference, address: 00000000000000d0 [ +0.000033] #PF: supervisor read access in kernel mode [ +0.000020] #PF: error_code(0x0000) - not-present page [ +0.000017] PGD 2097dfa067 P4D 0 [ +0.000017] Oops: Oops: 0000 [#1] SMP NOPTI ... [ +0.000018] RIP: 0010:device_del+0x3e/0x3d0 [ +0.000010] Call Trace: [ +0.000010] <TASK> [ +0.000012] idpf_idc_deinit_core_aux_device+0x36/0x70 [idpf] [ +0.000034] idpf_vc_core_deinit+0x3e/0x180 [idpf] [ +0.000035] idpf_remove+0x40/0x1d0 [idpf] [ +0.000035] pci_device_remove+0x42/0xb0 [ +0.000020] device_release_driver_internal+0x19c/0x200 [ +0.000024] driver_detach+0x48/0x90 [ +0.000018] bus_remove_driver+0x6d/0x100 [ +0.000023] pci_unregister_driver+0x2e/0xb0 [ +0.000022] __do_sys_delete_module.isra.0+0x18c/0x2b0 [ +0.000025] ? kmem_cache_free+0x2c2/0x390 [ +0.000023] do_syscall_64+0x107/0x7d0 [ +0.000023] entry_SYSCALL_64_after_hwframe+0x76/0x7e Pass the adapter struct into idpf_idc_deinit_core_aux_device instead and clear the cdev_info ptr. Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 daysiavf: fix out-of-bounds writes in iavf_get_ethtool_stats()Kohei Enju
iavf incorrectly uses real_num_tx_queues for ETH_SS_STATS. Since the value could change in runtime, we should use num_tx_queues instead. Moreover iavf_get_ethtool_stats() uses num_active_queues while iavf_get_sset_count() and iavf_get_stat_strings() use real_num_tx_queues, which triggers out-of-bounds writes when we do "ethtool -L" and "ethtool -S" simultaneously [1]. For example when we change channels from 1 to 8, Thread 3 could be scheduled before Thread 2, and out-of-bounds writes could be triggered in Thread 3: Thread 1 (ethtool -L) Thread 2 (work) Thread 3 (ethtool -S) iavf_set_channels() ... iavf_alloc_queues() -> num_active_queues = 8 iavf_schedule_finish_config() iavf_get_sset_count() real_num_tx_queues: 1 -> buffer for 1 queue iavf_get_ethtool_stats() num_active_queues: 8 -> out-of-bounds! iavf_finish_config() -> real_num_tx_queues = 8 Use immutable num_tx_queues in all related functions to avoid the issue. [1] BUG: KASAN: vmalloc-out-of-bounds in iavf_add_one_ethtool_stat+0x200/0x270 Write of size 8 at addr ffffc900031c9080 by task ethtool/5800 CPU: 1 UID: 0 PID: 5800 Comm: ethtool Not tainted 6.19.0-enjuk-08403-g8137e3db7f1c #241 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6f/0xb0 print_report+0x170/0x4f3 kasan_report+0xe1/0x180 iavf_add_one_ethtool_stat+0x200/0x270 iavf_get_ethtool_stats+0x14c/0x2e0 __dev_ethtool+0x3d0c/0x5830 dev_ethtool+0x12d/0x270 dev_ioctl+0x53c/0xe30 sock_do_ioctl+0x1a9/0x270 sock_ioctl+0x3d4/0x5e0 __x64_sys_ioctl+0x137/0x1c0 do_syscall_64+0xf3/0x690 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f7da0e6e36d ... </TASK> The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900031c9000 allocated at __dev_ethtool+0x3cc9/0x5830 The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88813a013de0 pfn:0x13a013 flags: 0x200000000000000(node=0|zone=2) raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000 raw: ffff88813a013de0 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffc900031c8f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc900031c9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffc900031c9080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffffc900031c9100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffffc900031c9180: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 Fixes: 64430f70ba6f ("iavf: Fix displaying queue statistics shown by ethtool") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 daysice: use ice_update_eth_stats() for representor statsPetr Oros
ice_repr_get_stats64() and __ice_get_ethtool_stats() call ice_update_vsi_stats() on the VF's src_vsi. This always returns early because ICE_VSI_DOWN is permanently set for VF VSIs - ice_up() is never called on them since queues are managed by iavf through virtchnl. In __ice_get_ethtool_stats() the original code called ice_update_vsi_stats() for all VSIs including representors, iterated over ice_gstrings_vsi_stats[] to populate the data, and then bailed out with an early return before the per-queue ring stats section. That early return was necessary because representor VSIs have no rings on the PF side - the rings belong to the VF driver (iavf), so accessing per-queue stats would be invalid. Move the representor handling to the top of __ice_get_ethtool_stats() and call ice_update_eth_stats() directly to read the hardware GLV_* counters. This matches ice_get_vf_stats() which already uses ice_update_eth_stats() for the same VF VSI in legacy mode. Apply the same fix to ice_repr_get_stats64(). Note that ice_gstrings_vsi_stats[] contains five software ring counters (rx_buf_failed, rx_page_failed, tx_linearize, tx_busy, tx_restart) that are always zero for representors since the PF never processes packets on VF rings. This is pre-existing behavior unchanged by this patch. Fixes: 7aae80cef7ba ("ice: add port representor ethtool ops and stats") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Patryk Holda <patryk.holda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 daysice: fix inverted ready check for VF representorsPetr Oros
Commit 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops") refactored the VF readiness check into a generic repr->ops.ready() callback but implemented ice_repr_ready_vf() with inverted logic: return !ice_check_vf_ready_for_cfg(repr->vf); ice_check_vf_ready_for_cfg() returns 0 on success, so the negation makes ready() return non-zero when the VF is ready. All callers treat non-zero as "not ready, skip", causing ndo_get_stats64, get_drvinfo, get_strings and get_ethtool_stats to always bail out in switchdev mode. Remove the erroneous negation. The SF variant ice_repr_ready_sf() is already correct (returns !active, i.e. non-zero when not active). Fixes: 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Patryk Holda <patryk.holda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 daysice: set max queues in alloc_etherdev_mqs()Michal Swiatkowski
When allocating netdevice using alloc_etherdev_mqs() the maximum supported queues number should be passed. The vsi->alloc_txq/rxq is storing current number of queues, not the maximum ones. Use the same function for getting max Tx and Rx queues which is used during ethtool -l call to set maximum number of queues during netdev allocation. Reproduction steps: $ethtool -l $pf # says current 16, max 64 $ethtool -S $pf # fine $ethtool -L $pf combined 40 # crash [491187.472594] Call Trace: [491187.472829] <TASK> [491187.473067] netif_set_xps_queue+0x26/0x40 [491187.473305] ice_vsi_cfg_txq+0x265/0x3d0 [ice] [491187.473619] ice_vsi_cfg_lan_txqs+0x68/0xa0 [ice] [491187.473918] ice_vsi_cfg_lan+0x2b/0xa0 [ice] [491187.474202] ice_vsi_open+0x71/0x170 [ice] [491187.474484] ice_vsi_recfg_qs+0x17f/0x230 [ice] [491187.474759] ? dev_get_min_mp_channel_count+0xab/0xd0 [491187.474987] ice_set_channels+0x185/0x3d0 [ice] [491187.475278] ethnl_set_channels+0x26f/0x340 Fixes: ee13aa1a2c5a ("ice: use netif_get_num_default_rss_queues()") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 dayslibie: prevent memleak in fwlog codeMichal Swiatkowski
All cmd_buf buffers are allocated and need to be freed after usage. Add an error unwinding path that properly frees these buffers. The memory leak happens whenever fwlog configuration is changed. For example: $echo 256K > /sys/kernel/debug/ixgbe/0000\:32\:00.0/fwlog/log_size Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 daysiavf: fix VLAN filter lost on add/delete racePetr Oros
When iavf_add_vlan() finds an existing filter in IAVF_VLAN_REMOVE state, it transitions the filter to IAVF_VLAN_ACTIVE assuming the pending delete can simply be cancelled. However, there is no guarantee that iavf_del_vlans() has not already processed the delete AQ request and removed the filter from the PF. In that case the filter remains in the driver's list as IAVF_VLAN_ACTIVE but is no longer programmed on the NIC. Since iavf_add_vlans() only picks up filters in IAVF_VLAN_ADD state, the filter is never re-added, and spoof checking drops all traffic for that VLAN. CPU0 CPU1 Workqueue ---- ---- --------- iavf_del_vlan(vlan 100) f->state = REMOVE schedule AQ_DEL_VLAN iavf_add_vlan(vlan 100) f->state = ACTIVE iavf_del_vlans() f is ACTIVE, skip iavf_add_vlans() f is ACTIVE, skip Filter is ACTIVE in driver but absent from NIC. Transition to IAVF_VLAN_ADD instead and schedule IAVF_FLAG_AQ_ADD_VLAN_FILTER so iavf_add_vlans() re-programs the filter. A duplicate add is idempotent on the PF. Fixes: 0c0da0e95105 ("iavf: refactor VLAN filter states") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 daysigc: fix page fault in XDP TX timestamps handlingZdenek Bouska
If an XDP application that requested TX timestamping is shutting down while the link of the interface in use is still up the following kernel splat is reported: [ 883.803618] [ T1554] BUG: unable to handle page fault for address: ffffcfb6200fd008 ... [ 883.803650] [ T1554] Call Trace: [ 883.803652] [ T1554] <TASK> [ 883.803654] [ T1554] igc_ptp_tx_tstamp_event+0xdf/0x160 [igc] [ 883.803660] [ T1554] igc_tsync_interrupt+0x2d5/0x300 [igc] ... During shutdown of the TX ring the xsk_meta pointers are left behind, so that the IRQ handler is trying to touch them. This issue is now being fixed by cleaning up the stale xsk meta data on TX shutdown. TX timestamps on other queues remain unaffected. Fixes: 15fd021bc427 ("igc: Add Tx hardware timestamp request for AF_XDP zero-copy packet") Signed-off-by: Zdenek Bouska <zdenek.bouska@siemens.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 daysigc: fix missing update of skb->tail in igc_xmit_frame()Kohei Enju
igc_xmit_frame() misses updating skb->tail when the packet size is shorter than the minimum one. Use skb_put_padto() in alignment with other Intel Ethernet drivers. Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-10e1000/e1000e: Fix leak in DMA error cleanupMatt Vollrath
If an error is encountered while mapping TX buffers, the driver should unmap any buffers already mapped for that skb. Because count is incremented after a successful mapping, it will always match the correct number of unmappings needed when dma_error is reached. Decrementing count before the while loop in dma_error causes an off-by-one error. If any mapping was successful before an unsuccessful mapping, exactly one DMA mapping would leak. In these commits, a faulty while condition caused an infinite loop in dma_error: Commit 03b1320dfcee ("e1000e: remove use of skb_dma_map from e1000e driver") Commit 602c0554d7b0 ("e1000: remove use of skb_dma_map from e1000 driver") Commit c1fa347f20f1 ("e1000/e1000e/igb/igbvf/ixgb/ixgbe: Fix tests of unsigned in *_tx_map()") fixed the infinite loop, but introduced the off-by-one error. This issue may still exist in the igbvf driver, but I did not address it in this patch. Fixes: c1fa347f20f1 ("e1000/e1000e/igb/igbvf/ixgb/ixgbe: Fix tests of unsigned in *_tx_map()") Assisted-by: Claude:claude-4.6-opus Signed-off-by: Matt Vollrath <tactii@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-10i40e: fix src IP mask checks and memcpy argument names in cloud filterAlok Tiwari
Fix following issues in the IPv4 and IPv6 cloud filter handling logic in both the add and delete paths: - The source-IP mask check incorrectly compares mask.src_ip[0] against tcf.dst_ip[0]. Update it to compare against tcf.src_ip[0]. This likely goes unnoticed because the check is in an "else if" path that only executes when dst_ip is not set, most cloud filter use cases focus on destination-IP matching, and the buggy condition can accidentally evaluate true in some cases. - memcpy() for the IPv4 source address incorrectly uses ARRAY_SIZE(tcf.dst_ip) instead of ARRAY_SIZE(tcf.src_ip), although both arrays are the same size. - The IPv4 memcpy operations used ARRAY_SIZE(tcf.dst_ip) and ARRAY_SIZE (tcf.src_ip), Update these to use sizeof(cfilter->ip.v4.dst_ip) and sizeof(cfilter->ip.v4.src_ip) to ensure correct and explicit copy size. - In the IPv6 delete path, memcmp() uses sizeof(src_ip6) when comparing dst_ip6 fields. Replace this with sizeof(dst_ip6) to make the intent explicit, even though both fields are struct in6_addr. Fixes: e284fc280473 ("i40e: Add and delete cloud filter") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-10iavf: fix incorrect reset handling in callbacksPetr Oros
Three driver callbacks schedule a reset and wait for its completion: ndo_change_mtu(), ethtool set_ringparam(), and ethtool set_channels(). Waiting for reset in ndo_change_mtu() and set_ringparam() was added by commit c2ed2403f12c ("iavf: Wait for reset in callbacks which trigger it") to fix a race condition where adding an interface to bonding immediately after MTU or ring parameter change failed because the interface was still in __RESETTING state. The same commit also added waiting in iavf_set_priv_flags(), which was later removed by commit 53844673d555 ("iavf: kill "legacy-rx" for good"). Waiting in set_channels() was introduced earlier by commit 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") to ensure the PF has enough time to complete the VF reset when changing channel count, and to return correct error codes to userspace. Commit ef490bbb2267 ("iavf: Add net_shaper_ops support") added net_shaper_ops to iavf, which required reset_task to use _locked NAPI variants (napi_enable_locked, napi_disable_locked) that need the netdev instance lock. Later, commit 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") and commit 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") started holding the netdev instance lock during ndo and ethtool callbacks for drivers with net_shaper_ops. Finally, commit 120f28a6f314 ("iavf: get rid of the crit lock") replaced the driver's crit_lock with netdev_lock in reset_task, causing incorrect behavior: the callback holds netdev_lock and waits for reset_task, but reset_task needs the same lock: Thread 1 (callback) Thread 2 (reset_task) ------------------- --------------------- netdev_lock() [blocked on workqueue] ndo_change_mtu() or ethtool op iavf_schedule_reset() iavf_wait_for_reset() iavf_reset_task() waiting... netdev_lock() <- blocked This does not strictly deadlock because iavf_wait_for_reset() uses wait_event_interruptible_timeout() with a 5-second timeout. The wait eventually times out, the callback returns an error to userspace, and after the lock is released reset_task completes the reset. This leads to incorrect behavior: userspace sees an error even though the configuration change silently takes effect after the timeout. Fix this by extracting the reset logic from iavf_reset_task() into a new iavf_reset_step() function that expects netdev_lock to be already held. The three callbacks now call iavf_reset_step() directly instead of scheduling the work and waiting, performing the reset synchronously in the caller's context which already holds netdev_lock. This eliminates both the incorrect error reporting and the need for iavf_wait_for_reset(), which is removed along with the now-unused reset_waitqueue. The workqueue-based iavf_reset_task() becomes a thin wrapper that acquires netdev_lock and calls iavf_reset_step(), preserving its use for PF-initiated resets. The callbacks may block for several seconds while iavf_reset_step() polls hardware registers, but this is acceptable since netdev_lock is a per-device mutex and only serializes operations on the same interface. v3: - Remove netif_running() guard from iavf_set_channels(). Unlike set_ringparam where descriptor counts are picked up by iavf_open() directly, num_req_queues is only consumed during iavf_reinit_interrupt_scheme() in the reset path. Skipping the reset on a down device would silently discard the channel count change. - Remove dead reset_waitqueue code (struct field, init, and all wake_up calls) since iavf_wait_for_reset() was the only consumer. Fixes: 120f28a6f314 ("iavf: get rid of the crit lock") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-10iavf: fix PTP use-after-free during resetPetr Oros
Commit 7c01dbfc8a1c5f ("iavf: periodically cache PHC time") introduced a worker to cache PHC time, but failed to stop it during reset or disable. This creates a race condition where `iavf_reset_task()` or `iavf_disable_vf()` free adapter resources (AQ) while the worker is still running. If the worker triggers `iavf_queue_ptp_cmd()` during teardown, it accesses freed memory/locks, leading to a crash. Fix this by calling `iavf_ptp_release()` before tearing down the adapter. This ensures `ptp_clock_unregister()` synchronously cancels the worker and cleans up the chardev before the backing resources are destroyed. Fixes: 7c01dbfc8a1c5f ("iavf: periodically cache PHC time") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-10drivers: net: ice: fix devlink parameters get without irdmaNikolay Aleksandrov
If CONFIG_IRDMA isn't enabled but there are ice NICs in the system, the driver will prevent full devlink dev param show dump because its rdma get callbacks return ENODEV and stop the dump. For example: $ devlink dev param show pci/0000:82:00.0: name msix_vec_per_pf_max type generic values: cmode driverinit value 2 name msix_vec_per_pf_min type generic values: cmode driverinit value 2 kernel answers: No such device Returning EOPNOTSUPP allows the dump to continue so we can see all devices' devlink parameters. Fixes: c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-05libeth, idpf: use truesize as XDP RxQ info frag_sizeLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buffer size instead of DMA write size. Different assumptions in idpf driver configuration lead to negative tailroom. To make it worse, buffer sizes are not actually uniform in idpf when splitq is enabled, as there are several buffer queues, so rxq->rx_buf_size is meaningless in this case. Use truesize of the first bufq in AF_XDP ZC, as there is only one. Disable growing tail for regular splitq. Fixes: ac8a861f632e ("idpf: prepare structures to support XDP") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-8-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05i40e: use xdp.frame_sz as XDP RxQ info frag_sizeLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buffer size instead of DMA write size. Different assumptions in i40e driver configuration lead to negative tailroom. Set frag_size to the same value as frame_sz in shared pages mode, use new helper to set frag_size when AF_XDP ZC is active. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-7-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05i40e: fix registering XDP RxQ infoLarysa Zaremba
Current way of handling XDP RxQ info in i40e has a problem, where frag_size is not updated when xsk_buff_pool is detached or when MTU is changed, this leads to growing tail always failing for multi-buffer packets. Couple XDP RxQ info registering with buffer allocations and unregistering with cleaning the ring. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-6-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05ice: change XDP RxQ frag_size from DMA write length to xdp.frame_szLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buff size instead of DMA write size. Different assumptions in ice driver configuration lead to negative tailroom. This allows to trigger kernel panic, when using XDP_ADJUST_TAIL_GROW_MULTI_BUFF xskxceiver test and changing packet size to 6912 and the requested offset to a huge value, e.g. XSK_UMEM__MAX_FRAME_SIZE * 100. Due to other quirks of the ZC configuration in ice, panic is not observed in ZC mode, but tailroom growing still fails when it should not. Use fill queue buffer truesize instead of DMA write size in XDP RxQ info. Fix ZC mode too by using the new helper. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-5-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05ice: fix rxq info registering in mbuf packetsLarysa Zaremba
XDP RxQ info contains frag_size, which depends on the MTU. This makes the old way of registering RxQ info before calculating new buffer sizes invalid. Currently, it leads to frag_size being outdated, making it sometimes impossible to grow tailroom in a mbuf packet. E.g. fragments are actually 3K+, but frag size is still as if MTU was 1500. Always register new XDP RxQ info after reconfiguring memory pools. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-4-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-03igc: Fix trigger of incorrect irq in igc_xsk_wakeup functionVivek Behera
This patch addresses the issue where the igc_xsk_wakeup function was triggering an incorrect IRQ for tx-0 when the i226 is configured with only 2 combined queues or in an environment with 2 active CPU cores. This prevented XDP Zero-copy send functionality in such split IRQ configurations. The fix implements the correct logic for extracting q_vectors saved during rx and tx ring allocation and utilizes flags provided by the ndo_xsk_wakeup API to trigger the appropriate IRQ. Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy") Fixes: 15fd021bc427 ("igc: Add Tx hardware timestamp request for AF_XDP zero-copy packet") Signed-off-by: Vivek Behera <vivek.behera@siemens.com> Reviewed-by: Jacob Keller <jacob.keller@intel.com> Reviewed-by: Aleksandr loktinov <aleksandr.loktionov@intel.com> Reviewed-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03igb: Fix trigger of incorrect irq in igb_xsk_wakeupVivek Behera
The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues to share the same irq. This would lead to triggering of incorrect irq in split irq configuration. This patch addresses this issue which could impact environments with 2 active cpu cores or when the number of queues is reduced to 2 or less cat /proc/interrupts | grep eno2 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-rx-0 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-rx-1 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 3-edge eno2-tx-0 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 4-edge eno2-tx-1 Furthermore it uses the flags input argument to trigger either rx, tx or both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers") Signed-off-by: Vivek Behera <vivek.behera@siemens.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03iavf: fix netdev->max_mtu to respect actual hardware limitKohei Enju
iavf sets LIBIE_MAX_MTU as netdev->max_mtu, ignoring vf_res->max_mtu from PF [1]. This allows setting an MTU beyond the actual hardware limit, causing TX queue timeouts [2]. Set correct netdev->max_mtu using vf_res->max_mtu from the PF. Note that currently PF drivers such as ice/i40e set the frame size in vf_res->max_mtu, not MTU. Convert vf_res->max_mtu to MTU before setting netdev->max_mtu. [1] # ip -j -d link show $DEV | jq '.[0].max_mtu' 16356 [2] iavf 0000:00:05.0 enp0s5: NETDEV WATCHDOG: CPU: 1: transmit queue 0 timed out 5692 ms iavf 0000:00:05.0 enp0s5: NIC Link is Up Speed is 10 Gbps Full Duplex iavf 0000:00:05.0 enp0s5: NETDEV WATCHDOG: CPU: 6: transmit queue 3 timed out 5312 ms iavf 0000:00:05.0 enp0s5: NIC Link is Up Speed is 10 Gbps Full Duplex ... Fixes: 5fa4caff59f2 ("iavf: switch to Page Pool") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03libie: don't unroll if fwlog isn't supportedMichal Swiatkowski
The libie_fwlog_deinit() function can be called during driver unload even when firmware logging was never properly initialized. This led to call trace: [ 148.576156] Oops: Oops: 0000 [#1] SMP NOPTI [ 148.576167] CPU: 80 UID: 0 PID: 12843 Comm: rmmod Kdump: loaded Not tainted 6.17.0-rc7next-queue-3oct-01915-g06d79d51cf51 #1 PREEMPT(full) [ 148.576177] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 07/18/2020 [ 148.576182] RIP: 0010:__dev_printk+0x16/0x70 [ 148.576196] Code: 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 49 89 d4 55 48 89 fd 53 48 85 f6 74 3c <4c> 8b 6e 50 48 89 f3 4d 85 ed 75 03 4c 8b 2e 48 89 df e8 f3 27 98 [ 148.576204] RSP: 0018:ffffd2fd7ea17a48 EFLAGS: 00010202 [ 148.576211] RAX: ffffd2fd7ea17aa0 RBX: ffff8eb288ae2000 RCX: 0000000000000000 [ 148.576217] RDX: ffffd2fd7ea17a70 RSI: 00000000000000c8 RDI: ffffffffb68d3d88 [ 148.576222] RBP: ffffffffb68d3d88 R08: 0000000000000000 R09: 0000000000000000 [ 148.576227] R10: 00000000000000c8 R11: ffff8eb2b1a49400 R12: ffffd2fd7ea17a70 [ 148.576231] R13: ffff8eb3141fb000 R14: ffffffffc1215b48 R15: ffffffffc1215bd8 [ 148.576236] FS: 00007f5666ba6740(0000) GS:ffff8eb2472b9000(0000) knlGS:0000000000000000 [ 148.576242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 148.576247] CR2: 0000000000000118 CR3: 000000011ad17000 CR4: 0000000000350ef0 [ 148.576252] Call Trace: [ 148.576258] <TASK> [ 148.576269] _dev_warn+0x7c/0x96 [ 148.576290] libie_fwlog_deinit+0x112/0x117 [libie_fwlog] [ 148.576303] ixgbe_remove+0x63/0x290 [ixgbe] [ 148.576342] pci_device_remove+0x42/0xb0 [ 148.576354] device_release_driver_internal+0x19c/0x200 [ 148.576365] driver_detach+0x48/0x90 [ 148.576372] bus_remove_driver+0x6d/0xf0 [ 148.576383] pci_unregister_driver+0x2e/0xb0 [ 148.576393] ixgbe_exit_module+0x1c/0xd50 [ixgbe] [ 148.576430] __do_sys_delete_module.isra.0+0x1bc/0x2e0 [ 148.576446] do_syscall_64+0x7f/0x980 It can be reproduced by trying to unload ixgbe driver in recovery mode. Fix that by checking if fwlog is supported before doing unroll. Fixes: 641585bc978e ("ixgbe: fwlog support for e610") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03ice: Fix memory leak in ice_set_ringparam()Zilin Guan
In ice_set_ringparam, tx_rings and xdp_rings are allocated before rx_rings. If the allocation of rx_rings fails, the code jumps to the done label leaking both tx_rings and xdp_rings. Furthermore, if the setup of an individual Rx ring fails during the loop, the code jumps to the free_tx label which releases tx_rings but leaks xdp_rings. Fix this by introducing a free_xdp label and updating the error paths to ensure both xdp_rings and tx_rings are properly freed if rx_rings allocation or setup fails. Compile tested only. Issue found using a prototype static analysis tool and code review. Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Fixes: efc2214b6047 ("ice: Add support for XDP") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03ice: fix retry for AQ command 0x06EEJakub Staniszewski
Executing ethtool -m can fail reporting a netlink I/O error while firmware link management holds the i2c bus used to communicate with the module. According to Intel(R) Ethernet Controller E810 Datasheet Rev 2.8 [1] Section 3.3.10.4 Read/Write SFF EEPROM (0x06EE) request should to be retried upon receiving EBUSY from firmware. Commit e9c9692c8a81 ("ice: Reimplement module reads used by ethtool") implemented it only for part of ice_get_module_eeprom(), leaving all other calls to ice_aq_sff_eeprom() vulnerable to returning early on getting EBUSY without retrying. Remove the retry loop from ice_get_module_eeprom() and add Admin Queue (AQ) command with opcode 0x06EE to the list of commands that should be retried on receiving EBUSY from firmware. Cc: stable@vger.kernel.org Fixes: e9c9692c8a81 ("ice: Reimplement module reads used by ethtool") Signed-off-by: Jakub Staniszewski <jakub.staniszewski@linux.intel.com> Co-developed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://www.intel.com/content/www/us/en/content-details/613875/intel-ethernet-controller-e810-datasheet.html [1] Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03ice: reintroduce retry mechanism for indirect AQJakub Staniszewski
Add retry mechanism for indirect Admin Queue (AQ) commands. To do so we need to keep the command buffer. This technically reverts commit 43a630e37e25 ("ice: remove unused buffer copy code in ice_sq_send_cmd_retry()"), but combines it with a fix in the logic by using a kmemdup() call, making it more robust and less likely to break in the future due to programmer error. Cc: Michal Schmidt <mschmidt@redhat.com> Cc: stable@vger.kernel.org Fixes: 3056df93f7a8 ("ice: Re-send some AQ commands, as result of EBUSY AQ error") Signed-off-by: Jakub Staniszewski <jakub.staniszewski@linux.intel.com> Co-developed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-03ice: fix adding AQ LLDP filter for VFLarysa Zaremba
The referenced commit came from a misunderstanding of the FW LLDP filter AQ (Admin Queue) command due to the error in the internal documentation. Contrary to the assumptions in the original commit, VFs can be added and deleted from this filter without any problems. Introduced dev_info message proved to be useful, so reverting the whole commit does not make sense. Without this fix, trusted VFs do not receive LLDP traffic, if there is an AQ LLDP filter on PF. When trusted VF attempts to add an LLDP multicast MAC address, the following message can be seen in dmesg on host: ice 0000:33:00.0: Failed to add Rx LLDP rule on VSI 20 error: -95 Revert checking VSI type when adding LLDP filter through AQ. Fixes: 4d5a1c4e6d49 ("ice: do not add LLDP-specific filter if not necessary") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-28Merge branch '200GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-02-19 (idpf, ice, i40e, ixgbevf, e1000e) For idpf: Li Li moves the check for software marker to occur after incrementing next to clean to avoid re-encountering the same packet. He also adds a couple of checks to prevent NULL pointer dereferences and NULLs rss_key, after free, in error path so that later checks are properly evaluated. Brian Vazquez adjusts IRQ naming to have correlation with netdev naming. Sreedevi removes validation of action type as part of ntuple rule deletion. For ice: Aaron Ma breaks RDMA initialization into two steps and adjusts calls so that VSIs are entirely configured before plugging. Michal Schmidt fixes initialization of loopback VSI to have proper resources allocated to allow for loopback testing to occur. For i40e: Thomas Gleixner fixes a leak of preempt count by replacing get_cpu() with smp_processor_id(). For ixgbevf: Jedrzej adds a check for mailbox version before attempting to call an associated link state call that is supported in that mailbox version. For e1000e: Vitaly clears power gating feature for Panther Lake systems to avoid packet issues. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: clear DPG_EN after reset to avoid autonomous power-gating e1000e: introduce new board type for Panther Lake PCH ixgbevf: fix link setup issue i40e: Fix preempt count leak in napi poll tracepoint ice: fix crash in ethtool offline loopback test ice: recap the VSI and QoS info after rebuild idpf: Fix flow rule delete failure due to invalid validation idpf: change IRQ naming to match netdev and ethtool queue numbering idpf: nullify pointers after they are freed idpf: skip deallocating txq group's txqs if it is NULL idpf: skip deallocating bufq_sets from rx_qgrp if it is NULL idpf: increment completion queue next_to_clean in sw marker wait routine ==================== Link: https://patch.msgid.link/20260225211546.1949260-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-25e1000e: clear DPG_EN after reset to avoid autonomous power-gatingVitaly Lifshits
Panther Lake systems introduced an autonomous power gating feature for the integrated Gigabit Ethernet in shutdown state (S5) state. As part of it, the reset value of DPG_EN bit was changed to 1. Clear this bit after performing hardware reset to avoid errors such as Tx/Rx hangs, or packet loss/corruption. Fixes: 0c9183ce61bc ("e1000e: Add support for the next LOM generation") Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25e1000e: introduce new board type for Panther Lake PCHVitaly Lifshits
Add new board type for Panther Lake devices for separating device-specific features and flows. Additionally, remove the deprecated device IDs 0x57B5 and 0x57B6, which are not used by any existing devices. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25ixgbevf: fix link setup issueJedrzej Jagielski
It may happen that VF spawned for E610 adapter has problem with setting link up. This happens when ixgbevf supporting mailbox API 1.6 cooperates with PF driver which doesn't support this version of API, and hence doesn't support new approach for getting PF link data. In that case VF asks PF to provide link data but as PF doesn't support it, returns -EOPNOTSUPP what leads to early bail from link configuration sequence. Avoid such situation by using legacy VFLINKS approach whenever negotiated API version is less than 1.6. To reproduce the issue just create VF and set its link up - adapter must be any from the E610 family, ixgbevf must support API 1.6 or higher while ixgbevf must not. Fixes: 53f0eb62b4d2 ("ixgbevf: fix getting link speed data for E610 devices") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25i40e: Fix preempt count leak in napi poll tracepointThomas Gleixner
Using get_cpu() in the tracepoint assignment causes an obvious preempt count leak because nothing invokes put_cpu() to undo it: softirq: huh, entered softirq 3 NET_RX with preempt_count 00000100, exited with 00000101? This clearly has seen a lot of testing in the last 3+ years... Use smp_processor_id() instead. Fixes: 6d4d584a7ea8 ("i40e: Add i40e_napi_poll tracepoint") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25ice: fix crash in ethtool offline loopback testMichal Schmidt
Since the conversion of ice to page pool, the ethtool loopback test crashes: BUG: kernel NULL pointer dereference, address: 000000000000000c #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 1100f1067 P4D 0 Oops: Oops: 0002 [#1] SMP NOPTI CPU: 23 UID: 0 PID: 5904 Comm: ethtool Kdump: loaded Not tainted 6.19.0-0.rc7.260128g1f97d9dcf5364.49.eln154.x86_64 #1 PREEMPT(lazy) Hardware name: [...] RIP: 0010:ice_alloc_rx_bufs+0x1cd/0x310 [ice] Code: 83 6c 24 30 01 66 41 89 47 08 0f 84 c0 00 00 00 41 0f b7 dc 48 8b 44 24 18 48 c1 e3 04 41 bb 00 10 00 00 48 8d 2c 18 8b 04 24 <89> 45 0c 41 8b 4d 00 49 d3 e3 44 3b 5c 24 24 0f 83 ac fe ff ff 44 RSP: 0018:ff7894738aa1f768 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000700 RDI: 0000000000000000 RBP: 0000000000000000 R08: ff16dcae79880200 R09: 0000000000000019 R10: 0000000000000001 R11: 0000000000001000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ff16dcae6c670000 FS: 00007fcf428850c0(0000) GS:ff16dcb149710000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 0000000121227005 CR4: 0000000000773ef0 PKRU: 55555554 Call Trace: <TASK> ice_vsi_cfg_rxq+0xca/0x460 [ice] ice_vsi_cfg_rxqs+0x54/0x70 [ice] ice_loopback_test+0xa9/0x520 [ice] ice_self_test+0x1b9/0x280 [ice] ethtool_self_test+0xe5/0x200 __dev_ethtool+0x1106/0x1a90 dev_ethtool+0xbe/0x1a0 dev_ioctl+0x258/0x4c0 sock_do_ioctl+0xe3/0x130 __x64_sys_ioctl+0xb9/0x100 do_syscall_64+0x7c/0x700 entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] It crashes because we have not initialized libeth for the rx ring. Fix it by treating ICE_VSI_LB VSIs slightly more like normal PF VSIs and letting them have a q_vector. It's just a dummy, because the loopback test does not use interrupts, but it contains a napi struct that can be passed to libeth_rx_fq_create() called from ice_vsi_cfg_rxq() -> ice_rxq_pp_create(). Fixes: 93f53db9f9dc ("ice: switch to Page Pool") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25ice: recap the VSI and QoS info after rebuildAaron Ma
Fix IRDMA hardware initialization timeout (-110) after resume by separating VSI-dependent configuration from RDMA resource allocation, ensuring VSI is rebuilt before IRDMA accesses it. After resume from suspend, IRDMA hardware initialization fails: ice: IRDMA hardware initialization FAILED init_state=4 status=-110 Separate RDMA initialization into two phases: 1. ice_init_rdma() - Allocate resources only (no VSI/QoS access, no plug) 2. ice_rdma_finalize_setup() - Assign VSI/QoS info and plug device This allows: - ice_init_rdma() to stay in ice_resume() (mirrors ice_deinit_rdma() in ice_suspend()) - VSI assignment deferred until after ice_vsi_rebuild() completes - QoS info updated after ice_dcb_rebuild() completes - Device plugged only when control queues, VSI, and DCB are all ready Fixes: bc69ad74867db ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: Fix flow rule delete failure due to invalid validationSreedevi Joshi
When deleting a flow rule using "ethtool -N <dev> delete <location>", idpf_sideband_action_ena() incorrectly validates fsp->ring_cookie even though ethtool doesn't populate this field for delete operations. The uninitialized ring_cookie may randomly match RX_CLS_FLOW_DISC or RX_CLS_FLOW_WAKE, causing validation to fail and preventing legitimate rule deletions. Remove the unnecessary sideband action enable check and ring_cookie validation during delete operations since action validation is not required when removing existing rules. Fixes: ada3e24b84a0 ("idpf: add flow steering support") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: change IRQ naming to match netdev and ethtool queue numberingBrian Vazquez
The code uses the vidx for the IRQ name but that doesn't match ethtool reporting nor netdev naming, this makes it hard to tune the device and associate queues with IRQs. Sequentially requesting irqs starting from '0' makes the output consistent. This commit changes the interrupt numbering but preserves the name format, maintaining ABI compatibility. Existing tools relying on the old numbering are already non-functional, as they lack a useful correlation to the interrupts. Before: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq/*/*idpf*/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-1/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-3/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-4/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-5/../smp_affinity_list:3 ethtool -S eth1 | grep -v ': 0' NIC statistics: tx_q-0_pkts: 1002 tx_q-1_pkts: 2679 tx_q-2_pkts: 1113 tx_q-3_pkts: 1192 <----- tx_q-3 vs idpf-eth1-Tx-5 rx_q-0_pkts: 1143 rx_q-1_pkts: 3172 rx_q-2_pkts: 1074 After: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq/*/*idpf*/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-0/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-1/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-2/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-3/../smp_affinity_list:3 ethtool -S eth1 | grep -v ': 0' NIC statistics: tx_q-0_pkts: 118 tx_q-1_pkts: 134 tx_q-2_pkts: 228 tx_q-3_pkts: 138 <--- tx_q-3 matches idpf-eth1-Tx-3 rx_q-0_pkts: 111 rx_q-1_pkts: 366 rx_q-2_pkts: 120 Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Signed-off-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: nullify pointers after they are freedLi Li
rss_data->rss_key needs to be nullified after it is freed. Checks like "if (!rss_data->rss_key)" in the code could fail if it is not nullified. Tested: built and booted the kernel. Fixes: 83f38f210b85 ("idpf: Fix RSS LUT NULL pointer crash on early ethtool operations") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: skip deallocating txq group's txqs if it is NULLLi Li
In idpf_txq_group_alloc(), if any txq group's txqs failed to allocate memory: for (j = 0; j < tx_qgrp->num_txq; j++) { tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]), GFP_KERNEL); if (!tx_qgrp->txqs[j]) goto err_alloc; } It would cause a NULL ptr kernel panic in idpf_txq_group_rel(): for (j = 0; j < txq_grp->num_txq; j++) { if (flow_sch_en) { kfree(txq_grp->txqs[j]->refillq); txq_grp->txqs[j]->refillq = NULL; } kfree(txq_grp->txqs[j]); txq_grp->txqs[j] = NULL; } [ 6.532461] BUG: kernel NULL pointer dereference, address: 0000000000000058 ... [ 6.534433] RIP: 0010:idpf_txq_group_rel+0xc9/0x110 ... [ 6.538513] Call Trace: [ 6.538639] <TASK> [ 6.538760] idpf_vport_queues_alloc+0x75/0x550 [ 6.538978] idpf_vport_open+0x4d/0x3f0 [ 6.539164] idpf_open+0x71/0xb0 [ 6.539324] __dev_open+0x142/0x260 [ 6.539506] netif_open+0x2f/0xe0 [ 6.539670] dev_open+0x3d/0x70 [ 6.539827] bond_enslave+0x5ed/0xf50 [ 6.540005] ? rcutree_enqueue+0x1f/0xb0 [ 6.540193] ? call_rcu+0xde/0x2a0 [ 6.540375] ? barn_get_empty_sheaf+0x5c/0x80 [ 6.540594] ? __kfree_rcu_sheaf+0xb6/0x1a0 [ 6.540793] ? nla_put_ifalias+0x3d/0x90 [ 6.540981] ? kvfree_call_rcu+0xb5/0x3b0 [ 6.541173] ? kvfree_call_rcu+0xb5/0x3b0 [ 6.541365] do_set_master+0x114/0x160 [ 6.541547] do_setlink+0x412/0xfb0 [ 6.541717] ? security_sock_rcv_skb+0x2a/0x50 [ 6.541931] ? sk_filter_trim_cap+0x7c/0x320 [ 6.542136] ? skb_queue_tail+0x20/0x50 [ 6.542322] ? __nla_validate_parse+0x92/0xe50 ro[o t t o6 .d5e4f2a5u4l0t]- ? security_capable+0x35/0x60 [ 6.542792] rtnl_newlink+0x95c/0xa00 [ 6.542972] ? __rtnl_unlock+0x37/0x70 [ 6.543152] ? netdev_run_todo+0x63/0x530 [ 6.543343] ? allocate_slab+0x280/0x870 [ 6.543531] ? security_capable+0x35/0x60 [ 6.543722] rtnetlink_rcv_msg+0x2e6/0x340 [ 6.543918] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 6.544138] netlink_rcv_skb+0x16a/0x1a0 [ 6.544328] netlink_unicast+0x20a/0x320 [ 6.544516] netlink_sendmsg+0x304/0x3b0 [ 6.544748] __sock_sendmsg+0x89/0xb0 [ 6.544928] ____sys_sendmsg+0x167/0x1c0 [ 6.545116] ? ____sys_recvmsg+0xed/0x150 [ 6.545308] ___sys_sendmsg+0xdd/0x120 [ 6.545489] ? ___sys_recvmsg+0x124/0x1e0 [ 6.545680] ? rcutree_enqueue+0x1f/0xb0 [ 6.545867] ? rcutree_enqueue+0x1f/0xb0 [ 6.546055] ? call_rcu+0xde/0x2a0 [ 6.546222] ? evict+0x286/0x2d0 [ 6.546389] ? rcutree_enqueue+0x1f/0xb0 [ 6.546577] ? kmem_cache_free+0x2c/0x350 [ 6.546784] __x64_sys_sendmsg+0x72/0xc0 [ 6.546972] do_syscall_64+0x6f/0x890 [ 6.547150] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 6.547393] RIP: 0033:0x7fc1a3347bd0 ... [ 6.551375] RIP: 0010:idpf_txq_group_rel+0xc9/0x110 ... [ 6.578856] Rebooting in 10 seconds.. We should skip deallocating txqs[j] if it is NULL in the first place. Tested: with this patch, the kernel panic no longer appears. Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: skip deallocating bufq_sets from rx_qgrp if it is NULLLi Li
In idpf_rxq_group_alloc(), if rx_qgrp->splitq.bufq_sets failed to get allocated: rx_qgrp->splitq.bufq_sets = kcalloc(vport->num_bufqs_per_qgrp, sizeof(struct idpf_bufq_set), GFP_KERNEL); if (!rx_qgrp->splitq.bufq_sets) { err = -ENOMEM; goto err_alloc; } idpf_rxq_group_rel() would attempt to deallocate it in idpf_rxq_sw_queue_rel(), causing a kernel panic: ``` [ 7.967242] early-network-sshd-n-rexd[3148]: knetbase: Info: [ 8.127804] BUG: kernel NULL pointer dereference, address: 00000000000000c0 ... [ 8.129779] RIP: 0010:idpf_rxq_group_rel+0x101/0x170 ... [ 8.133854] Call Trace: [ 8.133980] <TASK> [ 8.134092] idpf_vport_queues_alloc+0x286/0x500 [ 8.134313] idpf_vport_open+0x4d/0x3f0 [ 8.134498] idpf_open+0x71/0xb0 [ 8.134668] __dev_open+0x142/0x260 [ 8.134840] netif_open+0x2f/0xe0 [ 8.135004] dev_open+0x3d/0x70 [ 8.135166] bond_enslave+0x5ed/0xf50 [ 8.135345] ? nla_put_ifalias+0x3d/0x90 [ 8.135533] ? kvfree_call_rcu+0xb5/0x3b0 [ 8.135725] ? kvfree_call_rcu+0xb5/0x3b0 [ 8.135916] do_set_master+0x114/0x160 [ 8.136098] do_setlink+0x412/0xfb0 [ 8.136269] ? security_sock_rcv_skb+0x2a/0x50 [ 8.136509] ? sk_filter_trim_cap+0x7c/0x320 [ 8.136714] ? skb_queue_tail+0x20/0x50 [ 8.136899] ? __nla_validate_parse+0x92/0xe50 [ 8.137112] ? security_capable+0x35/0x60 [ 8.137304] rtnl_newlink+0x95c/0xa00 [ 8.137483] ? __rtnl_unlock+0x37/0x70 [ 8.137664] ? netdev_run_todo+0x63/0x530 [ 8.137855] ? allocate_slab+0x280/0x870 [ 8.138044] ? security_capable+0x35/0x60 [ 8.138235] rtnetlink_rcv_msg+0x2e6/0x340 [ 8.138431] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 8.138650] netlink_rcv_skb+0x16a/0x1a0 [ 8.138840] netlink_unicast+0x20a/0x320 [ 8.139028] netlink_sendmsg+0x304/0x3b0 [ 8.139217] __sock_sendmsg+0x89/0xb0 [ 8.139399] ____sys_sendmsg+0x167/0x1c0 [ 8.139588] ? ____sys_recvmsg+0xed/0x150 [ 8.139780] ___sys_sendmsg+0xdd/0x120 [ 8.139960] ? ___sys_recvmsg+0x124/0x1e0 [ 8.140152] ? rcutree_enqueue+0x1f/0xb0 [ 8.140341] ? rcutree_enqueue+0x1f/0xb0 [ 8.140528] ? call_rcu+0xde/0x2a0 [ 8.140695] ? evict+0x286/0x2d0 [ 8.140856] ? rcutree_enqueue+0x1f/0xb0 [ 8.141043] ? kmem_cache_free+0x2c/0x350 [ 8.141236] __x64_sys_sendmsg+0x72/0xc0 [ 8.141424] do_syscall_64+0x6f/0x890 [ 8.141603] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 8.141841] RIP: 0033:0x7f2799d21bd0 ... [ 8.149905] Kernel panic - not syncing: Fatal exception [ 8.175940] Kernel Offset: 0xf800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 8.176425] Rebooting in 10 seconds.. ``` Tested: With this patch, the kernel panic no longer appears. Fixes: 95af467d9a4e ("idpf: configure resources for RX queues") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-25idpf: increment completion queue next_to_clean in sw marker wait routineLi Li
Currently, in idpf_wait_for_sw_marker_completion(), when an IDPF_TXD_COMPLT_SW_MARKER packet is found, the routine breaks out of the for loop and does not increment the next_to_clean counter. This causes the subsequent NAPI polls to run into the same IDPF_TXD_COMPLT_SW_MARKER packet again and print out the following: [ 23.261341] idpf 0000:05:00.0 eth1: Unknown TX completion type: 5 Instead, we should increment next_to_clean regardless when an IDPF_TXD_COMPLT_SW_MARKER packet is found. Tested: with the patch applied, we do not see the errors above from NAPI polls anymore. Fixes: 9d39447051a0 ("idpf: remove SW marker handling from NAPI") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-02-22Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL usesKees Cook
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_flex' family to use the new default GFP_KERNEL argumentLinus Torvalds
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-12net: intel: fix PCI device ID conflict between i40e and ipw2200Ethan Nelson-Moore
The ID 8086:104f is matched by both i40e and ipw2200. The same device ID should not be in more than one driver, because in that case, which driver is used is unpredictable. Fix this by taking advantage of the fact that i40e devices use PCI_CLASS_NETWORK_ETHERNET and ipw2200 devices use PCI_CLASS_NETWORK_OTHER to differentiate the devices. Fixes: 2e45d3f4677a ("i40e: Add support for X710 B/P & SFP+ cards") Cc: stable@vger.kernel.org Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20260210021235.16315-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-06ice: Remove jumbo_remove step from TX pathAlice Mikityanska
Now that the kernel doesn't insert HBH for BIG TCP IPv6 packets, remove unnecessary steps from the ice TX path, that used to check and remove HBH. Signed-off-by: Alice Mikityanska <alice@isovalent.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260205133925.526371-8-alice.kernel@fastmail.im Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.19-rc9). No adjacent changes, conflicts: drivers/net/ethernet/spacemit/k1_emac.c 3125fc1701694 ("net: spacemit: k1-emac: fix jumbo frame support") f66086798f91f ("net: spacemit: Remove broken flow control support") https://lore.kernel.org/aYIysFIE9ooavWia@sirena.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-05ice: dpll: Support E825-C SyncE and dynamic pin discoveryArkadiusz Kubalewski
Implement SyncE support for the E825-C Ethernet controller using the DPLL subsystem. Unlike E810, the E825-C architecture relies on platform firmware (ACPI) to describe connections between the NIC's recovered clock outputs and external DPLL inputs. Implement the following mechanisms to support this architecture: 1. Discovery Mechanism: The driver parses the 'dpll-pins' and 'dpll-pin names' firmware properties to identify the external DPLL pins (parents) corresponding to its RCLK outputs ("rclk0", "rclk1"). It uses fwnode_dpll_pin_find() to locate these parent pins in the DPLL core. 2. Asynchronous Registration: Since the platform DPLL driver (e.g. zl3073x) may probe independently of the network driver, utilize the DPLL notifier chain The driver listens for DPLL_PIN_CREATED events to detect when the parent MUX pins become available, then registers its own Recovered Clock (RCLK) pins as children of those parents. 3. Hardware Configuration: Implement the specific register access logic for E825-C CGU (Clock Generation Unit) registers (R10, R11). This includes configuring the bypass MUXes and clock dividers required to drive SyncE signals. 4. Split Initialization: Refactor `ice_dpll_init()` to separate the static initialization path of E810 from the dynamic, firmware-driven path required for E825-C. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-10-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>