summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2025-07-15ice: check correct pointer in fwlog debugfsMichal Swiatkowski
pf->ice_debugfs_pf_fwlog should be checked for an error here. Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-15ice: add NULL check in eswitch lag checkDave Ertman
The function ice_lag_is_switchdev_running() is being called from outside of the LAG event handler code. This results in the lag->upper_netdev being NULL sometimes. To avoid a NULL-pointer dereference, there needs to be a check before it is dereferenced. Fixes: 776fe19953b0 ("ice: block default rule setting on LAG interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-15ethernet: intel: fix building with large NR_CPUSArnd Bergmann
With large values of CONFIG_NR_CPUS, three Intel ethernet drivers fail to compile like: In function ‘i40e_free_q_vector’, inlined from ‘i40e_vsi_alloc_q_vectors’ at drivers/net/ethernet/intel/i40e/i40e_main.c:12112:3: 571 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) include/linux/rcupdate.h:1084:17: note: in expansion of macro ‘BUILD_BUG_ON’ 1084 | BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096); \ drivers/net/ethernet/intel/i40e/i40e_main.c:5113:9: note: in expansion of macro ‘kfree_rcu’ 5113 | kfree_rcu(q_vector, rcu); | ^~~~~~~~~ The problem is that the 'rcu' member in 'q_vector' is too far from the start of the structure. Move this member before the CPU mask instead, in all three drivers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: implement get LAN MMIO memory regionsJoshua Hay
The RDMA driver needs to map its own MMIO regions for the sake of performance, meaning the IDPF needs to avoid mapping portions of the BAR space. However, to be HW agnostic, the IDPF cannot assume where these are and must avoid mapping hard coded regions as much as possible. The IDPF maps the bare minimum to load and communicate with the control plane, i.e., the mailbox registers and the reset state registers. Because of how and when mailbox register offsets are initialized, it is easier to adjust the existing defines to be relative to the mailbox region starting address. Use a specific mailbox register write function that uses these relative offsets. The reset state register addresses are calculated the same way as for other registers, described below. The IDPF then calls a new virtchnl op to fetch a list of MMIO regions that it should map. The addresses for the registers in these regions are calculated by determining what region the register resides in, adjusting the offset to be relative to that region, and then adding the register's offset to that region's mapped address. If the new virtchnl op is not supported, the IDPF will fallback to mapping the whole bar. However, it will still map them as separate regions outside the mailbox and reset state registers. This way we can use the same logic in both cases to access the MMIO space. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: implement IDC vport aux driver MTU change handlerJoshua Hay
The only event an RDMA vport aux driver cares about right now is an MTU change on its underlying vport. Implement and plumb the handler to signal the pre MTU change event and post MTU change events to the RDMA vport aux driver. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: implement remaining IDC RDMA core callbacks and handlersJoshua Hay
Implement the idpf_idc_request_reset and idpf_idc_rdma_vc_send_sync callbacks for the rdma core auxiliary driver to issue reset events to the idpf and send (synchronous) virtchnl messages to the control plane respectively. Implement and plumb the reset handler for the opposite flow as well, i.e. when the idpf is resetiing and needs to notify the rdma core auxiliary driver. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: implement RDMA vport auxiliary dev create, init, and destroyJoshua Hay
Implement the functions to create, initialize, and destroy an RDMA vport auxiliary device. The vport aux dev creation is dependent on the core aux device to call idpf_idc_vport_dev_ctrl to signal that it is ready for vport aux devices. Implement that core callback to either create and initialize the vport aux dev or deinitialize. RDMA vport aux dev creation is also dependent on the control plane to tell us the vport is RDMA enabled. Add a flag in the create vport message to signal individual vport RDMA capabilities. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: implement core RDMA auxiliary dev create, init, and destroyJoshua Hay
Add the initial idpf_idc.c file with the functions to kick off the IDC initialization, create and initialize a core RDMA auxiliary device, and destroy said device. The RDMA core has a dependency on the vports being created by the control plane before it can be initialized. Therefore, once all the vports are up after a hard reset (either during driver load a function level reset), the core RDMA device info will be created. It is populated with the function type (as distinguished by the IDC initialization function pointer), the core idc_ops function points (just stubs for now), the reserved RDMA MSIX table, and various other info the core RDMA auxiliary driver will need. It is then plugged on to the bus. During a function level reset or driver unload, the device will be unplugged from the bus and destroyed. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-14idpf: use reserved RDMA vectors from control planeJoshua Hay
Fetch the number of reserved RDMA vectors from the control plane. Adjust the number of reserved LAN vectors if necessary. Adjust the minimum number of vectors the OS should reserve to include RDMA; and fail if the OS cannot reserve enough vectors for the minimum number of LAN and RDMA vectors required. Create a separate msix table for the reserved RDMA vectors, which will just get handed off to the RDMA core device to do with what it will. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: introduce ice_get_vf_by_dev() wrapperJacob Keller
The ice_get_vf_by_id() function is used to obtain a reference to a VF structure based on its ID. The ice_sriov_set_msix_vec_count() function needs to get a VF reference starting from the VF PCI device, and uses pci_iov_vf_id() to get the VF ID. This pattern is currently uncommon in the ice driver. However, the live migration module will introduce many more such locations. Add a helper wrapper ice_get_vf_by_dev() which takes the VF PCI device and calls ice_get_vf_by_id() using pci_iov_vf_id() to get the VF ID. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: avoid rebuilding if MSI-X vector count is unchangedJacob Keller
Commit 05c16687e0cc ("ice: set MSI-X vector count on VF") added support to change the vector count for VFs as part of ice_sriov_set_msix_vec_count(). This function modifies and rebuilds the target VF with the requested number of MSI-X vectors. Future support for live migration will add a call to ice_sriov_set_msix_vec_count() to ensure that a migrated VF has the proper MSI-X vector count. In most cases, this request will be to set the MSI-X vector count to its current value. In that case, no work is necessary. Rather than requiring the caller to check this, update the function to check and exit early if the vector count is already at the requested value. This avoids an unnecessary VF rebuild. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: use pci_iov_vf_id() to get VF IDJacob Keller
The ice_sriov_set_msix_vec_count() obtains the VF device ID in a strange way by iterating over the possible VF IDs and calling pci_iov_virtfn_devfn to calculate the device and function combos and compare them to the pdev->devfn. This is unnecessary. The pci_iov_vf_id() helper already exists which does the reverse calculation of pci_iov_virtfn_devfn(), which is much simpler and avoids the loop construction. Use this instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: expose VF functions used by live migrationJacob Keller
The live migration process will require configuring the target VF with the data provided from the source host. A few helper functions in ice_sriov.c and ice_virtchnl.c will be needed for this process, but are currently static. Expose these functions in their respective headers so that the live migration module can use them during the migration process. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: move ice_vsi_update_l2tsel to ice_lib.cJacob Keller
A future change is going to need to call ice_vsi_update_l2tsel from a new context outside of ice_virtchnl.c Since this function deals with a generic VSI, move it into ice_lib.c to enable calling it from other places in the ice driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: save RSS hash configuration for migrationJacob Keller
The VF can program the RSS hash configuration over virtchnl. It does this by sending a u64 bitmask which represents the current hash configuration. It is not trivial to reverse the hardware configuration back to this hash set for migration. Instead, save the value to the ice_vf structure when its modified by the VF. The rss_hashcfg value is an 8-byte field. Make room for it in ice_vf by re-arranging some of the existing fields. There is a 4-byte gap after the first_vector_idx, and a 4-byte gap between max_tx_rate and vf_states. Move first_vector_idx into the later 4-byte gap, creating an 8 byte area where rss_hashcfg can be placed. Also move the num_msix field near min_tx_rate, filling 2 bytes of a 3 byte hole. The end result of these changes enables placing the rss_hashcfg field into the structure while also saving 8 bytes in size. It looks like there are a handful of more possible cleanups to reduce the size even further, but those have been left as a future cleanup. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: add functions to get and set Tx queue contextJacob Keller
The live migration driver will need to save and restore the Tx queue context state from the hardware registers. This state contains both static fields which do not change during Tx traffic as well as dynamic fields which may change during Tx traffic. Unlike the Rx context, the Tx queue context is accessed indirectly from GLCOMM_QTX_CNTX_CTL and GLCOMM_QTX_CNTX_DATA registers. These registers are shared by multiple PFs on the same PCIe card. Multiple PFs cannot safely access the registers simultaneously, and there is no hardware semaphore or logic to control access. To handle this, introduce the txq_ctx_lock to the ice_adapter structure. This is similar to the ptp_gltsyn_time_lock. All PFs on the same adapter share this structure, and use it to serialize access to the registers to prevent error. Add a new functions to get and set the Tx queue context through the GLCOMM_QTX_CNTX_CTL interface. The hardware context values are stored in the registers using the same packed format as the Admin Queue buffer. The hardware buffer is 40 bytes wide, as it contains an additional 18 bytes of internal state not sent with the Admin Queue buffer. For this reason, a separate typedef and packing function must be used. We can share the same packed fields definitions because we never need to unpack the internal state. This is preferred, as it ensures the internal state is zero'd when writing into HW, and avoids issues with reading by u32 registers into a buffer of 22 bytes in length. Thanks to the typedefs, misuse of the API with the wrong size buffer can easily be caught at compile time. Note reading this data from hardware is essential because the current Tx queue context may be different from the context as initially programmed by the driver during VF initialization. When migrating a VF we must ensure the target VF has identical context as the source VF did. Co-developed-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: add support for reading and unpacking Rx queue contextJacob Keller
In order to support live migration, the ice driver will need to read certain data from the Rx queue context. This is stored in the hardware in a packed format. Since we use <linux/packing.h> for the mapping between the packed hardware format and the unpacked structure, it is trivial to enable unpacking support via the unpack_fields() function. Add the ice_unpack_rxq_ctx() function based on the unpack_fields() API. Re-use the same field definitions from the packing implementation. Add ice_copy_rxq_ctx_from_hw() to copy the Rx queue context data from the hardware registers. Use these to implement ice_read_rxq_ctx() which will return the Rx queue context to the caller in its unpacked ice_rlan_ctx struct. This will enable the migration logic access to the relevant data about the Rx device queues. It can easily be copied to the target system as part of the migration payload, where it will be used to configure the Rx queues. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-08eth: ice: drop the dead code related to rss_contextsJakub Kicinski
ICE appears to have some odd form of rss_context use plumbed in for .get_rxfh. The .set_rxfh side does not support creating contexts, however, so this must be dead code. For at least a year now (since commit 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")) we have not been calling .get_rxfh with a non-zero rss_context. We just get the info from the RSS XArray under dev->ethtool. Remove what must be dead code in the driver, clear the support flags. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07Merge branch '10GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-07-03 Vladimir Oltean converts Intel drivers (ice, igc, igb, ixgbe, i40e) to utilize new timestamping API (ndo_hwtstamp_get() and ndo_hwtstamp_set()). For ixgbe: Paul, Don, Slawomir, and Radoslaw add Malicious Driver Detection (MDD) support for X550 and E610 devices to detect, report, and handle potentially malicious VFs. Simon Horman corrects spelling mistakes. For igbvf: Kohei Enju removes a couple of unreported counters and adds reporting of Tx timeouts. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igbvf: add tx_timeout_count to ethtool statistics igbvf: remove unused interrupt counter fields from struct igbvf_adapter ixgbe: spelling corrections ixgbe: turn off MDD while modifying SRRCTL ixgbe: add Tx hang detection unhandled MDD ixgbe: check for MDD events ixgbe: add MDD support i40e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ixgbe: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() igb: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() igc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ==================== Link: https://patch.msgid.link/20250703174242.3829277-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR (net-6.16-rc5). No conflicts. No adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03igbvf: add tx_timeout_count to ethtool statisticsKohei Enju
Add `tx_timeout_count` to ethtool statistics to provide visibility into transmit timeout events, bringing igbvf in line with other Intel ethernet drivers. Currently `tx_timeout_count` is incremented in igbvf_watchdog_task() and igbvf_tx_timeout() but is not exposed to userspace nor used elsewhere in the driver. Before: # ethtool -S ens5 | grep tx tx_packets: 43 tx_bytes: 4408 tx_restart_queue: 0 After: # ethtool -S ens5 | grep tx tx_packets: 41 tx_bytes: 4241 tx_restart_queue: 0 tx_timeout_count: 0 Tested-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igbvf: remove unused interrupt counter fields from struct igbvf_adapterKohei Enju
Remove `int_counter0` and `int_counter1` from struct igbvf_adapter since they are only incremented in interrupt handlers igbvf_intr_msix_rx() and igbvf_msix_other(), but never read or used anywhere in the driver. Note that igbvf_intr_msix_tx() does not have similar counter increments, suggesting that these were likely overlooked during development. Eliminate the fields and their unnecessary accesses in interrupt handlers. Tested-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: spelling correctionsSimon Horman
Correct spelling as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: turn off MDD while modifying SRRCTLRadoslaw Tyl
Modifying SRRCTL register can generate MDD event. Turn MDD off during SRRCTL register write to prevent generating MDD. Fix RCT in ixgbe_set_rx_drop_en(). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: add Tx hang detection unhandled MDDSlawomir Mrozowicz
Add Tx Hang detection due to an unhandled MDD Event. Previously, a malicious VF could disable the entire port causing TX to hang on the E610 card. Those events that caused PF to freeze were not detected as an MDD event and usually required a Tx Hang watchdog timer to catch the suspension, and perform a physical function reset. Implement flows in the affected PF driver in such a way to check the cause of the hang, detect it as an MDD event and log an entry of the malicious VF that caused the Hang. The PF blocks the malicious VF, if it continues to be the source of several MDD events. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com> Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: check for MDD eventsDon Skidmore
When an event is detected it is logged and, for the time being, the queue is immediately re-enabled. This is due to the lack of an API to the hypervisor so it could deal with it as it chooses. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: add MDD supportPaul Greenwalt
Add malicious driver detection to ixgbe driver. The supported devices are E610 and X550. Handling MDD events is enabled while VFs are created and turned off when they are disabled. There is no runtime command to enable or disable MDD independently. MDD event is logged when malicious VF driver is detected. For example VF can try to send incorrect Tx descriptor (TSO on, but length field not correct). It can be reproduced by manipulating the driver, or using driver with incorrect descriptor values. Example log: "Malicious event on VF 0 tx:128 rx:128" Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03i40e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel i40e driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ixgbe: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel ixgbe driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igb: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel igb driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03igc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel igc driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-03ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel ice driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-01igc: disable L1.2 PCI-E link substate to avoid performance issueVitaly Lifshits
I226 devices advertise support for the PCI-E link L1.2 substate. However, due to a hardware limitation, the exit latency from this low-power state is longer than the packet buffer can tolerate under high traffic conditions. This can lead to packet loss and degraded performance. To mitigate this, disable the L1.2 substate. The increased power draw between L1.1 and L1.2 is insignificant. Fixes: 43546211738e ("igc: Add new device ID's") Link: https://lore.kernel.org/intel-wired-lan/15248b4f-3271-42dd-8e35-02bfc92b25e1@intel.com Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-01idpf: convert control queue mutex to a spinlockAhmed Zaki
With VIRTCHNL2_CAP_MACFILTER enabled, the following warning is generated on module load: [ 324.701677] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578 [ 324.701684] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1582, name: NetworkManager [ 324.701689] preempt_count: 201, expected: 0 [ 324.701693] RCU nest depth: 0, expected: 0 [ 324.701697] 2 locks held by NetworkManager/1582: [ 324.701702] #0: ffffffff9f7be770 (rtnl_mutex){....}-{3:3}, at: rtnl_newlink+0x791/0x21e0 [ 324.701730] #1: ff1100216c380368 (_xmit_ETHER){....}-{2:2}, at: __dev_open+0x3f0/0x870 [ 324.701749] Preemption disabled at: [ 324.701752] [<ffffffff9cd23b9d>] __dev_open+0x3dd/0x870 [ 324.701765] CPU: 30 UID: 0 PID: 1582 Comm: NetworkManager Not tainted 6.15.0-rc5+ #2 PREEMPT(voluntary) [ 324.701771] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 324.701774] Call Trace: [ 324.701777] <TASK> [ 324.701779] dump_stack_lvl+0x5d/0x80 [ 324.701788] ? __dev_open+0x3dd/0x870 [ 324.701793] __might_resched.cold+0x1ef/0x23d <..> [ 324.701818] __mutex_lock+0x113/0x1b80 <..> [ 324.701917] idpf_ctlq_clean_sq+0xad/0x4b0 [idpf] [ 324.701935] ? kasan_save_track+0x14/0x30 [ 324.701941] idpf_mb_clean+0x143/0x380 [idpf] <..> [ 324.701991] idpf_send_mb_msg+0x111/0x720 [idpf] [ 324.702009] idpf_vc_xn_exec+0x4cc/0x990 [idpf] [ 324.702021] ? rcu_is_watching+0x12/0xc0 [ 324.702035] idpf_add_del_mac_filters+0x3ed/0xb50 [idpf] <..> [ 324.702122] __hw_addr_sync_dev+0x1cf/0x300 [ 324.702126] ? find_held_lock+0x32/0x90 [ 324.702134] idpf_set_rx_mode+0x317/0x390 [idpf] [ 324.702152] __dev_open+0x3f8/0x870 [ 324.702159] ? __pfx___dev_open+0x10/0x10 [ 324.702174] __dev_change_flags+0x443/0x650 <..> [ 324.702208] netif_change_flags+0x80/0x160 [ 324.702218] do_setlink.isra.0+0x16a0/0x3960 <..> [ 324.702349] rtnl_newlink+0x12fd/0x21e0 The sequence is as follows: rtnl_newlink()-> __dev_change_flags()-> __dev_open()-> dev_set_rx_mode() - > # disables BH and grabs "dev->addr_list_lock" idpf_set_rx_mode() -> # proceed only if VIRTCHNL2_CAP_MACFILTER is ON __dev_uc_sync() -> idpf_add_mac_filter -> idpf_add_del_mac_filters -> idpf_send_mb_msg() -> idpf_mb_clean() -> idpf_ctlq_clean_sq() # mutex_lock(cq_lock) Fix by converting cq_lock to a spinlock. All operations under the new lock are safe except freeing the DMA memory, which may use vunmap(). Fix by requesting a contiguous physical memory for the DMA mapping. Fixes: a251eee62133 ("idpf: add SRIOV support and other ndo_ops") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-01idpf: return 0 size for RSS key if not supportedMichal Swiatkowski
Returning -EOPNOTSUPP from function returning u32 is leading to cast and invalid size value as a result. -EOPNOTSUPP as a size probably will lead to allocation fail. Command: ethtool -x eth0 It is visible on all devices that don't have RSS caps set. [ 136.615917] Call Trace: [ 136.615921] <TASK> [ 136.615927] ? __warn+0x89/0x130 [ 136.615942] ? __alloc_frozen_pages_noprof+0x322/0x330 [ 136.615953] ? report_bug+0x164/0x190 [ 136.615968] ? handle_bug+0x58/0x90 [ 136.615979] ? exc_invalid_op+0x17/0x70 [ 136.615987] ? asm_exc_invalid_op+0x1a/0x20 [ 136.616001] ? rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616016] ? __alloc_frozen_pages_noprof+0x322/0x330 [ 136.616028] __alloc_pages_noprof+0xe/0x20 [ 136.616038] ___kmalloc_large_node+0x80/0x110 [ 136.616072] __kmalloc_large_node_noprof+0x1d/0xa0 [ 136.616081] __kmalloc_noprof+0x32c/0x4c0 [ 136.616098] ? rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616105] rss_prepare_get.constprop.0+0xb9/0x170 [ 136.616114] ethnl_default_doit+0x107/0x3d0 [ 136.616131] genl_family_rcv_msg_doit+0x100/0x160 [ 136.616147] genl_rcv_msg+0x1b8/0x2c0 [ 136.616156] ? __pfx_ethnl_default_doit+0x10/0x10 [ 136.616168] ? __pfx_genl_rcv_msg+0x10/0x10 [ 136.616176] netlink_rcv_skb+0x58/0x110 [ 136.616186] genl_rcv+0x28/0x40 [ 136.616195] netlink_unicast+0x19b/0x290 [ 136.616206] netlink_sendmsg+0x222/0x490 [ 136.616215] __sys_sendto+0x1fd/0x210 [ 136.616233] __x64_sys_sendto+0x24/0x30 [ 136.616242] do_syscall_64+0x82/0x160 [ 136.616252] ? __sys_recvmsg+0x83/0xe0 [ 136.616265] ? syscall_exit_to_user_mode+0x10/0x210 [ 136.616275] ? do_syscall_64+0x8e/0x160 [ 136.616282] ? __count_memcg_events+0xa1/0x130 [ 136.616295] ? count_memcg_events.constprop.0+0x1a/0x30 [ 136.616306] ? handle_mm_fault+0xae/0x2d0 [ 136.616319] ? do_user_addr_fault+0x379/0x670 [ 136.616328] ? clear_bhb_loop+0x45/0xa0 [ 136.616340] ? clear_bhb_loop+0x45/0xa0 [ 136.616349] ? clear_bhb_loop+0x45/0xa0 [ 136.616359] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 136.616369] RIP: 0033:0x7fd30ba7b047 [ 136.616376] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d bd d5 0c 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 c3 55 48 83 ec 30 44 89 4c 24 2c 4c 89 44 [ 136.616381] RSP: 002b:00007ffde1796d68 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [ 136.616388] RAX: ffffffffffffffda RBX: 000055d7bd89f2a0 RCX: 00007fd30ba7b047 [ 136.616392] RDX: 0000000000000028 RSI: 000055d7bd89f3b0 RDI: 0000000000000003 [ 136.616396] RBP: 00007ffde1796e10 R08: 00007fd30bb4e200 R09: 000000000000000c [ 136.616399] R10: 0000000000000000 R11: 0000000000000202 R12: 000055d7bd89f340 [ 136.616403] R13: 000055d7bd89f3b0 R14: 000055d78943f200 R15: 0000000000000000 Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-01time/timecounter: Fix the lie that struct cyclecounter is constGreg Kroah-Hartman
In both the read callback for struct cyclecounter, and in struct timecounter, struct cyclecounter is declared as a const pointer. Unfortunatly, a number of users of this pointer treat it as a non-const pointer as it is burried in a larger structure that is heavily modified by the callback function when accessed. This lie had been hidden by the fact that container_of() "casts away" a const attribute of a pointer without any compiler warning happening at all. Fix this all up by removing the const attribute in the needed places so that everyone can see that the structure really isn't const, but can, and is, modified by the users of it. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/2025070124-backyard-hurt-783a@gregkh
2025-06-27ice: add ref-sync dpll pinsArkadiusz Kubalewski
Implement reference sync input pin get/set callbacks, allow user space control over dpll pin pairs capable of reference sync support. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20250626135219.1769350-4-arkadiusz.kubalewski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-26ice: default to TIME_REF instead of TXCO on E825-CJacob Keller
The driver currently defaults to the internal oscillator as the clock source for E825-C hardware. While this clock source is labeled TCXO, indicating a temperature compensated oscillator, this is only true for some board designs. Many board designs have a less capable oscillator. The E825-C hardware may also have its clock source set to the TIME_REF pin. This pin is connected to the DPLL and is often a more stable clock source. The choice of the internal oscillator is not suitable for all systems, especially those which want to enable SyncE support. There is currently no interface available for users to configure the clock source. Other variants of the E82x board have the clock source configured in the NVM, but E825-C lacks this capability, so different board designs cannot select a different default clock via firmware. In most setups, the TIME_REF is a suitable default clock source. Additionally, we now fall back to the internal oscillator automatically if the TIME_REF clock source cannot be locked. Change the default clock source for E825-C to TIME_REF. Note that the driver logs a dev_dbg message upon configuring the TSPLL which includes the clock source and frequency. This can be enabled to confirm which clock source is in use. Longterm a proper interface to dynamically introspect and change the clock source will be designed (perhaps some extension of the DPLL subsystem?) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: move TSPLL init calls to ice_ptp.cKarol Kolacinski
Initialize TSPLL after initializing PHC in ice_ptp.c instead of calling for each product in PHC init in ice_ptp_hw.c. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: fall back to TCXO on TSPLL lock failKarol Kolacinski
TSPLL can fail when trying to lock to TIME_REF as a clock source, e.g. when the external clock source is not stable or connected to the board. To continue operation after failure, try to lock again to internal TCXO and inform user about this. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: wait before enabling TSPLLKarol Kolacinski
To ensure proper operation, wait for 10 to 20 microseconds before enabling TSPLL. Adjust wait time after enabling TSPLL from 1-5 ms to 1-2 ms. Those values are empirical and tested on multiple HW configurations. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: add multiple TSPLL helpersKarol Kolacinski
Add helpers for checking TSPLL params, disabling sticky bits, configuring TSPLL and getting default clock frequency to simplify the code flows. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: use bitfields instead of unions for CGU regsKarol Kolacinski
Switch from unions with bitfield structs to definitions with bitfield masks. This is necessary, because some registers have different field definitions or even use a different register for the same fields based on HW type. Remove unused register fields. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: read TSPLL registers again before reporting statusJacob Keller
After programming the TSPLL, re-read the registers before reporting status. This ensures the debug log message will show what was actually programmed, rather than relying on a cached value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: clear time_sync_en field for E825-C during reprogrammingJacob Keller
When programming the Clock Generation Unit for E285-C hardware, we need to clear the time_sync_en bit of the DWORD 9 before we set the frequency. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-21Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Separate TSPLL from PTP and clean up [part] Jake Keller says: Separate TSPLL related functions and definitions from all PTP-related files and clean up the code by implementing multiple helpers. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: add TSPLL log config helper ice: use designated initializers for TSPLL consts ice: remove ice_tspll_params_e825 definitions ice: fix E825-C TSPLL register definitions ice: rename TSPLL and CGU functions and definitions ice: move TSPLL functions to a separate file ==================== Link: https://patch.msgid.link/20250618174231.3100231-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19igc: Make the const read-only array supported_sizes staticColin Ian King
Don't populate the const read-only array supported_sizes on the stack at run time, instead make it static. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>> Link: https://patch.msgid.link/20250618135408.1784120-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18udp_tunnel: remove rtnl_lock dependencyStanislav Fomichev
Drivers that are using ops lock and don't depend on RTNL lock still need to manage it because udp_tunnel's RTNL dependency. Introduce new udp_tunnel_nic_lock and use it instead of rtnl_lock. Drop non-UDP_TUNNEL_NIC_INFO_MAY_SLEEP mode from udp_tunnel infra (udp_tunnel_nic_device_sync_work needs to grab udp_tunnel_nic_lock mutex and might sleep). Cover more places in v4: - netlink - udp_tunnel_notify_add_rx_port (ndo_open) - triggers udp_tunnel_nic_device_sync_work - udp_tunnel_notify_del_rx_port (ndo_stop) - triggers udp_tunnel_nic_device_sync_work - udp_tunnel_get_rx_info (__netdev_update_features) - triggers NETDEV_UDP_TUNNEL_PUSH_INFO - udp_tunnel_drop_rx_info (__netdev_update_features) - triggers NETDEV_UDP_TUNNEL_DROP_INFO - udp_tunnel_nic_reset_ntf (ndo_open) - notifiers - udp_tunnel_nic_netdevice_event, depending on the event: - triggers NETDEV_UDP_TUNNEL_PUSH_INFO - triggers NETDEV_UDP_TUNNEL_DROP_INFO - ethnl_tunnel_info_reply_size - udp_tunnel_nic_set_port_priv (two intel drivers) Cc: Michael Chan <michael.chan@broadcom.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com> Link: https://patch.msgid.link/20250616162117.287806-4-stfomichev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18ice: add TSPLL log config helperKarol Kolacinski
Add a helper function to print new/current TSPLL config. This helps avoid unnecessary casts from u8 to enums. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>