summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/meta
AgeCommit message (Collapse)Author
33 hourseth: fbnic: Advertise supported XDP features.Dimitri Daskalakis
Drivers are supposed to advertise the XDP features they support. This was missed while adding XDP support. Before: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': set(), 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... After: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': {'basic', 'rx-sg'}, 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... Fixes: 168deb7b31b2 ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260218030620.3329608-1-dimitri.daskalakis1@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 dayseth: fbnic: Add validation for MTU changesDimitri Daskalakis
Increasing the MTU beyond the HDS threshold causes the hardware to fragment packets across multiple buffers. If a single-buffer XDP program is attached, the driver will drop all multi-frag frames. While we can't prevent a remote sender from sending non-TCP packets larger than the MTU, this will prevent users from inadvertently breaking new TCP streams. Traditionally, drivers supported XDP with MTU less than 4Kb (packet per page). Fbnic currently prevents attaching XDP when MTU is too high. But it does not prevent increasing MTU after XDP is attached. Fixes: 1b0a3950dbd4 ("eth: fbnic: Add XDP pass, drop, abort support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
4 dayseth: fbnic: set DMA_HINT_L4 for all flowsBobby Eshleman
fbnic always advertises ETHTOOL_TCP_DATA_SPLIT_ENABLED via ethtool .get_ringparam. To enable proper splitting for all flow types, even for IP/Ethernet flows, this patch sets DMA_HINT_L4 unconditionally for all RSS and NFC flow steering rules. According to the spec, L4 falls back to L3 if no valid L4 is found, and L3 falls back to L2 if no L3 is found. This makes sure that the correct header boundary is used regardless of traffic type. This is important for zero-copy use cases where we must ensure that all ZC packets are split correctly. Fixes: 2b30fc01a6c7 ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-3-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 dayseth: fbnic: increase FBNIC_HDR_BYTES_MIN from 128 to 256 bytesBobby Eshleman
Increase FBNIC_HDR_BYTES_MIN from 128 to 256 bytes. The previous minimum was too small to guarantee that very long L2+L3+L4 headers always fit within the header buffer. When EN_HDR_SPLIT is disabled and a packet exceeds MAX_HEADER_BYTES, splitting occurs at that byte offset instead of the header boundary, resulting in some of the header landing in the payload page. The increased minimum ensures headers always fit with the MAX_HEADER_BYTES cut off and land in the header page. Fixes: 2b30fc01a6c7 ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-2-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 dayseth: fbnic: set FBNIC_QUEUE_RDE_CTL0_EN_HDR_SPLIT on RDE_CTL0Bobby Eshleman
Fix EN_HDR_SPLIT configuration by writing the field to RDE_CTL0 instead of RDE_CTL1. Because drop mode configuration and header splitting enablement both use RDE_CTL0, we consolidate these configurations into the single function fbnic_config_drop_mode. Fixes: 2b30fc01a6c7 ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-1-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 daysfbnic: close fw_log race between users and teardownChengfeng Ye
Fixes a theoretical race on fw_log between the teardown path and fw_log write functions. fw_log is written inside fbnic_fw_log_write() and can be reached from the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before IRQ/MBX teardown during cleanup, resulting in a potential data race of dereferencing a freed/null variable. Possible Interleaving Scenario: CPU0: fbnic_fw_msix_intr() // Entry fbnic_fw_log_write() if (fbnic_fw_log_ready()) // true ... preempt ... CPU1: fbnic_remove() // Entry fbnic_fw_log_free() vfree(log->data_start); log->data_start = NULL; CPU0: continues, walks log->entries or writes to log->data_start The initialization also has an incorrect order problem, as the fw_log is currently allocated after MBX setup during initialization. Fix the problems by adjusting the synchronization order to put initialization in place before the mailbox is enabled, and not cleared until after the mailbox has been disabled. Fixes: ecc53b1b46c89 ("eth: fbnic: Enable firmware logging") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://patch.msgid.link/20260211191329.530886-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-30eth fbnic: Add debugfs hooks for tx/rx ringsMike Marciniszyn (Meta)
Add debugfs hooks to display tx/rx rings for each napi vector. Note that the cloning mechanism in fbnic_ethtool.c for configuration changes protects against concurrency issues with simultaneous config changes along with debugs ring accesses. The configuration switch builds up the new configuration offline, takes the current config down, which removes the debugfs nv files, and switches to the new configuration. The new configuration is brought up which brings the debugfs files back on top of the new configuration rings. The interaction with fbnic_queue_stop() and fbnic_queue_start() will similarly delete and add the files for the indicated vector. Signed-off-by: Mike Marciniszyn (Meta) <mike.marciniszyn@gmail.com> Link: https://patch.msgid.link/20260127200644.11640-3-mike.marciniszyn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-30eth fbnic: Add debugfs hooks for firmware mailboxMike Marciniszyn (Meta)
This patch adds reporting the Rx and Tx information interfacing with the firmware. The result of reading fbnic/fw_mbx is: Rx Rdy: 1 Head: 11 Tail: 10 Idx Len E Addr F H Raw ---------------------------------- 00 4096 0 000101fea000 0 1 1000000101fea001 01 4096 0 000101feb000 0 1 1000000101feb001 . . . 15 4096 0 000101fe9000 0 1 1000000101fe9001 Tx Rdy: 1 Head: 4 Tail: 4 Idx Len E Addr F H Raw ---------------------------------- 00 0004 1 00010321b000 1 1 000440010321b003 01 0004 1 00010228d000 1 1 000440010228d003 . . . 15 0004 1 00010321b000 1 1 000440010321b003 Signed-off-by: Mike Marciniszyn (Meta) <mike.marciniszyn@gmail.com> Link: https://patch.msgid.link/20260127200644.11640-2-mike.marciniszyn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23net: fbnic: convert to use .get_rx_ring_countBreno Leitao
Use the newly introduced .get_rx_ring_count ethtool ops callback instead of handling ETHTOOL_GRXRINGS directly in .get_rxnfc(). Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-5-94dbe4dcaa10@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20eth: fbnic: Update RX mbox timeout valueMohsin Bashir
While waiting for completions on read requests, driver is using different timeout values for different messages. Make use of a single timeout value. Introduce a wrapper function to handle the wait, which also simplify maintaining the 80 char line limit. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260115003353.4150771-6-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20eth: fbnic: Remove retry supportMohsin Bashir
The driver retries sensor read requests from firmware, but this is unnecessary. A functioning firmware should respond to each request within the timeout period. Remove the retry logic and set the timeout to the sum of all retry timeouts. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260115003353.4150771-5-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20eth: fbnic: Reuse RX mailbox pagesMohsin Bashir
Currently, the RX mailbox frees and reallocates a page for each received message. Since FW Rx messages are processed synchronously, and nothing hold these pages (unlike skbs which we hand over to the stack), reuse the pages and put them back on the Rx ring. Now that we ensure the ring is always fully populated we don't have to worry about filling it up after partial population during init, either. Update fbnic_mbx_process_rx_msgs() to recycle pages after message processing. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260115003353.4150771-4-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20eth: fbnic: Allocate all pages for RX mailboxMohsin Bashir
Now that memory is allocated with GFP_KERNEL, allocation failures should be extremely rare. Ensure the FW communication ring is always fully populated with free pages, and hard fail initialization otherwise. This enables simplifications in next patches. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260115003353.4150771-3-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20eth: fbnic: Use GFP_KERNEL to allocting mbx pagesMohsin Bashir
Replace GFP_ATOMIC with GFP_KERNEL for mailbox RX page allocation. Since interrupt handler is threaded GFP_KERNEL is a safe option to reduce allocation failures. Also remove __GFP_NOWARN so the kernel reports a warning on allocation failure to aid debugging. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260115003353.4150771-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-14net: add bare bone queue configsPavel Begunkov
We'll need to pass extra parameters when allocating a queue for memory providers. Define a new structure for queue configurations, and pass it to qapi callbacks. It's empty for now, actual parameters will be added in following patches. Configurations should persist across resets, and for that they're default-initialised on device registration and stored in struct netdev_rx_queue. We also add a new qapi callback for defaulting a given config. It must be implemented if a driver wants to use queue configs and is optional otherwise. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2025-12-04Merge tag 'pci-v6.19-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan Williams) - Switch vmd from custom domain number allocator to the common allocator to prevent a potential race with new non-VMD buses (Dan Williams) - Enable Precision Time Measurement (PTM) only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) Resource management: - Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) Power management and error handling: - For drivers using PCI legacy suspend, save config state at suspend so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - For devices with no driver or a driver that lacks power management, save config state at hibernate so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - Save device config space on device addition, before driver binding, so error recovery works more reliably (Lukas Wunner) - Drop pci_save_state() from several drivers that no longer need it since the PCI core always does it and pci_restore_state() no longer invalidates the saved state (Lukas Wunner) - Document use of pci_save_state() by drivers to capture the state they want restored during error recovery (Lukas Wunner) Power control: - Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) Endpoint framework: - Convert the endpoint doorbell test to use a threaded IRQ to fix a 'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri) - Add endpoint VNTB MSI doorbell support to reduce latency between host and endpoint (Frank Li) New native PCIe controller drivers: - Add CIX Sky1 host controller DT binding and driver (Hans Zhang) - Add NXP S32G host controller DT binding and driver (Vincent Guittot) - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu Beznea) - Add SpacemiT K1 host controller DT binding and driver (Alex Elder) Amlogic Meson PCIe controller driver: - Update DT binding to name DBI region 'dbi', not 'elbi', and update driver to support both (Manivannan Sadhasivam) Apple PCIe controller driver: - Move struct pci_host_bridge allocation from pci_host_common_init() to callers, which significantly simplifies pcie-apple (Marc Zyngier) Broadcom STB PCIe controller driver: - Disable advertising ASPM L0s support correctly (Jim Quinlan) - Add a panic/die handler to print diagnostic info in case PCIe caused an unrecoverable abort (Jim Quinlan) Cadence PCIe controller driver: - Add module support for Cadence platform host and endpoint controller driver (Manikandan K Pillai) - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare for new CIX Sky1 driver (Manikandan K Pillai) MediaTek PCIe controller driver: - Convert DT binding to YAML schema (Christian Marangi) - Add Airoha AN7583 DT compatible and driver support (Christian Marangi) Qualcomm PCIe controller driver: - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu) - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280, sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT schemas (Krzysztof Kozlowski) - Look up OPP using both frequency and data rate (not just frequency) so RPMh votes can account for both (Krishna Chaitanya Chundru) Rockchip DesignWare PCIe controller driver: - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi) STMicroelectronics STM32MP25 PCIe controller driver: - Fix a race between link training and endpoint register initialization (Christian Bruel) - Align endpoint allocations to match the ATU requirements (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Clear L1 PM Substate Capability 'Supported' bits unless glue driver says it's supported, which prevents users from enabling non-working L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas) - Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas) - Configure L1SS support in dw-rockchip when DT says 'supports-clkreq' (Shawn Lin) TI Keystone PCIe controller driver: - Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli)" * tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits) MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer PCI: sky1: Add PCIe host support for CIX Sky1 dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings PCI: cadence: Add support for High Perf Architecture (HPA) controller MAINTAINERS: Add NXP S32G PCIe controller driver maintainer PCI: s32g: Add NXP S32G PCIe controller driver (RC) PCI: dwc: Add register and bitfield definitions dt-bindings: PCI: s32g: Add NXP S32G PCIe controller PCI: Add Renesas RZ/G3S host controller driver PCI: host-generic: Move bridge allocation outside of pci_host_common_init() dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding PCI: Validate pci_rebar_size_supported() input Documentation: PCI: Amend error recovery doc with pci_save_state() rules treewide: Drop pci_save_state() after pci_restore_state() PCI/ERR: Ensure error recoverability at all times PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths PCI: dw-rockchip: Configure L1SS support PCI: tegra194: Remove unnecessary L1SS disable code ...
2025-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: net/xdp/xsk.c 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number") 8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb") 30ed05adca4a ("xsk: use a smaller new lock for shared pool case") https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27fbnic: Replace use of internal PCS w/ Designware XPCSAlexander Duyck
As we have exposed the PCS registers via the SWMII we can now start looking at connecting the XPCS driver to those registers and let it mange the PCS instead of us doing it directly from the fbnic driver. For now this just gets us the ability to detect link. The hope is in the future to add some of the vendor specific registers to begin enabling XPCS configuration of the interface. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374325295.959489.14521115864034905277.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add SW shim for MDIO interface to PMD and PCSAlexander Duyck
In order for us to support a PCS device we need to add an MDIO bus to allow the drivers to have access to the registers for the device. This change adds such an interface. The interface will consist of 2 PHY addrs, the first one consisting of a PMD and PCS, and the second just being a PCS. There is a need for 2 PHYs addrs due to the fact that in order to support the 50GBase-CR2 mode we will need to access and configure the PCS vendor registers and RSFEC registers from the second lane identical to the first. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374324532.959489.15389723111560978054.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add handler for reporting link down event statisticsAlexander Duyck
We were previously not displaying the number of link_down_events tracked by the device. With this change we should now be able to display the value. The value itself tracks the calls from the phylink interface to the mac_link_down call. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374323824.959489.6915296616773178954.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add logic to track PMD state via MAC/PCS signalsAlexander Duyck
One complication with the design of our part is that the PMD doesn't provide a direct signal to the host. Instead we have visibility to signals that the PCS provides to the MAC that allow us to check the link state through that. We will need to account for several things in the PMD and firmware when managing the link. Specifically when the link first starts to come up the PMD will cause the link to flap. This is due to the firmware starting a training cycle when the link is first detected. This will cause link flapping if we were to immediately report link up when the PCS first detects it. To address that we are adding a pmd_state variable that is meant to be a countdown of sorts indicating the state of the PMD. If the link is down or has been reconfigured the PMD will start out in the initialize state. By default the link is assumed to be in the SEND_DATA state if it is available on initial link inspection. If link is detected while in the initialize state the PMD state will switch to training, and if after 4 seconds the link is still stable we will transition to link_ready, and finally the send_data state. With this we can avoid link flapping when a cable is first connected to the NIC. One side effect of this is that we need to pull the link state away from the PCS. For now we use a union of the PCS link state register value and the pmd_state. The plan is to add a PMD register to report the pmd_state to the phylink interface. With that we can then look at switching over to the use of the XPCS driver for fbnic instead of having an internal one. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374323107.959489.14951134213387615059.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Rename PCS IRQ to MAC IRQ as it is actually a MAC interruptAlexander Duyck
Throughout several spots in the code I had called out the IRQ as being related to the PCS. However the actual IRQ is a part of the MAC and it is just exposing PCS data. To more accurately reflect the owner of the calls this change makes it so that we rename the functions and values that are taking in the interrupt value and processing it to reflect that it is a MAC call and not a PCS one. This change is mostly motivated by the fact that we will be moving the handling of this interrupt from being PCS focused to being more PMA/PMD focused as this will drive the phydev driver that I am adding instead of driving the PCS directly. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374322373.959489.12018231545479053860.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-26eth: fbnic: Fix counter roll-over issueMohsin Bashir
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs() when calculating descriptor slots. The issue occurs when head - tail results in a large positive value (unsigned) and the compiler interprets head - tail - 1 as a signed value. Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation, which is a common way of avoiding this problem when dealing with these sort of ring space calculations. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20251125211704.3222413-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25drivers: net: fbnic: Return the true error in fbnic_alloc_napi_vectors.Dimitri Daskalakis
The error case in fbnic_alloc_napi_vectors defaulted to returning ENOMEM. This can mask the true error case, causing confusion. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Link: https://patch.msgid.link/20251124200518.1848029-1-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24treewide: Drop pci_save_state() after pci_restore_state()Lukas Wunner
In 2009, commit c82f63e411f1 ("PCI: check saved state before restore") changed the behavior of pci_restore_state() such that it became necessary to call pci_save_state() afterwards, lest recovery from subsequent PCI errors fails. The commit has just been reverted and so all the pci_save_state() after pci_restore_state() calls that have accumulated in the tree are now superfluous. Drop them. Two drivers chose a different approach to achieve the same result: drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the pci_dev's "state_saved" flag to true before calling pci_restore_state(). Drop this as well. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> # qat Link: https://patch.msgid.link/c2b28cc4defa1b743cf1dedee23c455be98b397a.1760274044.git.lukas@wunner.de
2025-11-20eth: fbnic: access @pp through netmem_desc instead of pageByungchul Park
To eliminate the use of struct page in page pool, the page pool users should use netmem descriptor and APIs instead. Make fbnic access @pp through netmem_desc instead of page. Signed-off-by: Byungchul Park <byungchul@sk.com> Link: https://patch.msgid.link/20251120011118.73253-1-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17eth: fbnic: Configure RDE settings for pause frameMohsin Bashir
fbnic supports pause frames. When pause frames are enabled presumably user expects lossless operation from the NIC. Make sure we configure RDE (Rx DMA Engine) to DROP_NEVER mode to avoid discards due to delays in fetching Rx descriptors from the host. While at it enable DROP_NEVER when NIC only has a single queue configured. In this case the NIC acts as a FIFO so there's no risk of head-of-line blocking other queues by making RDE wait. If pause is disabled this just moves the packet loss from the DMA engine to the Rx buffer. Remove redundant call to fbnic_config_drop_mode_rcq(), introduced by commit 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free"). This call does not add value as fbnic_enable_rcq(), which is called immediately afterward, already handles this. Although we do not support autoneg at this time, preserve tx_pause in .mac_link_up instead of fbnic_phylink_get_pauseparam() Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251113232610.1151712-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22eth: fbnic: fix integer overflow warning in TLV_MAX_DATA definitionPei Xiao
The TLV_MAX_DATA macro calculates (PAGE_SIZE - 512) which can exceed the maximum value of a 16-bit unsigned integer on architectures with large page sizes, causing compiler warnings: drivers/net/ethernet/meta/fbnic/fbnic_tlv.h:83:24: warning: conversion from 'long unsigned int' to 'short unsigned int' changes value from '261632' to '65024' [-Woverflow] Fix this by explicitly masking the result to 16 bits using bitwise AND with 0xFFFF, ensuring the value fits within the expected data type while maintaining the intended behavior for normal page sizes. This preserves the existing functionality while eliminating the compiler warning and potential undefined behavior from integer truncation. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510190832.3SQkTCHe-lkp@intel.com/ Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Link: https://patch.msgid.link/182b9d0235d044d69d7a57c1296cc6f46e395beb.1761039651.git.xiaopei01@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16net: fbnic: Allow builds for all 64 bit architecturesDimitri Daskalakis
This enables aarch64 testing, but there's no reason we cannot support other architectures. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251013211449.1377054-3-dimitri.daskalakis1@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-16net: fbnic: Fix page chunking logic when PAGE_SIZE > 4KDimitri Daskalakis
The HW always works on a 4K page size. When the OS supports larger pages, we fragment them across multiple BDQ descriptors. We were not properly incrementing the descriptor, which resulted in us specifying the last chunks id/addr and then 15 zero descriptors. This would cause packet loss and driver crashes. This is not a fix since the Kconfig prevents use outside of x86. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251013211449.1377054-2-dimitri.daskalakis1@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14eth: fbnic: fix various typos in comments and stringsAlok Tiwari
Fix several minor typos and grammatical errors in comments and log (in fbnic firmware, PCI, and time modules) Changes include: - "cordeump" -> "coredump" - "of" -> "off" in RPC config comment - "healty" -> "healthy" in firmware heartbeat comment - "Firmware crashed detected!" -> "Firmware crash detected!" - "The could be caused" -> "This could be caused" - "lockng" -> "locking" in fbnic_time.c Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251013160507.768820-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-09eth: fbnic: fix reporting of alloc_failed qstatsJakub Kicinski
Rx processing under normal circumstances has 3 rings - 2 buffer rings (heads, payloads) and a completion ring. All the rings have a struct fbnic_ring. Make sure we expose alloc_failed counter from the buffer rings, previously only the alloc_failed from the completion ring was reported, even tho all ring types may increment this counter (buffer rings in __fbnic_fill_bdq()). This makes the pp_alloc_fail.py test pass, it expects the qstat to be incrementing as page pool injections happen. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 67dc4eb5fc92 ("eth: fbnic: report software Rx queue stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-7-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-09eth: fbnic: fix saving stats from XDP_TX rings on closeJakub Kicinski
When rings are freed - stats get added to the device level stat structs. Save the stats from the XDP_TX ring just as Tx stats. Previously they would be saved to Rx and Tx stats. So we'd not see XDP_TX packets as Rx during runtime but after an down/up cycle the packets would appear in stats. Correct the helper used by ethtool code which does a runtime config switch. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 5213ff086344 ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-4-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-09eth: fbnic: fix accounting of XDP packetsJakub Kicinski
Make XDP-handled packets appear in the Rx stats. The driver has been counting XDP_TX packets on the Tx ring, but there wasn't much accounting on the Rx side (the Rx bytes appear to be incremented on XDP_TX but XDP_DROP / XDP_ABORT are only counted as Rx drops). Counting XDP_TX packets (not just bytes) in Rx stats looks like a simple bug of omission. The XDP_DROP handling appears to be intentional. Whether XDP_DROP packets should be counted in interface-level Rx stats is a bit unclear historically. When we were defining qstats, however, we clarified based on operational experience that in this context: name: rx-packets doc: | Number of wire packets successfully received and passed to the stack. For drivers supporting XDP, XDP is considered the first layer of the stack, so packets consumed by XDP are still counted here. fbnic does not obey this requirement. Since XDP support has been added in current release cycle, instead of splitting interface and qstat handling - make them both follow the qstat definition. Another small tweak here is that we count bytes as received on the wire rather than post-XDP bytes (xdp_get_buff_len() vs skb->len). Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 5213ff086344 ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-09eth: fbnic: fix missing programming of the default descriptorJakub Kicinski
XDP_TX typically uses no offloads. To optimize XDP we added a "default descriptor" feature to the chip, which allows us to send XDP frames with just the buffer descriptors (DMA address + length). All the metadata descriptors are derived from the queue config. Commit under Fixes missed adding setting the defaults up when transplanting the code from the prototype driver. Importantly after reset the "request completion" bit is not set. Packets still get sent but there's no completion, so ring is not cleaned up. We can send one ring's worth of packets and then will start dropping all frames that got the XDP_TX action from the XDP prog. Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 168deb7b31b2 ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251007232653.2099376-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-26eth: fbnic: Add support to read lane countMohsin Bashir
We are reporting the lane count in the link settings but the flag is not set to indicate that the driver supports lanes. Set the flag to report lane count. ~]# ethtool eth0 | grep Lanes Lanes: 2 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250924184445.2293325-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-26ethtool: add FEC bins histogram reportVadim Fedorenko
IEEE 802.3ck-2022 defines counters for FEC bins and 802.3df-2024 clarifies it a bit further. Implement reporting interface through as addition to FEC stats available in ethtool. Drivers can leave bin counter uninitialized if per-lane values are provided. In this case the core will recalculate summ for the bin. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20250924124037.1508846-2-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-25eth: fbnic: Read module EEPROMMohsin Bashir
Add support to read module EEPROM for fbnic. Towards this, add required support to issue a new command to the firmware and to receive the response to the corresponding command. Create a local copy of the data in the completion struct before writing to ethtool_module_eeprom to avoid writing to data in case it is freed. Given that EEPROM pages are small, the overhead of additional copy is negligible. Do not block API with explicit checks since API has appropriate checks in place for length, offset, and page. Explicitly check bank, page, offset, and length in fbnic_fw_parse_qsfp_read_resp() to match EEPROM read responses to the correct request. This is important because if the driver times out waiting for an EEPROM read response, a subsequent read request with different values is susceptible to receiving an erroneous response (i.e., the response to the previous request). Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250922231855.3717483-1-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: add OTP health reporterJakub Kicinski
OTP memory ("fuses") are used for secure boot and anti-rollback protection. The OTP memory is ECC protected. Check for its health periodically to notice when the chip is starting to go bad. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-10-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: report FW uptime in health diagnoseJakub Kicinski
FW crashes are detected based on uptime going back, expose the uptime via devlink health diagnose. $ devlink -j health diagnose pci/0000:01:00.0 reporter fw {"last_heartbeat":{"fw_uptime":{"sec":201,"msec":76}}} $ devlink -j health diagnose pci/0000:01:00.0 reporter fw last_heartbeat: fw_uptime: sec: 201 msec: 76 Reviewed-by: Lee Trager <lee@trager.us> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-9-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: add FW health reporterJakub Kicinski
Add a health reporter to catch FW crashes. Dumping the reporter if FW has not crashed will create a snapshot of FW memory. Reviewed-by: Lee Trager <lee@trager.us> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-8-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: support FW communication for core dumpJakub Kicinski
To read FW core dump we need to issue two commands to FW: - first get the FW core dump info - second read the dump chunk by chunk Implement these two FW commands. Subsequent commits will use them to expose FW dump via devlink heath. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-7-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: support allocating FW completions with extra spaceJakub Kicinski
Support allocating extra space after the FW completion. This makes it easy to pass extra variable size buffer space to FW response handlers without worrying about synchronization (completion itself is already refcounted). Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-6-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: reprogram TCAMs after FW crashJakub Kicinski
FW may mess with the TCAM after it boots, to try to restore the traffic flow to the BMC (it may not be aware that the host is already up). Make sure that we reprogram the TCAMs after detecting a crash. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-5-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: factor out clearing the action TCAMJakub Kicinski
We'll want to wipe the driver TCAM state after FW crash, to force a re-programming. Factor out the clearing logic. Remove the micro- -optimization to skip clearing the BMC entry twice, it doesn't hurt. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-4-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: use fw uptime to detect fw crashesJakub Kicinski
Currently we only detect FW crashes when it stops responding to heartbeat messages. FW has a watchdog which will reset it in case of crashes. Use FW uptime sent in the ownership and heartbeat messages to detect that the watchdog has fired (uptime went down). Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: make fbnic_fw_log_write() parameter constJakub Kicinski
Make the log message parameter const, it's not modified and this lets us pass in strings which are const for the caller. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916231420.1693955-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-18eth: fbnic: support devmem TxJakub Kicinski
Support devmem Tx. We already use skb_frag_dma_map(), we just need to make sure we don't try to unmap the frags. Check if frag is unreadable and mark the ring entry. # ./tools/testing/selftests/drivers/net/hw/devmem.py TAP version 13 1..3 ok 1 devmem.check_rx ok 2 devmem.check_tx ok 3 devmem.check_tx_chunks # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Acked-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250916145401.1464550-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-09eth: fbnic: support persistent NAPI configJakub Kicinski
No shenanigans in this driver, AFAIU, pass the vector index to NAPI registration. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250905022254.2635707-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-04eth: fbnic: support queue ops / zero-copy RxJakub Kicinski
Support queue ops. fbnic doesn't shut down the entire device just to restart a single queue. ./tools/testing/selftests/drivers/net/hw/iou-zcrx.py TAP version 13 1..3 ok 1 iou-zcrx.test_zcrx ok 2 iou-zcrx.test_zcrx_oneshot ok 3 iou-zcrx.test_zcrx_rss # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Acked-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250901211214.1027927-15-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>