summaryrefslogtreecommitdiff
path: root/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
blob: 1e82f90d9ad2f5a12c8bb4f02a8752f10c1a4eb4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
.. SPDX-License-Identifier: GPL-2.0+

=====================================
Meta Platforms Host Network Interface
=====================================

Firmware Versions
-----------------

fbnic has three components stored on the flash which are provided in one PLDM
image:

1. fw - The control firmware used to view and modify firmware settings, request
   firmware actions, and retrieve firmware counters outside of the data path.
   This is the firmware which fbnic_fw.c interacts with.
2. bootloader - The firmware which validate firmware security and control basic
   operations including loading and updating the firmware. This is also known
   as the cmrt firmware.
3. undi - This is the UEFI driver which is based on the Linux driver.

fbnic stores two copies of these three components on flash. This allows fbnic
to fall back to an older version of firmware automatically in case firmware
fails to boot. Version information for both is provided as running and stored.
The undi is only provided in stored as it is not actively running once the Linux
driver takes over.

devlink dev info provides version information for all three components. In
addition to the version the hg commit hash of the build is included as a
separate entry.

Configuration
-------------

Ringparams (ethtool -g / -G)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

fbnic has two submission (host -> device) rings for every completion
(device -> host) ring. The three ring objects together form a single
"queue" as used by higher layer software (a Rx, or a Tx queue).

For Rx the two submission rings are used to pass empty pages to the NIC.
Ring 0 is the Header Page Queue (HPQ), NIC will use its pages to place
L2-L4 headers (or full frames if frame is not header-data split).
Ring 1 is the Payload Page Queue (PPQ) and used for packet payloads.
The completion ring is used to receive packet notifications / metadata.
ethtool ``rx`` ringparam maps to the size of the completion ring,
``rx-mini`` to the HPQ, and ``rx-jumbo`` to the PPQ.

For Tx both submission rings can be used to submit packets, the completion
ring carries notifications for both. fbnic uses one of the submission
rings for normal traffic from the stack and the second one for XDP frames.
ethtool ``tx`` ringparam controls both the size of the submission rings
and the completion ring.

Every single entry on the HPQ and PPQ (``rx-mini``, ``rx-jumbo``)
corresponds to 4kB of allocated memory, while entries on the remaining
rings are in units of descriptors (8B). The ideal ratio of submission
and completion ring sizes will depend on the workload, as for small packets
multiple packets will fit into a single page.

Upgrading Firmware
------------------

fbnic supports updating firmware using signed PLDM images with devlink dev
flash. PLDM images are written into the flash. Flashing does not interrupt
the operation of the device.

On host boot the latest UEFI driver is always used, no explicit activation
is required. Firmware activation is required to run new control firmware. cmrt
firmware can only be activated by power cycling the NIC.

Health reporters
----------------

fw reporter
~~~~~~~~~~~

The ``fw`` health reporter tracks FW crashes. Dumping the reporter will
show the core dump of the most recent FW crash, and if no FW crash has
happened since power cycle - a snapshot of the FW memory. Diagnose callback
shows FW uptime based on the most recently received heartbeat message
(the crashes are detected by checking if uptime goes down).

otp reporter
~~~~~~~~~~~~

OTP memory ("fuses") are used for secure boot and anti-rollback
protection. The OTP memory is ECC protected, ECC errors indicate
either manufacturing defect or part deteriorating with age.

Statistics
----------

TX MAC Interface
~~~~~~~~~~~~~~~~

 - ``ptp_illegal_req``: packets sent to the NIC with PTP request bit set but routed to BMC/FW
 - ``ptp_good_ts``: packets successfully routed to MAC with PTP request bit set
 - ``ptp_bad_ts``: packets destined for MAC with PTP request bit set but aborted because of some error (e.g., DMA read error)

TX Extension (TEI) Interface (TTI)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 - ``tti_cm_drop``: control messages dropped at the TX Extension (TEI) Interface because of credit starvation
 - ``tti_frame_drop``: packets dropped at the TX Extension (TEI) Interface because of credit starvation
 - ``tti_tbi_drop``: packets dropped at the TX BMC Interface (TBI) because of credit starvation

RXB (RX Buffer) Enqueue
~~~~~~~~~~~~~~~~~~~~~~~

 - ``rxb_integrity_err[i]``: frames enqueued with integrity errors (e.g., multi-bit ECC errors) on RXB input i
 - ``rxb_mac_err[i]``: frames enqueued with MAC end-of-frame errors (e.g., bad FCS) on RXB input i
 - ``rxb_parser_err[i]``: frames experienced RPC parser errors
 - ``rxb_frm_err[i]``: frames experienced signaling errors (e.g., missing end-of-packet/start-of-packet) on RXB input i
 - ``rxb_drbo[i]_frames``: frames received at RXB input i
 - ``rxb_drbo[i]_bytes``: bytes received at RXB input i

RXB (RX Buffer) FIFO
~~~~~~~~~~~~~~~~~~~~

 - ``rxb_fifo[i]_drop``: transitions into the drop state on RXB pool i
 - ``rxb_fifo[i]_dropped_frames``: frames dropped on RXB pool i
 - ``rxb_fifo[i]_ecn``: transitions into the ECN mark state on RXB pool i
 - ``rxb_fifo[i]_level``: current occupancy of RXB pool i

RXB (RX Buffer) Dequeue
~~~~~~~~~~~~~~~~~~~~~~~

   - ``rxb_intf[i]_frames``: frames sent to the output i
   - ``rxb_intf[i]_bytes``: bytes sent to the output i
   - ``rxb_pbuf[i]_frames``: frames sent to output i from the perspective of internal packet buffer
   - ``rxb_pbuf[i]_bytes``: bytes sent to output i from the perspective of internal packet buffer

RPC (Rx parser)
~~~~~~~~~~~~~~~

 - ``rpc_unkn_etype``: frames containing unknown EtherType
 - ``rpc_unkn_ext_hdr``: frames containing unknown IPv6 extension header
 - ``rpc_ipv4_frag``: frames containing IPv4 fragment
 - ``rpc_ipv6_frag``: frames containing IPv6 fragment
 - ``rpc_ipv4_esp``: frames with IPv4 ESP encapsulation
 - ``rpc_ipv6_esp``: frames with IPv6 ESP encapsulation
 - ``rpc_tcp_opt_err``: frames which encountered TCP option parsing error
 - ``rpc_out_of_hdr_err``: frames where header was larger than parsable region
 - ``ovr_size_err``: oversized frames

Hardware Queues
~~~~~~~~~~~~~~~

1. RX DMA Engine:

 - ``rde_[i]_pkt_err``: packets with MAC EOP, RPC parser, RXB truncation, or RDE frame truncation errors. These error are flagged in the packet metadata because of cut-through support but the actual drop happens once PCIE/RDE is reached.
 - ``rde_[i]_pkt_cq_drop``: packets dropped because RCQ is full
 - ``rde_[i]_pkt_bdq_drop``: packets dropped because HPQ or PPQ ran out of host buffer

PCIe
~~~~

The fbnic driver exposes PCIe hardware performance statistics through debugfs
(``pcie_stats``). These statistics provide insights into PCIe transaction
behavior and potential performance bottlenecks.

1. PCIe Transaction Counters:

   These counters track PCIe transaction activity:
        - ``pcie_ob_rd_tlp``: Outbound read Transaction Layer Packets count
        - ``pcie_ob_rd_dword``: DWORDs transferred in outbound read transactions
        - ``pcie_ob_wr_tlp``: Outbound write Transaction Layer Packets count
        - ``pcie_ob_wr_dword``: DWORDs transferred in outbound write
	  transactions
        - ``pcie_ob_cpl_tlp``: Outbound completion TLP count
        - ``pcie_ob_cpl_dword``: DWORDs transferred in outbound completion TLPs

2. PCIe Resource Monitoring:

   These counters indicate PCIe resource exhaustion events:
        - ``pcie_ob_rd_no_tag``: Read requests dropped due to tag unavailability
        - ``pcie_ob_rd_no_cpl_cred``: Read requests dropped due to completion
	  credit exhaustion
        - ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
	  credit exhaustion

XDP Length Error:
~~~~~~~~~~~~~~~~~

For XDP programs without frags support, fbnic tries to make sure that MTU fits
into a single buffer. If an oversized frame is received and gets fragmented,
it is dropped and the following netlink counters are updated

   - ``rx-length``: number of frames dropped due to lack of fragmentation
     support in the attached XDP program
   - ``rx-errors``: total number of packets with errors received on the interface