<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/infiniband/hw/mana, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>RDMA/mana: Fix error unwind in mana_ib_create_qp_rss()</title>
<updated>2026-05-02T18:30:48+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-04-28T16:17:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6aaa978c6b6218cfac15fe1dab17c76fe229ce3f'/>
<id>6aaa978c6b6218cfac15fe1dab17c76fe229ce3f</id>
<content type='text'>
Sashiko points out that mana_ib_cfg_vport_steering() is leaked, the normal
destroy path cleans it up.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/7-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko points out that mana_ib_cfg_vport_steering() is leaked, the normal
destroy path cleans it up.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=4
Link: https://patch.msgid.link/r/7-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana: Fix mana_destroy_wq_obj() cleanup in mana_ib_create_qp_rss()</title>
<updated>2026-05-02T18:30:48+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-04-28T16:17:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=34ecf795692ee57c393109f4a24ccc313091e137'/>
<id>34ecf795692ee57c393109f4a24ccc313091e137</id>
<content type='text'>
Sashiko points out there are two bugs here in the error unwind flow, both
related to how the WQ table is unwound.

First there is a double i-- on the first failure path due to the while loop
having a i--, remove it.

Second if mana_ib_install_cq_cb() fails then mana_create_wq_obj() is not
undone due to the above i--.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/6-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko points out there are two bugs here in the error unwind flow, both
related to how the WQ table is unwound.

First there is a double i-- on the first failure path due to the while loop
having a i--, remove it.

Second if mana_ib_install_cq_cb() fails then mana_create_wq_obj() is not
undone due to the above i--.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/6-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss()</title>
<updated>2026-05-02T18:30:48+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-04-28T16:17:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=159f2efabc89d3f931d38f2d35876535d4abf0a3'/>
<id>159f2efabc89d3f931d38f2d35876535d4abf0a3</id>
<content type='text'>
Sashiko points out that the user can specify WQs sharing the same CQ as a
part of the uAPI and this will trigger the WARN_ON() then go on to corrupt
the kernel.

Just reject it outright and fail the QP creation.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/5-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko points out that the user can specify WQs sharing the same CQ as a
part of the uAPI and this will trigger the WARN_ON() then go on to corrupt
the kernel.

Just reject it outright and fail the QP creation.

Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/5-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana: Validate rx_hash_key_len</title>
<updated>2026-05-02T18:30:48+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-04-28T16:17:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6dd2d4ad9c8429523b1c220c5132bd551c006425'/>
<id>6dd2d4ad9c8429523b1c220c5132bd551c006425</id>
<content type='text'>
Sashiko points out that rx_hash_key_len comes from a uAPI structure and is
blindly passed to memcpy, allowing the userspace to trash kernel
memory. Bounds check it so the memcpy cannot overflow.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/4-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko points out that rx_hash_key_len comes from a uAPI structure and is
blindly passed to memcpy, allowing the userspace to trash kernel
memory. Bounds check it so the memcpy cannot overflow.

Cc: stable@vger.kernel.org
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/4-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: Support memory windows</title>
<updated>2026-04-09T14:22:06+00:00</updated>
<author>
<name>Konstantin Taranov</name>
<email>kotaranov@microsoft.com</email>
</author>
<published>2026-03-31T09:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b21058880c454a06eeb0d146cd08e80b00caacb4'/>
<id>b21058880c454a06eeb0d146cd08e80b00caacb4</id>
<content type='text'>
Implement .alloc_mw() and .dealloc_mw() for mana device.

This is just the basic infrastructure, MW is not practically usable until
additional kernel support for allowing user space to submit MW work
requests is completed.

Link: https://patch.msgid.link/r/20260331090851.2276205-1-kotaranov@linux.microsoft.com
Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement .alloc_mw() and .dealloc_mw() for mana device.

This is just the basic infrastructure, MW is not practically usable until
additional kernel support for allowing user space to submit MW work
requests is completed.

Link: https://patch.msgid.link/r/20260331090851.2276205-1-kotaranov@linux.microsoft.com
Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA: Remove redundant = {} for udata req structs</title>
<updated>2026-03-31T07:18:16+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-03-25T21:27:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e3e07cca00434d6430e44c310b2ab2965dfb794'/>
<id>8e3e07cca00434d6430e44c310b2ab2965dfb794</id>
<content type='text'>
Now that all of the udata request structs are loaded with the helpers
the callers should not pre-zero them. The helpers all guarantee that
the entire struct is filled with something.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that all of the udata request structs are loaded with the helpers
the callers should not pre-zero them. The helpers all guarantee that
the entire struct is filled with something.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA: Consolidate patterns with sizeof() to ib_copy_validate_udata_in()</title>
<updated>2026-03-31T07:11:01+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-03-25T21:26:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e910d98dc440f1d9965289f52d455dc7a81fae82'/>
<id>e910d98dc440f1d9965289f52d455dc7a81fae82</id>
<content type='text'>
Similar to the prior patch, these patterns are open coding an
offsetofend() using sizeof(), which targets the last member of the
current struct.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Bernard Metzler &lt;bernard.metzler@linux.dev&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to the prior patch, these patterns are open coding an
offsetofend() using sizeof(), which targets the last member of the
current struct.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Bernard Metzler &lt;bernard.metzler@linux.dev&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA: Consolidate patterns with offsetof() to ib_copy_validate_udata_in()</title>
<updated>2026-03-31T07:11:01+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-03-25T21:26:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8d7573b194021b0b2baee9a8c563af488be733be'/>
<id>8d7573b194021b0b2baee9a8c563af488be733be</id>
<content type='text'>
Similar to the prior patch, these patterns are open coding an
offsetofend(). The use of offsetof() targets the prior field as the
last field in the struct.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to the prior patch, these patterns are open coding an
offsetofend(). The use of offsetof() targets the prior field as the
last field in the struct.

Reviewed-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: Disable RX steering on RSS QP destroy</title>
<updated>2026-03-30T17:47:45+00:00</updated>
<author>
<name>Long Li</name>
<email>longli@microsoft.com</email>
</author>
<published>2026-03-25T19:40:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dbeb256e8dd87233d891b170c0b32a6466467036'/>
<id>dbeb256e8dd87233d891b170c0b32a6466467036</id>
<content type='text'>
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss()
destroys the RX WQ objects but does not disable vPort RX steering in
firmware. This leaves stale steering configuration that still points to
the destroyed RX objects.

If traffic continues to arrive (e.g. peer VM is still transmitting) and
the VF interface is subsequently brought up (mana_open), the firmware
may deliver completions using stale CQ IDs from the old RX objects.
These CQ IDs can be reused by the ethernet driver for new TX CQs,
causing RX completions to land on TX CQs:

  WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana]  (is_sq == false)
  WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails)

Fix this by disabling vPort RX steering before destroying RX WQ objects.
Note that mana_fence_rqs() cannot be used here because the fence
completion is delivered on the CQ, which is polled by user-mode (e.g.
DPDK) and not visible to the kernel driver.

Refactor the disable logic into a shared mana_disable_vport_rx() in
mana_en, exported for use by mana_ib, replacing the duplicate code.
The ethernet driver's mana_dealloc_queues() is also updated to call
this common function.

Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss()
destroys the RX WQ objects but does not disable vPort RX steering in
firmware. This leaves stale steering configuration that still points to
the destroyed RX objects.

If traffic continues to arrive (e.g. peer VM is still transmitting) and
the VF interface is subsequently brought up (mana_open), the firmware
may deliver completions using stale CQ IDs from the old RX objects.
These CQ IDs can be reused by the ethernet driver for new TX CQs,
causing RX completions to land on TX CQs:

  WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana]  (is_sq == false)
  WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails)

Fix this by disabling vPort RX steering before destroying RX WQ objects.
Note that mana_fence_rqs() cannot be used here because the fence
completion is delivered on the CQ, which is polled by user-mode (e.g.
DPDK) and not visible to the kernel driver.

Refactor the disable logic into a shared mana_disable_vport_rx() in
mana_en, exported for use by mana_ib, replacing the duplicate code.
The ethernet driver's mana_dealloc_queues() is also updated to call
this common function.

Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mana_ib: cleanup the usage of mana_gd_send_request()</title>
<updated>2026-03-30T17:47:43+00:00</updated>
<author>
<name>Konstantin Taranov</name>
<email>kotaranov@microsoft.com</email>
</author>
<published>2026-03-18T17:39:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5aa437c93d90f486ff5428183323c971cfeb4294'/>
<id>5aa437c93d90f486ff5428183323c971cfeb4294</id>
<content type='text'>
Do not check the status of the response header returned by mana_gd_send_request(),
as the returned error code already indicates the request status.

The mana_gd_send_request() may return no error code and have the response status
GDMA_STATUS_MORE_ENTRIES, which is a successful completion. It is used
for checking the correctness of multi-request operations, such as creation of
a dma region with mana_ib_gd_create_dma_region().

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://patch.msgid.link/20260318173939.1417856-1-kotaranov@linux.microsoft.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Do not check the status of the response header returned by mana_gd_send_request(),
as the returned error code already indicates the request status.

The mana_gd_send_request() may return no error code and have the response status
GDMA_STATUS_MORE_ENTRIES, which is a successful completion. It is used
for checking the correctness of multi-request operations, such as creation of
a dma region with mana_ib_gd_create_dma_region().

Signed-off-by: Konstantin Taranov &lt;kotaranov@microsoft.com&gt;
Link: https://patch.msgid.link/20260318173939.1417856-1-kotaranov@linux.microsoft.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
