linux-toradex.git/drivers/infiniband/ulp, branch v6.19

RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation

2025-12-30T10:47:32+00:00

If device max_mr_size bits in the range [mr_page_shift+31:mr_page_shift]
are zero, the `min3` function will set clt_path::max_pages_per_mr to
zero.

`alloc_path_reqs` will pass zero, which is invalid, as the third parameter
to `ib_alloc_mr`.

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Honggang LI 
Link: https://patch.msgid.link/20251229025617.13241-1-honggangli@163.com
Signed-off-by: Leon Romanovsky

RTRS/rtrs: clean up rtrs headers kernel-doc

2025-12-17T17:47:07+00:00

Fix all (30+) kernel-doc warnings in rtrs.h and rtrs-pri.h.  The changes
are:

- add ending ':' to enum member names
- change enum description separators from '-' to ':'
- add "struct" keyword to kernel-doc for structs where missing
- fix enum names in enum rtrs_clt_con_type
- add a '-' separator and drop the "()" in enum rtrs_clt_con_type
- convert struct rtrs_addr to kernel-doc format
- add missing struct member descriptions for struct rtrs_attrs

Link: https://patch.msgid.link/r/20251129022146.1498273-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap 
Acked-by: Jack Wang 
Signed-off-by: Jason Gunthorpe

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

2025-12-05T02:54:37+00:00

Pull rdma updates from Jason Gunthorpe:
 "This has another new RDMA driver 'bng_en' for latest generation
  Broadcom NICs. There might be one more new driver still to come.

  Otherwise it is a fairly quite cycle. Summary:

   - Minor driver bug fixes and updates to cxgb4, rxe, rdmavt, bnxt_re,
     mlx5

   - Many bug fix patches for irdma

   - WQ_PERCPU annotations and system_dfl_wq changes

   - Improved mlx5 support for "other eswitches" and multiple PFs

   - 1600Gbps link speed reporting support. Four Digits Now!

   - New driver bng_en for latest generation Broadcom NICs

   - Bonding support for hns

   - Adjust mlx5's hmm based ODP to work with the very large address
     space created by the new 5 level paging default on x86

   - Lockdep fixups in rxe and siw"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (65 commits)
  RDMA/rxe: reclassify sockets in order to avoid false positives from lockdep
  RDMA/siw: reclassify sockets in order to avoid false positives from lockdep
  RDMA/bng_re: Remove prefetch instruction
  RDMA/core: Reduce cond_resched() frequency in __ib_umem_release
  RDMA/irdma: Fix SRQ shadow area address initialization
  RDMA/irdma: Remove doorbell elision logic
  RDMA/irdma: Do not set IBK_LOCAL_DMA_LKEY for GEN3+
  RDMA/irdma: Do not directly rely on IB_PD_UNSAFE_GLOBAL_RKEY
  RDMA/irdma: Add missing mutex destroy
  RDMA/irdma: Fix SIGBUS in AEQ destroy
  RDMA/irdma: Add a missing kfree of struct irdma_pci_f for GEN2
  RDMA/irdma: Fix data race in irdma_free_pble
  RDMA/irdma: Fix data race in irdma_sc_ccq_arm
  RDMA/mlx5: Add support for 1600_8x lane speed
  RDMA/core: Add new IB rate for XDR (8x) support
  IB/mlx5: Reduce IMR KSM size when 5-level paging is enabled
  RDMA/bnxt_re: Pass correct flag for dma mr creation
  RDMA/bnxt_re: Fix the inline size for GenP7 devices
  RDMA/hns: Support reset recovery for bond
  RDMA/hns: Support link state reporting for bond
  ...

RDMA/rtrs: server: Fix error handling in get_or_create_srv

2025-11-10T08:09:41+00:00

After device_initialize() is called, use put_device() to release the
device according to kernel device management rules. While direct
kfree() work in this case, using put_device() is more correct.

Found by code review.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Ma Ke 
Link: https://patch.msgid.link/20251110005158.13394-1-make24@iscas.ac.cn
Acked-by: Jack Wang 
Signed-off-by: Leon Romanovsky

IB/isert: add WQ_PERCPU to alloc_workqueue users

2025-11-09T11:30:58+00:00

Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo 
Signed-off-by: Marco Crivellari 
Link: https://patch.msgid.link/20251107133626.190952-1-marco.crivellari@suse.com
Signed-off-by: Leon Romanovsky

IB/iser: add WQ_PERCPU to alloc_workqueue users

2025-11-09T11:30:48+00:00

Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo 
Signed-off-by: Marco Crivellari 
Link: https://patch.msgid.link/20251107133306.187939-1-marco.crivellari@suse.com
Signed-off-by: Leon Romanovsky

IB/IPoIB: Add support for hwtstamp get/set ndos

2025-10-31T23:36:43+00:00

Add support for the ndo_hwtstamp_get and ndo_hwtstamp_set operations in
IPoIB. This allows lower devices to handle hardware timestamp
configuration through the new ndos instead of the legacy ioctls.

Signed-off-by: Carolina Jubran 
Signed-off-by: Tariq Toukan 
Link: https://patch.msgid.link/1761819910-1011051-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski

RDMA: Use %pe format specifier for error pointers

2025-09-21T11:34:49+00:00

Convert error logging throughout the RDMA subsystem to use
the %pe format specifier instead of PTR_ERR() with integer
format specifiers.

Link: https://patch.msgid.link/e81ec02df1e474be20417fb62e779776e3f47a50.1758217936.git.leon@kernel.org
Reviewed-by: Zhu Yanjun 
Signed-off-by: Leon Romanovsky

IB/ipoib: Ignore L3 master device

2025-09-18T09:20:48+00:00

Currently, all master upper netdevices (e.g., bond, VRF) are treated
equally.

When a VRF netdevice is used over an IPoIB netdevice, the expected
netdev resolution is on the lower IPoIB device which has the IP address
assigned to it and not the VRF device.

The rdma_cm module (CMA) tries to match incoming requests to a
particular netdevice. When successful, it also validates that the return
path points to the same device by performing a routing table lookup.
Currently, the former would resolve to the VRF netdevice, while the
latter to the correct lower IPoIB netdevice, leading to failure in
rdma_cm.

Improve this by ignoring the VRF master netdevice, if it exists, and
instead return the lower IPoIB device.

Signed-off-by: Vlad Dumitrescu 
Reviewed-by: Parav Pandit 
Signed-off-by: Edward Srouji 
Link: https://patch.msgid.link/20250916111103.84069-5-edwards@nvidia.com
Signed-off-by: Leon Romanovsky

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

2025-07-31T19:19:55+00:00

Pull rdma updates from Jason Gunthorpe:

 - Various minor code cleanups and fixes for hns, iser, cxgb4, hfi1,
   rxe, erdma, mana_ib

 - Prefetch supprot for rxe ODP

 - Remove memory window support from hns as new device FW is no longer
   support it

 - Remove qib, it is very old and obsolete now, Cornelis wishes to
   restructure the hfi1/qib shared layer

 - Fix a race in destroying CQs where we can still end up with work
   running because the work is cancled before the driver stops
   triggering it

 - Improve interaction with namespaces:
     * Follow the devlink namespace for newly spawned RDMA devices
     * Create iopoib net devces in the parent IB device's namespace
     * Allow CAP_NET_RAW checks to pass in user namespaces

 - A new flow control scheme for IB MADs to try and avoid queue
   overflows in the network

 - Fix 2G message sizes in bnxt_re

 - Optimize mkey layout for mlx5 DMABUF

 - New "DMA Handle" concept to allow controlling PCI TPH and steering
   tags

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (71 commits)
  RDMA/siw: Change maintainer email address
  RDMA/mana_ib: add support of multiple ports
  RDMA/mlx5: Refactor optional counters steering code
  RDMA/mlx5: Add DMAH support for reg_user_mr/reg_user_dmabuf_mr
  IB: Extend UVERBS_METHOD_REG_MR to get DMAH
  RDMA/mlx5: Add DMAH object support
  RDMA/core: Introduce a DMAH object and its alloc/free APIs
  IB/core: Add UVERBS_METHOD_REG_MR on the MR object
  net/mlx5: Add support for device steering tag
  net/mlx5: Expose IFC bits for TPH
  PCI/TPH: Expose pcie_tph_get_st_table_size()
  RDMA/mlx5: Fix incorrect MKEY masking
  RDMA/mlx5: Fix returned type from _mlx5r_umr_zap_mkey()
  RDMA/mlx5: remove redundant check on err on return expression
  RDMA/mana_ib: add additional port counters
  RDMA/mana_ib: Fix DSCP value in modify QP
  RDMA/efa: Add CQ with external memory support
  RDMA/core: Add umem "is_contiguous" and "start_dma_addr" helpers
  RDMA/uverbs: Add a common way to create CQ with umem
  RDMA/mlx5: Optimize DMABUF mkey page size
  ...