diff options
| author | Ankit Agrawal <ankita@nvidia.com> | 2026-01-15 20:28:49 +0000 |
|---|---|---|
| committer | Alex Williamson <alex@shazbot.org> | 2026-01-19 10:06:31 -0700 |
| commit | e5f19b619fa0b691ccb537d72240bd20eb72087c (patch) | |
| tree | 6b40a5f1d1f77b39cd5001c8b4963c10a1de471b /include | |
| parent | 205e6d17cdf5b7f7b221bf64be9850eabce429c9 (diff) | |
vfio/nvgrace-gpu: register device memory for poison handling
The nvgrace-gpu module [1] maps the device memory to the user VA (Qemu)
without adding the memory to the kernel. The device memory pages are PFNMAP
and not backed by struct page. The module can thus utilize the MM's PFNMAP
memory_failure mechanism that handles ECC/poison on regions with no struct
pages.
The kernel MM code exposes register/unregister APIs allowing modules to
register the device memory for memory_failure handling. Make nvgrace-gpu
register the GPU memory with the MM on open.
The module registers its memory region, the address_space with the
kernel MM for ECC handling and implements a callback function to convert
the PFN to the file page offset. The callback functions checks if the
PFN belongs to the device memory region and is also contained in the
VMA range, an error is returned otherwise.
Link: https://lore.kernel.org/all/20240220115055.23546-1-ankita@nvidia.com/ [1]
Suggested-by: Alex Williamson <alex@shazbot.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Reviewed-by: Jiaqi Yan <jiaqiyan@google.com>
Link: https://lore.kernel.org/r/20260115202849.2921-3-ankita@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
Diffstat (limited to 'include')
0 files changed, 0 insertions, 0 deletions
