linux-toradex.git/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h, branch v6.16-rc5

drm/amdkfd: Fix pasid value leak

2025-02-13T02:05:50+00:00

Curret kfd does not allocate pasid values, instead uses pasid value for each
vm from graphic driver. So should not prevent graphic driver from releasing
pasid values since the values are allocated by graphic driver, not kfd driver
anymore. This patch does not stop graphic driver release pasid values.

Fixes: 8544374c0f82 ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver")
Signed-off-by: Xiaogang Chen 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher

drm/amdgpu: Unlocked unmap only clear page table leaves

2025-02-13T02:05:49+00:00

SVM migration unmap pages from GPU and then update mapping to GPU to
recover page fault. Currently unmap clears the PDE entry for range
length >= huge page and free PTB bo, update mapping to alloc new PT bo.
There is race bug that the freed entry bo maybe still on the pt_free
list, reused when updating mapping and then freed, leave invalid PDE
entry and cause GPU page fault.

By setting the update to clear only one PDE entry or clear PTB, to
avoid unmap to free PTE bo. This fixes the race bug and improve the
unmap and map to GPU performance. Update mapping to huge page will
still free the PTB bo.

With this change, the vm->pt_freed list and work is not needed. Add
WARN_ON(unlocked) in amdgpu_vm_pt_free_dfs to catch if unmap to free the
PTB.

Signed-off-by: Philip Yang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher

drm/amdgpu: track bo memory stats at runtime

2024-12-19T15:56:28+00:00

Before, every time fdinfo is queried we try to lock all the BOs in the
VM and calculate memory usage from scratch. This works okay if the
fdinfo is rarely read and the VMs don't have a ton of BOs. If either of
these conditions is not true, we get a massive performance hit.

In this new revision, we track the BOs as they change states. This way
when the fdinfo is queried we only need to take the status lock and copy
out the usage stats with minimal impact to the runtime performance. With
this new approach however, we would no longer be able to track active
buffers.

Signed-off-by: Yunxiang Li 
Reviewed-by: Christian König 
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-6-Yunxiang.Li@amd.com
Signed-off-by: Christian König

drm/amdgpu: remove unused function parameter

2024-12-19T15:56:25+00:00

amdgpu_vm_bo_invalidate doesn't use the adev parameter and not all
callers have a reference to adev handy, so remove it for cleanliness.

Signed-off-by: Yunxiang Li 
Reviewed-by: Christian König 
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-5-Yunxiang.Li@amd.com
Signed-off-by: Christian König

drm/amdgpu: stop tracking visible memory stats

2024-11-04T16:32:40+00:00

Since on modern systems all of vram can be made visible anyways, to
simplify the new implementation, drops tracking how much memory is
visible for now. If this is really needed we can add it back on top of
the new implementation, or just report all the BOs as visible.

Signed-off-by: Yunxiang Li 
Reviewed-by: Christian König 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Alex Deucher

drm/amdgpu: Use drm_print_memory_stats helper from fdinfo

2024-10-08T13:43:25+00:00

Convert fdinfo memory stats to use the common drm_print_memory_stats
helper.

This achieves alignment with the common keys as documented in
drm-usage-stats.rst, adding specifically drm-total- key the driver was
missing until now.

Additionally I made the code stop skipping total size for objects which
currently do not have a backing store, and I added resident, active and
purgeable reporting.

Legacy keys have been preserved, with the outlook of only potentially
removing only the drm-memory- when the time gets right.

The example output now looks like this:

 pos:	0
 flags:	02100002
 mnt_id:	24
 ino:	1239
 drm-driver:	amdgpu
 drm-client-id:	4
 drm-pdev:	0000:04:00.0
 pasid:	32771
 drm-total-cpu:	0
 drm-shared-cpu:	0
 drm-active-cpu:	0
 drm-resident-cpu:	0
 drm-purgeable-cpu:	0
 drm-total-gtt:	2392 KiB
 drm-shared-gtt:	0
 drm-active-gtt:	0
 drm-resident-gtt:	2392 KiB
 drm-purgeable-gtt:	0
 drm-total-vram:	44564 KiB
 drm-shared-vram:	31952 KiB
 drm-active-vram:	0
 drm-resident-vram:	44564 KiB
 drm-purgeable-vram:	0
 drm-memory-vram:	44564 KiB
 drm-memory-gtt: 	2392 KiB
 drm-memory-cpu: 	0 KiB
 amd-memory-visible-vram:	44564 KiB
 amd-evicted-vram:	0 KiB
 amd-evicted-visible-vram:	0 KiB
 amd-requested-vram:	44564 KiB
 amd-requested-visible-vram:	11952 KiB
 amd-requested-gtt:	2392 KiB
 drm-engine-compute:	46464671 ns

v2:
 * Track purgeable via AMDGPU_GEM_CREATE_DISCARDABLE.

Acked-by: Daniel Vetter 
Reviewed-by: Alex Deucher 
Signed-off-by: Tvrtko Ursulin 
Cc: Alex Deucher 
Cc: Christian König 
Cc: Daniel Vetter 
Cc: Rob Clark 
Signed-off-by: Alex Deucher

drm/amdgpu: re-work VM syncing

2024-09-06T21:36:24+00:00

Rework how VM operations synchronize to submissions. Provide an
amdgpu_sync container to the backends instead of an reservation
object and fill in the amdgpu_sync object in the higher layers
of the code.

No intended functional change, just prepares for upcomming changes.

Signed-off-by: Christian König 
Reviewed-by: Friedrich Vock 
Acked-by: Felix Kuehling 
Signed-off-by: Alex Deucher

drm/amdkfd: Change kfd/svm page fault drain handling

2024-08-23T14:55:13+00:00

When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and
not handle any incoming pages fault of this process until a deferred work item
got executed by default system wq. The time period of "not handle page fault"
can be long and is unpredicable. That is advese to kfd performance on page
faults recovery.

This patch uses time stamp of incoming page fault to decide to drop or recover
page fault. When app unmap vm ranges kfd records each gpu device's ih ring
current time stamp. These time stamps are used at kfd page fault recovery
routine.

Any page fault happened on unmapped ranges after unmap events is application
bug that accesses vm range after unmap. It is not driver work to cover that.

By using time stamp of page fault do not need drain page faults at deferred
work. So, the time period that kfd does not handle page faults is reduced
and can be controlled.

Signed-off-by: Xiaogang.Chen 
Reviewed-by: Philip Yang 
Signed-off-by: Alex Deucher

drm/amdgpu: add additional VM bits

2024-06-14T19:20:56+00:00

Add additional VM PTE bits.

Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher

drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10

2024-06-05T15:25:13+00:00

This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10,
clear the bits before setting the new one.

Suggested-by: Alex Deucher 
Signed-off-by: longlyao 
Signed-off-by: Shane Xiao 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher