| Age | Commit message (Collapse) | Author |
|
When splitting and restoring vmas for madvise, we only copied the
XE_VMA_SYSTEM_ALLOCATOR flag. That meant we lost flags for read_only,
dumpable and sparse (in case anyone would call madvise for the latter).
Instead, define a mask of relevant flags and ensure all are replicated,
To simplify this and make the code a bit less fragile, remove the
conversion to VMA_CREATE flags and instead just pass around the
gpuva flags after initial conversion from user-space.
Fixes: a2eb8aec3ebe ("drm/xe: Reset VMA attributes to default in SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20251015170726.178685-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit b3af8658ec70f2196190c66103478352286aba3b)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Fix some duplicate includes in xe:
./drivers/gpu/drm/xe/xe_tlb_inval.c: xe_tlb_inval.h is included more than once.
./drivers/gpu/drm/xe/xe_pt.c: xe_tlb_inval_job.h is included more than once.
While at it, also sort the include lines alphabetically.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24705
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24706
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
[Reword commit message]
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250916021039.1632766-1-yang.lee@linux.alibaba.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Introduce an xe_bo_create_pin_map_novm() function that does not
take the drm_exec paramenter to simplify the conversion of many
callsites.
For the rest, ensure that the same drm_exec context that was used
for locking the vm is passed down to validation.
Use xe_validation_guard() where appropriate.
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Break out the change to pf_provision_vf_lmem8 to a separate
patch.
- Adapt to signature change of xe_validation_guard().
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-12-thomas.hellstrom@linux.intel.com
|
|
We now use the same notifier lock for SVM and userptr, with that we can
combine xe_pt_userptr_pre_commit and xe_pt_svm_pre_commit.
v2: (Matt B)
- Re-use xe_svm_notifier_lock/unlock for userptr.
- Combine svm/userptr handling further down into op_check_svm_userptr.
v3:
- Only hide the ops if we lack DRM_GPUSVM, since we also need them for
userptr.
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-18-matthew.auld@intel.com
|
|
Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
common code. The core facilities we need are get_pages(), unmap_pages()
and free_pages() for a given useptr range, plus a vm level notifier
lock, which is now provided by gpusvm.
v2:
- Reuse the same SVM vm struct we use for full SVM, that way we can
use the same lock (Matt B & Himal)
v3:
- Re-use svm_init/fini for userptr.
v4:
- Allow building xe without userptr if we are missing DRM_GPUSVM
config. (Matt B)
- Always make .read_only match xe_vma_read_only() for the ctx. (Dafna)
v5:
- Fix missing conversion with CONFIG_DRM_XE_USERPTR_INVAL_INJECT
v6:
- Convert the new user in xe_vm_madise.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Dafna Hirschfeld <dafna.hirschfeld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-17-matthew.auld@intel.com
|
|
This will simplify compiling out the bits that depend on DRM_GPUSVM in a
later patch. Without this we end up littering the code with ifdef
checks, plus it becomes hard to be sure that something won't blow at
runtime due to something not being initialised, even though it passed
the build. Should be no functional change here.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-16-matthew.auld@intel.com
|
|
Pull the pages stuff from the svm range into its own substructure, with
the idea of having the main pages related routines, like get_pages(),
unmap_pages() and free_pages() all operating on some lower level
structures, which can then be re-used for stuff like userptr.
v2:
- Move seq into pages struct (Matt B)
v3:
- Small kernel-doc fixes
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-13-matthew.auld@intel.com
|
|
Decouple TLB invalidations from the GT by updating the TLB invalidation
layer to accept a `struct xe_tlb_inval` instead of a `struct xe_gt`.
Also, rename *gt_tlb* to *tlb*. The internals of the TLB invalidation
code still operate on a GT, but this is now hidden from the rest of the
driver.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250826182911.392550-7-stuart.summers@intel.com
|
|
If the platform does not support atomic access on system memory, and the
ranges are in system memory, but the user requires atomic accesses on
the VMA, then migrate the ranges to VRAM. Apply this policy for prefetch
operations as well.
v2
- Drop unnecessary vm_dbg
v3 (Matthew Brost)
- fix atomic policy
- prefetch shouldn't have any impact of atomic
- bo can be accessed from vma, avoid duplicate parameter
v4 (Matthew Brost)
- Remove TODO comment
- Fix comment
- Dont allow gpu atomic ops when user is setting atomic attr as CPU
v5 (Matthew Brost)
- Fix atomic checks
- Add userptr checks
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-10-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Introduce xe_svm_ranges_zap_ptes_in_range(), a function to zap page table
entries (PTEs) for all SVM ranges within a user-specified address range.
-v2 (Matthew Brost)
Lock should be called even for tlb_invalidation
v3(Matthew Brost)
- Update comment
- s/notifier->itree.start/drm_gpusvm_notifier_start
- s/notifier->itree.last + 1/drm_gpusvm_notifier_end
- use WRITE_ONCE
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-8-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
The PAT index determines how PTEs are encoded and can be modified by
madvise. Therefore, it is now part of the vma attributes.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-4-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
For non-leaf paging structures we end up selecting a random index
between [0, 3], depending on the first user if the page-table is shared,
since non-leaf structures only have two bits in the HW for encoding the
PAT index, and here we are just passing along the full user provided
index, which can be an index as large as ~31 on xe2+. The user provided
index is meant for the leaf node, which maps the actual BO pages where
we have more PAT bits, and not the non-leaf nodes which are only mapping
other paging structures, and so only needs a minimal PAT index range.
Also the chosen index might need to consider how the driver mapped the
paging structures on the host side, like wc vs wb, which is separate
from the user provided index.
With that move the PDE PAT index selection under driver control. For now
just use a coherent index on platforms with page-tables that are cached
on host side, and incoherent otherwise. Using a coherent index could
potentially be expensive, and would be overkill if we know the page-table
is always uncached on host side.
v2 (Stuart):
- Add some documentation and split into separate helper.
BSpec: 59510
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250808103455.462424-2-matthew.auld@intel.com
|
|
Rather than open-coding GT TLB invalidations in the PT layer, use GT TLB
invalidation jobs. The real benefit is that GT TLB invalidation jobs use
a single dma-fence context, allowing the generated fences to be squashed
in dma-resv/DRM scheduler.
v2:
- s/;;/; (checkpatch)
- Move ijob/mjob job push after range fence install
v3:
- Remove extra newline (Stuart)
- Set ijob/mjob near creation (Stuart)
- Add comment back in (Stuart)
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250724191216.4076566-7-matthew.brost@intel.com
|
|
If a range or VMA is invalidated and scratch page is disabled, there
is no reason to issue a TLB invalidation on unbind, skip TLB
innvalidation is this condition is true. This is an opportunistic check
as it is done without the notifier lock, thus it possible for the range
to be invalidated after this check is performed.
This should improve performance of the SVM garbage collector, for
example, xe_exec_system_allocator --r many-stride-new-prefetch, went
~20s to ~9.5s on a BMG.
v2:
- Use helper for valid check (Thomas)
v3:
- Avoid skipping TLB invalidation if PTEs are removed at a higher
level than the range
- Never skip TLB invalidations for VMA
- Drop Himal's RB
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250616063024.2059829-3-matthew.brost@intel.com
|
|
Rather than having multiple READ_ONCE of the tile_* fields and comments
in code, use helper with kernel doc for single access point and clear
rules.
v3:
- s/xe_vm_has_valid_gpu_pages/xe_vm_has_valid_gpu_mapping
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250616063024.2059829-2-matthew.brost@intel.com
|
|
No need kill on -ENODATA as is this non-fatal error can occur when MMU
notifiers race with prefetches.
Fixes: 09ba0a8f06cd ("drm/xe/svm: Implement prefetch support for SVM ranges")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>>
Link: https://lore.kernel.org/r/20250613231808.752616-1-matthew.brost@intel.com
|
|
Document VMA tile_invalidated access rules, use READ_ONCE / WRITE_ONCE
for opportunistic checks of tile_present and tile_invalidated, move
tile_invalidated state change from page fault handler to PT code under
the correct locks, and add lockdep asserts to TLB invalidation paths.
v2:
- Assert VM dma-resv lock rather than BO in zap PTEs
v3:
- Back to BO's dma-resv lock, adjust documentation
v4:
- Add WRITE_ONCE in xe_vm_invalidate_vma (Thomas)
- Change lockdep assert for userptr in xe_vm_invalidate_vma (CI)
- Take userptr notifier lock in read mode in xe_vm_userptr_pin before
calling xe_vm_invalidate_vma (CI)
v5:
- Fix typos (Thomas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250602164412.1912293-1-matthew.brost@intel.com
|
|
This commit adds prefetch support for SVM ranges, utilizing the
existing ioctl vm_bind functionality to achieve this.
v2: rebase
v3:
- use xa_for_each() instead of manual loop
- check range is valid and in preferred location before adding to
xarray
- Fix naming conventions
- Fix return condition as -ENODATA instead of -EAGAIN (Matthew Brost)
- Handle sparsely populated cpu vma range (Matthew Brost)
v4:
- fix end address to find next cpu vma in case of -ENOENT
v5:
- Move find next vma logic to drm gpusvm layer
- Avoid mixing declaration and logic
v6:
- Use new function names
- Move eviction logic to prefetch_ranges
v7:
- devmem_only assigned 0
- nit address
v8:
- initialize ctx with 0
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-15-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Introduce a helper to add tile mask of binding present and invalidated
for the range. Add a lockdep_assert to ensure it is protected by GPU SVM
notifier lock.
-v7
rebased
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-4-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Mixing GPU and CPU atomics does not work unless a strict migration
policy of GPU atomics must be device memory. Enforce a policy of must be
in VRAM with a retry loop of 3 attempts, if retry loop fails abort
fault.
Removing always_migrate_to_vram modparam as we now have real migration
policy.
v2:
- Only retry migration on atomics
- Drop alway migrate modparam
v3:
- Only set vram_only on DGFX (Himal)
- Bail on get_pages failure if vram_only and retry count exceeded (Himal)
- s/vram_only/devmem_only
- Update xe_svm_range_is_valid to accept devmem_only argument
v4:
- Fix logic bug get_pages failure
v5:
- Fix commit message (Himal)
- Mention removing always_migrate_to_vram in commit message (Lucas)
- Fix xe_svm_range_is_valid to check for devmem pages
- Bail on devmem_only && !migrate_devmem (Thomas)
v6:
- Add READ_ONCE barriers for opportunistic checks (Thomas)
- Pair READ_ONCE with WRITE_ONCE (Thomas)
v7:
- Adjust comments (Thomas)
Fixes: 2f118c949160 ("drm/xe: Add SVM VRAM migration")
Cc: stable@vger.kernel.org
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250512135500.1405019-3-matthew.brost@intel.com
|
|
When a vm runs under fault mode, if scratch page is enabled, we need
to clear the scratch page mapping on vm_bind for the vm_bind address
range. Under fault mode, we depend on recoverable page fault to
establish mapping in page table. If scratch page is not cleared, GPU
access of address won't cause page fault because it always hits the
existing scratch page mapping.
When vm_bind with IMMEDIATE flag, there is no need of clearing as
immediate bind can overwrite the scratch page mapping.
So far only is xe2 and xe3 products are allowed to enable scratch page
under fault mode. On other platform we don't allow scratch page under
fault mode, so no need of such clearing.
v2: Rework vm_bind pipeline to clear scratch page mapping. This is similar
to a map operation, with the exception that PTEs are cleared instead of
pointing to valid physical pages. (Matt, Thomas)
TLB invalidation is needed after clear scratch page mapping as larger
scratch page mapping could be backed by physical page and cached in
TLB. (Matt, Thomas)
v3: Fix the case of clearing huge pte (Thomas)
Improve commit message (Thomas)
v4: TLB invalidation on all LR cases, not only the clear on bind
cases (Thomas)
v5: Misc cosmetic changes (Matt)
Drop pt_update_ops.invalidate_on_bind. Directly wire
xe_vma_op.map.invalidata_on_bind to bind_op_prepare/commit (Matt)
v6: checkpatch fix (Matt)
v7: No need to check platform needs_scratch deciding invalidate_on_bind
(Matt)
v8: rebase
v9: rebase
v10: fix an error in xe_pt_stage_bind_entry, introduced in v9 rebase
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250403165328.2438690-3-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Some users apply PINNED and some don't when using pin_map(). The pin in
pin_map() should imply PINNED so just unconditionally apply it and clean
up all users.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-14-matthew.auld@intel.com
|
|
With the idea of having more pinned objects using the blitter engine
where possible, during suspend/resume, mark the pinned objects which
can be done during the late phase once submission/migration has been
setup. Start out simple with lrc and page-tables from userspace.
v2:
- s/early_restore/late_restore; early restore was way too bold with too
many places being impacted at once.
v3:
- Split late vs early into separate lists, to align with newly added
apply-to-pinned infra.
v4:
- Rebase.
v5:
- Make sure we restore the late phase kernel_bo_present in igpu.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-13-matthew.auld@intel.com
|
|
The structure was missing a proper kerneldoc header and once
that was added a number of typos and errors became obvious.
Fix those.
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Closes: https://lore.kernel.org/intel-xe/x53tcs5bjldw6lcorjemuheklxcmepdvr2u7lvt3hpqrzqoc4h@nsu6hs25taqj/
Fixes: b2d4b03b03a7 ("drm/xe: Make the PT code handle placement per PTE rather than per vma / range")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250402122924.25526-1-thomas.hellstrom@linux.intel.com
|
|
With SVM, ranges forwarded to the PT code for binding can, mostly
due to races when migrating, point to both VRAM and system / foreign
device memory. Make the PT code able to handle that by checking,
for each PTE set up, whether it points to local VRAM or to system
memory.
v2:
- Fix system memory GPU atomic access.
v3:
- Avoid the UAPI change. It needs more thought.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250326080551.40201-6-thomas.hellstrom@linux.intel.com
|
|
Don't rely on CONFIG_DRM_GPUSVM because other drivers may enable it
causing us to compile in SVM support unintentionally.
Also take the opportunity to leave more code out of compilation if
!CONFIG_DRM_XE_GPUSVM and !CONFIG_DRM_XE_DEVMEM_MIRROR
v3:
- Fixes for compilation errors on 32-bit. This changes the Kconfig
logic a bit.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250326080551.40201-2-thomas.hellstrom@linux.intel.com
|
|
Add some useful SVM debug logging fro SVM range which prints the range's
state.
v2:
- Update logging with latest structure layout
v3:
- Better commit message (Thomas)
- New range structure (Thomas)
- s/COLLECTOT/s/COLLECTOR (Thomas)
v4:
- Drop partial evict message (Thomas)
- Use %p for pointers print (Thomas)
v6:
- Cast dma_addr to u64 (CI)
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-30-matthew.brost@intel.com
|
|
Add unbind to SVM garbage collector. To facilitate add unbind support
function to VM layer which unbinds a SVM range. Also teach PT layer to
understand unbinds of SVM ranges.
v3:
- s/INVALID_VMA/XE_INVALID_VMA (Thomas)
- Kernel doc (Thomas)
- New GPU SVM range structure (Thomas)
- s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas)
v4:
- Use xe_vma_op_unmap_range (Himal)
v5:
- s/PY/PT (Thomas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-17-matthew.brost@intel.com
|
|
Add (re)bind to SVM page fault handler. To facilitate add support
function to VM layer which (re)binds a SVM range. Also teach PT layer to
understand (re)binds of SVM ranges.
v2:
- Don't assert BO lock held for range binds
- Use xe_svm_notifier_lock/unlock helper in xe_svm_close
- Use drm_pagemap dma cursor
- Take notifier lock in bind code to check range state
v3:
- Use new GPU SVM range structure (Thomas)
- Kernel doc (Thomas)
- s/DRM_GPUVA_OP_USER/DRM_GPUVA_OP_DRIVER (Thomas)
v5:
- Kernel doc (Thomas)
v6:
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Tested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-15-matthew.brost@intel.com
|
|
Add SVM range invalidation vfunc which invalidates PTEs. A new PT layer
function which accepts a SVM range is added to support this. In
addition, add the basic page fault handler which allocates a SVM range
which is used by SVM range invalidation vfunc.
v2:
- Don't run invalidation if VM is closed
- Cycle notifier lock in xe_svm_close
- Drop xe_gt_tlb_invalidation_fence_fini
v3:
- Better commit message (Thomas)
- Add lockdep asserts (Thomas)
- Add kernel doc (Thomas)
- s/change/changed (Thomas)
- Use new GPU SVM range / notifier structures
- Ensure PTEs are zapped / dma mappings are unmapped on VM close (Thomas)
v4:
- Fix macro (Checkpatch)
v5:
- Use range start/end helpers (Thomas)
- Use notifier start/end helpers (Thomas)
v6:
- Use min/max helpers (Himal)
- Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-13-matthew.brost@intel.com
|
|
Clear root PT entry and invalidate entire VM's address space when
closing the VM. Will prevent the GPU from accessing any of the VM's
memory after closing.
v2:
- s/vma/vm in kernel doc (CI)
- Don't nuke migration VM as this occur at driver unload (CI)
v3:
- Rebase and pull into SVM series (Thomas)
- Wait for pending binds (Thomas)
v5:
- Remove xe_gt_tlb_invalidation_fence_fini in error case (Matt Auld)
- Drop local migration bool (Thomas)
v7:
- Add drm_dev_enter/exit protecting invalidation (CI, Matt Auld)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-12-matthew.brost@intel.com
|
|
Add the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag, which is used to
create unpopulated virtual memory areas (VMAs) without memory backing or
GPU page tables. These VMAs are referred to as CPU address mirror VMAs.
The idea is that upon a page fault or prefetch, the memory backing and
GPU page tables will be populated.
CPU address mirror VMAs only update GPUVM state; they do not have an
internal page table (PT) state, nor do they have GPU mappings.
It is expected that CPU address mirror VMAs will be mixed with buffer
object (BO) VMAs within a single VM. In other words, system allocations
and runtime allocations can be mixed within a single user-mode driver
(UMD) program.
Expected usage:
- Bind the entire virtual address (VA) space upon program load using the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- If a buffer object (BO) requires GPU mapping (runtime allocation),
allocate a CPU address using mmap(PROT_NONE), bind the BO to the
mmapped address using existing bind IOCTLs. If a CPU map of the BO is
needed, mmap it again to the same CPU address using mmap(MAP_FIXED)
- If a BO no longer requires GPU mapping, munmap it from the CPU address
space and them bind the mapping address with the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- Any malloc'd or mmapped CPU address accessed by the GPU will be
faulted in via the SVM implementation (system allocation).
- Upon freeing any mmapped or malloc'd data, the SVM implementation will
remove GPU mappings.
Only supporting 1 to 1 mapping between user address space and GPU
address space at the moment as that is the expected use case. uAPI
defines interface for non 1 to 1 but enforces 1 to 1, this restriction
can be lifted if use cases arrise for non 1 to 1 mappings.
This patch essentially short-circuits the code in the existing VM bind
paths to avoid populating page tables when the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag is set.
v3:
- Call vm_bind_ioctl_ops_fini on -ENODATA
- Don't allow DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR on non-faulting VMs
- s/DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATOR/DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR (Thomas)
- Rework commit message for expected usage (Thomas)
- Describe state of code after patch in commit message (Thomas)
v4:
- Fix alignment (Checkpatch)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306012657.3505757-9-matthew.brost@intel.com
|
|
Concurrent VM bind staging and zapping of PTEs from a userptr notifier
do not work because the view of PTEs is not stable. VM binds cannot
acquire the notifier lock during staging, as memory allocations are
required. To resolve this race condition, use a staging tree for VM
binds that is committed only under the userptr notifier lock during the
final step of the bind. This ensures a consistent view of the PTEs in
the userptr notifier.
A follow up may only use staging for VM in fault mode as this is the
only mode in which the above race exists.
v3:
- Drop zap PTE change (Thomas)
- s/xe_pt_entry/xe_pt_entry_staging (Thomas)
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org>
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Fix fault mode invalidation racing with unbind leading to the
PTE zapping potentially traversing an invalid page-table tree.
Do this by holding the notifier lock across PTE zapping. This
might transfer any contention waiting on the notifier seqlock
read side to the notifier lock read side, but that shouldn't be
a major problem.
At the same time get rid of the open-coded invalidation in the bind
code by relying on the notifier even when the vma bind is not
yet committed.
Finally let userptr invalidation call a dedicated xe_vm function
performing a full invalidation.
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.12+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-4-thomas.hellstrom@linux.intel.com
|
|
Fix all typos in files of xe, reported by codespell tool.
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
|
|
Invalidation_fence_init takes a PM reference, which is released in its
_fini counterpart, so we need to make sure that the latter is called,
even if the fence is in an error state.
Since we already have a function that calls _fini() and signals the
fence in the tlb inval code, we can expose that and call it from the PT
code.
Fixes: f002702290fc ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: <stable@vger.kernel.org> # v6.11+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241206015022.1567113-1-daniele.ceraolospurio@intel.com
|
|
Use fault injection infrastructure to allow specific functions to
be configured over debugfs for failing during the execution of
xe_vm_create_ioctl() and xe_vm_bind_ioctl(). This allows more
thorough testing from user space by going through code paths for
error handling and unwinding which cannot be reached by simply
injecting errors in IOCTL arguments. This can help increase code
robustness.
v2: Add xe_pt_update_ops_{prepare,run} (Matthew Brost)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241113162212.2154103-1-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
|
|
Make sure to call xe_pt_update_ops_fini in xe_pt_update_ops_abort to
free any memory the bind allocated.
Caught by kmemleak when running Vulkan CTS tests on LNL. The leak
seems to happen only when there's some kind of failure happening, like
the lack of memory. Example output:
unreferenced object 0xffff9120bdf62000 (size 8192):
comm "deqp-vk", pid 115008, jiffies 4310295728
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 1b 05 f9 28 01 00 00 40 ...........(...@
00 00 00 00 00 00 00 00 1b 15 f9 28 01 00 00 40 ...........(...@
backtrace (crc 7a56be79):
[<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0
[<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe]
[<ffffffffc08e8309>] xe_pt_insert_entry+0xb9/0x140 [xe]
[<ffffffffc08eab6d>] xe_pt_stage_bind_entry+0x12d/0x5b0 [xe]
[<ffffffffc08ecbca>] xe_pt_walk_range+0xea/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08e9eff>] xe_pt_stage_bind.constprop.0+0x25f/0x580 [xe]
[<ffffffffc08eb21a>] bind_op_prepare+0xea/0x6e0 [xe]
[<ffffffffc08ebab8>] xe_pt_update_ops_prepare+0x1c8/0x440 [xe]
[<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe]
[<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe]
[<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe]
[<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm]
unreferenced object 0xffff9120bdf72000 (size 8192):
comm "deqp-vk", pid 115008, jiffies 4310295728
hex dump (first 32 bytes):
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
backtrace (crc 23b2f0b5):
[<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0
[<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe]
[<ffffffffc08e8453>] xe_pt_stage_unbind_post_descend+0xb3/0x150 [xe]
[<ffffffffc08ecd26>] xe_pt_walk_range+0x246/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
[<ffffffffc08ece31>] xe_pt_walk_shared+0xc1/0x110 [xe]
[<ffffffffc08e7b2a>] xe_pt_stage_unbind+0x9a/0xd0 [xe]
[<ffffffffc08e913d>] unbind_op_prepare+0xdd/0x270 [xe]
[<ffffffffc08eb9f6>] xe_pt_update_ops_prepare+0x106/0x440 [xe]
[<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe]
[<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe]
[<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe]
[<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm]
[<ffffffffc05e95a0>] drm_ioctl+0x280/0x4e0 [drm]
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2877
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240927232228.3255246-1-matthew.brost@intel.com
|
|
Testing on LNL has shown media GT's TLBs need to be invalidated via the
GuC, update PT code appropriately.
v2:
- Do dma_fence_get before first call of invalidation_fence_init (Himal)
- No need to check for valid chain fence (Himal)
v3:
- Use dma-fence-array
Fixes: 3330361543fc ("drm/xe/lnl: Add LNL platform definition")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826170144.2492062-3-matthew.brost@intel.com
|
|
This reverts commit 40520283e0fd11237ed9dfc0991503b3403d5fa4.
We can't install dma-fence-chain in timeline sync objs.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240823162207.2168887-1-matthew.brost@intel.com
|
|
Testing on LNL has shown media GT's TLBs need to be invalidated via the
GuC, update PT code appropriately.
v2:
- Do dma_fence_get before first call of invalidation_fence_init (Himal)
- No need to check for valid chain fence (Himal)
Fixes: 3330361543fc ("drm/xe/lnl: Add LNL platform definition")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240820161632.987369-1-matthew.brost@intel.com
|
|
We only set the last fence on user binds, so no need to check last fence
kernel issued binds. Will avoid blowing up last fence lockdep asserts.
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805200233.3050325-1-matthew.brost@intel.com
|
|
When restoring the children PT entries on a bind failure the incorrect
loop index has used resulting in PT entries being leaked. This is shown
by running xe_vm.bind-array-conflict-error-inject on a VRAM device going
into a suspend state after the test completes.
v2:
- s/childern/children (CI, Matt Auld)
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240723010230.1652707-1-matthew.brost@intel.com
|
|
Having two methods to wait on GT TLB invalidations is not ideal. Remove
xe_gt_tlb_invalidation_wait and only use GT TLB invalidation fences.
In addition to two methods being less than ideal, once GT TLB
invalidations are coalesced the seqno cannot be assigned during
xe_gt_tlb_invalidation_ggtt/range. Thus xe_gt_tlb_invalidation_wait
would not have a seqno to wait one. A fence however can be armed and
later signaled.
v3:
- Add explaination about coalescing to commit message
v4:
- Don't put dma fence if defined on stack (CI)
v5:
- Initialize ret to zero (CI)
v6:
- Use invalidation_fence_signal helper in tlb timeout (Matthew Auld)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-3-matthew.brost@intel.com
|
|
Other layers should not be touching struct xe_gt_tlb_invalidation_fence
directly, add helper for initialization.
v2:
- Add dma_fence_get and list init to xe_gt_tlb_invalidation_fence_init
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-2-matthew.brost@intel.com
|
|
Add VM bind IOCTL error injection which steals MSB of the bind flags
field which if set injects errors at various points in the VM bind
IOCTL. Intended to validate error paths. Enabled by CONFIG_DRM_XE_DEBUG.
v4:
- Change define layout (Jonathan)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-8-matthew.brost@intel.com
|
|
Update PT layer so if a memory allocation for a PTE fails the error can
be propagated to the user without requiring the VM to be killed.
v5:
- change return value invalidation_fence_init to void (Matthew Auld)
v7:
- Invert i,j usage in two places (Matthew Auld)
- s/0/NULL (Matthew Auld)
- Don't ignore return value of xe_pt_new_shared (Matthew Auld)
- Don't check for NULL in xe_pt_entry (Matthew Auld)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-7-matthew.brost@intel.com
|
|
This aligns with the uAPI of an array of binds or single bind that
results in multiple GPUVA ops to be considered a single atomic
operations.
The design is roughly:
- xe_vma_ops is a list of xe_vma_op (GPUVA op)
- each xe_vma_op resolves to 0-3 PT ops
- xe_vma_ops creates a single job
- if at any point during binding a failure occurs, xe_vma_ops contains
the information necessary unwind the PT and VMA (GPUVA) state
v2:
- add missing dma-resv slot reservation (CI, testing)
v4:
- Fix TLB invalidation (Paulo)
- Add missing xe_sched_job_last_fence_add/test_dep check (Inspection)
v5:
- Invert i, j usage (Matthew Auld)
- Add helper to test and add job dep (Matthew Auld)
- Return on anything but -ETIME for cpu bind (Matthew Auld)
- Return -ENOBUFS if suballoc of BB fails due to size (Matthew Auld)
- s/do/Do (Matthew Auld)
- Add missing comma (Matthew Auld)
- Do not assign return value to xe_range_fence_insert (Matthew Auld)
v6:
- s/0x1ff/MAX_PTE_PER_SDI (Matthew Auld, CI)
- Check to large of SA in Xe to avoid triggering WARN (Matthew Auld)
- Fix checkpatch issues
v7:
- Rebase
- Support more than 510 PTEs updates in a bind job (Paulo, mesa testing)
v8:
- Rebase
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-5-matthew.brost@intel.com
|
|
In multi-gpu environments it is important to know the device
gt events belongs to. The tracing information includes the device_id
to indicate the device the event is associated with.
v2: Use variable sized variant to display dev name(Gustavo)
v3: Pass single argument to __assign_str to fix kunit error
v4: Remove unused sting_helper library include
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607182943.3572524-6-radhakrishna.sripada@intel.com
|
|
At xe_pt_zap_ptes_entry() and xe_pt_stage_unbind_entry, the level cannot
be 0. Therefore, add an independent check for the level. Since the level
cannot be zero at this point, there is no need to check for `is_compact`,
so remove that instead.
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521103623.11645-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
|