summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe/xe_exec.c
diff options
context:
space:
mode:
authorThomas Hellström <thomas.hellstrom@linux.intel.com>2025-09-04 18:07:15 +0200
committerThomas Hellström <thomas.hellstrom@linux.intel.com>2025-09-05 17:03:35 +0200
commit599334572a5a99111015fbbd5152ce4dedc2f8b7 (patch)
treeee64a4c6aac5df6f74419495d2c7d6c588ad2d74 /drivers/gpu/drm/xe/xe_exec.c
parentebd546fdffddfcaeab08afdd68ec93052c8fa740 (diff)
drm/xe: Block exec and rebind worker while evicting for suspend / hibernate
When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com
Diffstat (limited to 'drivers/gpu/drm/xe/xe_exec.c')
-rw-r--r--drivers/gpu/drm/xe/xe_exec.c9
1 files changed, 9 insertions, 0 deletions
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 25a59b6934f6..1dc0c54cac5f 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -238,6 +238,15 @@ retry:
goto err_unlock_list;
}
+ /*
+ * It's OK to block interruptible here with the vm lock held, since
+ * on task freezing during suspend / hibernate, the call will
+ * return -ERESTARTSYS and the IOCTL will be rerun.
+ */
+ err = wait_for_completion_interruptible(&xe->pm_block);
+ if (err)
+ goto err_unlock_list;
+
vm_exec.vm = &vm->gpuvm;
vm_exec.flags = DRM_EXEC_INTERRUPTIBLE_WAIT;
if (xe_vm_in_lr_mode(vm)) {