summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)Author
2025-12-02drm/xe/vf: Enable VF migration only on supported GuC versionsSatyanarayana K V P
Enable VF migration starting with GuC 70.54.0 (compatibility version 1.27.0) which supports additional VF2GUC_RESFIX_START message required to handle migration recovery in a more robust way. Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251201095011.21453-7-satyanarayana.k.v.p@intel.com
2025-12-02drm/{i915, xe}/display: make pxp key check part of bo interfaceJani Nikula
Add intel_bo_key_check() next to intel_bo_is_protected() where it feels like it belongs, and drop the extra pxp compat header. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251201172730.2154668-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-02drm/xe/compat: remove unused i915_active.h and i915_active_types.hJani Nikula
Commit 965930962a41 ("drm/i915/frontbuffer: Fix intel_frontbuffer lifetime handling") dropped the last xe display users of the headers. They're still used in intel_overlay.c, but it's not built as part of xe. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251201171050.2145833-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-01drm/xe: Implement DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATEMatthew Brost
Implement DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE which sets the exec queue default state to user data passed in. The intent is for a Mesa tool to use this replay GPU hangs. v2: - Enable the flag DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE - Fix the page size math calculation to avoid a crash v4: - Use vmemdup_user (Maarten) - Copy default state first into LRC, then replay state (Testing, Carlos) Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-10-matthew.brost@intel.com
2025-12-01drm/xe: Add replay_offset and replay_length lines to LRC HWCTX snapshotMatthew Brost
Add replay_offset and replay_length lines to LRC HWCTX snapshot with the idea being this information can be used extract the data which needs to be pass to exec queue extension DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE so GPU hang can be replayed via a Mesa tool. The additional lines look like: [HWCTX].replay_offset: 0x%x [HWCTX].replay_length: 0x%x Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-9-matthew.brost@intel.com
2025-12-01drm/xe: Add VM.uapi_flags to VM snapshot captureMatthew Brost
Add VM.uapi_flags to VM snapshot capture VM snapshot capture. This is useful information for debug and will help build a robust GPU hang replay tool. The current format is: VM.uapi_flags: 0x%x Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-7-matthew.brost@intel.com
2025-12-01drm/xe: Add cpu_caching to properties line in VM snapshot captureMatthew Brost
Add CPU caching to properties line in VM snapshot capture indicating the BO caching properites. This is useful information for debug and will help build a robust GPU hang replay tool. The current format is: [<vma address>]: <permissions>|<type>|mem_region=0x%x|pat_index=%d|cpu_caching=%d Permissions has two options, either "read_only" or "read_write". Type has three options, either "userptr", "null_sparse", or "bo". Memory region is a bit mask of where the memory is located. Pat index corresponds to the value setup upon VM bind. CPU caching corresponds to the value of BO setup upon creation. v2: - Save off cpu_caching value rather than looking at BO (Carlos) v4: - Fix NULL ptr dereference (Carlos) Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-6-matthew.brost@intel.com
2025-12-01drm/xe: Add pat_index to properties line in VM snapshot captureMatthew Brost
Add pat index to properties line in VM snapshot capture indicating the VMA caching properites. This is useful information for debug and will help build a robust GPU hang replay tool. The current format is: [<vma address>]: <permissions>|<type>|mem_region=0x%x|pat_index=%d Permissions has two options, either "read_only" or "read_write". Type has three options, either "userptr", "null_sparse", or "bo". Memory region is a bit mask of where the memory is located. Pat index corresponds to the value setup upon VM bind. Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-5-matthew.brost@intel.com
2025-12-01drm/xe: Add mem_region to properties line in VM snapshot captureMatthew Brost
Add memory region to properties line in VM snapshot capture indicating where the memory is located. The memory region corresponds to regions in the uAPI. This is useful information for debug and will help build a robust GPU hang replay tool. The current format is: [<vma address>]: <permissions>|<type>|mem_region=0x%x Permissions has two options, either "read_only" or "read_write". Type has three options, either "userptr", "null_sparse", or "bo". Memory region is a bit mask of where the memory is located. Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-4-matthew.brost@intel.com
2025-12-01drm/xe: Add "null_sparse" type to VM snap propertiesMatthew Brost
Add "null_sparse" type to VM snap properties indicating the VMA reads zero and writes are droppped. This is useful information for debug and will help build a robust GPU hang replay tool. The current format is: [<vma address>]: <permissions>|<type> Permissions has two options, either "read_only" or "read_write". Type has three options, either "userptr", "null_sparse", or "bo". Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-3-matthew.brost@intel.com
2025-12-01drm/xe: Add properties line to VM snapshot captureMatthew Brost
Add properties line to VM snapshot capture which includes additional information about VMA being dumped. This is helpful for debug purposes but also to build a robust GPU hang replay tool. The current format is: [<vma address>]: <permissions>|<type> Permissions has two options, either "read_only" or "read_write". Type has two options, either "userptr" or "bo". Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20251126185952.546277-2-matthew.brost@intel.com
2025-12-01drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg()Vinay Belgaumkar
Wa_14020316580 was getting clobbered by power gating init code later in the driver load sequence. Move the Wa so that it applies correctly. Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds") Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-01drm/xe: Fix freq kobject leak on sysfs_create_files failureShuicheng Lin
Ensure gt->freq is released when sysfs_create_files() fails in xe_gt_freq_init(). Without this, the kobject would leak. Add kobject_put() before returning the error. Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Alex Zuo <alex.zuo@intel.com> Reviewed-by: Xin Wang <x.wang@intel.com> Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-01drm/xe/xe3_lpg: Apply Wa_16028005424Balasubramani Vivekanandan
Applied Wa_16028005424 to Graphics version from 30.00 to 30.05 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patch.msgid.link/20251121100822.20076-2-balasubramani.vivekanandan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-01drm/{i915,xe}/display: drop intel_wakeref.h usageJani Nikula
Drop the display dependency on intel_wakeref.h header. The contract in the parent interface is that -ENOENT means there's no tracking. It doesn't actually require us to use a shared macro for it. Duplicate the macro in the few places that need this instead of inlining, primarily for documentation reasons. This allows us to remove the xe compat intel_wakeref.h header. v2: Define INTEL_WAKEREF_DEF in intel_display_power.h Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Link: https://patch.msgid.link/3599d0ec168d7ce7030582706acba66b616ab9f3.1764076995.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-01drm/i915/power: convert intel_wakeref_t to struct ref_tracker *Jani Nikula
Under the hood, intel_wakeref_t is just struct ref_tracker *. Use the actual underlying type both for clarity (we *are* using intel_wakeref_t as a pointer though it doesn't look like one) and to help i915, xe and display coexistence without custom types. v2: Keep intel_wakeref.h includes as they are Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Link: https://patch.msgid.link/f182bd26d5f9a00e843246d4aac8b25ff7531c51.1764076995.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-01drm/xe: Protect against unset LRC when pausing submissionsTomasz Lis
While pausing submissions, it is possible to encouner an exec queue which is during creation, and therefore doesn't have a valid xe_lrc struct reference. Protect agains such situation, by checking for NULL before access. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: c25c1010df88 ("drm/xe/vf: Replay GuC submission state on pause / unpause") Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251124222853.1900800-1-tomasz.lis@intel.com (cherry picked from commit 07cf4b864f523f01d2bb522a05813df30b076ba8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/vf: Start re-emission from first unsignaled job during VF migrationMatthew Brost
The LRC software ring tail is reset to the first unsignaled pending job's head. Fix the re-emission logic to begin submitting from the first unsignaled job detected, rather than scanning all pending jobs, which can cause imbalance. v2: - Include missing local changes v3: - s/skip_replay/restore_replay (Tomasz) Fixes: c25c1010df88 ("drm/xe/vf: Replay GuC submission state on pause / unpause") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://patch.msgid.link/20251121152750.240557-1-matthew.brost@intel.com (cherry picked from commit 00937fe1921ab346b6f6a4beaa5c38e14733caa3) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/pf: Use div_u64 when calculating GGTT profileMichal Wajdeczko
This will fix the following error seen on some 32-bit config: "ERROR: modpost: "__udivdi3" [drivers/gpu/drm/xe/xe.ko] undefined!" Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511150929.3vUi6PEJ-lkp@intel.com/ Fixes: e448372e8a8e ("drm/xe/pf: Use migration-friendly GGTT auto-provisioning") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251115151323.10828-1-michal.wajdeczko@intel.com (cherry picked from commit 0f4435a1f46efc3177eb082cd3f73e29da5ab86a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe: Fix memory leak when handling pagefault vmaMika Kuoppala
When the pagefault handling code was moved to a new file, an extra drm_exec_init() was added to the VMA path. This call is unnecessary because xe_validation_ctx_init() already performs a drm_exec_init(), resulting in a memory leak reported by kmemleak. Remove the redundant drm_exec_init() from the VMA pagefault handling code. Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: intel-xe@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251120161435.3674556-1-mika.kuoppala@linux.intel.com (cherry picked from commit 62519b77aecad22b525eda482660ffa127e7ad80) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/pf: Export helpers for VFIOMichał Winiarski
Device specific VFIO driver variant for Xe will implement VF migration. Export everything that's needed for migration ops. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-4-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 17f22465c5a5573724c942ca7147b4024631ef87) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/pci: Introduce a helper to allow VF access to PF xe_deviceMichał Winiarski
In certain scenarios (such as VF migration), VF driver needs to interact with PF driver. Add a helper to allow VF driver access to PF xe_device. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-3-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 8b3cce3ad9c78ce3dae1c178f99352d50e12a3c0) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/pf: Enable SR-IOV VF migrationMichał Winiarski
All of the necessary building blocks are now in place to support SR-IOV VF migration. Flip the enable/disable logic to match VF code and disable the feature only for platforms that don't meet the necessary prerequisites. To allow more testing and experiments, on DEBUG builds any missing prerequisites will be ignored. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-2-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 01c724aa7bf84e9d081a56e0cbf1d282678ce144) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-01drm/xe/pm: Add scope-based cleanup helper for runtime PMMatt Roper
Add a scope-based helpers for runtime PM that may be used to simplify cleanup logic and potentially avoid goto-based cleanup. For example, using guard(xe_pm_runtime)(xe); will get runtime PM and cause a corresponding put to occur automatically when the current scope is exited. 'xe_pm_runtime_noresume' can be used as a guard replacement for the corresponding 'noresume' variant. There's also an xe_pm_runtime_ioctl conditional guard that can be used as a replacement for xe_runtime_ioctl(): ACQUIRE(xe_pm_runtime_ioctl, pm)(xe); if ((ret = ACQUIRE_ERR(xe_pm_runtime_ioctl, &pm)) < 0) /* failed */ In a few rare cases (such as gt_reset_worker()) we need to ensure that runtime PM is dropped when the function is exited by any means (including error paths), but the function does not need to acquire runtime PM because that has already been done earlier by a different function. For these special cases, an 'xe_pm_runtime_release_only' guard can be used to handle the release without doing an acquisition. These guards will be used in future patches to eliminate some of our goto-based cleanup. v2: - Specify success condition for xe_pm runtime_ioctl as _RET >= 0 so that positive values will be properly identified as success and trigger destructor cleanup properly. v3: - Add comments to the kerneldoc for the existing 'get' functions indicating that scope-based handling should be preferred where possible. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251118164338.3572146-31-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 59e7528dbfd52efbed05e0f11b2143217a12bc74) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-11-28drm/xe/pf: Export helpers for VFIOMichał Winiarski
Device specific VFIO driver variant for Xe will implement VF migration. Export everything that's needed for migration ops. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-4-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-28drm/xe/pci: Introduce a helper to allow VF access to PF xe_deviceMichał Winiarski
In certain scenarios (such as VF migration), VF driver needs to interact with PF driver. Add a helper to allow VF driver access to PF xe_device. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-3-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-28drm/xe/pf: Enable SR-IOV VF migrationMichał Winiarski
All of the necessary building blocks are now in place to support SR-IOV VF migration. Flip the enable/disable logic to match VF code and disable the feature only for platforms that don't meet the necessary prerequisites. To allow more testing and experiments, on DEBUG builds any missing prerequisites will be ignored. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251127093934.1462188-2-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-28drm/xe/dsb: drop the unnecessary struct i915_vmaJani Nikula
Now that struct intel_dsb_buffer is opaque, it can be made unique to both drivers, and we can drop the unnecessary struct i915_vma part. Only the struct xe_bo part is needed. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patch.msgid.link/f0bba09d2f185fe3e7f3b803036f036d845a8cc4.1764155417.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28drm/{i915,xe}/dsb: make struct intel_dsb_buffer opaqueJani Nikula
Move the definitions of struct intel_dsb_buffer to the driver specific files, hiding the implementation details from the shared DSB code. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patch.msgid.link/08a8a7745042afcffa647f82ae23ebbeda0234c9.1764155417.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28drm/{i915, xe}/dsb: allocate struct intel_dsb_buffer dynamicallyJani Nikula
Prepare for hiding the struct intel_dsb_buffer implementation details from the generic DSB code. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patch.msgid.link/af94dc06c55a866efa9105ae0a8d244e4c6b17ab.1764155417.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28drm/{i915, xe}/dsb: make {intel, xe}_dsb_buffer.c independent of displayJani Nikula
The DSB buffer implementation is really independent of display. Pass struct drm_device instead of struct intel_crtc to intel_dsb_buffer_create(), and drop the intel_display_types.h include. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patch.msgid.link/a8cee08e8c36c2cf84cb9cda1b9f318db76710af.1764155417.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28drm/{i915,xe}/hdcp: use parent interface for HDCP GSC callsJani Nikula
The HDCP GSC implementation is different for both i915 and xe. Add it to the display parent interface, and call the hooks via the parent interface. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/e397073e91f8aa7518754b3b79f65c1936be91ad.1764090990.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28drm/i915: Use hw.active instead of uapi.active in the initial plane readoutVille Syrjälä
We're interested in the actual hardware state rather than the uapi state, so grab the crtc active flag from the correct spot. In practice the two will be identical here becase .get_initial_plane_config() will reject the initial FB when joiner is active. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251119181606.17129-5-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2025-11-28Merge tag 'drm-xe-next-fixes-2025-11-21' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - A couple of SR-IOV fixes (Michal Winiarski) - Fix a potential UAF (Sanjay) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/aSA08EW9JMU3LkIu@fedora
2025-11-27drm/xe/gt: Introduce runtime suspend/resumeRaag Jadav
If power state is retained between suspend/resume cycle, we don't need to perform full GT re-initialization. Introduce runtime helpers for GT which greatly reduce suspend/resume delay. v2: Drop redundant xe_gt_sanitize() and xe_guc_ct_stop() (Daniele) Use runtime naming for guc helpers (Daniele) v3: Drop redundant logging, add kernel doc (Michal) Use runtime naming for ct helpers (Michal) v4: Fix tags (Rodrigo) v5: Include host_l2_vram workaround (Daniele) Reuse xe_guc_submit_enable/disable() helpers (Daniele) Co-developed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20251030122357.128825-5-raag.jadav@intel.com
2025-11-27drm/xe/pm: Assert on runtime suspend if VFs are enabledRaag Jadav
We hold an additional reference to the runtime PM to keep PF in D0 during VFs lifetime, as our VFs do not implement the PM capability. This means we should never be runtime suspending as long as VFs are enabled. v8: Add !IS_SRIOV_VF() assert (Matthew Brost) Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20251030122357.128825-4-raag.jadav@intel.com
2025-11-27drm/xe/guc_submit: Introduce pause/unpause() helpers for PFRaag Jadav
Introduce pause/unpause() helpers which stop/start further runs of submission tasks on given GuC and can be called from PF context. This is in preparation of usecases where we simply need to stop/start the scheduler without losing GuC state and don't require dealing with VF migration. v7: Reword commit message (Matthew Brost) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20251030122357.128825-3-raag.jadav@intel.com
2025-11-27drm/xe/vf: Update pause/unpause() helpers with VF namingRaag Jadav
Now that pause/unpause() helpers have been updated for VF migration usecase, update their naming to match the functionality and while at it, add IS_SRIOV_VF() assert to make sure they are not abused. v7: Add IS_SRIOV_VF() assert (Matthew Brost) Use "vf" suffix (Michal) Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20251030122357.128825-2-raag.jadav@intel.com
2025-11-27drm/xe: Move VRAM MM debugfs creation to tile levelPiotr Piórkowski
Previously, VRAM TTM resource manager debugfs entries (vram0_mm / vram1_mm) were created globally in the XE debugfs root directory. But technically, each tile has an associated VRAM TTM manager, which it can own. Let's create VRAM memory manager debugfs entries directly under each tile's debugfs directory for better alignment with the per-tile memory layout. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251127073643.144379-1-piotr.piorkowski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-27drm/xe/xe_sriov_packet: Return int from pf_descriptor_initJonathan Cavitt
pf_descriptor_init currently returns a size_t, which is an unsigned integer data type. This conflicts with it returning a negative errno value on failure. Make it return an int instead. This mirrors how pf_trailer_init is used later. Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Alex Zuo <alex.zuo@intel.com> Link: https://patch.msgid.link/20251117190114.69953-2-jonathan.cavitt@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-26drm/xe: Covert return of -EBUSY to -ENOMEM in VM bind IOCTLMatthew Brost
xe_vma_userptr_pin_pages can return -EBUSY but -EBUSY has special meaning in VM bind IOCTLs that user fence is pending that is attached to the VMA. Convert -EBUSY to -ENOMEM in this case as -EBUSY in practice means we are low or out of memory. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patch.msgid.link/20251122012502.382587-2-matthew.brost@intel.com
2025-11-26drm/xe: Add caching pagetable flagZbigniew Kempczyński
Introduce device xe_caching_pt flag to selectively turn it on for supported platforms. It allows to eliminate version check and enable this feature for the future platforms. Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251125153732.400766-2-zbigniew.kempczynski@intel.com
2025-11-26drm/xe/vm: Skip ufence association for CPU address mirror VMA during MAPHimal Prasad Ghimiray
The MAP operation for a CPU address mirror VMA does not require ufence association because such mappings are not GPU-synchronized and do not participate in GPU job completion signaling. Remove the unnecessary ufence addition for this case to avoid -EBUSY failure in check_ufence of unbind ops. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-6-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-11-26drm/xe/svm: Enable UNMAP for VMA merging operationsHimal Prasad Ghimiray
ALLOW UNMAP of VMAs associated with SVM mappings when the MAP operation is intended to merge adjacent CPU_ADDR_MIRROR VMAs. v2 - Remove mapping exist check in garbage collector Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-5-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-11-26drm/xe/svm: Extend MAP range to reduce vma fragmentationHimal Prasad Ghimiray
When DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR is set during VM_BIND_OP_MAP, the mapping logic now checks adjacent cpu_addr_mirror VMAs with default attributes and expands the mapping range accordingly. This ensures that bo_unmap operations ideally target the same area and helps reduce fragmentation by coalescing nearby compatible VMAs into a single mapping. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-4-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-11-26drm/xe: Merge adjacent default-attribute VMAs during garbage collectionHimal Prasad Ghimiray
While restoring default memory attributes for VMAs during garbage collection, extend the target range by checking neighboring VMAs. If adjacent VMAs are CPU-address-mirrored and have default attributes, include them in the mergeable range to reduce fragmentation and improve VMA reuse. v2 -Rebase Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-3-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-11-26drm/xe: Add helper to extend CPU-mirrored VMA range for mergeHimal Prasad Ghimiray
Introduce xe_vm_find_cpu_addr_mirror_vma_range(), which computes an extended range around a given range by including adjacent VMAs that are CPU-address-mirrored and have default memory attributes. This helper is useful for determining mergeable range without performing the actual merge. v2 - Add assert - Move unmap check to this patch v3 - Decrease offset to check by SZ_4K to avoid wrong vma return in fast lookup path v4 - *start should be >= SZ_4K (Matt) Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251125075628.1182481-2-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-11-25drm/xe: Fix conversion from clock ticks to millisecondsHarish Chegondi
When tick counts are large and multiplication by MSEC_PER_SEC is larger than 64 bits, the conversion from clock ticks to milliseconds can go bad. Use mul_u64_u32_div() instead. Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Suggested-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: 49cc215aad7f ("drm/xe: Add xe_gt_clock_interval_to_ms helper") Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/1562f1b62d5be3fbaee100f09107f3cc49e40dd1.1763408584.git.harish.chegondi@intel.com (cherry picked from commit 96b93ac214f9dd66294d975d86c5dee256faef91) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-25drm/xe/guc: Fix stack_depot usageLucas De Marchi
Add missing stack_depot_init() call when CONFIG_DRM_XE_DEBUG_GUC is enabled to fix the following call stack: [] BUG: kernel NULL pointer dereference, address: 0000000000000000 [] Workqueue: drm_sched_run_job_work [gpu_sched] [] RIP: 0010:stack_depot_save_flags+0x172/0x870 [] Call Trace: [] <TASK> [] fast_req_track+0x58/0xb0 [xe] Fixes: 16b7e65d299d ("drm/xe/guc: Track FAST_REQ H2Gs to report where errors came from") Tested-by: Sagar Ghuge <sagar.ghuge@intel.com> Cc: stable@vger.kernel.org # v6.17+ Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251118-fix-debug-guc-v1-1-9f780c6bedf8@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 64fdf496a6929a0a194387d2bb5efaf5da2b542f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-25drm/xe/guc: Fix resource leak in xe_guc_ct_init_noalloc()Shuicheng Lin
xe_guc_ct_init_noalloc() allocates the CT workqueue and other helpers before it tries to initialize ct->lock. If drmm_mutex_init() fails we currently bail out without releasing those resources because the guc_ct_fini() hasn’t been registered yet. Since destroy_workqueue() in guc_ct_fini() may flush the workqueue, which in turn can take the ct lock, the initialization sequence is restructured to first initialize the ct->lock, then set up all CT state, and finally register guc_ct_fini(). v2: guc_ct_fini() does take ct lock. (Matt) v3: move primelockdep() together with drmm_mutex_init(). (Lucas) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251110184522.1581001-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 2e4ad5b0667244f496783c58de0995b9562d3344) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>