summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)Author
2025-11-21drm/xe/pf: Fix kernel-doc warning in migration_save_consumeMichał Winiarski
The kernel-doc for xe_sriov_pf_migration_save_consume() contained multiple "Return:" sections, causing a warning. Fix it by removing the extra line. Fixes: 67df4a5cbc583 ("drm/xe/pf: Add data structures and handlers for migration rings") Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251114134030.1795947-1-michal.winiarski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 562b0f254d8b1515a1c8d2a650f940d4f719300e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-11-21Merge tag 'v6.18-rc6' into drm-nextDave Airlie
Linux 6.18-rc6 Backmerge in order to merge msm next Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-19drm/xe: Switch to use %ptSpAndy Shevchenko
Use %ptSp instead of open coded variants to print content of struct timespec64 in human readable format. Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251113150217.3030010-9-andriy.shevchenko@linux.intel.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-18drm/xe/irq: Handle msix vector0 interruptVenkata Ramana Nayana
Current gu2host handler registered as MSI-X vector 0 and as per bspec for a msix vector 0 interrupt, the driver must check the legacy registers 190008(TILE_INT_REG), 190060h (GT INTR Identity Reg 0) and other registers mentioned in "Interrupt Service Routine Pseudocode" otherwise it will block the next interrupts. To overcome this issue replacing guc2host handler with legacy xe_irq_handler. Fixes: da889070be7b2 ("drm/xe/irq: Separate MSI and MSI-X flows") Bspec: 62357 Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patch.msgid.link/20251107083141.2080189-1-venkata.ramana.nayana@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit c34a14bce7090862ebe5a64abe8d85df75e62737) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-18drm/xe: Remove duplicate DRM_EXEC selection from KconfigShuicheng Lin
There are 2 identical "select DRM_EXEC" lines for DRM_XE. Remove one to clean up the configuration. Fixes: d490ecf57790 ("drm/xe: Rework xe_exec and the VM rebind worker to use the drm_exec helper") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251110232657.1807998-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit b1aa02acd03bfef3ed39c511d33c4a4303d2f9b1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-18drm/xe/kunit: Fix forcewake assertion in mocs testMatt Roper
The MOCS kunit test calls KUNIT_ASSERT_TRUE_MSG() with a condition of 'true;' this prevents the assertion from ever failing. Replace KUNIT_ASSERT_TRUE_MSG with KUNIT_FAIL_AND_ABORT to get the intended failure behavior in cases where forcewake was not acquired successfully. Fixes: 51c0ee84e4dc ("drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs") Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251113234038.2256106-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 9be4f0f687048ba77428ceca11994676736507b7) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-18drm/xe: Prevent BIT() overflow when handling invalid prefetch regionShuicheng Lin
If user provides a large value (such as 0x80) for parameter prefetch_mem_region_instance in vm_bind ioctl, it will cause BIT(prefetch_region) overflow as below: " ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7 shift exponent 128 is too large for 64-bit type 'long unsigned int' CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G W 6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary) Tainted: [W]=WARN Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 Call Trace: <TASK> dump_stack_lvl+0xa0/0xc0 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_shift_out_of_bounds+0x10e/0x170 ? mutex_unlock+0x12/0x20 xe_vm_bind_ioctl.cold+0x20/0x3c [xe] ... " Fix it by validating prefetch_region before the BIT() usage. v2: Add Closes and Cc stable kernels. (Matt) Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478 Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com (cherry picked from commit 8f565bdd14eec5611cc041dba4650e42ccdf71d9) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-18drm/pcids: Split PTL pciids group to make wcl subplatformDnyaneshwar Bhadane
To form the WCL platform as a subplatform of PTL in definition, WCL pci ids are splited into saparate group from PTL. So update the pciidlist struct to cover all the pci ids. v2: - Squash wcl description in single patch for display and xe.(jani,gustavo) Fixes: 3c0f211bc8fc ("drm/xe: Add Wildcat Lake device IDs to PTL list") Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250922150317.2334680-2-dnyaneshwar.bhadane@intel.com (cherry picked from commit 32620e176443bf23ec81bfe8f177c6721a904864) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added the Fixes tag when porting it to fixes]
2025-11-18Merge tag 'drm-intel-next-2025-11-14' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 feature pull #2 for v6.19: Features and functionality: - Add initial display support for Xe3p_LPD, display version 35 (Sai Teja, Matt R, Gustavo, Matt A, Ankit, Juha-pekka, Luca, Ravi Kumar) - Compute LT PHY HDMI params when port clock not in predefined tables (Suraj) Refactoring and cleanups: - Refactor intel_frontbuffer split between i915, xe, and display (Ville) - Clean up intel_de_wait_custom() usage (Ville) - Unify display register polling interfaces (Ville) - Finish removal of the expensive format info lookups (Ville) - Cursor code cleanups (Ville) - Convert intel_rom interfaces to struct drm_device (Jani) Fixes: - Fix uninitialized variable in DSI exec packet (Jonathan) - Fix PIPEDMC logging (Alok Tiwari) - Fix PSR pipe to vblank conversion (Jani) - Fix intel_frontbuffer lifetime handling (Ville) - Disable Panel Replay on DP MST for the time being (Imre) Merges: - Backmerge drm-next to get the drm_print.h changes (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/b131309bb7310ab749f1770aa6e36fa8d6a82fa5@intel.com
2025-11-17Merge tag 'drm-xe-next-2025-11-14' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: Avoid TOCTOU when montoring throttle reasons (Lucas) Add/extend workaround (Nitin) SRIOV migration work / plumbing (Michal Wajdeczko, Michal Winiarski, Lukasz) Drop debug flag requirement for VF resource fixup Fix MTL vm_max_level (Rodrigo) Changes around TILE_ADDR_RANGE for platform compatibility (Fei, Lucas) Add runtime registers for GFX ver >= 35 (Piotr) Kerneldoc fix (Kriish) Rework pcode error mapping (Lucas) Allow lockdown the PF (Michal) Eliminate GUC code caching of some frequency values (Sk) Improvements around forcewake referencing (Matt Roper) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/aRcJOrisG2qPbucE@fedora
2025-11-17Merge tag 'drm-xe-next-2025-11-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: Limit number of jobs per exec queue (Shuicheng) Add sriov_admin sysfs tree (Michal) Driver Changes: Fix an uninitialized value (Thomas) Expose a residency counter through debugfs (Mohammed Thasleem) Workaround enabling and improvement (Tapani, Tangudu) More Crescent Island-specific support (Sk Anirban, Lucas) PAT entry dump imprement (Xin) Inline gt_reset in the worker (Lucas) Synchronize GT reset with device unbind (Balasubramani) Do clean shutdown also when using flr (Jouni) Fix serialization on burst of unbinds (Matt Brost) Pagefault Refactor (Matt Brost) Remove some unused code (Gwan-gyeong) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/aQuBECxNOhudc0Bz@fedora
2025-11-14PCI: Convert BAR sizes bitmasks to u64Ilpo Järvinen
PCIe r7.0, sec 7.8.6, defines resizable BAR sizes beyond the currently supported maximum of 128TB, which will require more than u32 to store the entire bitmask. Convert Resizable BAR related functions to use u64 bitmask for BAR sizes to make the typing more future-proof. The support for the larger BAR sizes themselves is not added at this point. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113180053.27944-12-ilpo.jarvinen@linux.intel.com
2025-11-14drm/xe/vram: Use pci_rebar_get_max_size()Ilpo Järvinen
Use pci_rebar_get_max_size() from PCI core in resize_vram_bar() to simplify code. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251113180053.27944-10-ilpo.jarvinen@linux.intel.com
2025-11-14drm/xe/vram: Use PCI rebar helpers in resize_vram_bar()Ilpo Järvinen
PCI core provides pci_rebar_size_supported() and pci_rebar_size_to_bytes(); use them in resize_vram_bar() to simplify code. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251113180053.27944-8-ilpo.jarvinen@linux.intel.com
2025-11-14drm/xe: Remove driver side BAR release before resizeIlpo Järvinen
PCI core handles releasing device's resources and their rollback in case of failure of a BAR resizing operation. Releasing resource prior to calling pci_resize_resource() prevents PCI core from restoring the BARs as they were. Remove driver-side release of BARs from the xe driver. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251113162628.5946-9-ilpo.jarvinen@linux.intel.com
2025-11-14PCI: Fix restoring BARs on BAR resize rollback pathIlpo Järvinen
BAR resize operation is implemented in the pci_resize_resource() and pbus_reassign_bridge_resources() functions. pci_resize_resource() can be called either from __resource_resize_store() from sysfs or directly by the driver for the Endpoint Device. The pci_resize_resource() requires that caller has released the device resources that share the bridge window with the BAR to be resized as otherwise the bridge window is pinned in place and cannot be changed. pbus_reassign_bridge_resources() rolls back resources if the resize operation fails, but rollback is performed only for the bridge windows. Because releasing the device resources are done by the caller of the BAR resize interface, these functions performing the BAR resize do not have access to the device resources as they were before the resize. pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources() after rolling back the bridge windows as they were, however, it will not guarantee the resource are assigned due to differences in how FW and the kernel assign the resources (alignment of the start address and tail). To perform rollback robustly, the BAR resize interface has to be altered to also release the device resources that share the bridge window with the BAR to be resized. Also, remove restoring from the entries failed list as saved list should now contain both the bridge windows and device resources so the extra restore is duplicated work. Some drivers (currently only amdgpu) want to prevent releasing some resources. Add exclude_bars param to pci_resize_resource() and make amdgpu pass its register BAR (BAR 2 or 5), which should never be released during resize operation. Normally 64-bit prefetchable resources do not share a bridge window with the 32-bit only register BAR, but there are various fallbacks in the resource assignment logic which may make the resources share the bridge window in rare cases. This change (together with the driver side changes) is to counter the resource releases that had to be done to prevent resource tree corruption in the ("PCI: Release assigned resource before restoring them") change. As such, it likely restores functionality in cases where device resources were released to avoid resource tree conflicts which appeared to be "working" when such conflicts were not correctly detected by the kernel. Reported-by: Simon Richter <Simon.Richter@hogyros.de> Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/ Reported-by: Alex Bennée <alex.bennee@linaro.org> Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash amdgpu BAR selection from https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
2025-11-13drm/xe/oa: Store forcewake reference in stream structureMatt Roper
Calls to xe_force_wake_put() should generally pass the exact reference returned by xe_force_wake_get(). Since OA grabs and releases forcewake in different functions, xe_oa_stream_destroy() is currently calling put with a hardcoded ALL mask. Although this works for now, it's somewhat fragile in case OA moves to more precise power domain management in the future. Stash the original reference obtained during stream initialization inside the stream structure so that we can use it directly when the stream is destroyed. Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20251110232017.1475869-35-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-13drm/xe/eustall: Store forcewake reference in stream structureMatt Roper
Calls to xe_force_wake_put() should generally pass the exact reference returned by xe_force_wake_get(). Since EU stall grabs and releases forcewake in different functions, xe_eu_stall_disable_locked() is currently calling put with a hardcoded RENDER domain. Although this works for now, it's somewhat fragile in case the power domain(s) required by stall sampling change in the future, or if workarounds show up that require us to obtain additional domains. Stash the original reference obtained during stream enable inside the stream structure so that we can use it directly when the stream is disabled. Cc: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251110232017.1475869-34-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-13drm/xe/forcewake: Improve kerneldocMatt Roper
Improve the kerneldoc for forcewake a bit to give more detail about what the structures represent. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251110232017.1475869-33-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-13drm/xe/pf: Use migration-friendly GGTT auto-provisioningMichal Wajdeczko
Instead of trying very hard to find the largest fair GGTT size that could be allocated for VFs on the current tile, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs | GGTT space (MiB) --------+----------------- 63..57 | 56 56..29 | 64 28..15 | 128 14..8 | 256 7..4 | 512 3..2 | 1024 1 | 2048 (regular PF) 1 | 3584 (admin only PF) Note that due to FW/HW limitations we can't share all 4GiB GGTT address space with VFs, so for the larger (>7) number of the VFs the change in the outcome is happening at different points than we have in case of GuC contexts/doorbells IDs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251112124408.8094-1-michal.wajdeczko@intel.com
2025-11-13drm/xe/pf: Add wait helper for VF FLRMichał Winiarski
VF FLR requires additional processing done by PF driver. The processing is done after FLR is already finished from PCIe perspective. In order to avoid a scenario where migration state transitions while PF processing is still in progress, additional synchronization point is needed. Add a helper that will be used as part of VF driver struct pci_error_handlers .reset_done() callback. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-24-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Handle VRAM migration data as part of PF controlMichał Winiarski
Connect the helpers to allow save and restore of VRAM migration data in stop_copy / resume device state. Co-developed-by: Lukasz Laguna <lukasz.laguna@intel.com> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251112132220.516975-23-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/migrate: Add function to copy of VRAM data in chunksLukasz Laguna
Introduce a new function to copy data between VRAM and sysmem objects. The existing xe_migrate_copy() is tailored for eviction and restore operations, which involves additional logic and operates on entire objects. The xe_migrate_vram_copy_chunk() allows copying chunks of data to or from a dedicated buffer object, which is essential in case of VF migration. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251112132220.516975-22-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add helper to retrieve VF's LMEM objectLukasz Laguna
Instead of accessing VF's lmem_obj directly, introduce a helper function to make the access more convenient. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-21-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Handle MMIO migration data as part of PF controlMichał Winiarski
Implement the helpers and use them for save and restore of MMIO migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-20-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Handle GGTT migration data as part of PF controlMichał Winiarski
Connect the helpers to allow save and restore of GGTT migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-19-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add helpers for VF GGTT migration data handlingMichał Winiarski
In an upcoming change, the VF GGTT migration data will be handled as part of VF control state machine. Add the necessary helpers to allow the migration data transfer to/from the HW GGTT resource. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-18-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Handle GuC migration data as part of PF controlMichał Winiarski
Connect the helpers to allow save and restore of GuC migration data in stop_copy / resume device state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-17-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Switch VF migration GuC save/restore to struct migration dataMichał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part of separate SAVE/RESTORE states in VF control state machine. Now that the data is decoupled from both guc_state debugfs and PAUSE state, we can safely remove the struct xe_gt_sriov_state_snapshot and modify the GuC save/restore functions to operate on struct xe_sriov_migration_data. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-16-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Don't save GuC VF migration data on pauseMichał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part of separate SAVE/RESTORE states in VF control state machine. Remove it from PAUSE state. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-15-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Remove GuC migration data save/restore from GT debugfsMichał Winiarski
In upcoming changes, SR-IOV VF migration data will be extended beyond GuC data and exported to userspace using VFIO interface (with a vendor-specific variant driver) and a device-level debugfs interface. Remove the GT-level debugfs. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-14-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migrationMichał Winiarski
Contiguous PF GGTT VMAs can be scarce after creating VFs. Increase the GuC buffer cache size to 8M for PF so that we can fit GuC migration data (which currently maxes out at just over 4M) and use the cache instead of allocating fresh BOs. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-13-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe: Allow the caller to pass guc_buf_cache sizeMichał Winiarski
An upcoming change will use GuC buffer cache as a place where GuC migration data will be stored, and the memory requirement for that is larger than indirect data. Allow the caller to pass the size based on the intended usecase. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-12-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe: Add sa/guc_buf_cache sync interfaceMichał Winiarski
In upcoming changes the cached buffers are going to be used to read data produced by the GuC. Add a counterpart to flush, which synchronizes the CPU-side of suballocation with the GPU data and propagate the interface to GuC Buffer Cache. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-11-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Expose VF migration data size over debugfsMichał Winiarski
The size is normally used to make a decision on when to stop the device (mainly when it's in a pre_copy state). Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-10-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add minimalistic migration descriptorMichał Winiarski
The descriptor reuses the KLV format used by GuC and contains metadata that can be used to quickly fail migration when source is incompatible with destination. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-9-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add support for encap/decap of bitstream to/from packetMichał Winiarski
Add debugfs handlers for migration state and handle bitstream .read()/.write() to convert from bitstream to/from migration data packets. As descriptor/trailer are handled at this layer - add handling for both save and restore side. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-8-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add helpers for migration data packet allocation / freeMichał Winiarski
Now that it's possible to free the packets - connect the restore handling logic with the ring. The helpers will also be used in upcoming changes that will start producing migration data packets. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-7-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add data structures and handlers for migration ringsMichał Winiarski
Migration data is queued in a per-GT ptr_ring to decouple the worker responsible for handling the data transfer from the .read() and .write() syscalls. Add the data structures and handlers that will be used in future commits. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-6-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Add save/restore control state stubs and connect to debugfsMichał Winiarski
The states will be used by upcoming changes to produce (in case of save) or consume (in case of resume) the VF migration data. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-5-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Convert control state to bitmapMichał Winiarski
In upcoming changes, the number of states will increase as a result of introducing SAVE and RESTORE states. This means that using unsigned long as underlying storage won't work on 32-bit architectures, as we'll run out of bits. Use bitmap instead. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510231918.XlOqymLC-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-4-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe: Move migration support to device-level structMichał Winiarski
Upcoming changes will allow users to control VF state and obtain its migration data with a device-level granularity (not tile/gt). Change the data structures to reflect that and move the GT-level migration init to happen after device-level init. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-3-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe/pf: Remove GuC version check for migration supportMichał Winiarski
Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC version to v70.29.2"), the minimum GuC version required by the driver is v70.29.2, which should already include everything that we need for migration. Remove the version check. Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-2-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-13drm/xe: remove stale runtime_pm memberJani Nikula
This has become unused and unnecessary. Remove. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251112185547.172113-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-12drm/xe/guc: Eliminate RPa frequency cachingSk Anirban
Remove the cached pc->rpa_freq field and refactor RPA frequency handling to fetch values directly from hardware registers on each request. v2: Check graphics version instead of platform (Rodrigo) v3: Fix graphics version check (Badal) Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Suggested-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251112185153.3593145-6-sk.anirban@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12drm/xe/guc: Eliminate RPe caching for SLPC parameter handlingSk Anirban
RPe is runtime-determined by PCODE and caching it caused stale values, leading to incorrect GuC SLPC parameter settings. Drop the cached rpe_freq field and query fresh values from hardware on each use to ensure GuC SLPC parameters reflect current RPe. v2: Remove cached RPe frequency field (Rodrigo) v3: Remove extra variable (Vinay) Modify function name (Vinay) v4: Maintain a separate function for PVC (Rodrigo) v5: Avoid RPn update while fetching RPe frequency (Rodrigo) v6: Split platform-specific RPe comments (Vinay) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5166 Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20251112185153.3593145-5-sk.anirban@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12drm/xe/pf: Allow to lockdown the PF using custom guardMichal Wajdeczko
Some driver components, like eudebug or ccs-mode, can't be used when VFs are enabled. Add functions to allow those components to block the PF from enabling VFs for the requested duration. Introduce trivial counter to allow lockdown or exclusive access that can be used in the scenarios where we can't follow the strict owner semantics as required by the rw_semaphore implementation. Before enabling VFs, the PF will try to arm the "vfs_enabling" guard for the exclusive access. This will fail if there are some lockdown requests already initiated by the other components. For testing purposes, add debugfs file which will call these new functions from the file's open/close hooks. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Christoph Manszewski <christoph.manszewski@intel.com> Reviewed-by: Christoph Manszewski <christoph.manszewski@intel.com> Link: https://patch.msgid.link/20251109162451.4779-1-michal.wajdeczko@intel.com
2025-11-12drm/xe/pcode: Rework error mappingLucas De Marchi
The sparse array used for error decoding from is unnecessarily big. It should be better handled by a switch statement that will also allow us to more easily improve this code. Add a CASE_ERR() macro to keep the table compact and use it instead of the 256-entries array, which saves some space: $ bloat-o-meter xe_pcode.o.old xe_pcode.o add/remove: 0/1 grow/shrink: 2/0 up/down: 190/-4096 (-3906) Function old new delta __pcode_mailbox_rw 363 465 +102 __pcode_mailbox_rw.cold 58 146 +88 err_decode 4096 - -4096 Total: Before=7890, After=3984, chg -49.51% Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251110-pcode-errmap-v2-1-cb18c8f54238@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-12drm/xe: fix kernel-doc function name mismatch in xe_pm.cKriish Sharma
Documentation build reported: WARNING: ./drivers/gpu/drm/xe/xe_pm.c:131 expecting prototype for xe_pm_might_block_on_suspend(). Prototype was for xe_pm_block_on_suspend() instead The kernel-doc comment for xe_pm_block_on_suspend() incorrectly used the function name xe_pm_might_block_on_suspend(). Fix the header to match the actual function prototype. No functional changes. Fixes: f73f6dd312a5 ("drm/xe/pm: Add lockdep annotation for the pm_block completion") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511061736.CiuroL7H-lkp@intel.com/ Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251110184206.2113830-1-kriish.sharma2006@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-11drm/i915/de: Implement register waits one wayVille Syrjälä
Currently we use a messy mix of intel_wait_for_register*() and __intel_wait_for_register*() to implement various register polling functions. Make the mess a bit more understandable by always using the __intel_wait_for_register*() stuff. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251110172756.2132-2-ville.syrjala@linux.intel.com Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com>