summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)Author
2025-11-11Merge drm/drm-next into drm-intel-nextJani Nikula
Primarily sync with the drm_print.h changes from drm-misc. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-10drm/xe/pf: Add runtime registers for GFX ver >= 35Piotr Piórkowski
Add a dedicated runtime register list for GFX ver >= 35. Compared to the list for GFX >= 30, this variant drops HUC_KERNEL_LOAD_INFO, MIRROR_FUSE1 and adds SERVICE_COPY_ENABLE. v2: - drop MIRROR_FUSE1 register - update commit message Fixes: 5e0de2dfbc1b ("drm/xe/cri: Add CRI platform definition") Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251107211845.3633633-1-piotr.piorkowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpgNitin Gote
Wa_15016589081 applies to Xe3_LPG renderCS Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 715974499a2199bd199fb4630501f55545342ea4) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe/xe3: Extend wa_14023061436Tangudu Tilak Tirumalesh
Extend wa_14023061436 to Graphics Versions 30.03, 30.04 and 30.05. Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251030154626.3124565-1-tilak.tirumalesh.tangudu@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 0dd656d06f50ae4cedf160634cf13fd9e0944cf7) Cc: stable@vger.kernel.org # v6.17+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe/xe3: Add WA_14024681466 for Xe3_LPGNitin Gote
Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05. v2: (Matthew Roper) - Remove stepping filter as workaround applies to all steppings. - Add an engine class filter so it only applies to the RENDER engine. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251027092643.335904-1-nitin.r.gote@intel.com Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 071089a69e199bd810ff31c4c933bd528e502743) Cc: stable@vger.kernel.org # v6.16+ Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe/vram: Move forcewake down to get_flat_ccs_offset()Lucas De Marchi
With SG_TILE_ADDR_RANGE use, the only thing requiring GT forcewake while probing for vram size is the get_flat_ccs_offset(). Move the forcewake down where it's needed. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251107-tile-addr-v1-2-a3014aadc2e7@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe: Use SG_TILE_ADDR_RANGE instead of TILE_ADDR_RANGEFei Yang
The TILE_ADDR_RANGE register is not available on all platforms going forward as it was deprecated and is being replaced by equivalent registers within SoC MMIO space. While that doesn't happen, the SG_TILE_ADDR_RANGE (base 0x1083a0) is still valid for all platforms supported by xe. Use that instead. BSpec: 59353, 54991 Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251107-tile-addr-v1-1-a3014aadc2e7@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10drm/xe: Fix MTL vm_max_levelRodrigo Vivi
MTL was broken after the vm_max_level movement. Get it back to a working value. [ 37.722413] xe 0000:00:02.0: [drm] Tile0: GT0: VM job timed out on non-killed execqueue [ 37.722465] WARNING: CPU: 0 PID: 12 at drivers/gpu/drm/xe/xe_guc_submit.c:1379 guc_exec_queue_timedout_job+0x2f3/0xe00 [xe] [ 37.722559] Modules linked in: xt_REDIRECT nft_compat nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc bnep snd_ctl_led snd_soc_s\ of_sdw snd_soc_intel_hda_dsp_common snd_soc_sdw_utils snd_sof_probes snd_soc_rt712_sdca regmap_sdw_mbq snd_hda_codec_intelhdmi regmap_sdw snd_soc_dmic snd_hda_intel snd_sof_pci_intel_mtl iwlmvm snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink\ snd_sof_intel_hda snd_hda_codec_hdmi soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp binfmt_misc snd_sof mac80211 vfat snd_sof_utils fat snd_hda_ext_core snd_hda_codec snd_hda_core snd_intel_dspcfg snd_intel_sdw_acpi snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_acpi snd_hwdep \ crc8 soundwire_bus libarc4 snd_soc_sdca snd_soc_core [ 37.722584] snd_compress ac97_bus uvcvideo snd_pcm_dmaengine iwlwifi snd_seq uvc videobuf2_vmalloc snd_seq_device videobuf2_memops videobuf2_v4l2 snd_pcm processor_thermal_device_pci videobuf2_common processor_thermal_device btusb intel_uncore_frequency processor_thermal_wt_hint intel_uncore_frequency_common platform_temp\ erature_control videodev btmtk spi_nor processor_thermal_soc_slider x86_pkg_temp_thermal btrtl snd_timer iTCO_wdt processor_thermal_rfim intel_powerclamp btbcm intel_pmc_bxt snd intel_rapl_msr processor_thermal_rapl coretemp iTCO_vendor_support mei_gsc_proxy btintel intel_rapl_common rapl intel_cstate cfg80211 bluetooth mc in\ tel_pmc_core mtd soundcore acer_wmi mei_me intel_uncore processor_thermal_wt_req i2c_i801 spi_intel_pci pmt_telemetry platform_profile mei processor_thermal_power_floor spi_intel i2c_smbus pmt_discovery igen6_edac pcspkr rfkill wmi_bmof idma64 processor_thermal_mbox intel_hid pmt_class int3403_thermal int3400_thermal joydev i\ nt340x_thermal_zone acpi_pad sparse_keymap [ 37.722611] intel_pmc_ssram_telemetry acpi_thermal_rel acer_wireless loop nfnetlink zram lz4hc_compress lz4_compress dm_crypt xe drm_ttm_helper drm_suballoc_helper gpu_sched drm_gpuvm drm_exec drm_gpusvm_helper i915 nvme i2c_algo_bit nvme_core drm_buddy ucsi_acpi ttm typec_ucsi typec nvme_keyring nvme_auth hkdf drm_displa\ y_helper hid_multitouch polyval_clmulni thunderbolt intel_vpu ghash_clmulni_intel cec vmd i2c_hid_acpi video intel_vsec i2c_hid wmi pinctrl_meteorlake serio_raw i2c_dev fuse [ 37.722638] CPU: 0 UID: 0 PID: 12 Comm: kworker/u88:0 Not tainted 6.18.0-rc2+ #37 PREEMPT(voluntary) [ 37.722641] Hardware name: Acer Swift SFG14-72/Coral_MTH, BIOS V1.01 11/06/2023 [ 37.722643] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 37.722649] RIP: 0010:guc_exec_queue_timedout_job+0x2f3/0xe00 [xe] [ 37.722722] Code: 4c 24 10 44 89 44 24 08 e8 5a 95 f1 d4 44 8b 44 24 08 8b 4c 24 10 48 c7 c7 00 b7 25 c1 48 8b 54 24 18 48 89 c6 e8 4d 59 37 d4 <0f> 0b 80 3c 24 00 0f 85 55 03 00 00 49 8b 47 58 a8 01 75 1a 49 8b [ 37.722723] RSP: 0018:ffffd468000f7d80 EFLAGS: 00010246 [ 37.722725] RAX: 0000000000000000 RBX: ffff8e3d4e215c00 RCX: 0000000000000027 [ 37.722726] RDX: ffff8e40ae61cfc8 RSI: 0000000000000001 RDI: ffff8e40ae61cfc0 [ 37.722727] RBP: 00000000fffffffb R08: 0000000000000000 R09: ffffd468000f7c20 [ 37.722727] R10: ffff8e40c09fffa8 R11: 00000000fffbffff R12: ffff8e3d44c00028 [ 37.722728] R13: ffff8e3d807d4000 R14: ffff8e3d807d4018 R15: ffff8e3d95c9d600 [ 37.722729] FS: 0000000000000000(0000) GS:ffff8e4116110000(0000) knlGS:0000000000000000 [ 37.722729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.722730] CR2: 00007ff1f3e02720 CR3: 0000000113c8d005 CR4: 0000000000f70ef0 [ 37.722731] PKRU: 55555554 [ 37.722731] Call Trace: [ 37.722734] <TASK> [ 37.722735] ? __pfx_autoremove_wake_function+0x10/0x10 [ 37.722740] drm_sched_job_timedout+0x81/0x170 [gpu_sched] Fixes: 50292f9af8ec ("drm/xe: Move 'vm_max_level' flag back to platform descriptor") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251108040634.6376-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-10drm/xe/vf: Enable VF resource fixup unconditionallyMichał Winiarski
All the feature enabling code is in place - drop the debug flag requirement for VF resource fixup. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251107161000.1938186-1-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-07drm/xe/tests: Add KUnit tests for PF fair provisioningMichal Wajdeczko
Add test cases to check outcome of fair GuC context or doorbells IDs allocations for regular and admin-only PF mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251106165932.2143-1-michal.wajdeczko@intel.com
2025-11-07drm/xe/pf: Use migration-friendly doorbells auto-provisioningMichal Wajdeczko
Instead of trying very hard to find the largest fair number of GuC doorbell IDs that could be allocated for VFs on the current GT, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs | num doorbells --------+-------------- 63..32 | 4 31..16 | 8 15..8 | 16 7..4 | 32 3..2 | 64 1 | 128 (regular PF) 1 | 240 (admin only PF) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251105183253.863-3-michal.wajdeczko@intel.com
2025-11-07drm/xe/pf: Use migration-friendly context IDs auto-provisioningMichal Wajdeczko
Instead of trying very hard to find the largest fair number of GuC context IDs that could be allocated for VFs on the current GT, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instance: num VFs | num contexts --------+------------- 63..32 | 1024 31..16 | 2048 15..8 | 4096 7..4 | 8192 3..2 | 16384 1 | 32768 (regular PF) 1 | 64512 (admin only PF) Add also helper function to determine if the PF is admin-only, and for now use .probe_display flag for that. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251105183253.863-2-michal.wajdeczko@intel.com
2025-11-07drm/i915/frontbuffer: Fix intel_frontbuffer lifetime handlingVille Syrjälä
The current attempted split between xe/i915 vs. display for intel_frontbuffer is a mess: - the i915 rcu leaks through the interface to the display side - the obj->frontbuffer write-side is now protected by a display specific spinlock even though the actual obj->framebuffer pointer lives in a i915 specific structure - the kref is getting poked directly from both sides - i915_active is still on the display side Clean up the mess by moving everything about the frontbuffer lifetime management to the i915/xe side: - the rcu usage is now completely contained in i915 - frontbuffer_lock is moved into i915 - kref is on the i915/xe side (xe needs the refcount as well due to intel_frontbuffer_queue_flush()->intel_frontbuffer_ref()) - the bo (and its refcounting) is no longer on the display side - i915_active is contained in i915 I was pondering whether we could do this in some kind of smaller steps, and perhaps we could, but it would probably have to start with a bunch of reverts (which for sure won't go cleanly anymore). So not convinced it's worth the hassle. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251016185408.22735-10-ville.syrjala@linux.intel.com
2025-11-07drm/i915/frontbuffer: Turn intel_bo_flush_if_display() into a frontbuffer ↵Ville Syrjälä
operation Convert intel_bo_flush_if_display() to be an operation on the frontbuffer object rather than the underlying gem bo. This will help with cleaning up the frontbuffer xe/i915 vs. display split. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20251016185408.22735-5-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2025-11-07drm/xe: Enforce correct user fence signaling order usingMatthew Brost
Prevent application hangs caused by out-of-order fence signaling when user fences are attached. Use drm_syncobj (via dma-fence-chain) to guarantee that each user fence signals in order, regardless of the signaling order of the attached fences. Ensure user fence writebacks to user space occur in the correct sequence. v7: - Skip drm_syncbj create of error (CI) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com (cherry picked from commit adda4e855ab6409a3edaa585293f1f2069ab7299) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07drm/xe: Do clean shutdown also when using flrJouni Högander
Currently Xe driver is triggering flr without any clean-up on shutdown. This is causing random warnings from pending related works as the underlying hardware is reset in the middle of their execution. Fix this by performing clean shutdown also when using flr. Fixes: 501d799a47e2 ("drm/xe: Wire up device shutdown handler") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://patch.msgid.link/20251031122312.1836534-1-jouni.hogander@intel.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> (cherry picked from commit a4ff26b7c8ef38e4dd34f77cbcd73576fdde6dd4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07drm/xe: Move declarations under conditional branchTejas Upadhyay
The xe_device_shutdown() function was needing a few declarations that were only required under a specific condition. This change moves those declarations to be within that conditional branch to avoid unnecessary declarations. Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20251007100208.1407021-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit 15b3036045188f4da4ca62b2ed01b0f160252e9b) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07drm/xe/guc: Synchronize Dead CT worker with unbindBalasubramani Vivekanandan
Cancel and wait for any Dead CT worker to complete before continuing with device unbinding. Else the worker will end up using resources freed by the undind operation. Cc: Zhanjun Dong <zhanjun.dong@intel.com> Fixes: d2c5a5a926f4 ("drm/xe/guc: Dead CT helper") Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-6-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 492671339114e376aaa38626d637a2751cdef263) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07Merge tag 'drm-misc-next-2025-11-05-1' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.19-rc1: UAPI Changes: - Add userptr support to ivpu. - Add IOCTL's for resource and telemetry data in amdxdna. Core Changes: - Improve some atomic state checking handling. - drm/client updates. - Use forward declarations instead of including drm_print.h - RUse allocation flags in ttm_pool/device_init and allow specifying max useful pool size and propagate ENOSPC. - Updates and fixes to scheduler and bridge code. - Add support for quirking DisplayID checksum errors. Driver Changes: - Assorted cleanups and fixes in rcar-du, accel/ivpu, panel/nv3052cf, sti, imxm, accel/qaic, accel/amdxdna, imagination, tidss, sti, panthor, vkms. - Add Samsung S6E3FC2X01 DDIC/AMS641RW, Synaptics TDDI series DSI, TL121BVMS07-00 (IL79900A) panels. - Add mali MediaTek MT8196 SoC gpu support. - Add etnaviv GC8000 Nano Ultra VIP r6205 support. - Document powervr ge7800 support in the devicetree. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/5afae707-c9aa-4a47-b726-5e1f1aa7a106@linux.intel.com
2025-11-07Merge tag 'drm-intel-next-2025-11-04' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 feature pull for v6.19: Features and functionality: - Enable LNL+ content adaptive sharpness filter (CASF) (Nemesa) - Use optimized VRR guardband (Ankit, Ville) - Enable Xe3p LT PHY (Suraj) - Enable FBC support for Xe3p_LPD display (Sai Teja, Vinod) - Specify DMC firmware for display version 30.02 (Dnyaneshwar) - Report reason for disabling PSR to debugfs (Michał) - Extend i915_display_info with Type-C port details (Khaled) - Log DSI send packet sequence errors and contents Refactoring and cleanups: - Refactoring to prepare for VRR guardband optimization (Ankit) - Abstract VRR live status wait (Ankit) - Refactor VRR and DSB timing to handle Set Context Latency explicitly (Ankit) - Helpers for prefill latency calculations (Ville) - Refactor SKL+ watermark latency setup (Ville) - VRR refactoring and cleanups (Ville) - SKL+ universal plane cleanups (Ville) - Decouple CDCLK from state->modeset refactor (Ville) - Refactor VLV/CHV clock functions (Jani) - Refactor fbdev handling (Jani) - Call i915 and xe runtime PM from display via function pointers (Jouni) - IRQ code refactoring (Jani) - Drop display dependency on i915 feature check macros (Jani) - Refactor and unify i915 and xe stolen memory interfaces towards display (Jani) - Switch to driver agnostic drm to display pointer chase (Jani) - Use display version over graphics version in display code (Matt A) - GVT cleanups (Jonathan, Andi) - Rename a VLV clock function to unify (Michał) - Explicitly sanitize DMC package header num entries (Luca) - Remove redundant port clock check from ALPM (Jouni) - Use sysfs_emit() instead of sprintf() in PMU sysfs (Madhur Kumar) - Clean up C20 PHY PLL register macros (Imre, Mika)) - Abstract "address in MMIO table" helper for general use (Matt A) - Improve VRR platform abstractions (Ville) - Move towards more standard PCI PM code usage (Ville) - Framebuffer refactoring (Ville) - Drop display dependency on i915_utils.h (Jani) - Include cleanups (Jani) Fixes: - Workaround docking station DSC issues with high pixel clock and bpp (Imre) - Fix Panel Replay in DSC mode (Imre) - Disable tracepoints for PREEMPT_RT as a workaround (Maarten) - Fix intel_crtc_get_vblank_counter() on PREEMPT_RT (Maarten) - Fix C10 PHY identification on PTL/WCL (Dnyaneshwar) - Take AS SDP into account with optimized guardband (Jouni) - Fix panic structure allocation memory leak (Jani) - Adjust an FBC workaround platforms (Vinod) - Add fallback for CDCLK selection (Naladala) - Avoid using invalid transcoder in MST transport select (Suraj) - Don't use cursor size reduction on display version 14+ (Nemesa) - Fix C20 PHY PLL register programming (Imre, Mika) - Fix PSR frontbuffer flush handling (Jouni) - Store ALPM parameters in crtc state (Jouni) - Defeature DRRS on LNL+ (Ville) - Fix the scope of the large DRAM DIMM workaround (Ville) - Fix PICA vs. AUX power ordering issue (Gustavo) - Fix pixel rate for computing watermark line time (Ville) - Fix framebuffer set_tiling vs. addfb race (Ville) - DMC event handler fixes (Ville) DRM Core: - CRTC sharpness strength property (Nemesa) - DPCD DSC quirk for Synaptics Panamera devices (Imre) - Helpers to query the branch DSC max throughput/line-width (Imre) Merges: - Backmerge drm-next for v6.18-rc and to sync with drm-xe-next (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/ec5a05f2df6d597a62033ee2d57225cce707b320@intel.com
2025-11-06drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpgNitin Gote
Wa_15016589081 applies to Xe3_LPG renderCS Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-05drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasonsLucas De Marchi
It's currently not possible to safely monitor if there's throttling happening and what are the reasons. The approach of reading the status and then reading the reasons is not reliable as by the time sysadmin reads the reason, the throttling could not be happening anymore. Previous tentative to fix that[1] was breaking the ABI and potentially sysadmin's scripts. This takes a different approach of adding and documenting the additional attribute. It's still valuable, though redundant, to provide the simpler 0/1 interface. In order to avoid userspace knowledge on the bitmask meaning and to be able to maintain the kernel side in sync with possible changes in future, just walk the attribute group and check what are the masks that match the value read. [1] https://lore.kernel.org/intel-xe/20241025092238.167042-1-raag.jadav@intel.com/ Cc: Raag Jadav <raag.jadav@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251104-gt-throttle-cri-v5-1-4948b060bbfd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-05drm/xe: Remove never used code in xe_vm_create()Gwan-gyeong Mun
Clang is not happy with set but unused variable (this is visible with `make LLVM=1` build: drivers/gpu/drm/xe/xe_vm.c:1462:11: error: variable 'number_tiles' set but not used [-Werror,-Wunused-but-set-variable] The use of this variable was removed in the commit mentioned below as "Fixes:" but only its declaration and update remain. It seems like the variable is not used along with the assignment that does not have side effects as far as I can see. Remove those altogether. Fixes: cb99e12ba8cb ("drm/xe: Decouple bind queue last fence from TLB invalidations") Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20251105011311.3177875-1-gwan-gyeong.mun@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2025-11-04drm/xe: Remove unused GT page fault codeMatthew Brost
With the Xe page fault layer and GuC page layer in place, this is now dead code and can be removed. ACC code is also removed, but this was dead code. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-8-matthew.brost@intel.com
2025-11-04drm/xe: Add xe_guc_pagefault layerMatthew Brost
Add xe_guc_pagefault layer (producer) which parses G2H fault messages messages into struct xe_pagefault, forwards them to the page fault layer (consumer) for servicing, and provides a vfunc to acknowledge faults to the GuC upon completion. Replace the old (and incorrect) GT page fault layer with this new layer throughout the driver. As part of this change, the ACC handling code has been removed, as it is dead code that is currently unused. v2: - Include engine instance (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-7-matthew.brost@intel.com
2025-11-04drm/xe: Implement xe_pagefault_queue_workMatthew Brost
Implement a worker that services page faults, using the same implementation as in xe_gt_pagefault.c. v2: - Rebase on exhaustive eviction changes - Include engine instance in debug prints (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-6-matthew.brost@intel.com
2025-11-04drm/xe: Implement xe_pagefault_handlerMatthew Brost
Enqueue (copy) the input struct xe_pagefault into a queue (i.e., into a memory buffer) and schedule a worker to service it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-5-matthew.brost@intel.com
2025-11-04drm/xe: Implement xe_pagefault_resetMatthew Brost
Squash any pending faults on the GT being reset by setting the GT field in struct xe_pagefault to NULL. v4: - Only do reset it page faults queues initialized (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-4-matthew.brost@intel.com
2025-11-04drm/xe: Implement xe_pagefault_initMatthew Brost
Create pagefault queues and initialize them. v2: - Fix kernel doc + add comment for number PF queue (Francois) v4: - Move init after GT init (CI, Francois) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-3-matthew.brost@intel.com
2025-11-04drm/xe: Stub out new pagefault layerMatthew Brost
Stub out the new page fault layer and add kernel documentation. This is intended as a replacement for the GT page fault layer, enabling multiple producers to hook into a shared page fault consumer interface. v2: - Fix kernel doc typo (checkpatch) - Remove comment around GT (Stuart) - Add explaination around reclaim (Francois) - Add comment around u8 vs enum (Francois) - Include engine instance (Stuart) v3: - Fix XE_PAGEFAULT_TYPE_ATOMIC_ACCESS_VIOLATION kernel doc (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-2-matthew.brost@intel.com
2025-11-04drm/xe: Remove last fence dependency check from binds and execsMatthew Brost
Eliminate redundant last fence dependency checks in exec and bind jobs, as they are now equivalent to xe_exec_queue_is_idle. Simplify the code by removing this dead logic. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-7-matthew.brost@intel.com
2025-11-04drm/xe: Disallow input fences on zero batch execs and zero bindsMatthew Brost
Prevent input fences from being installed on zero batch execs or zero binds, which were originally added to support queue idling in Mesa via output fences. Although input fence support was introduced for interface consistency, it leads to incorrect behavior due to chained composite fences, which are disallowed. Avoid the complexity of fixing this by removing support, as input fences for these cases are not used in practice. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-6-matthew.brost@intel.com
2025-11-04drm/xe: Skip TLB invalidation waits in page fault bindsMatthew Brost
Avoid waiting on unrelated TLB invalidations when servicing page fault binds. Since the migrate queue is shared across processes, TLB invalidations triggered by other processes may occur concurrently but are not relevant to the current bind. Teach the bind pipeline to skip waits on such invalidations to prevent unnecessary serialization. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-5-matthew.brost@intel.com
2025-11-04drm/xe: Decouple bind queue last fence from TLB invalidationsMatthew Brost
Separate the bind queue’s last fence to apply exclusively to the bind job, avoiding unnecessary serialization on prior TLB invalidations. Preserve correct user fence signaling by merging bind and TLB invalidation fences later in the pipeline. v3: - Fix lockdep assert for migrate queues (CI) - Use individual dma fence contexts for array out fences (Testing) - Don't set last fence with arrays (Testing) - Move TLB invalid last fence under migrate lock (Testing) - Don't set queue last for migrate queues (Testing) Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6047 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-4-matthew.brost@intel.com
2025-11-04drm/xe: Attach last fence to TLB invalidation job queuesMatthew Brost
Add support for attaching the last fence to TLB invalidation job queues to address serialization issues during bursts of unbind jobs. Ensure that user fence signaling for a bind job reflects both the bind job itself and the last fences of all related TLB invalidations. Maintain submission order based solely on the state of the bind and TLB invalidation queues. Introduce support functions for last fence attachment to TLB invalidation queues. v3: - Fix assert in xe_exec_queue_tlb_inval_last_fence_set (CI) - Ensure migrate lock held for migrate queues (Testing) v5: - Style nits (Thomas) - Rewrite commit message (Thomas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-3-matthew.brost@intel.com
2025-11-04drm/xe: Enforce correct user fence signaling order usingMatthew Brost
Prevent application hangs caused by out-of-order fence signaling when user fences are attached. Use drm_syncobj (via dma-fence-chain) to guarantee that each user fence signals in order, regardless of the signaling order of the attached fences. Ensure user fence writebacks to user space occur in the correct sequence. v7: - Skip drm_syncbj create of error (CI) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com
2025-11-04drm/xe: Do clean shutdown also when using flrJouni Högander
Currently Xe driver is triggering flr without any clean-up on shutdown. This is causing random warnings from pending related works as the underlying hardware is reset in the middle of their execution. Fix this by performing clean shutdown also when using flr. Fixes: 501d799a47e2 ("drm/xe: Wire up device shutdown handler") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://patch.msgid.link/20251031122312.1836534-1-jouni.hogander@intel.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-11-03drm/xe/guc: Synchronize Dead CT worker with unbindBalasubramani Vivekanandan
Cancel and wait for any Dead CT worker to complete before continuing with device unbinding. Else the worker will end up using resources freed by the undind operation. Cc: Zhanjun Dong <zhanjun.dong@intel.com> Fixes: d2c5a5a926f4 ("drm/xe/guc: Dead CT helper") Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-6-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-03drm/xe/gt: Synchronize GT reset with device unbindBalasubramani Vivekanandan
When unbinding wait for any GT reset in progress to complete. Unbinding will release the mmio mapping but mmio operations are performed during GT reset causing Kernel panic. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-5-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-03drm/xe/display: Use display parent interface for xe runtime pmJouni Högander
Start using display parent interface for xe runtime pm. v2: keep xe_display_rpm.c Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20251030202836.1815680-7-jouni.hogander@intel.com
2025-11-03drm/xe/display: Runtime pm wrappers for display parent interfaceJouni Högander
Implement runtime pm wrappers for xe driver and add them into display parent interface. v3: - drop useless include - drop xe_display_rpm_{get, put}_raw v2: - move xe_display_rpm_interface code into xe_display_rpm.c - rename xe_rpm as xe_display_rpm Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20251030202836.1815680-5-jouni.hogander@intel.com
2025-11-03drm/{i915, xe}/display: pass parent interface to display probeJani Nikula
Let's gradually start calling i915 and xe parent, or core, drivers from display via function pointers passed at display probe. Going forward, the struct intel_display_parent_interface is expected to include const pointers to sub-structs by functionality, for example: struct intel_display_rpm { struct ref_tracker *(*get)(struct drm_device *drm); /* ... */ }; struct intel_display_parent_interface { /* ... */ const struct intel_display_rpm *rpm; }; This is a baby step towards not building display as part of both i915 and xe drivers, but rather making it an independent driver interfacing with the two. v3: useless include additions dropped v2: unrelated include removal dropped Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20251030202836.1815680-2-jouni.hogander@intel.com
2025-11-02drm/xe: Inline gt_reset in the workerLucas De Marchi
gt_reset() doesn't make sense by itself: it can only be called as part of the worker. Inline it there to avoid it being called from elsewhere and clarify the gt_reset() vs do_gt_reset() paths. Note that the error return from gt_reset() was just being ignored. Also add a comment to the xe_pm_runtime_put() to make sure the get()/put() pair is clear. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251031222244.37735-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-01drm/i915/ltphy: Phy lane reset for LT PhySuraj Kandpal
Define function to bring phy lane out of reset for LT Phy and the corresponding pre-requisite steps before we follow the steps for Phy lane reset. Also create a skeleton of LT PHY PLL enable sequence function in which we can place this function Bspec: 77449, 74749, 74499, 74495, 68960 Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patch.msgid.link/20251101032513.4171255-4-suraj.kandpal@intel.com
2025-10-31drm/xe/pf: Allow to stop the VF using sysfsMichal Wajdeczko
It is expected that VFs activity will be monitored and in some cases admin might want to silence specific VF without killing the VM where it was attached. Add write-only attribute to stop GuC scheduling at VFs level. /sys/bus/pci/drivers/xe/BDF/ ├── sriov_admin/ ├── vf1/ │ └── stop [WO] bool ├── vf2/ │ └── stop [WO] bool Writing "1" or "y" (or whatever is recognized by the strtobool() function) to this file will trigger the change of the VF state to STOP (GuC will stop servicing the VF). To go back to a READY state (to allow GuC to service this VF again) the VF FLR must be triggered (which can be done by writing 1 to device/reset file). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-17-michal.wajdeczko@intel.com
2025-10-31drm/xe/pf: Add sysfs device symlinks to enabled VFsMichal Wajdeczko
For convenience, for every enabled VF add 'device' symlink from our SR-IOV admin VF folder to enabled sysfs PCI VF device entry. Remove all those links when disabling PCI VFs. For completeness, add static 'device' symlink for the PF itself. /sys/bus/pci/drivers/xe/BDF/sriov_admin/ ├── pf │   └── device -> ../../../BDF # PF BDF ├── vf1 │   └── device -> ../../../BDF' # VF1 BDF ├── vf2 │   └── device -> ../../../BDF" # VF2 BDF Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-16-michal.wajdeczko@intel.com
2025-10-31drm/xe/pf: Promote xe_pci_sriov_get_vf_pdevMichal Wajdeczko
In the upcoming patch we would like to use this private helper during preparation of the sysfs links. Promote it. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251030222348.186658-15-michal.wajdeczko@intel.com
2025-10-31drm/xe/pf: Allow change PF scheduling priority using sysfsMichal Wajdeczko
We have just added bulk change of the scheduling priority for all VFs and PF, but that only allow to select LOW and NORMAL priority. Add read-write attribute under PF to allow changing its priority without impacting other VFs priority settings. For completeness also add read-only attributes under VFs, to show currently selected priority levels used by the VFs. /sys/bus/pci/drivers/xe/BDF/ ├── sriov_admin/ ├── pf/ │ └── profile │ └── sched_priority [RW] low, normal, high ├── vf1/ │ └── profile │ └── sched_priority [RO] low, normal Writing "high" to the PF read-write attribute will change PF priority on all tiles/GTs to HIGH (schedule function in the next time-slice after current one completes and it has work). Writing "low" or "normal" to change priority to LOW/NORMAL is supported. When read, those files will display the current and available scheduling priorities. The currently active priority level will be enclosed in square brackets, default output will be like: $ grep . -h sriov_admin/{pf,vf1,vf2}/profile/sched_priority [low] normal high [low] normal [low] normal Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-14-michal.wajdeczko@intel.com
2025-10-31drm/xe/pf: Allow bulk change all VFs priority using sysfsMichal Wajdeczko
It is expected to be a common practice to configure the same level of scheduling priority across all VFs and PF (at least as starting point). Due to current GuC FW limitations it is also the only way to change VFs priority. Add write-only sysfs attribute that will apply required priority level to all VFs and PF at once. /sys/bus/pci/drivers/xe/BDF/ ├── sriov_admin/ ├── .bulk_profile │   └── sched_priority [WO] low, normal Writing "low" to this write-only attribute will change PF and VFs scheduling priority on all tiles/GTs to LOW (function will be scheduled only if it has work submitted). Similarly, writing "normal" will change functions priority to NORMAL (functions will be scheduled irrespective of whether there is a work or not). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-13-michal.wajdeczko@intel.com
2025-10-31drm/xe/pf: Add functions to provision scheduling priorityMichal Wajdeczko
We already have function to configure PF (or VF) scheduling priority on a single GT, but we also need function that will cover all tiles and GTs. However, due to the current GuC FW limitation, we can't always rely on per-GT function as it actually only works for the PF case. The only way to change VFs scheduling priority is to use 'sched_if_idle' policy KLV that will change priorities for all VFs (and the PF). We will use these new functions in the upcoming patches. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251030222348.186658-12-michal.wajdeczko@intel.com