summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
32 hoursMerge tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "This is the weekly fixes for what is in next tree, mostly amdgpu and some i915, panthor and a core revert. core: - revert dumb bo 8 byte alignment amdgpu: - SI fix - DC reduce stack usage - HDMI fixes - VCN 4.0.5 fix - DP MST fix - DC memory allocation fix amdkfd: - SVM fix - Trap handler fix - VGPR fixes for GC 11.5 i915: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation panthor: - fix UAF renesas: - fix sync flag handling" * tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/amd/display: Fix pbn to kbps Conversion" drm/amd: Fix unbind/rebind for VCN 4.0.5 drm/i915: Fix format string truncation warning drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation drm/amd/display: Improve HDMI info retrieval drm/amdkfd: bump minimum vgpr size for gfx1151 drm/amd/display: shrink struct members drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace drm/amd/display: Refactor dml_core_mode_support to reduce stack frame drm/amdgpu: don't attach the tlb fence for SI drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state() drm/amdkfd: Trap handler support for expert scheduling mode drm/amdkfd: Use huge page size to check split svm range alignment drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC drm/gem-shmem: revert the 8-byte alignment constraint drm/gem-dma: revert the 8-byte alignment constraint drm/panthor: Prevent potential UAF in group creation
4 daysdrm/amd: Fix unbind/rebind for VCN 4.0.5Mario Limonciello (AMD)
Unbinding amdgpu has no problems, but binding it again leads to an error of sysfs file already existing. This is because it wasn't actually cleaned up on unbind. Add the missing cleanup step. Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d717e62e9b6ccff0e3cec78a58dfbd00858448b3) Cc: stable@vger.kernel.org
6 daysdrm/amdgpu: don't attach the tlb fence for SIAlex Deucher
SI hardware doesn't support pasids, user mode queues, or KIQ/MES so there is no need for this. Doing so results in a segfault as these callbacks are non-existent for SI. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4744 Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 820b3d376e8a102c6aeab737ec6edebbbb710e04)
9 daysMerge tag 'pci-v6.19-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan Williams) - Switch vmd from custom domain number allocator to the common allocator to prevent a potential race with new non-VMD buses (Dan Williams) - Enable Precision Time Measurement (PTM) only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) Resource management: - Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) Power management and error handling: - For drivers using PCI legacy suspend, save config state at suspend so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - For devices with no driver or a driver that lacks power management, save config state at hibernate so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - Save device config space on device addition, before driver binding, so error recovery works more reliably (Lukas Wunner) - Drop pci_save_state() from several drivers that no longer need it since the PCI core always does it and pci_restore_state() no longer invalidates the saved state (Lukas Wunner) - Document use of pci_save_state() by drivers to capture the state they want restored during error recovery (Lukas Wunner) Power control: - Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) Endpoint framework: - Convert the endpoint doorbell test to use a threaded IRQ to fix a 'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri) - Add endpoint VNTB MSI doorbell support to reduce latency between host and endpoint (Frank Li) New native PCIe controller drivers: - Add CIX Sky1 host controller DT binding and driver (Hans Zhang) - Add NXP S32G host controller DT binding and driver (Vincent Guittot) - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu Beznea) - Add SpacemiT K1 host controller DT binding and driver (Alex Elder) Amlogic Meson PCIe controller driver: - Update DT binding to name DBI region 'dbi', not 'elbi', and update driver to support both (Manivannan Sadhasivam) Apple PCIe controller driver: - Move struct pci_host_bridge allocation from pci_host_common_init() to callers, which significantly simplifies pcie-apple (Marc Zyngier) Broadcom STB PCIe controller driver: - Disable advertising ASPM L0s support correctly (Jim Quinlan) - Add a panic/die handler to print diagnostic info in case PCIe caused an unrecoverable abort (Jim Quinlan) Cadence PCIe controller driver: - Add module support for Cadence platform host and endpoint controller driver (Manikandan K Pillai) - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare for new CIX Sky1 driver (Manikandan K Pillai) MediaTek PCIe controller driver: - Convert DT binding to YAML schema (Christian Marangi) - Add Airoha AN7583 DT compatible and driver support (Christian Marangi) Qualcomm PCIe controller driver: - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu) - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280, sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT schemas (Krzysztof Kozlowski) - Look up OPP using both frequency and data rate (not just frequency) so RPMh votes can account for both (Krishna Chaitanya Chundru) Rockchip DesignWare PCIe controller driver: - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi) STMicroelectronics STM32MP25 PCIe controller driver: - Fix a race between link training and endpoint register initialization (Christian Bruel) - Align endpoint allocations to match the ATU requirements (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Clear L1 PM Substate Capability 'Supported' bits unless glue driver says it's supported, which prevents users from enabling non-working L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas) - Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas) - Configure L1SS support in dw-rockchip when DT says 'supports-clkreq' (Shawn Lin) TI Keystone PCIe controller driver: - Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli)" * tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits) MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer PCI: sky1: Add PCIe host support for CIX Sky1 dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings PCI: cadence: Add support for High Perf Architecture (HPA) controller MAINTAINERS: Add NXP S32G PCIe controller driver maintainer PCI: s32g: Add NXP S32G PCIe controller driver (RC) PCI: dwc: Add register and bitfield definitions dt-bindings: PCI: s32g: Add NXP S32G PCIe controller PCI: Add Renesas RZ/G3S host controller driver PCI: host-generic: Move bridge allocation outside of pci_host_common_init() dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding PCI: Validate pci_rebar_size_supported() input Documentation: PCI: Amend error recovery doc with pci_save_state() rules treewide: Drop pci_save_state() after pci_restore_state() PCI/ERR: Ensure error recoverability at all times PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths PCI: dw-rockchip: Configure L1SS support PCI: tegra194: Remove unnecessary L1SS disable code ...
10 daysMerge tag 'drm-next-2025-12-03' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm updates from Dave Airlie: "There was a rather late merge of a new color pipeline feature, that some userspace projects are blocked on, and has seen a lot of work in amdgpu. This should have seen some time in -next. There is additional support for this for Intel, that if it arrives in the next day or two I'll pass it on in another pull request and you can decide if you want to take it. Highlights: - Arm Ethos NPU accelerator driver - new DRM color pipeline support - amdgpu will now run discrete SI/CIK cards instead of radeon, which enables vulkan support in userspace - msm gets gen8 gpu support - initial Xe3P support in xe Full detail summary: New driver: - Arm Ethos-U65/U85 accel driver Core: - support the drm color pipeline in vkms/amdgfx - add support for drm colorop pipeline - add COLOR PIPELINE plane property - add DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE - throttle dirty worker with vblank - use drm_for_each_bridge_in_chain_scoped in drm's bridge code - Ensure drm_client_modeset tests are enabled in UML - add simulated vblank interrupt - use in drivers - dumb buffer sizing helper - move freeing of drm client memory to driver - crtc sharpness strength property - stop using system_wq in scheduler/drivers - support emergency restore in drm-client Rust: - make slice::as_flattened usable on all supported rustc - add FromBytes::from_bytes_prefix() method - remove redundant device ptr from Rust GEM object - Change how AlwaysRefCounted is implemented for GEM objects gpuvm: - Add deferred vm_bo cleanup to GPUVM (for rust) atomic: - cleanup and improve state handling interfaces buddy: - optimize block management dma-buf: - heaps: Create heap per CMA reserved location - improve userspace documentation dp: - add POST_LT_ADJ_REQ training sequence - DPCD dSC quirk for synaptics panamera devices - helpers to query branch DSC max throughput ttm: - Rename ttm_bo_put to ttm_bo_fini - allow page protection flags on risc-v - rework pipelined eviction fence handling amdgpu: - enable amdgpu by default for SI/CI dGPUs - enable DC by default on SI - refactor CIK/SI enablement - add ABM KMS property - Re-enable DM idle optimizations - DC Analog encoders support - Powerplay fixes for fiji/iceland - Enable DC on bonaire by default - HMM cleanup - Add new RAS framework - DML2.1 updates - YCbCr420 fixes - DC FP fixes - DMUB fixes - LTTPR fixes - DTBCLK fixes - DMU cursor offload handling - Userq validation improvements - Unify shutdown callback handling - Suspend improvements - Power limit code cleanup - SR-IOV fixes - AUX backlight fixes - DCN 3.5 fixes - HDMI compliance fixes - DCN 4.0.1 cursor updates - DCN interrupt fix - DC KMS full update improvements - Add additional HDCP traces - DCN 3.2 fixes - DP MST fixes - Add support for new SR-IOV mailbox interface - UQ reset support - HDP flush rework - VCE1 support amdkfd: - HMM cleanups - Relax checks on save area overallocations - Fix GPU mappings after prefetch radeon: - refactor CIK/SI enablement xe: - Initial Xe3P support - panic support on VRAM for display - fix stolen size check - Loosen used tracking restriction - New SR-IOV debugfs structure and debugfs updates - Hide the GPU madvise flag behind a VM_BIND flag - Always expose VRAM provisioning data on discrete GPUs - Allow VRAM mappings for userptr when used with SVM - Allow pinning of p2p dma-buf - Use per-tile debugfs where appropriate - Add documentation for Execution Queues - PF improvements - VF migration recovery redesign work - User / Kernel VRAM partitioning - Update Tile-based messages - Allow configfs to disable specific GT types - VF provisioning and migration improvements - use SVM range helpers in PT layer - Initial CRI support - access VF registers using dedicated MMIO view - limit number of jobs per exec queue - add sriov_admin sysfs tree - more crescent island specific support - debugfs residency counter - SRIOV migration work - runtime registers for GFX 35 i915: - add initial Xe3p_LPD display version 35 support - Enable LNL+ content adaptive sharpness filter - Use optimized VRR guardband - Enable Xe3p LT PHY - enable FBC support for Xe3p_LPD display - add display 30.02 firmware support - refactor SKL+ watermark latency setup - refactor fbdev handling - call i915/xe runtime PM via function pointers - refactor i915/xe stolen memory/display interfaces - use display version instead of gfx version in display code - extend i915_display_info with Type-C port details - lots of display cleanups/refactorings - set O_LARGEFILE in __create_shmem - skuip guc communication warning on reset - fix time conversions - defeature DRRS on LNL+ - refactor intel_frontbuffer split between i915/xe/display - convert inteL_rom interfaces to struct drm_device - unify display register polling interfaces - aovid lock inversion when pinning to GGTT on CHV/BXT+VTD panel: - Add KD116N3730A08/A12, chromebook mt8189 - JT101TM023, LQ079L1SX01, - GLD070WX3-SL01 MIPI DSI - Samsung LTL106AL0, Samsung LTL106AL01 - Raystar RFF500F-AWH-DNN - Winstar WF70A8SYJHLNGA - Wanchanglong w552946aaa - Samsung SOFEF00 - Lenovo X13s panel - ilitek-ili9881c - add rpi 5" support - visionx-rm69299 - add backlight support - edp - support AUI B116XAN02.0 bridge: - improve ref counting - ti-sn65dsi86 - add support for DP mode with HPD - synopsis: support CEC, init timer with correct freq - ASL CS5263 DP-to-HDMI bridge support nova-core: - introduce bitfield! macro - introduce safe integer converters - GSP inits to fully booted state on Ampere - Use more future-proof register for GPU identification nova-drm: - select NOVA_CORE - 64-bit only nouveau: - improve reclocking on tegra 186+ - add large page and compression support msm: - GPU: - Gen8 support: A840 (Kaanapali) and X2-85 (Glymur) - A612 support - MDSS: - Added support for Glymur and QCS8300 platforms - DPU: - Enabled Quad-Pipe support, unlocking higher resolutions support - Added support for Glymur platform - Documented DPU on QCS8300 platform as supported - DisplayPort: - Added support for Glymur platform - Added support lame remapping inside DP block - Documented DisplayPort controller on QCS8300 and SM6150/QCS615 as supported tegra: - NVJPG driver panfrost: - display JM contexts over debugfs - export JM contexts to userspace - improve error and job handling panthor: - support custom ASN_HASH for mt8196 - support mali-G1 GPU - flush shmem write before mapping buffers uncached - make timeout per-queue instead of per-job mediatek: - MT8195/88 HDMIv2/DDCv2 support rockchip: - dsi: add support for RK3368 amdxdna: - enhance runtime PM - last hardware error reading uapi - support firmware debug output - add resource and telemetry data uapi - preemption support imx: - add driver for HDMI TX Parallel audio interface ivpu: - add support for user-managed preemption buffer - add userptr support - update JSM firware API to 3.33.0 - add better alloc/free warnings - fix page fault in unbind all bos - rework bind/unbind of imported buffers - enable MCA ECC signalling - split fw runtime and global memory buffers - add fdinfo memory statistics tidss: - convert to drm logging - logging cleanup ast: - refactor generation init paths - add per chip generation detect_tx_chip - set quirks for each chip model atmel-hlcdc: - set LCDC_ATTRE register in plane disable - set correct values for plane scaler solomon: - use drm helper for get_modes and move_valid sitronix: - fix output position when clearing screens qaic: - support dma-buf exports - support new firmware's READ_DATA implementation - sahara AIC200 image table update - add sysfs support - add coredump support - add uevents support - PM support sun4i: - layer refactors to decouple plane from output - improve DE33 support vc4: - switch to generic CEC helpers komeda: - use drm_ logging functions vkms: - configfs support for display configuration vgem: - fix fence timer deadlock etnaviv: - add HWDB entry for GC8000 Nano Ultra VIP r6205" * tag 'drm-next-2025-12-03' of https://gitlab.freedesktop.org/drm/kernel: (1869 commits) Revert "drm/amd: Skip power ungate during suspend for VPE" drm/amdgpu: use common defines for HUB faults drm/amdgpu/gmc12: add amdgpu_vm_handle_fault() handling drm/amdgpu/gmc11: add amdgpu_vm_handle_fault() handling drm/amdgpu: use static ids for ACP platform devs drm/amdgpu/sdma6: Update SDMA 6.0.3 FW version to include UMQ protected-fence fix drm/amdgpu: Forward VMID reservation errors drm/amdgpu/gmc8: Delegate VM faults to soft IRQ handler ring drm/amdgpu/gmc7: Delegate VM faults to soft IRQ handler ring drm/amdgpu/gmc6: Delegate VM faults to soft IRQ handler ring drm/amdgpu/gmc6: Cache VM fault info drm/amdgpu/gmc6: Don't print MC client as it's unknown drm/amdgpu/cz_ih: Enable soft IRQ handler ring drm/amdgpu/tonga_ih: Enable soft IRQ handler ring drm/amdgpu/iceland_ih: Enable soft IRQ handler ring drm/amdgpu/cik_ih: Enable soft IRQ handler ring drm/amdgpu/si_ih: Enable soft IRQ handler ring drm/amd/display: fix typo in display_mode_core_structs.h drm/amd/display: fix Smart Power OLED not working after S4 drm/amd/display: Move RGB-type check for audio sync to DCE HW sequence ...
11 daysMerge tag 'printk-for-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Allow creaing nbcon console drivers with an unsafe write_atomic() callback that can only be called by the final nbcon_atomic_flush_unsafe(). Otherwise, the driver would rely on the kthread. It is going to be used as the-best-effort approach for an experimental nbcon netconsole driver, see https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org Note that a safe .write_atomic() callback is supposed to work in NMI context. But some networking drivers are not safe even in IRQ context: https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35 In an ideal world, all networking drivers would be fixed first and the atomic flush would be blocked only in NMI context. But it brings the question how reliable networking drivers are when the system is in a bad state. They might block flushing more reliable serial consoles which are more suitable for serious debugging anyway. - Allow to use the last 4 bytes of the printk ring buffer. - Prevent queuing IRQ work and block printk kthreads when consoles are suspended. Otherwise, they create non-necessary churn or even block the suspend. - Release console_lock() between each record in the kthread used for legacy consoles on RT. It might significantly speed up the boot. - Release nbcon context between each record in the atomic flush. It prevents stalls of the related printk kthread after it has lost the ownership in the middle of a record - Add support for NBCON consoles into KDB - Add %ptsP modifier for printing struct timespec64 and use it where possible - Misc code clean up * tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits) printk: Use console_is_usable on console_unblank arch: um: kmsg_dump: Use console_is_usable drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT lib/vsprintf: Unify FORMAT_STATE_NUM handlers printk: Avoid irq_work for printk_deferred() on suspend printk: Avoid scheduling irq_work on suspend printk: Allow printk_trigger_flush() to flush all types tracing: Switch to use %ptSp scsi: snic: Switch to use %ptSp scsi: fnic: Switch to use %ptSp s390/dasd: Switch to use %ptSp ptp: ocp: Switch to use %ptSp pps: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp media: av7110: Switch to use %ptSp ipmi: Switch to use %ptSp igb: Switch to use %ptSp e1000e: Switch to use %ptSp ...
12 daysMerge tag 'amd-drm-next-6.19-2025-12-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.19-2025-12-02: amdgpu: - Unified MES fix - SMU 11 unbalanced irq fix - Fix for driver reloading on APUs - pp_table sysfs fix - Fix memory leak in fence handling - HDMI fix - DC cursor fixes - eDP panel parsing fix - Brightness fix - DC analog fixes - EDID retry fixes - UserQ fixes - RAS fixes - IP discovery fix - Add missing locking in amdgpu_ttm_access_memory_sdma() - Smart Power OLED fix - PRT and page fault fixes for GC 6-8 - VMID reservation fix - ACP platform device fix - Add missing vm fault handling for GC 11-12 - VPE fix amdkfd: - Partitioning fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20251202220101.2039347-1-alexander.deucher@amd.com
12 daysRevert "drm/amd: Skip power ungate during suspend for VPE"Mario Limonciello (AMD)
Skipping power ungate exposed some scenarios that will fail like below: ``` amdgpu: Register(0) [regVPEC_QUEUE_RESET_REQ] failed to reach value 0x00000000 != 0x00000001n amdgpu 0000:c1:00.0: amdgpu: VPE queue reset failed ... amdgpu: [drm] *ERROR* wait_for_completion_timeout timeout! ``` The underlying s2idle issue that prompted this commit is going to be fixed in BIOS. This reverts commit 2a6c826cfeedd7714611ac115371a959ead55bda. Fixes: 2a6c826cfeed ("drm/amd: Skip power ungate during suspend for VPE") Cc: stable@vger.kernel.org Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Konstantin <answer2019@yandex.ru> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220812 Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu: use common defines for HUB faultsAlex Deucher
Use common definitions for the fault bits in the IH sourc data for the gmc9-12 memory hub faults Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc12: add amdgpu_vm_handle_fault() handlingAlex Deucher
We need to call amdgpu_vm_handle_fault() on page fault on all gfx9 and newer parts to properly update the page tables, not just for recoverable page faults. Cc: stable@vger.kernel.org Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc11: add amdgpu_vm_handle_fault() handlingAlex Deucher
We need to call amdgpu_vm_handle_fault() on page fault on all gfx9 and newer parts to properly update the page tables, not just for recoverable page faults. Cc: stable@vger.kernel.org Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu: use static ids for ACP platform devsBrady Norander
mfd_add_hotplug_devices() assigns child platform devices with PLATFORM_DEVID_AUTO, but the ACP machine drivers expect the platform device names to never change. Use mfd_add_devices() instead and give each cell a unique id. Signed-off-by: Brady Norander <bradynorander@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/sdma6: Update SDMA 6.0.3 FW version to include UMQ ↵Srinivasan Shanmugam
protected-fence fix On GFX11.0.3, earlier SDMA firmware versions issue the PROTECTED_FENCE write from the user VMID (e.g. VMID 8) instead of VMID 0. This causes a GPU VM protection fault when SDMA tries to write the secure fence location, as seen in the UMQ SDMA test (cs-sdma-with-IP-DMA-UMQ) Fixes the below GPU page fault: [ 514.037189] amdgpu 0000:0b:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32770) [ 514.037199] amdgpu 0000:0b:00.0: amdgpu: Process pid 0 thread pid 0 [ 514.037205] amdgpu 0000:0b:00.0: amdgpu: in page starting at address 0x00007fff00409000 from client 10 [ 514.037212] amdgpu 0000:0b:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841A51 [ 514.037217] amdgpu 0000:0b:00.0: amdgpu: Faulty UTCL2 client ID: SDMA0 (0xd) [ 514.037223] amdgpu 0000:0b:00.0: amdgpu: MORE_FAULTS: 0x1 [ 514.037227] amdgpu 0000:0b:00.0: amdgpu: WALKER_ERROR: 0x0 [ 514.037232] amdgpu 0000:0b:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [ 514.037236] amdgpu 0000:0b:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 514.037241] amdgpu 0000:0b:00.0: amdgpu: RW: 0x1 v2: Updated commit message v3: s/gfx11.0.3/sdma 6.0.3/ in patch title (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu: Forward VMID reservation errorsNatalie Vock
Otherwise userspace may be fooled into believing it has a reserved VMID when in reality it doesn't, ultimately leading to GPU hangs when SPM is used. Fixes: 80e709ee6ecc ("drm/amdgpu: add option params to enforce process isolation between graphics and compute") Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Natalie Vock <natalie.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc8: Delegate VM faults to soft IRQ handler ringTimur Kristóf
On old GPUs, it may be an issue that handling the interrupts from VM faults is too slow and the interrupt handler (IH) ring may overflow, which can cause an eventual hang. Delegate the processing of all VM faults to the soft IRQ handler ring. As a result, we spend much less time in the IRQ handler that interacts with the HW IH ring, which significantly reduces the chance of hangs/reboots. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc7: Delegate VM faults to soft IRQ handler ringTimur Kristóf
On old GPUs, it may be an issue that handling the interrupts from VM faults is too slow and the interrupt handler (IH) ring may overflow, which can cause an eventual hang. Delegate the processing of all VM faults to the soft IRQ handler ring. As a result, we spend much less time in the IRQ handler that interacts with the HW IH ring, which significantly reduces the chance of hangs/reboots. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc6: Delegate VM faults to soft IRQ handler ringTimur Kristóf
On old GPUs, it may be an issue that handling the interrupts from VM faults is too slow and the interrupt handler (IH) ring may overflow, which can cause an eventual hang. Delegate the processing of all VM faults to the soft IRQ handler ring. As a result, we spend much less time in the IRQ handler that interacts with the HW IH ring, which significantly reduces the chance of hangs/reboots. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc6: Cache VM fault infoTimur Kristóf
Call amdgpu_vm_update_fault_cache on GMC v6 similarly to how we do in GMC v7-v8 so that VM fault info can be used later by userspace for debugging. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/gmc6: Don't print MC client as it's unknownTimur Kristóf
The VM_CONTEXT1_PROTECTION_FAULT_MCCLIENT register doesn't exist on GMC v6 so we can't print the MC client as a string like we do on GMC v7-v8. However, we still print the mc_id from VM_CONTEXT1_PROTECTION_FAULT_STATUS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/cz_ih: Enable soft IRQ handler ringTimur Kristóf
We are going to use the soft IRQ handler ring on GMC v8 to process interrupts from VM faults. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/tonga_ih: Enable soft IRQ handler ringTimur Kristóf
We are going to use the soft IRQ handler ring on GMC v8 to process interrupts from VM faults. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/iceland_ih: Enable soft IRQ handler ringTimur Kristóf
We are going to use the soft IRQ handler ring on GMC v8 to process interrupts from VM faults. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/cik_ih: Enable soft IRQ handler ringTimur Kristóf
We are going to use the soft IRQ handler ring on GMC v7 (CIK) to process interrupts from VM faults. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu/si_ih: Enable soft IRQ handler ringTimur Kristóf
We are going to use the soft IRQ handler ring on GMC v6 (SI) to process interrupts from VM faults. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysdrm/amdgpu: add missing lock to amdgpu_ttm_access_memory_sdmaPierre-Eric Pelloux-Prayer
Users of ttm entities need to hold the gtt_window_lock before using them to guarantee proper ordering of jobs. Cc: stable@vger.kernel.org Fixes: cb5cc4f573e1 ("drm/amdgpu: improve debug VRAM access performance using sdma") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysMerge tag 'drm-misc-next-2025-12-01-1' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Extra drm-misc-next for v6.19-rc1: UAPI Changes: - Add support for drm colorop pipeline. - Add COLOR PIPELINE plane property. - Add DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE. Cross-subsystem Changes: - Attempt to use higher order mappings in system heap allocator. - Always taint kernel with sw-sync. Core Changes: - Small fixes to drm/gem. - Support emergency restore to drm-client. - Allocate and release fb_info in single place. - Rework ttm pipelined eviction fence handling. Driver Changes: - Support the drm color pipeline in vkms, amdgfx. - Add NVJPG driver for tegra. - Assorted small fixes and updates to rockchip, bridge/dw-hdmi-qp, panthor. - Add ASL CS5263 DP-to-HDMI simple bridge. - Add and improve support for G LD070WX3-SL01 MIPI DSI, Samsung LTL106AL0, Samsung LTL106AL01, Raystar RFF500F-AWH-DNN, Winstar WF70A8SYJHLNGA, Wanchanglong w552946aaa, Samsung SOFEF00, Lenovo X13s panel. - Add support for it66122 to it66121. - Support mali-G1 gpu in panthor. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/aa5cbd50-7676-4a59-bbed-e8428af86804@linux.intel.com
2025-11-26drm/amdgpu: fix cyan_skillfish2 gpu info fw handlingAlex Deucher
If the board supports IP discovery, we don't need to parse the gpu info firmware. Backport to 6.18. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4721 Fixes: fa819e3a7c1e ("drm/amdgpu: add support for cyan skillfish gpu_info") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5427e32fa3a0ba9a016db83877851ed277b065fb)
2025-11-26drm/amdgpu: attach tlb fence to the PTs updatePrike Liang
Ensure the userq TLB flush is emitted only after the VM update finishes and the PT BOs have been annotated with bookkeeping fences. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f3854e04b708d73276c4488231a8bd66d30b4671) Cc: stable@vger.kernel.org
2025-11-26drm/amdgpu: fix cyan_skillfish2 gpu info fw handlingAlex Deucher
If the board supports IP discovery, we don't need to parse the gpu info firmware. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4721 Fixes: fa819e3a7c1e ("drm/amdgpu: add support for cyan skillfish gpu_info") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-26drm/amdgpu: Fix CPER ring debugfs read buffer overflow riskSrinivasan Shanmugam
The CPER ring debugfs read code always writes a 12-byte header when the file is read for the first time (*offset == 0): copy_to_user(buf, ring_header, 12); But the code never checks whether the user buffer (@size) is at least 12 bytes long. After writing the 12-byte header, the code then gives the full original @size to the CPER payload handler: record_req->buf_size = size; This means the function can write: 12 bytes (header) + payload bytes (up to @size) into a buffer that is only @size bytes big. In other words, the kernel may write more data than the user asked for. This can overflow the user buffer. The fix is: - If the user buffer is smaller than 12 bytes on the first read, return -EINVAL instead of copying the header. - After writing the 12-byte header, subtract 12 from @size and pass the reduced size to record_req->buf_size. This ensures the CPER payload only uses the remaining free space in the buffer. Reads after the first one (*offset != 0) do not write the header, so their behavior stays exactly the same. The only user-visible change is that tiny buffers now fail safely instead of risking an overflow. Fixes: drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:523 amdgpu_ras_cper_debugfs_read() warn: userbuf overflow? is 'ring_header_size' <= 'size' Fixes: 527e3d40339b ("drm/amd/ras: Add CPER ring read for uniras") Reported by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Xiang Liu <xiang.liu@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-26drm/amdgpu: attach tlb fence to the PTs updatePrike Liang
Ensure the userq TLB flush is emitted only after the VM update finishes and the PT BOs have been annotated with bookkeeping fences. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-26drm/amdgpu: free job fences on failure in amdgpu_job_alloc_with_ibPierre-Eric Pelloux-Prayer
Otherwise we're leaking memory. Fixes: db36632ea51e ("drm/amdgpu: clean up and unify hw fence handling") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-26drm/amdgpu: clear job on failure in amdgpu_job_alloc(_with_ib)Pierre-Eric Pelloux-Prayer
If memory is freed we need to nullify the pointer or the caller might call kfree again (eg: amdgpu_cs_parser_fini calls kfree on all non-null job pointers). Fixes: db36632ea51e ("drm/amdgpu: clean up and unify hw fence handling") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-26drm/amdgpu: use ttm_resource_manager_cleanupPierre-Eric Pelloux-Prayer
Rather than open-coding it. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20251121101315.3585-3-pierre-eric.pelloux-prayer@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2025-11-24drm/amd/amdgpu: reserve vm invalidation engine for uni_mesMichael Chen
Reserve vm invalidation engine 6 when uni_mes enabled. It is used in processing tlb flush request from host. Signed-off-by: Michael Chen <michael.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 873373739b9b150720ea2c5390b4e904a4d21505) Cc: stable@vger.kernel.org
2025-11-24Revert "drm/amd: fix gfx hang on renoir in IGT reload test"Rodrigo Siqueira
The original patch introduced additional latency during boot time because it triggers a driver reload to avoid a CP hang when the driver is reloaded multiple times. This has been addressed with a more generic solution that triggers the GPU reset only during the unload phase, avoiding extra latency during boot time. For this reason, this commit reverts the original change. This reverts commit 72a98763b473890e6605604bfcaf71fc212b4720. This patch should only be applied if commit: 4355e61835e7 ("drm/amdgpu: Fix GFX hang on SteamDeck when amdgpu is reloaded") is present. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-24drm/amdgpu: Fix GFX hang on SteamDeck when amdgpu is reloadedRodrigo Siqueira
When trying to unload amdgpu in the SteamDeck (TTY mode), the following set of errors happens and the system gets unstable: [..] [drm] Initialized amdgpu 3.64.0 for 0000:04:00.0 on minor 0 amdgpu 0000:04:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx_0.0.0 (-110). amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110). [..] amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff! amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff! [..] When the driver initializes the GPU, the PSP validates all the firmware loaded, and after that, it is not possible to load any other firmware unless the device is reset. What is happening in the load/unload situation is that PSP halts the GC engine because it suspects that something is amiss. To address this issue, this commit ensures that the GPU is reset (mode 2 reset) in the unload sequence. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-24drm/amd/amdgpu: reserve vm invalidation engine for uni_mesMichael Chen
Reserve vm invalidation engine 6 when uni_mes enabled. It is used in processing tlb flush request from host. Signed-off-by: Michael Chen <michael.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-21Merge tag 'v6.18-rc6' into drm-nextDave Airlie
Linux 6.18-rc6 Backmerge in order to merge msm next Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-19drm/amdgpu: Add sriov vf check for VCN per queue reset support.Shikang Fan
Add SRIOV check when setting VCN ring's supported reset mask. Signed-off-by: Shikang Fan <shikang.fan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ee9b603ad43f9870eb75184f9fb0a84f8c3cc852) Cc: stable@vger.kernel.org
2025-11-19drm/amdgpu/ttm: Fix crash when handling MMIO_REMAP in PDE flagsSrinivasan Shanmugam
The MMIO_REMAP BO is a special 4K IO page that does not have a ttm_tt behind it. However, amdgpu_ttm_tt_pde_flags() was treating it like normal TT/doorbell/preempt memory and unconditionally accessed ttm->caching. For the MMIO_REMAP BO, ttm is NULL, so this leads to a NULL pointer dereference when computing PDE flags. Fix this by checking that ttm is non-NULL before reading ttm->caching. This prevents the crash for MMIO_REMAP and also makes the code more defensive if other BOs ever come through without a ttm_tt. Fixes: fb5a52dbe9fe ("drm/amdgpu: Implement TTM handling for MMIO_REMAP placement") Suggested-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Tested-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0db94da5a0a1cacda080b9ec8425fcbe4babc141)
2025-11-19drm/amdgpu/vm: Check PRT uAPI flag instead of PTE flagTimur Kristóf
This fixes sparse mappings (aka. partially resident textures). Check the correct flags. Since a recent refactor, the code works with uAPI flags (for mapping buffer objects), and not PTE (page table entry) flags. Fixes: 6716a823d18d ("drm/amdgpu: rework how PTE flags are generated v3") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8feeab26c80635b802f72b3ed986c693ff8f3212)
2025-11-19drm/amdgpu: Skip emit de meta data on gfx11 with rs64 enabledYifan Zha
[Why] Accoreding to CP updated to RS64 on gfx11, WRITE_DATA with PREEMPTION_META_MEMORY(dst_sel=8) is illegal for CP FW. That packet is used for MCBP on F32 based system. So it would lead to incorrect GRBM write and FW is not handling that extra case correctly. [How] With gfx11 rs64 enabled, skip emit de meta data. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8366cd442d226463e673bed5d199df916f4ecbcf) Cc: stable@vger.kernel.org
2025-11-19drm/amd: Skip power ungate during suspend for VPEMario Limonciello
During the suspend sequence VPE is already going to be power gated as part of vpe_suspend(). It's unnecessary to call during calls to amdgpu_device_set_pg_state(). It actually can expose a race condition with the firmware if s0i3 sequence starts as well. Drop these calls. Cc: Peyton.Lee@amd.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a6c826cfeedd7714611ac115371a959ead55bda) Cc: stable@vger.kernel.org
2025-11-19drm/amdgpu: Add sriov vf check for VCN per queue reset support.Shikang Fan
Add SRIOV check when setting VCN ring's supported reset mask. Signed-off-by: Shikang Fan <shikang.fan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-19drm/amdgpu/ttm: Fix crash when handling MMIO_REMAP in PDE flagsSrinivasan Shanmugam
The MMIO_REMAP BO is a special 4K IO page that does not have a ttm_tt behind it. However, amdgpu_ttm_tt_pde_flags() was treating it like normal TT/doorbell/preempt memory and unconditionally accessed ttm->caching. For the MMIO_REMAP BO, ttm is NULL, so this leads to a NULL pointer dereference when computing PDE flags. Fix this by checking that ttm is non-NULL before reading ttm->caching. This prevents the crash for MMIO_REMAP and also makes the code more defensive if other BOs ever come through without a ttm_tt. Fixes: fb5a52dbe9fe ("drm/amdgpu: Implement TTM handling for MMIO_REMAP placement") Suggested-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Tested-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-19drm/amdgpu/vm: Check PRT uAPI flag instead of PTE flagTimur Kristóf
This fixes sparse mappings (aka. partially resident textures). Check the correct flags. Since a recent refactor, the code works with uAPI flags (for mapping buffer objects), and not PTE (page table entry) flags. Fixes: 6716a823d18d ("drm/amdgpu: rework how PTE flags are generated v3") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-19drm/amdgpu: Skip emit de meta data on gfx11 with rs64 enabledYifan Zha
[Why] Accoreding to CP updated to RS64 on gfx11, WRITE_DATA with PREEMPTION_META_MEMORY(dst_sel=8) is illegal for CP FW. That packet is used for MCBP on F32 based system. So it would lead to incorrect GRBM write and FW is not handling that extra case correctly. [How] With gfx11 rs64 enabled, skip emit de meta data. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-19drm/amd: Skip power ungate during suspend for VPEMario Limonciello
During the suspend sequence VPE is already going to be power gated as part of vpe_suspend(). It's unnecessary to call during calls to amdgpu_device_set_pg_state(). It actually can expose a race condition with the firmware if s0i3 sequence starts as well. Drop these calls. Cc: Peyton.Lee@amd.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-19drm/amdgpu: Switch to use %ptSpAndy Shevchenko
Use %ptSp instead of open coded variants to print content of struct timespec64 in human readable format. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251113150217.3030010-6-andriy.shevchenko@linux.intel.com Signed-off-by: Petr Mladek <pmladek@suse.com>