| Age | Commit message (Collapse) | Author |
|
Pull drm updates from Dave Airlie:
"Highlights:
- xe: add initial CRI platform support
- amdgpu: initial HDMI 2.1 FRL support
- rust: add some new type concepts for device lifetimes
- scheduler: moves to a fair algorithm and lots of cleanups
But it's mostly the usual mountain of changes across the board.
core:
- add docbook for DRM_IOCTL_SYNCOBJ_EVENTFD
- change signature of drm_connector_attach_hdr_output_metadata_property
- dedup counter and timestamp retrieval in vblank code
- parse AMD VSDB v3 in CTA extension blocks
- add P230, Y7, XYYY2101010, T430, XVUY210101010 formats
- don't call drop master on file close if not master
- use drm_printf_indent in atomic / bridge
- fix 32b format descriptions
- docs: fix toctree
- hdmi: add common TMDS character rates
- fix drm_syncobj_find_fence leak
rust:
- introduce Higher-Ranked lifetime types
- replace drvdata with scoped registration data
- add GPUVM immediate mode abstraction for rust GPU drivers
- introduce DeviceContext type state for drm::Device
bridge:
- clarify drm_bridge_get/put
- create drm_get_bridge_by_endpoint and use it
- analogix_dp: add panel probing
- ite-it6211 - use drm audio hdmi helpers
buddy:
- add lockdep annotations
dp:
- add PR and VRR updates
- mst: fix buffer overflows
- add Adaptive Sync SDP decoding support
- fix OOB reads in dp-mst
ttm:
- bump fpfn/lpfn to 64-bit
scheduler:
- change default to fair scheduler
- map runqueue 1:1 with scheduler
dma-buf:
- port selftests to kunit
- convert dma-buf system/heap allocators to module
- add separate DMABUF_HEAPS_SYSTEM_CC_SHARED Kconfig
udmabuf:
- revert hugetlb support
- fix error with CONFIG_DMA_API_DEBUG
dma-fence:
- fix tracepoints lifetime
- remove unused signal on any support
ras:
- add clear error counter netlink command to drm ras
gpusvm:
- reject VMAs with VM_IO or VM_PFNMAP when creating SVM ranges
- use IOVA allocations
pagemap:
- use IOVA allocations
panels:
- update to use ref counts
- add support for CSW PNB601LS1-2, LGD LP116WHA-SPB1
- add support for waveshare panels
- CMN N116BCN-EA1, CMN N140HCA-EEK, IVO M140NWFQ R5,
- IVO, R140NWFW R0, BOE NT140*, BOE NV133FHM-N4F,
- AUO B140*, AUO B133HAN06.6 and AUO B116XTN02.3 eDP panels
- Surface Pro 12 Panel
xe:
- add CRI PCI-IDs
- debugfs add multi-lrc info
- engine init cleanup
- PF fair scheduling auto provisioning
- system controller support for CRI/Xe3p
- PXP state machine fixes
- Reset/wedge/unload corner case fixes
- Wedge path memory allocation fixes
- PAT type cleanups
- Reject unsafe PAT for CPU cached memory
- OA improvements for CRI device memory
- kernel doc syntax in xe headers
- xe_drm.h documentation fixes
- include guard cleanups
- VF CCS memory pool
- i915/xe step unification
- Xe3p GT tuning fixes
- forcewake cleanup in GT and GuC
- admin-only PF mode
- enable hwmon energy attributes for CRI
- enable GT_MI_USER_INTERRUPT
- refactor emit functions
- oa workarounds
- multi_queue: allow QUEUE_TIMESTAMP register
- convert stolen memory to ttm range manager
- use xe2 style blitter as a feature flag
- make drm_driver const
- add/use IRQ page to HW engine definition
- fix oops when display disabled
i915:
- enable PIPEDMC_ERROR interrupt
- more common display code refactoring
- restructure DP/HDMI sink format handling
- eliminate FB usage from lowlevel pinning code
- panel replay bw optimization
- integrate sharpness filter into the scaler
- new fb_pin abstraction for xe/i915 fb transparent handling
- skip inactive MST connectors on HDCP
- start switching to display specific registers
- use polling when irq unavailable
- Adaptive-sync SDP prep
amdgpu:
- use drm_display_info for AMD VSDB data
- Initial HDMI 2.1 FRL support
- Initial DCN 4.2.1 support
- GART fixes for non-4k pages
- GC 11.5.6/SDMA 6.4.0/and other new IPs
- GFX9/DCE6/Hawaii/SDMA4/GART/Userq fixes
- Finish support for using multiple SDMA queues for TTM operations
- SWSMU updates
- GC 12.1 updates
- SMU 15.0.8 updates
- DCN 4.2 updates
- DC type conversion fixes
- Enable DC power module
- Replay/PSR updates
- SMU 13.x updates
- Compute queue quantum MQD updates
- ASPM fix
- Align VKMS with common implementation
- DC analog support fixes
- UVD 3 fixes
- TCC harvesting fixes for SI
- GC 11 APU module reload fix
- NBIO 6.3.2 support
- IH 7.1 updates
- DC cursor fixes
- VCN/JPEG user fence fixes
- DC support for connectors without DDC
- Prefer ROM BAR for default VGA device
- DC bandwidth fixes
- Add PTL support for profiler
- Introduce dc_plane_cm and migrate surface update color path
- Add FRL registers for HDMI 2.1
- Restructure VM state machine
- Auxless ALPM support
- GEM_OP locking/warning fixes
- switch to system_dfl_wq
amdkfd:
- GPUVM TLB flush fix
- Hotplug fix
- Boundary check fixes
- SVM fixes
- CRIU fixes
- add profiler API
- MES 12.1 updates
msm:
- core:
- fix shrinker documentation
- IFPC enabled for gen8
- PERFCNTR_CONFIG ioctl support
- GPU:
- reworked UBWC handling
- a810 support
- MDSS:
- add support for Milos platform
- reworked UBWC handling
- DisplayPort:
- reworked HPD handling as prep for MST
- DPU:
- Milos platform support
- reworked UBWC handling
- DSI:
- Milos platform support
nova:
- Hopper/Blackwell enablement (GH100/GB100/GB202)
- FSP support
- 32-bit firmware support
- HAL functions
- refactor GSP boot/unload
- GA100 support
- VBIOS hardening/refactoring
- Adopt higher order lifetime types
tyr:
- define register blocks
- add shmem backed GEM objects
- adopt higher order lifetime types
- move clock cleanup into Drop
radeon:
- Hawaii SMU fixes
- CS parser fix
- use struct drm_edid instead of edid
amdxdna:
- export per-client BO memory via fdinfo
- AIE4 device support
- support medium/lower power modes
- expandable device heap support
- revert read-only user-pointer BO mappings
ivpu:
- support frequency limiting
panthor:
- enable GEM shrinker support
- add eviction and reclaim info to fdinfo
v3d:
- enable runtime PM
mgag200:
- support XRGB1555 + C8
ast:
- support XRGB1555 + C8
- use constants for lots of registers
- fix register handling
imagination:
- fence handling refactoring
nouveau:
- fix sched double call
- expose VBIOS on GSP-RM systems
- add GA100 support
virtio:
- add VIRTIO_GPU_F_BLOB_ALIGNMENT flag
- add deferred mapping support
gud:
- add RCade Display Adapter
hibmc:
- fix no connectors usage
mediatek:
- hdmi: convert error handling
- simplify mtk_crtc allocation
exynos:
- move fbdev emulation to drm client buffers
- use drm format helpers for geometry/size
- adopt core DMA tracking
- fix framebuffer offset handling
renesas:
- add RZ/T2H SOC support
versilicon:
- add cursor plane support
tegra:
- use drm client for framebuffer"
* tag 'drm-next-2026-06-17' of https://gitlab.freedesktop.org/drm/kernel: (1731 commits)
dma-buf: move system_cc_shared heap under separate Kconfig
accel/amdxdna: Clear sva pointer after unbind
agp/amd64: Fix broken error propagation in agp_amd64_probe()
accel/amdxdna: Require carveout when PASID and force_iova are disabled
drm/amdkfd: always resume_all after suspend_all
drm/amdgpu/gfx: move fault and EOP IRQ get/put to hw_init/hw_fini
drm/amd/display: Consult MCCS FreeSync cap only if requested & supported
drm/amd/pm: Use strscpy in profile mode parsing
drm/amdkfd: Fix infinite loop parsing CRAT with zero subtype length
drm/amdkfd: fix sysfs topology prop length on buffer truncation
drm/amdgpu: drop retry loop in amdgpu_hmm_range_get_pages
drm/amd/pm: bound OD parameter parsing to stack array size
drm/amd/pm: Stop pp_od_clk_voltage emit at PAGE_SIZE
drm/amdkfd: Unwind debug trap enable on copy_to_user failure
drm/amdgpu: validate the mes firmware version for gfx12.1
drm/amdgpu: validate the mes firmware version for gfx12
drm/amdgpu: compare MES firmware version ucode for gfx11
drm/amdkfd: Add bounds check for AMDKFD_IOC_WAIT_EVENTS
drm/amdgpu: restart the CS if some parts of the VM are still invalidated
drm/amd/display: use unsigned types for local pipe and REG_GET counters
...
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
amd:
- track colorop changes correctly
amdxdna:
- fix possible leak of mm_struct
colorop:
- make lut interpolation mutable
- track colorop updates correctly
ivpu:
- fix integer truncation
vc4:
- fix leak in krealloc() error handling
virtio:
- fix dma_fence ref-count leak
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260612081418.GA17001@2a02-2455-9062-2500-e496-5a17-62ba-545e.dyn6.pyur.net
|
|
Ensure the driver tracks changes in any colorop property of a plane
color pipeline by using the same mechanism of CRTC color management and
update plane color blocks when any colorop property changes. It fixes an
issue observed on gamescope settings for night mode which is done via
shaper/3D-LUT updates.
Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patch.msgid.link/20260609110420.1298352-5-mwen@igalia.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-7.2-2026-06-04:
amdgpu:
- UserQ fix
- Userptr fix
- MCCS freesync fix
- Remove some triggerable BUG() calls
- DCN 4.2.1 fixes
- Lockdep annotations
- Guilty handling fix
- VCN 5.3 fix
- FRL fixes
- Bounds checking fixes
- HMM fix
- IRQ accounting fix
amdkfd:
- Fix an event information leak
- Events bounds check fix
- Trap cleanup fix
- Bounds checking fixes
- MES fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260604231801.19979-1-alexander.deucher@amd.com
|
|
When the do_mccs parameter is false, we don't call
dm_helpers_read_mccs_caps, so sink->mccs_caps.freesync_supported is
unlikely to be true.
Fixes: 6f71d5dd3206 ("drm/amd/display: Read sink freesync support via mccs")
Bug: https://gitlab.freedesktop.org/drm/amd/-/work_items/5286
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 115bf5ca318e18a3dc1888ec6271c7052774952a)
|
|
If kfd_dbg_trap_enable() fails while copying runtime_info to userspace,
it had already activated the trap, set debug_trap_enabled, taken an extra
process reference, and opened the debug event file. Return -EFAULT without
unwinding that state, leaving inconsistent trap state and a refcount
imbalance that could break later DISABLE/ENABLE.
On copy_to_user failure, deactivate the trap and undo the rest of the
enable setup before returning.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 01112e241e37f9ac98b6f418d93ce2e0b87b7ee0)
|
|
The kfd_wait_on_events ioctl passes a user-supplied num_events parameter
directly to alloc_event_waiters() which calls kcalloc() without validation.
This allows unprivileged users with /dev/kfd access to trigger large kernel
memory allocations, potentially causing memory exhaustion and denial of
service via the OOM killer.
Add a check to reject num_events values exceeding KFD_SIGNAL_EVENT_LIMIT
(4096), which is the maximum number of events a single process can create.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 39eb6da7acee8d0cc12a8959235b590f295d7b4c)
|
|
Make sure that we only submit work with full up to date VM page tables.
Backport to 7.1 and older.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 59720bfd8c6dbebeb8d5a7ab64241b007efd9213)
Cc: stable@vger.kernel.org
|
|
Use correct u64 type.
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0ac98160dfb6ab3c6d7b38e0ff9687780beed9cb)
|
|
kfd_smi_ev_enabled() skips the suser privilege check when pid=0.
PROCESS_START, PROCESS_END, and VMFAULT events are emitted with
pid=0 while carrying another process's PID and command name, so any
/dev/kfd user in the render group can monitor all GPU workloads.
Pass the target process PID into kfd_smi_event_add() for these events
so the existing per-client filter restricts delivery to the owning
process or CAP_SYS_ADMIN subscribers.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 92a8dba246d371fe268280e5fd74b0955688e6df)
|
|
Need to restore any good queues even if the suspend_all
failed for some. Always run remove_queue as that will
schedule a GPU reset is removing the queue fails.
v2: move resume_all after remove
Fixes: eb067d65c33e ("drm/amdkfd: Update BadOpcode Interrupt handling with MES")
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
priv_reg / priv_inst / bad_op and (on v11+) userq EOP IRQs are
acquired in late_init but released in hw_fini. This split forced
gfx_v9_0_hw_fini() to defensively guard each put with
amdgpu_irq_enabled() because hw_fini runs on paths that may not
reach late_init.
amdgpu_ip_block_hw_fini() only runs after hw_init returns success,
and suspend / resume cycle the refs through the same path, so
hw_init / hw_fini pair without any extra tracking. Move the gets
there and drop the guards.
While here, fix the pre-existing partial-failure leak in
set_userq_eop_interrupts() (gfx11 / 12_0 / 12_1). amdgpu_irq_get()
increments the refcount before calling .set, so a failure partway
through the loop leaves earlier successful gets stranded. Track
the loop position and roll back on the enable path.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When the do_mccs parameter is false, we don't call
dm_helpers_read_mccs_caps, so sink->mccs_caps.freesync_supported is
unlikely to be true.
Fixes: 6f71d5dd3206 ("drm/amd/display: Read sink freesync support via mccs")
Bug: https://gitlab.freedesktop.org/drm/amd/-/work_items/5286
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use strscpy to copy the buffer which makes it explicit that a valid NULL
terminated string gets copied. Also, make it explicit that the source
buffer can be copied safely to the temporary buffer by checking against
its size.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Malformed ACPI CRAT tables can advertise a zero or undersized subtype
length. The parser then fails to advance the cursor and loops forever
while the remaining image still looks large enough for a generic header.
Validate sub_type_hdr->length on each iteration before parsing or
advancing. Return -EINVAL and warn when length is zero or smaller than
the generic subtype header.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sysfs_show_gen_prop() accumulated snprintf()'s return value into the
offset. snprintf() reports bytes that would have been written, not
bytes actually written, so a truncated sysfs show could over-report
its length. Use sysfs_emit_at(), which returns only the bytes written.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Since commit c08972f55594 ("drm/amdgpu: fix amdgpu_hmm_range_get_pages")
moved mmu_interval_read_begin() out of the per-chunk loop, the
captured notifier_seq is no longer refreshed across retries. As a
result, the existing -EBUSY retry path can never make progress:
hmm_range_fault() returns -EBUSY only when
mmu_interval_check_retry(notifier, notifier_seq) reports that the
sequence is stale. Once the sequence has advanced, the stored seq
will never match again, so every subsequent call within the same
invocation returns -EBUSY immediately.
The "goto retry" therefore degenerates into a busy spin that simply
burns CPU for the full HMM_RANGE_DEFAULT_TIMEOUT (~1s) window before
finally bailing out with -EAGAIN. This is pure latency with no chance
of recovery, and it actively hurts the KFD userptr stack: the caller
ends up blocked for a second while holding mmap_lock, only to return
-EAGAIN to the restore worker (or to userspace) which would have
re-driven the operation immediately anyway.
Drop the retry/timeout entirely and let -EBUSY propagate straight to
out_free_pfns, where it is already translated to -EAGAIN. Recovery is
handled at a higher level: the KFD restore_userptr_worker reschedules
itself, and the userptr ioctl path returns -EAGAIN to userspace.
No functional regression: the previous behaviour on -EBUSY was already
to fail with -EAGAIN after a 1s stall; we just skip the stall.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Honglei Huang <honghuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reject inputs once parameter_size reaches the array limit, and pass
ARRAY_SIZE(parameter) into parse_input_od_command_lines() for defense in
depth.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Stop appending OD sections in amdgpu_get_pp_od_clk_voltage()
once the sysfs page is full, instead of checking every sysfs_emit_at()
in SMU helpers. This is purely defensive hardening.
v2: Drop the prior series that checked sysfs_emit_at()
return values in every SMU *_emit_clk_levels() helper and
smu_cmn_print_*().(Kevin)
v3: Update description, remove all clamping
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If kfd_dbg_trap_enable() fails while copying runtime_info to userspace,
it had already activated the trap, set debug_trap_enabled, taken an extra
process reference, and opened the debug event file. Return -EFAULT without
unwinding that state, leaving inconsistent trap state and a refcount
imbalance that could break later DISABLE/ENABLE.
On copy_to_user failure, deactivate the trap and undo the rest of the
enable setup before returning.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The kfd_wait_on_events ioctl passes a user-supplied num_events parameter
directly to alloc_event_waiters() which calls kcalloc() without validation.
This allows unprivileged users with /dev/kfd access to trigger large kernel
memory allocations, potentially causing memory exhaustion and denial of
service via the OOM killer.
Add a check to reject num_events values exceeding KFD_SIGNAL_EVENT_LIMIT
(4096), which is the maximum number of events a single process can create.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make sure that we only submit work with full up to date VM page tables.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Two small type fixes that match how the values are actually consumed:
- decide_zstate_support() iterates from 0 to pipe_count, which is
unsigned. Make the loop index unsigned int.
- hpo_enc401_read_state() reads HDMI_PIXEL_ENCODING and
HDMI_DEEP_COLOR_DEPTH via REG_GET_2(), which internally casts the
output pointer to (uint32_t *). Passing the address of an int is a
strict-aliasing wart even when the sizes match. Declare the locals
as uint32_t.
No behavioural change since the values are only compared against small
non-negative constants.
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
dc_hdmi_frl_flags.force_frl_rate mirrors dc_debug_options.force_frl_rate,
which was just widened to unsigned int. Match the type here too so the
assignment in link_hdmi_frl.c does not narrow from unsigned to signed.
All call sites in link_hdmi_frl.c only compare the value against 0, 0xF,
or an hdmi_frl_link_rate enum whose values are non-negative, so the
change is behaviour-preserving and does not introduce sign-compare
warnings.
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use correct u64 type.
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable secure submission support on the unified ring for VCN IP version
5.3.0 by setting `secure_submission_supported = true` in
vcn_v5_0_0_unified_ring_vm_funcs.
Secure IB submission is supported on VCN 5.3.0 hardware/firmware,
allowing protected decode workloads to bypass the common IB gate.
Without this, secure playback submissions can be blocked and fail.
Other VCN 5.x variants using the same vcn_v5_0_0_ip_block
(e.g. IP_VERSION(5, 0, 0)) do not support secure submission
on the unified ring and therefore continue using non-secure paths.
This change only advertises existing hardware/firmware capability;
non-secure decode paths remain unaffected.
Signed-off-by: Jeevana Muthyala <Jeevana.Muthyala2@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The guilty handling tried to establish a second way of signaling problems with
the GPU back to userspace. This caused quite a bunch of issue we had to work
around, especially lifetime issues with the drm_sched_entity.
Just drop the handling altogether and use the dma_fence based approach instead.
v2: fix reversed condition in entity check (Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add lockdep annotations to teach lockdep the correct lock hierarchy
and catch ordering violations during development. This follows the
pattern established by dma-resv in drivers/dma-buf/dma-resv.c.
Lock ordering hierarchy (outermost to innermost):
1. userq_sch_mutex - Global userq scheduler (enforce_isolation)
2. userq_mutex - Per-context userq (held across queue create/destroy)
3. notifier_lock - MMU notifier synchronization
4. vram_lock - VRAM memory allocator
5. reset_domain->sem - GPU reset synchronization
6. reset_lock - Reset control mutex
7. srbm_mutex - SRBM register access
8. grbm_idx_mutex - GRBM index register access
9. mmio_idx_lock - MMIO index access (spinlock)
The implementation provides:
- Lock ordering training at module init (amdgpu_lockdep_init)
- Lock class association for real driver locks (amdgpu_lockdep_set_class)
Dummy locks are associated with the same class keys as real driver locks
via lockdep_set_class(), ensuring lockdep connects the training ordering
with actual runtime locks.
Testing:
Build the kernel with CONFIG_PROVE_LOCKING=y (enables CONFIG_LOCKDEP):
scripts/config --enable PROVE_LOCKING
scripts/config --enable DEBUG_LOCKDEP
make -j$(nproc)
On boot, dmesg should show:
AMDGPU: Lockdep annotations initialized (9 lock levels)
The companion IGT test (tests/amdgpu/amd_lockdep) exercises lock-heavy
GPU code paths concurrently to trigger lockdep warnings on violations:
sudo ./build/tests/amdgpu/amd_lockdep
sudo dmesg | grep -A 50 "circular locking dependency"
IGT subtests:
concurrent-reset-and-submit - reset_sem vs submission locks
concurrent-mmap-and-evict - mmap_lock vs vram_lock
concurrent-userptr-and-reset - notifier_lock vs reset_sem
stress-all-paths - all of the above simultaneously
A clean dmesg (no "circular locking dependency" or "possible recursive
locking detected" messages) confirms no lock ordering violations.
For CI integration, the test should be run on kernels compiled with
CONFIG_LOCKDEP=y; dmesg is scanned post-run for lockdep splats.
v2: (Christian)
- Move notifier_lock and vram_lock before reset locks in hierarchy.
HMM invalidation holds notifier_lock and can wait for GPU reset
completion, so notifier_lock must be outer to reset_domain->sem.
- Associate dummy locks with lock class keys via lockdep_set_class()
so lockdep connects training with real driver locks.
- Update commit message to list all 9 lock levels.
Requires CONFIG_PROVE_LOCKING=y to activate.
Cc: Christian Konig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
kfd_smi_ev_enabled() skips the suser privilege check when pid=0.
PROCESS_START, PROCESS_END, and VMFAULT events are emitted with
pid=0 while carrying another process's PID and command name, so any
/dev/kfd user in the render group can monitor all GPU workloads.
Pass the target process PID into kfd_smi_event_add() for these events
so the existing per-client filter restricts delivery to the owning
process or CAP_SYS_ADMIN subscribers.
Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Needed for DML to function with DCN42B.
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In resource_parse_asic_id, the check for GC_11_0_4 was unbounded, which
caused it to override the detection of DCN42B.
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Replace BUG()/BUG_ON() with error logs and safe returns in several
places where they can be triggered by invalid userspace input,
preventing DoS via kernel panic.
Signed-off-by: Ce Sun <cesun102@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-7.2-2026-06-03:
amdgpu:
- BT.2020 fix for DCE
- DC bounds checking fixes
- SDMA 7.1 fix
- UserQ fixes
- SI fix
- SMU 13 fixes
- SMU 14 fixes
- GC 12.1 fix
- Userptr fix
- GC 10.1 fix
- GART fix for non-4K pages
- DCN 4.x fixes
- DCN 4.2 updates
- More DC KUnit tests
- PSR cleanup
- Support for connectors without DDC pins
- Initial DCN 4.2.1 support
- Initial HDMI 2.1 FRL support
- Misc bounds check fixes
- RAS fixes
- GC 11.5.6 support
- SDMA 6.4.0 support
- NBIO 7.11.5 support
- IH 6.4.0 support
- HDP 6.4.0 support
- MMHUB 3.4.2 support
- SMU 15.0.5 support
- ATHUB 3.4.2 support
- VPE 2.2 support
- Devcoredump fixes
- _PR3 fix
amdkfd:
- UAF race fix
- Fix a potential NULL pointer dereference
- GC 11 buffer overflow fix for SDMA
- Profiler locking order fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260604013527.2373534-1-alexander.deucher@amd.com
|
|
In smu_v14_0_0_set_soft_freq_limited_range(), the gfxclk floor is
programmed via SetHardMinGfxClk together with SetSoftMaxGfxClk. Under
power_dpm_force_performance_level=high this pins HardMin to peak gfxclk.
In PMFW arbitration HardMin has higher priority than SoftMax, so the
firmware thermal/PPT throttler cannot clamp gfxclk via SoftMax once
HardMin is set to peak. Replace SetHardMinGfxClk with SetSoftMinGfxclk
so the driver still requests peak performance but the firmware
throttler retains the ability to clamp gfxclk under thermal/PPT
pressure. SoftMax handling is unchanged and no other clock domains
are affected.
Signed-off-by: Priya Hosur <Priya.Hosur@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3ea273267fd29cbf6d83ee72329f59eb5042605b)
Cc: stable@vger.kernel.org
|
|
When mapping VRAM pages into the GART page table,
amdgpu_gart_map_vram_range() assumes that the system page size is the
same as the GPU page size.
On systems with non-4K page sizes, multiple GPU pages can exist within
a single CPU page. As a result, the mappings are created incorrectly
because fewer page table entries are programmed than required.
Fix this by programming the mappings correctly for non-4K page size
systems.
Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a8f0bc22388f74e0cf4ed8b7d1846c580eaf44cc)
Cc: stable@vger.kernel.org
|
|
In case when queue_create fails and mqd has already been
allocated and hence wptr_obj is not cleaned up.
So moving that cleanup part to mqd_destroy so it takes
care of all the cases of clean up and during tear down of
the queue.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 43355f62cd2ef5386c2693df537c232ea0f2ce6c)
|
|
Use find_next_zero_bit() to locate the next free seq slot bit
instead of the current walk, for more efficient bitmap scanning.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ff905a9b6228de9eedd0db71ecb1bdde91fb898d)
|
|
Mesa userqueues free does not wait for the free to complete and go ahead
in unmapping the vital bos while kernel is still in queue free and
corresponding cleanup.
So ideally we don't need the logging for that and hence remove the warn
message as this is expected behaviour and functionally, we are making
sure to wait for the required fences before unmap.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 758a868043dcb07eca923bc451c16da3e73dc47c)
|
|
The v11 MQD manager incorrectly assigned the CP-compute variants of
checkpoint_mqd/restore_mqd for KFD_MQD_TYPE_SDMA queues. These functions
use sizeof(struct v11_compute_mqd) (2048 bytes) instead of sizeof(struct
v11_sdma_mqd) (512 bytes), causing a 1536-byte overflow.
During CRIU checkpoint of an SDMA queue on Navi3x:
- checkpoint_mqd() reads 2048 bytes from a 512-byte SDMA MQD buffer,
leaking 1536 bytes of adjacent GTT memory to userspace
During CRIU restore:
- restore_mqd() writes 2048 bytes into a 512-byte SDMA MQD buffer,
corrupting 1536 bytes of adjacent GTT memory (often the ring buffer
or neighboring MQDs)
This is a copy-paste regression unique to v11. All other ASIC backends
(cik, vi, v9, v10, v12) correctly use the SDMA-specific variants.
Add checkpoint_mqd_sdma() and restore_mqd_sdma() functions that properly
handle the smaller v11_sdma_mqd structure, matching the pattern used in
other MQD managers.
Fixes: cc009e613de6 ("drm/amdkfd: Add KFD support for soc21 v3")
Assisted-by: Claude:Sonnet 4-5
Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6fa41db7ffdec97d62433adf03b7b9b759af8c2c)
Cc: stable@vger.kernel.org
|
|
When usr_queue_id_array is NULL and num_queues is non-zero,
get_queue_ids() returns NULL. The callers check only IS_ERR() on the
return value; since IS_ERR(NULL) == false the check passes, and
suspend_queues() calls q_array_invalidate() which immediately
dereferences NULL while iterating num_queues times.
Userspace can trigger this via kfd_ioctl_set_debug_trap() by supplying
num_queues > 0 with a zero queue_array_ptr, causing a kernel panic.
A NULL usr_queue_id_array with num_queues == 0 is a legitimate no-op
(q_array_invalidate never executes, and resume_queues already guards
all queue_ids dereferences behind a NULL check). Return ERR_PTR(-EINVAL)
only when num_queues is non-zero and the pointer is absent; both callers
already propagate IS_ERR() returns correctly to userspace.
Fixes: a70a93fa568b ("drm/amdkfd: add debug suspend and resume process queues operation")
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f165a82cdf503884bb1797771c61b2fcc72113d4)
Cc: stable@vger.kernel.org
|
|
Problem:
While developing the amd_close_race IGT test (which intentionally triggers
execute permission faults by removing VM_PAGE_EXECUTABLE from GPU page table
entries), we discovered that on Navi10 (GFX 10.1.x) these faults produce
zero diagnostic output. The GPU simply hangs silently for ~10s until the
scheduler timeout fires. There is no way to distinguish an execute
permission fault from any other type of GPU hang.
Root cause:
GFX 10.1.x defaults to noretry=0, which sets
RETRY_PERMISSION_OR_INVALID_PAGE_FAULT=1 in the GFXHUB UTCL2 registers
(gfxhub_v2_0.c line 313). With this bit set, permission faults (valid PTE,
wrong R/W/X bits) are handled entirely within the UTCL1/UTCL2 hardware
loop: UTCL2 returns an XNACK to UTCL1, and UTCL1 re-requests the
translation indefinitely, expecting software to eventually fix the
permission bits (as happens in SVM/HMM recovery). No interrupt of any kind
reaches the IH ring.
This is different from invalid-page faults (V=0) which DO generate a retry
fault interrupt that the driver can escalate to a no-retry fault. Permission
faults with valid PTEs loop silently forever in hardware.
GFX 10.3+ already defaults to noretry=1, which makes permission faults
generate immediate L2 protection fault interrupts. GFX 10.1.x was
inadvertently left out of this default.
Fix:
Change the noretry=1 threshold from IP_VERSION(10, 3, 0) to
IP_VERSION(10, 1, 0) in amdgpu_gmc_noretry_set(). This is a one-line
change that aligns GFX 10.1.x behavior with GFX 10.3+ and all newer
generations.
With noretry=1, the existing non-retry fault handler
(gmc_v10_0_process_interrupt) already decodes and prints the full
GCVM_L2_PROTECTION_FAULT_STATUS register including PERMISSION_FAULTS,
faulting address, VMID, PASID, and process name. No additional logging
code is needed — the fix is purely routing permission faults to the
existing, fully-capable non-retry interrupt handler.
v2: Dropped GFX10-specific logging from gmc_v10_0.c and
kfd_int_process_v10.c (Felix Kuehling). v1 added logging in the retry
fault handler, but with noretry=1 permission faults take the non-retry
path — the v1 retry handler code was dead and would never execute.
Tested on Navi10 (GFX 10.1.10):
- Execute permission faults now produce immediate, clear output:
[gfxhub] page fault (src_id:0 ring:64 vmid:4 pasid:592)
Process amd_close_race pid 13380 thread amd_close_race pid 13384
in page at address 0x40001000 from client 0x1b (UTCL2)
GCVM_L2_PROTECTION_FAULT_STATUS:0x00700881
PERMISSION_FAULTS: 0x8
- No regressions with properly-mapped GPU workloads
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb21edd24c40d81066753f8ac6f23bce15745395)
Cc: stable@vger.kernel.org
|
|
When the fault stop mode isn't AMDGPU_VM_FAULT_STOP_ALWAYS,
these bits should be programmed to 0.
Program CRASH_ON_NO_RETRY_FAULT and CRASH_ON_RETRY_FAULT
always, to make sure to clear the bits when we don't want
to crash.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d0cd99e73090700b7a942b98a3327ec966597d0a)
|
|
Wait for all submissions when userptrs need to be invalidated by the MMU
notifier, not just the one the userptr was involved into.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 91250893cbaa25c86872deca95a540d08de1f91e)
Cc: stable@vger.kernel.org
|
|
Set correct DMA mask for gfx12
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a2ef14ee2593b48242b8d90f229f71c1710529da)
|
|
For PTE creation use asic specific physical page base address mask
v2: Change variable name from pa_mask to pte_addr_mask
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2ea989885941a6e5607ef86dbe309e90b7191f21)
|
|
SMU messages may use fewer arguments than the available argument registers,
the previous code only wrote used registers and left the rest unchanged,
so stale values from a prior message could persist.
Write all argument registers for each message and zero the unused tail
to keep command arguments deterministic and avoid unintended carry-over.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e03b635f61f77ebd5107ef82f48e3221cb695856)
|
|
EnergyAccumulator is unsupported on SMU 14.0.2, mark it invalid.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 646b05043eeed04b51c14aad22a400a8250af4b7)
Cc: stable@vger.kernel.org
|