| Age | Commit message (Collapse) | Author |
|
Pull drm fixes from Dave Airlie:
"Seems to be a bit quieter this week, mostly xe and amdgpu, with msm
and imx fixes and one WARN_ON from user blocked. Nothing of note
outstanding either.
uapi:
- Fix a WARN_ON() when passing an invalid handle to
drm_gem_change_handle_ioctl()
msm:
- GPU:
- Fix bogus hwcg register update for a690
xe:
- Skip address copy for sync-only execs
- Fix a WA
- Derive mem_copy cap from graphics version
- Fix is_bound() pci_dev lifetime
- xe nvm cleanup fixes
amdgpu:
- SMU 13 fixes
- SMU 14 fixes
- GPUVM fault filter fix
- Powergating fix
- HDMI debounce fix
- Xclk fix for soc21 APUs
- Fix COND_EXEC handling for GC 11
- GC 10-12 KGQ init fixes
- GC 11-12 KGQ reset fixes
imx/tve:
- drop ddc device reference when unloading"
* tag 'drm-fixes-2026-01-30' of https://gitlab.freedesktop.org/drm/kernel: (21 commits)
drm/xe/nvm: Fix double-free on aux add failure
drm/xe/nvm: Manage nvm aux cleanup with devres
drm/amdgpu/gfx12: adjust KGQ reset sequence
drm/amdgpu/gfx11: adjust KGQ reset sequence
drm/amdgpu/gfx12: fix wptr reset in KGQ init
drm/amdgpu/gfx11: fix wptr reset in KGQ init
drm/amdgpu/gfx10: fix wptr reset in KGQ init
drm/xe/configfs: Fix is_bound() pci_dev lifetime
drm/amdgpu: Fix cond_exec handling in amdgpu_ib_schedule()
drm/amdgpu/soc21: fix xclk for APUs
drm/amd/display: Clear HDMI HPD pending work only if it is enabled
drm/imx/tve: fix probe device leak
drm/amd/pm: fix race in power state check before mutex lock
drm/amdgpu: fix NULL pointer dereference in amdgpu_gmc_filter_faults_remove
drm/amd/pm: fix smu v14 soft clock frequency setting issue
drm/amd/pm: fix smu v13 soft clock frequency setting issue
drm/xe: derive mem copy capability from graphics version
drm/xe/xelp: Fix Wa_18022495364
drm/xe: Skip address copy for sync-only execs
drm: Do not allow userspace to trigger kernel warnings in drm_gem_change_handle_ioctl()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes. 9 are cc:stable, 12 are for MM.
There's a patch series from Pratyush Yadav which fixes a few things in
the new-in-6.19 LUO memfd code.
Plus the usual shower of singletons - please see the changelogs for
details"
* tag 'mm-hotfixes-stable-2026-01-29-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
vmcoreinfo: make hwerr_data visible for debugging
mm/zone_device: reinitialize large zone device private folios
mm/mm_init: don't cond_resched() in deferred_init_memmap_chunk() if called from deferred_grow_zone()
mm/kfence: randomize the freelist on initialization
kho: kho_preserve_vmalloc(): don't return 0 when ENOMEM
kho: init alloc tags when restoring pages from reserved memory
mm: memfd_luo: restore and free memfd_luo_ser on failure
mm: memfd_luo: use memfd_alloc_file() instead of shmem_file_setup()
memfd: export alloc_file()
flex_proportions: make fprop_new_period() hardirq safe
mailmap: add entry for Viacheslav Bocharov
mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
mm, swap: restore swap_space attr aviod kernel panic
mm/kasan: fix KASAN poisoning in vrealloc()
mm/shmem, swap: fix race of truncate and swap entry split
|
|
Kernel gfx queues do not need to be reinitialized or
remapped after a reset. Align with gfx11.
v2: preserve init and remap for MMIO case.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0a6d6ed694d72b66b0ed7a483d5effa01acd3951)
Cc: stable@vger.kernel.org
|
|
Kernel gfx queues do not need to be reinitialized or
remapped after a reset. This fixes queue reset failures
on APUs.
v2: preserve init and remap for MMIO case.
Fixes: b3e9bfd86658 ("drm/amdgpu/gfx11: add ring reset callbacks")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b340ff216fdabfe71ba0cdd47e9835a141d08e10)
Cc: stable@vger.kernel.org
|
|
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a2918f958d3f677ea93c0ac257cb6ba69b7abb7c)
Cc: stable@vger.kernel.org
|
|
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1f16866bdb1daed7a80ca79ae2837a9832a74fbc)
Cc: stable@vger.kernel.org
|
|
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e80b1d1aa1073230b6c25a1a72e88f37e425ccda)
Cc: stable@vger.kernel.org
|
|
The EXEC_COUNT field must be > 0. In the gfx shadow
handling we always emit a cond_exec packet after the gfx_shadow
packet, but the EXEC_COUNT never gets patched. This leads
to a hang when we try and reset queues on gfx11 APUs.
Fixes: c68cbbfd54c6 ("drm/amdgpu: cleanup conditional execution")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba205ac3d6e83f56c4f824f23f1b4522cb844ff3)
Cc: stable@vger.kernel.org
|
|
The reference clock is supposed to be 100Mhz, but it
appears to actually be slightly lower (99.81Mhz).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14451
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 637fee3954d4bd509ea9d95ad1780fc174489860)
Cc: stable@vger.kernel.org
|
|
[Why&How]
On amdgpu_dm_connector_destroy(), the driver attempts to cancel pending
HDMI HPD work without checking if the HDMI HPD is enabled.
Added a check that it is enabled before clearing it.
Fixes: 6a681cd90345 ("drm/amd/display: Add an hdmi_hpd_debounce_delay_ms module")
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 17b2c526fd8026d8e0f4c0e7f94fc517e3901589)
|
|
The power state check in amdgpu_dpm_set_powergating_by_smu() is done
before acquiring the pm mutex, leading to a race condition where:
1. Thread A checks state and thinks no change is needed
2. Thread B acquires mutex and modifies the state
3. Thread A returns without updating state, causing inconsistency
Fix this by moving the mutex lock before the power state check,
ensuring atomicity of the state check and modification.
Fixes: 6ee27ee27ba8 ("drm/amd/pm: avoid duplicate powergate/ungate setting")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7a3fbdfd19ec5992c0fc2d0bd83888644f5f2f38)
|
|
On APUs such as Raven and Renoir (GC 9.1.0, 9.2.2, 9.3.0), the ih1 and
ih2 interrupt ring buffers are not initialized. This is by design, as
these secondary IH rings are only available on discrete GPUs. See
vega10_ih_sw_init() which explicitly skips ih1/ih2 initialization when
AMD_IS_APU is set.
However, amdgpu_gmc_filter_faults_remove() unconditionally uses ih1 to
get the timestamp of the last interrupt entry. When retry faults are
enabled on APUs (noretry=0), this function is called from the SVM page
fault recovery path, resulting in a NULL pointer dereference when
amdgpu_ih_decode_iv_ts_helper() attempts to access ih->ring[].
The crash manifests as:
BUG: kernel NULL pointer dereference, address: 0000000000000004
RIP: 0010:amdgpu_ih_decode_iv_ts_helper+0x22/0x40 [amdgpu]
Call Trace:
amdgpu_gmc_filter_faults_remove+0x60/0x130 [amdgpu]
svm_range_restore_pages+0xae5/0x11c0 [amdgpu]
amdgpu_vm_handle_fault+0xc8/0x340 [amdgpu]
gmc_v9_0_process_interrupt+0x191/0x220 [amdgpu]
amdgpu_irq_dispatch+0xed/0x2c0 [amdgpu]
amdgpu_ih_process+0x84/0x100 [amdgpu]
This issue was exposed by commit 1446226d32a4 ("drm/amdgpu: Remove GC HW
IP 9.3.0 from noretry=1") which changed the default for Renoir APU from
noretry=1 to noretry=0, enabling retry fault handling and thus
exercising the buggy code path.
Fix this by adding a check for ih1.ring_size before attempting to use
it. Also restore the soft_ih support from commit dd299441654f ("drm/amdgpu:
Rework retry fault removal"). This is needed if the hardware doesn't
support secondary HW IH rings.
v2: additional updates (Alex)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3814
Fixes: dd299441654f ("drm/amdgpu: Rework retry fault removal")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6ce8d536c80aa1f059e82184f0d1994436b1d526)
Cc: stable@vger.kernel.org
|
|
v1:
resolve the issue where some freq frequencies cannot be set correctly
due to insufficient floating-point precision.
v2:
patch this convert on 'max' value only.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 53868dd8774344051999c880115740da92f97feb)
Cc: stable@vger.kernel.org
|
|
v1:
resolve the issue where some freq frequencies cannot be set correctly
due to insufficient floating-point precision.
v2:
patch this convert on 'max' value only.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6194f60c707e3878e120adeb36997075664d8429)
Cc: stable@vger.kernel.org
|
|
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio. This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio. As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.
Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.
Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.19-2026-01-22:
amdgpu:
- GC 12 fix
- Misc error path fixes
- DC analog fix
- SMU 6 fixes
- TLB flush fix
- DC idle optimization fix
amdkfd:
- GC 11 cooperative launch fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260122204308.946339-1-alexander.deucher@amd.com
|
|
This reverts commit bc6d54ac7e7436721a19443265f971f890c13cc5.
The workload profile needs to be in the default state when
the dc idle optimizaion state is entered. However, when
jobs come in for video or GFX or compute, the profile may
be set to a non-default profile resulting in the dc idle
optimizations not taking affect and resulting in higher
power usage. As such we need to pause the workload profile
changes during this transition. When this patch was originally
committed, it caused a regression with a Dell U3224KB display,
but no other problems were reported at the time. When it
was reapplied (this patch) to address increased power usage, it
seems to have caused additional regressions. This change seems
to have a number of side affects (audio issues, stuttering,
etc.). I suspect the pause should only happen when all displays
are off or in static screen mode, but I think this call site
gets called more often than that which results in idle state
entry more often than intended. For now revert.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4894
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4717
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4725
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4517
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4806
Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1412482b714358ffa30d38fd3dd0b05795163648)
|
|
dm_plane_init_colorops() allocates enum names for color pipelines.
These are eventually passed to drm_property_create_enum() which create
its own copies of the string. Free the strings after initialization
is done.
Also, allocate color pipeline enum names only after successfully creating
color pipeline.
Fixes: 9ba25915efba ("drm/amd/display: Add support for sRGB EOTF in DEGAM block")
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Acked-by: Alex Deucher <alexander.deucher@amd.com> #irc
Link: https://patch.msgid.link/20260113102303.724205-3-chaitanya.kumar.borah@intel.com
|
|
Needs to be a u64.
Fixes: 77cc0da39c7c ("drm/amdgpu: track ring state associated with a fence")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 56fff1941abd3ca3b6f394979614ca7972552f7f)
|
|
When a function holds a lock and we return without unlocking it,
it deadlocks the kernel. We should always unlock before returning.
This commit fixes suspend/resume on SI.
Tested on two Tahiti GPUs: FirePro W9000 and R9 280X.
Fixes: f4db9913e4d3 ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202601190121.z9C0uml5-lkp@intel.com/
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e3a6eff92bbd960b471966d9afccb4d584546d17)
|
|
Radeon 430 and 520 are OEM GPUs from 2016~2017
They have the same device id: 0x6611 and revision: 0x87
On the Radeon 430, powertune is buggy and throttles the GPU,
never allowing it to reach its maximum SCLK. Work around this
bug by raising the TDP limits we program to the SMC from
24W (specified by the VBIOS on Radeon 430) to 32W.
Disabling powertune entirely is not a viable workaround,
because it causes the Radeon 520 to heat up above 100 C,
which I prefer to avoid.
Additionally, revise the maximum SCLK limit. Considering the
above issue, these GPUs never reached a high SCLK on Linux,
and the workarounds were added before the GPUs were released,
so the workaround likely didn't target these specifically.
Use 780 MHz (the maximum SCLK according to the VBIOS on the
Radeon 430). Note that the Radeon 520 VBIOS has a higher
maximum SCLK: 905 MHz, but in practice it doesn't seem to
perform better with the higher clock, only heats up more.
v2:
Move the workaround to si_populate_smc_tdp_limits.
Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 966d70f1e160bdfdecaf7ff2b3f22ad088516e9f)
|
|
There is no reason to clear the SMC table.
We also don't need to recalculate the power limit then.
Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e214d626253f5b180db10dedab161b7caa41f5e9)
|
|
Use WREG32 to write mmCG_THERMAL_INT.
This is a direct access register.
Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2555f4e4a741d31e0496572a8ab4f55941b4e30e)
|
|
Analog connectors may be hot-plugged unlike other connector
types that don't support HPD.
Stop DRM from polling other connector types that don't
support HPD, such as eDP, LVDS, etc. These were wrongly
polled when analog connector support was added,
causing issues with the seamless boot process.
Fixes: c4f3f114e73c ("drm/amd/display: Poll analog connectors (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e924c7004b08e4e173782bad60b27841d889e371)
|
|
If fence emit fails, free the fence if necessary.
Fixes: db36632ea51e ("drm/amdgpu: clean up and unify hw fence handling")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5eb680a06007f2f6ea333d11a4e29039da90614b)
|
|
Restrictions on debugging cooperative launch for GFX11 devices should
align to CWSR work around requirements.
i.e. devices without the need for the work around should not be subject
to such restrictions.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: James Zhu <james.zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 230ef3977d6ffdd498ffa9baa6f5a061786189bf)
|
|
If drm_sched_job_init fails, hw_vm_fence is not freed currently,
then cause memory leak.
Fixes: db36632ea51e ("drm/amdgpu: clean up and unify hw fence handling")
Link: https://lore.kernel.org/amd-gfx/a5a828cb-0e4a-41f0-94c3-df31e5ddad52@amd.com/T/#t
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5d42ee457ccd1fb5da4c7f817825b2806ec36956)
|
|
Remove emit_frame_cntl function for gfx v12, which is not support.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5aaa5058dec5bfdcb24c42fe17ad91565a3037ca)
Cc: stable@vger.kernel.org
|
|
[Why&How]
Right now, the HDMI HPD filter is enabled by default at 1500ms.
We want to disable it by default, as most modern displays with HDMI do
not require it for DPMS mode.
The HPD can instead be enabled as a driver parameter with a custom delay
value in ms (up to 5000ms).
Fixes: c918e75e1ed9 ("drm/amd/display: Add an HPD filter for HDMI")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4859
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6a681cd9034587fe3550868bacfbd639d1c6891f)
|
|
The user mode queue keeps a pointer to the most recent fence in
userq->last_fence. This pointer holds an extra dma_fence reference.
When the queue is destroyed, we free the fence driver and its xarray,
but we forgot to drop the last_fence reference.
Because of the missing dma_fence_put(), the last fence object can stay
alive when the driver unloads. This leaves an allocated object in the
amdgpu_userq_fence slab cache and triggers
This is visible during driver unload as:
BUG amdgpu_userq_fence: Objects remaining on __kmem_cache_shutdown()
kmem_cache_destroy amdgpu_userq_fence: Slab cache still has objects
Call Trace:
kmem_cache_destroy
amdgpu_userq_fence_slab_fini
amdgpu_exit
__do_sys_delete_module
Fix this by putting userq->last_fence and clearing the pointer during
amdgpu_userq_fence_driver_free().
This makes sure the fence reference is released and the slab cache is
empty when the module exits.
v2: Update to only release userq->last_fence with dma_fence_put()
(Christian)
Fixes: edc762a51c71 ("drm/amdgpu/userq: move some code around")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8e051e38a8d45caf6a866d4ff842105b577953bb)
|
|
Each queue of the process is individually removed and there is not need
to suspend whole mes. Suspending mes stops kernel mode queues also
causing unnecessary timeouts when running mixed work loads
Fixes: 079ae5118e1f ("drm/amdkfd: fix suspend/resume all calls in mes based eviction path")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4765
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3fd20580b96a6e9da65b94ac3b58ee288239b731)
|
|
This reverts commit 820b3d376e8a102c6aeab737ec6edebbbb710e04.
It’s better to validate VM TLB flushes in the flush‑TLB backend
rather than in the generic VM layer.
Reverting this patch depends on
commit fa7c231fc2b0 ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
being present in the tree.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9163fe4d790fb4e16d6b0e23f55b43cddd3d4a65)
|
|
Validate flush_gpu_tlb_pasid() availability before flushing tlb.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f4db9913e4d3dabe9ff3ea6178f2c1bc286012b8)
|
|
resolving the issue of incorrect type definitions potentially causing calculation errors.
Fixes: 54f7f3ca982a ("drm/amdgpu/swm14: Update power limit logic")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e3a03d0ae16d6b56e893cce8e52b44140e1ed985)
|
|
Internal backlight levels are initialised from ACPI but the values
are sometimes out of sync with the levels in effect until there has
been a read from hardware (eg triggered by reading from sysfs).
This means that the first drm_commit can cause the levels to be set
to a different value than the actual starting one, which results in
a sudden change in brightness.
This path shows the problem (when the values are out of sync):
amdgpu_dm_atomic_commit_tail()
-> amdgpu_dm_commit_streams()
-> amdgpu_dm_backlight_set_level(..., dm->brightness[n])
This patch calls the backlight ops get_brightness explicitly
at the end of backlight registration to make sure dm->brightness[n]
is in sync with the actual hardware levels.
Fixes: 2fe87f54abdc ("drm/amd/display: Set default brightness according to ACPI")
Signed-off-by: Vivek Das Mohapatra <vivek@collabora.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 318b1c36d82a0cd2b06a4bb43272fa6f1bc8adc1)
Cc: stable@vger.kernel.org
|
|
[Why]
DP-HDMI dongles can execeed bandwidth requirements on high resolution
monitors. This can lead to pruning the high resolution modes.
HDMI 1.3 bumped the clock to 340MHz, but display code never matched it.
[How]
Set default to (DVI) 165MHz. Once HDMI display is identified update
to 340MHz.
Reported-by: Dianne Skoll <dianne@skoll.ca>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4780
Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ac1e65d8ade46c09fb184579b81acadf36dcb91e)
Cc: stable@vger.kernel.org
|
|
[Why]
The PSR message was moved in commit 4321742c394e ("drm/amd/display:
Move PSR support message into amdgpu_dm"). This message however shows
for every single link without showing which link is which. This can
send a confusing message to the user.
[How]
Add link name into the message.
Fixes: 4321742c394e ("drm/amd/display: Move PSR support message into amdgpu_dm")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 99f77f6229c0766b980ae05affcf9f742d97de6a)
|
|
If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd()
to release the memory allocated by allocate_hiq_sdma_mqd().
Move deallocate_hiq_sdma_mqd() up to ensure proper function
visibility at the point of use.
Fixes: 11614c36bc8f ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7cccc8286bb9919a0952c812872da1dcfe9d390)
Cc: stable@vger.kernel.org
|
|
These IOCTLs shouldn't be called when userqs are not
enabled. Make sure they are enabled before executing
the IOCTLs.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d967509651601cddce7ff2a9f09479f3636f684d)
Cc: stable@vger.kernel.org
|
|
Use dst input parameter to setup gart page table entries instead of using fixed
location.
Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)")
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ca5d4db8db843be7ed35fc9334737490c2b58d32)
|
|
GC12 VRAM surfaces"
This reverts commit 22a36e660d014925114feb09a2680bb3c2d1e279 once,
which was merged twice due to an incorrect backmerge resolution.
Fixes: ce0478b02ed2 ("Merge tag 'v6.18-rc6' into drm-next")
Signed-off-by: Peter Colberg <pcolberg@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 38a0f4cf8c6147fd10baa206ab349f8ff724e391)
|
|
When an eGPU is unplugged the KFD topology should also be destroyed
for that GPU. This never happens because the fini_sw callbacks never
get to run. Run them manually before calling amdgpu_device_ip_fini_early()
when a device has already been disconnected.
This location is intentionally chosen to make sure that the kfd locking
refcount doesn't get incremented unintentionally.
Cc: kent.russell@amd.com
Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6a23e7b4332c10f8b56c33a9c5431b52ecff9aab)
Cc: stable@vger.kernel.org
|
|
When driver not support atomic, fb using plane->fb rather than
plane->state->fb.
Fixes: fe151ed7af54 ("drm/amdgpu: add generic display panic helper code")
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2f2a72de673513247cd6fae14e53f6c40c5841ef)
|
|
Fix copy&paste error, that should have been an assignment instead of an or,
otherwise MTYPE_UC 0x3 can not be updated to MTYPE_RW 0x1.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fc1366016abe4103c0f0fac882811aea961ef213)
Cc: stable@vger.kernel.org
|
|
Skipping power ungate exposed some scenarios that will fail
like below:
```
amdgpu: Register(0) [regVPEC_QUEUE_RESET_REQ] failed to reach value 0x00000000 != 0x00000001n
amdgpu 0000:c1:00.0: amdgpu: VPE queue reset failed
...
amdgpu: [drm] *ERROR* wait_for_completion_timeout timeout!
```
The underlying s2idle issue that prompted this commit is going to
be fixed in BIOS.
This reverts commit 2a6c826cfeedd7714611ac115371a959ead55bda.
This was lost in the 6.19 merge so reapply it.
Fixes: 2a6c826cfeed ("drm/amd: Skip power ungate during suspend for VPE")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Konstantin <answer2019@yandex.ru>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220812
Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3925683515e93844be204381d2d5a1df5de34f31)
|
|
dac_load_detection can be NULL in some scenario, so checking it before
calling.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 179176134b535246f0b368b30e8ecad50066f896)
|
|
During Mode 1 reset, the ASIC undergoes a reset cycle and becomes
temporarily inaccessible via PCIe. Any attempt to access MMIO registers
during this window (e.g., from interrupt handlers or other driver threads)
can result in uncompleted PCIe transactions, leading to NMI panics or
system hangs.
To prevent this, set the `no_hw_access` flag to true immediately after
triggering the reset. This signals other driver components to skip
register accesses while the device is offline.
A memory barrier `smp_mb()` is added to ensure the flag update is
globally visible to all cores before the driver enters the sleep/wait
state.
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7edb503fe4b6d67f47d8bb0dfafb8e699bb0f8a4)
|
|
[Why]
Query for VPE block_type and ip_count is missing.
[How]
Add VPE case in ip_block_type and hw_ip_count query.
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a6ea0a430aca5932b9c75d8e38deeb45665dd2ae)
Cc: stable@vger.kernel.org
|
|
Apparently the DAC encoder needs to be set up before use.
The BIOS parser in DC did not support this so I assumed it was
not necessary, but the DAC doesn't work without it on some GPUs.
Fixes: 69b29b894660 ("drm/amd/display: Hook up DAC to bios_parser_encoder_control")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bb5dfe2f5630ce344c654c705d28b4e20cb9d334)
|
|
Pass the correct enum values as expected by the VBIOS.
Previously the actual bit depth integer value was passed,
which was a mistake.
Fixes: 7fb4f254c8eb ("drm/amd/display: Add SelectCRTC_Source to BIOS parser")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cdf6e4c0cdab129ffc4e41a8ac53a0738f805072)
|