| Age | Commit message (Collapse) | Author |
|
Add a generic function to map IH node-id to XCC instance.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a separate interrupt handler for handling interrupts,
both retry and no-retry, for GFX 12.1.0.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the UTCL2 retry fault interrupt for both GCVM and MMVM for GFX 12.1.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Needed to query the CSA size and alignment for SDMA
user queues.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
v4:
use func "amdgpu_gfx_get_hdp_flush_mask" to get ref_and_mask for
gfx9 through gfx12.
v3:
Unify the get_ref_and_mask function in amdgpu_gfx_funcs,
to support both GFX11 and earlier generations
v2:
place "get_ref_and_mask" in amdgpu_gfx_funcs instead of amdgpu_ring,
since this function only assigns the cp entry.
v1:
both gfx ring and mes ring use cp0 to flush hdp, cause conflict.
use function get_ref_and_mask to assign the cp entry.
reassign mes to use cp8 instead.
Signed-off-by: chong li <chongli2@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This is used by firmware for compute user queues.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add vram_type to ras_ta_init_flags.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update SDMA instances/masks according to xcc num for
multi-xcc models on soc v1.0.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Initialize xcp manager for soc v1_0
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Implement xcp mgr callbacks for soc v1_0
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To be used by soc v1_0 xcp manager
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To be used by soc v1_0 xcp manager
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add vram_type to ras init_flags.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_virt_ras_get_cper_records() was using a large stack array
of ras_log_info pointers. This contributed to the frame size
warning on this function.
Replace the fixed-size stack array:
struct ras_log_info *trace[MAX_RECORD_PER_BATCH];
with a heap-allocated array using kcalloc().
We free the trace buffer together with out_buf on all exit paths.
If allocation of trace or out_buf fails, we return a generic RAS
error code.
This reduces stack usage and keeps the runtime behaviour
unchanged.
Fixes:
stack frame size: 1112 bytes (limit: 1024)
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only check and drain IH1 ring if CAM is not enabled.
If GPU is under reset, don't access IH to drain retry fault.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Deeply daisy chained DP/MST displays are no longer able to light
up. This reverts commit e0dec00f3d05 ("drm/amd/display: Fix pbn
to kbps Conversion")
Cc: Jerry Zuo <jerry.zuo@amd.com>
Reported-by: nat@nullable.se
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4756
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To enable switching compute partition mode
v2: cleanup (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use gfx v12_1 callback to query the numbers of xccs
per xcp
v2: add todo (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove the redundant check for async_gfx_ring,
as it is not required for gfx v12_1
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disable doorbell range for graphics engine on gfx v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable unmapped doorbell handling for gfx v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Revision doorbell range on muti-XCC mode for gfx v12_1.
Clean up doorbell range set for graphics engine.
V2: Remove doorbell range set from gfx_v12_1_xcc_kiq_init_register.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Shader messages to deallocate VGPRs prior to shader end can prevent
the trap handler from saving context, making debugging and core dumps
unreliable.
VGPR deallocations for performance gain is negligible.
GC 12.1 will NOP shader VGPR deallocation messages via HW
settings on driver boot.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
PMFW is integrated into ifwi for gfx 12_1 adapter,
making PMFW backdoor loading unnecessary.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Unbinding amdgpu has no problems, but binding it again leads to an
error of sysfs file already existing. This is because it wasn't
actually cleaned up on unbind. Add the missing cleanup step.
Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_acpi_detect() calls some helper functions it calls have large
local structures. When the compiler inlines these helpers, their local
data adds to the amdgpu_acpi_detect() stack frame.
Mark the helpers with noinline_for_stack:
- amdgpu_atif_verify_interface()
- amdgpu_atif_get_notification_params()
- amdgpu_atif_query_backlight_caps()
- amdgpu_atcs_verify_interface()
- amdgpu_acpi_enumerate_xcc()
This keeps the large temporary objects inside the helper’s own stack
frame instead of being inlined into the caller, preventing the caller
from growing beyond the stack limit.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1403:6: warning: stack frame size (1688) exceeds limit (1024) in 'amdgpu_acpi_detect' [-Wframe-larger-than]
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY & HOW]
Make a dedicated function to read HDMI-related monitor info, including
monitor's SCDC support.
Fixes: 3471b9a31ce3 ("drm/amd/display: Rework HDMI data channel reads")
Suggested-by: Fangzhi Zuo <jerry.zuo@amd.com>
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c78e31bcf586f1c910a2636650840f5ce1cb1c63)
|
|
GFX1151 has 1.5x the number of available physical VGPRs per SIMD.
Bump total memory availability for acquire checks on queue creation.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b42f3bf9536c9b710fd1d4deb7d1b0dc819dc72d)
Cc: stable@vger.kernel.org
|
|
On a 32-bit ARM system, the audio_decoder struct ends up being too large
for dp_retrain_link_dp_test.
link_dp_cts.c:157:1: error: the frame size of 1328 bytes is larger than
1280 bytes [-Werror=frame-larger-than=]
This is mitigated by shrinking the members of the struct and avoids
having to deal with dynamic allocation.
feed_back_divider is assigned but otherwise unused. Remove both.
pixel_repetition looks like it should be a bool since it's only ever
assigned to 1. But there are checks for 2 and 4. Reduce to uint8_t.
Remove ss_percentage_divider. Unused.
Shrink refresh_rate as it gets assigned to at most a 3 digit integer
value.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3849efdc7888d537f09c3dcfaea4b3cd377a102e)
|
|
This is important for userspace to avoid hardcoding VGPR size.
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 71776e0965f9f730af19c5f548827f2a7c91f5a8)
Cc: stable@vger.kernel.org
|
|
[WHAT]
When compiling Linux kernel with clang, the following warning / error
messages pops up:
drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/dml2_0/display_mode_core.c:6853:12:
error: stack frame size (2120) exceeds limit (2056) in
'dml_core_mode_support' [-Werror,-Wframe-larger-than]
6853 | dml_bool_t dml_core_mode_support(struct display_mode_lib_st
*mode_lib)
[HOW]
Refactoring CalculateVMRowAndSwath_params assignments to a new function
helps reduce the stack frame size in dml_core_mode_support.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4733
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 29a4dc4b5d82e6b3da343391f9e784cf5c48732c)
|
|
SI hardware doesn't support pasids, user mode queues, or
KIQ/MES so there is no need for this. Doing so results in
a segfault as these callbacks are non-existent for SI.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4744
Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 820b3d376e8a102c6aeab737ec6edebbbb710e04)
|
|
This can get called from an atomic context.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4470
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8acdad9344cc7b4e7bc01f0dfea80093eb3768db)
Cc: stable@vger.kernel.org
|
|
The trap may be entered with dependency checking disabled.
Wait for dependency counters and save/restore scheduling mode.
v2:
Use ttmp1 instead of ttmp11. ttmp11 is not zero-initialized.
While the trap handler does zero this field before use, a user-mode
second-level trap handler could not rely on this being zero when
using an older kernel mode driver.
v3:
Use ttmp11 primarily but copy to ttmp1 before jumping to the
second level trap handler. ttmp1 is inspectable by a debugger.
Unexpected bits in the unused space may regress existing software.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 423888879412e94725ca2bdccd89414887d98e31)
Cc: stable@vger.kernel.org
|
|
When split svm ranges that have been mapped using huge page should use huge
page size(2MB) to check split range alignment, not prange->granularity that
means migration granularity.
Fixes: 7ef6b2d4b7e5 ("drm/amdkfd: remap unaligned svm ranges that have split")
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 448ee45353ef9fb1a34f5f26eb3f48923c6f0898)
|
|
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
drm_sched_job_add_resv_dependencies can fail in amdgpu_ttm_prepare_job.
In this case we need to use amdgpu_job_free to release memory.
---
v4: moved job pointer clearing to a different patchset
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Deduplicate the IB padding code and will also be used
later to check locking.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No functional change for now, but this struct will have more
fields added in the next commit.
This change would introduce synchronisation issue, because
dependencies between successive jobs are not taken care of
properly. For instance, amdgpu_ttm_clear_buffer uses
amdgpu_ttm_map_buffer then amdgpu_ttm_fill_mem which should
use different entities (default_entity then move/clear entity).
To prevent failures for this commit, we limit ourselves to
2 entities: default_entity (which replaces high_pr usages) and
clear_entity (which replaces low_pr usages).
The next commits will deal with these dependencies correctly,
and then we'll be able to use move_entity.
---
v2: renamed amdgpu_ttm_buffer_entity
v4: don't use move_entity in ttm yet
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Acked-by: Felix Kuehling <felix.kuehling@amd.com> (v3)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add IMU support for gc version 12.1.0.
Only support imu fw loading for imu 12.1.0.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix some code error for muti-xcc on mes v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update number of mmhub and mid_mask via reuse aid_mask.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set GFXHUB accodring to xcc_mask for gfx version 12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add XCC id info for compute ring name on gfx version 12.1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GFX1151 has 1.5x the number of available physical VGPRs per SIMD.
Bump total memory availability for acquire checks on queue creation.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Revert the change to enable retry based thrashing prevention on GFX 12.1.0
for now as its causing data mismatch and slowness issues with multiple HIP
tests.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Configure a single mes instance if the xcc_mask remains
uninitialized.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For GFX 12.1.0, setup correct MTYPE for a BO depending on
its current location relative to the mapping GPU.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|