<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/gpu/drm/amd/amdgpu, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge tag 'amd-drm-next-7.2-2026-06-04' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2026-06-08T09:57:35+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2026-06-08T09:56:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5ea194bc25518376dee1209e64d32f940dd0cc4c'/>
<id>5ea194bc25518376dee1209e64d32f940dd0cc4c</id>
<content type='text'>
amd-drm-next-7.2-2026-06-04:

amdgpu:
- UserQ fix
- Userptr fix
- MCCS freesync fix
- Remove some triggerable BUG() calls
- DCN 4.2.1 fixes
- Lockdep annotations
- Guilty handling fix
- VCN 5.3 fix
- FRL fixes
- Bounds checking fixes
- HMM fix
- IRQ accounting fix

amdkfd:
- Fix an event information leak
- Events bounds check fix
- Trap cleanup fix
- Bounds checking fixes
- MES fix

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patch.msgid.link/20260604231801.19979-1-alexander.deucher@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
amd-drm-next-7.2-2026-06-04:

amdgpu:
- UserQ fix
- Userptr fix
- MCCS freesync fix
- Remove some triggerable BUG() calls
- DCN 4.2.1 fixes
- Lockdep annotations
- Guilty handling fix
- VCN 5.3 fix
- FRL fixes
- Bounds checking fixes
- HMM fix
- IRQ accounting fix

amdkfd:
- Fix an event information leak
- Events bounds check fix
- Trap cleanup fix
- Bounds checking fixes
- MES fix

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patch.msgid.link/20260604231801.19979-1-alexander.deucher@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/gfx: move fault and EOP IRQ get/put to hw_init/hw_fini</title>
<updated>2026-06-04T19:35:19+00:00</updated>
<author>
<name>Yunxiang Li</name>
<email>Yunxiang.Li@amd.com</email>
</author>
<published>2026-05-27T22:05:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9117d8be850baf0f89b65ff399442fb59b1a1beb'/>
<id>9117d8be850baf0f89b65ff399442fb59b1a1beb</id>
<content type='text'>
priv_reg / priv_inst / bad_op and (on v11+) userq EOP IRQs are
acquired in late_init but released in hw_fini.  This split forced
gfx_v9_0_hw_fini() to defensively guard each put with
amdgpu_irq_enabled() because hw_fini runs on paths that may not
reach late_init.

amdgpu_ip_block_hw_fini() only runs after hw_init returns success,
and suspend / resume cycle the refs through the same path, so
hw_init / hw_fini pair without any extra tracking.  Move the gets
there and drop the guards.

While here, fix the pre-existing partial-failure leak in
set_userq_eop_interrupts() (gfx11 / 12_0 / 12_1).  amdgpu_irq_get()
increments the refcount before calling .set, so a failure partway
through the loop leaves earlier successful gets stranded.  Track
the loop position and roll back on the enable path.

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
priv_reg / priv_inst / bad_op and (on v11+) userq EOP IRQs are
acquired in late_init but released in hw_fini.  This split forced
gfx_v9_0_hw_fini() to defensively guard each put with
amdgpu_irq_enabled() because hw_fini runs on paths that may not
reach late_init.

amdgpu_ip_block_hw_fini() only runs after hw_init returns success,
and suspend / resume cycle the refs through the same path, so
hw_init / hw_fini pair without any extra tracking.  Move the gets
there and drop the guards.

While here, fix the pre-existing partial-failure leak in
set_userq_eop_interrupts() (gfx11 / 12_0 / 12_1).  amdgpu_irq_get()
increments the refcount before calling .set, so a failure partway
through the loop leaves earlier successful gets stranded.  Track
the loop position and roll back on the enable path.

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: drop retry loop in amdgpu_hmm_range_get_pages</title>
<updated>2026-06-04T19:26:05+00:00</updated>
<author>
<name>Honglei Huang</name>
<email>honghuan@amd.com</email>
</author>
<published>2026-05-29T02:23:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=342981fff32802a819d6fc7cf3c9fedf9f3d9d60'/>
<id>342981fff32802a819d6fc7cf3c9fedf9f3d9d60</id>
<content type='text'>
Since commit c08972f55594 ("drm/amdgpu: fix amdgpu_hmm_range_get_pages")
moved mmu_interval_read_begin() out of the per-chunk loop, the
captured notifier_seq is no longer refreshed across retries. As a
result, the existing -EBUSY retry path can never make progress:

  hmm_range_fault() returns -EBUSY only when
  mmu_interval_check_retry(notifier, notifier_seq) reports that the
  sequence is stale. Once the sequence has advanced, the stored seq
  will never match again, so every subsequent call within the same
  invocation returns -EBUSY immediately.

The "goto retry" therefore degenerates into a busy spin that simply
burns CPU for the full HMM_RANGE_DEFAULT_TIMEOUT (~1s) window before
finally bailing out with -EAGAIN. This is pure latency with no chance
of recovery, and it actively hurts the KFD userptr stack: the caller
ends up blocked for a second while holding mmap_lock, only to return
-EAGAIN to the restore worker (or to userspace) which would have
re-driven the operation immediately anyway.

Drop the retry/timeout entirely and let -EBUSY propagate straight to
out_free_pfns, where it is already translated to -EAGAIN. Recovery is
handled at a higher level: the KFD restore_userptr_worker reschedules
itself, and the userptr ioctl path returns -EAGAIN to userspace.

No functional regression: the previous behaviour on -EBUSY was already
to fail with -EAGAIN after a 1s stall; we just skip the stall.

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Honglei Huang &lt;honghuan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit c08972f55594 ("drm/amdgpu: fix amdgpu_hmm_range_get_pages")
moved mmu_interval_read_begin() out of the per-chunk loop, the
captured notifier_seq is no longer refreshed across retries. As a
result, the existing -EBUSY retry path can never make progress:

  hmm_range_fault() returns -EBUSY only when
  mmu_interval_check_retry(notifier, notifier_seq) reports that the
  sequence is stale. Once the sequence has advanced, the stored seq
  will never match again, so every subsequent call within the same
  invocation returns -EBUSY immediately.

The "goto retry" therefore degenerates into a busy spin that simply
burns CPU for the full HMM_RANGE_DEFAULT_TIMEOUT (~1s) window before
finally bailing out with -EAGAIN. This is pure latency with no chance
of recovery, and it actively hurts the KFD userptr stack: the caller
ends up blocked for a second while holding mmap_lock, only to return
-EAGAIN to the restore worker (or to userspace) which would have
re-driven the operation immediately anyway.

Drop the retry/timeout entirely and let -EBUSY propagate straight to
out_free_pfns, where it is already translated to -EAGAIN. Recovery is
handled at a higher level: the KFD restore_userptr_worker reschedules
itself, and the userptr ioctl path returns -EAGAIN to userspace.

No functional regression: the previous behaviour on -EBUSY was already
to fail with -EAGAIN after a 1s stall; we just skip the stall.

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Honglei Huang &lt;honghuan@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: validate the mes firmware version for gfx12.1</title>
<updated>2026-06-04T19:25:52+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-06-01T14:45:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bfc6042540b7795d2f96a6ddc71442f74438dc73'/>
<id>bfc6042540b7795d2f96a6ddc71442f74438dc73</id>
<content type='text'>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: validate the mes firmware version for gfx12</title>
<updated>2026-06-04T19:25:49+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-06-01T14:44:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1720970f607d17b43274a06a7fd919e37a429281'/>
<id>1720970f607d17b43274a06a7fd919e37a429281</id>
<content type='text'>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: compare MES firmware version ucode for gfx11</title>
<updated>2026-06-04T19:25:46+00:00</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-06-01T14:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd7bd8e0f0c47361a3a513d6aa8ea2b36dd70deb'/>
<id>dd7bd8e0f0c47361a3a513d6aa8ea2b36dd70deb</id>
<content type='text'>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MES firmware should report the same version whether read from
the register or from the firmware ucode binary. This is not
always the case, so add a log when they mismatch.

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: restart the CS if some parts of the VM are still invalidated</title>
<updated>2026-06-04T19:24:48+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-02-25T14:12:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=59720bfd8c6dbebeb8d5a7ab64241b007efd9213'/>
<id>59720bfd8c6dbebeb8d5a7ab64241b007efd9213</id>
<content type='text'>
Make sure that we only submit work with full up to date VM page tables.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Tested-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make sure that we only submit work with full up to date VM page tables.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Tested-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/userq: Fix reading timeline points in wait ioctl</title>
<updated>2026-06-04T19:24:40+00:00</updated>
<author>
<name>David Rosca</name>
<email>david.rosca@amd.com</email>
</author>
<published>2025-09-13T14:51:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0ac98160dfb6ab3c6d7b38e0ff9687780beed9cb'/>
<id>0ac98160dfb6ab3c6d7b38e0ff9687780beed9cb</id>
<content type='text'>
Use correct u64 type.

Signed-off-by: David Rosca &lt;david.rosca@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use correct u64 type.

Signed-off-by: David Rosca &lt;david.rosca@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu/vcn5.0.0: enable secure submission on unified ring for VCN 5.3.0</title>
<updated>2026-06-04T19:24:34+00:00</updated>
<author>
<name>Jeevana Muthyala</name>
<email>Jeevana.Muthyala2@amd.com</email>
</author>
<published>2026-05-14T10:56:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=869de64649cf54d55e196597f819b8c8befe39d0'/>
<id>869de64649cf54d55e196597f819b8c8befe39d0</id>
<content type='text'>
Enable secure submission support on the unified ring for VCN IP version
5.3.0 by setting `secure_submission_supported = true` in
vcn_v5_0_0_unified_ring_vm_funcs.

Secure IB submission is supported on VCN 5.3.0 hardware/firmware,
allowing protected decode workloads to bypass the common IB gate.
Without this, secure playback submissions can be blocked and fail.

Other VCN 5.x variants using the same vcn_v5_0_0_ip_block
(e.g. IP_VERSION(5, 0, 0)) do not support secure submission
on the unified ring and therefore continue using non-secure paths.

This change only advertises existing hardware/firmware capability;
non-secure decode paths remain unaffected.

Signed-off-by: Jeevana Muthyala &lt;Jeevana.Muthyala2@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Enable secure submission support on the unified ring for VCN IP version
5.3.0 by setting `secure_submission_supported = true` in
vcn_v5_0_0_unified_ring_vm_funcs.

Secure IB submission is supported on VCN 5.3.0 hardware/firmware,
allowing protected decode workloads to bypass the common IB gate.
Without this, secure playback submissions can be blocked and fail.

Other VCN 5.x variants using the same vcn_v5_0_0_ip_block
(e.g. IP_VERSION(5, 0, 0)) do not support secure submission
on the unified ring and therefore continue using non-secure paths.

This change only advertises existing hardware/firmware capability;
non-secure decode paths remain unaffected.

Signed-off-by: Jeevana Muthyala &lt;Jeevana.Muthyala2@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/amdgpu: deprecate guilty handling</title>
<updated>2026-06-04T19:24:29+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2026-05-05T13:40:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=182bdd59be41595e211ac98406d3637fc6141017'/>
<id>182bdd59be41595e211ac98406d3637fc6141017</id>
<content type='text'>
The guilty handling tried to establish a second way of signaling problems with
the GPU back to userspace. This caused quite a bunch of issue we had to work
around, especially lifetime issues with the drm_sched_entity.

Just drop the handling altogether and use the dma_fence based approach instead.

v2: fix reversed condition in entity check (Alex)

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The guilty handling tried to establish a second way of signaling problems with
the GPU back to userspace. This caused quite a bunch of issue we had to work
around, especially lifetime issues with the drm_sched_entity.

Just drop the handling altogether and use the dma_fence based approach instead.

v2: fix reversed condition in entity check (Alex)

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
