summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)Author
2026-03-30drm/amd/display: Enable Replay support for dcn42Roman Li
Add DCN4.2 to the list that supports Panel Replay feature. Reviewed-by: Alex Hung <Alex.Hung@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Remove check for DC_DMCUB_ENABLE on DCN42Gabe Teeger
[why] DCN without DMCUB is not a supported configuration on DCN42. [how] Remove the DC_DMCUB_ENABLE fuse register check and remove the corresponding entries in the DCN42 DMUB register list. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: bios_parser: fix GPIO I2C line off-by-onePengpeng Hou
get_gpio_i2c_info() computes the number of GPIO I2C assignment records present in the BIOS table and then uses bfI2C_LineMux as an array index into header->asGPIO_Info[]. The current check only rejects values strictly larger than the record count, so an index equal to count still falls through and reaches the fixed table one element past the end. Reject indices at or above the number of available records before using them as an array index. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KBDonet Tom
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with 4K pages, both values match (8KB), so allocation and reserved space are consistent. However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB, while the reserved trap area remains 8KB. This mismatch causes the kernel to crash when running rocminfo or rccl unit tests. Kernel attempted to read user page (2) - exploit attempt? (uid: 1001) BUG: Kernel NULL pointer dereference on read at 0x00000002 Faulting instruction address: 0xc0000000002c8a64 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY Tainted: [E]=UNSIGNED_MODULE Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730 REGS: c0000001e0957580 TRAP: 0300 Tainted: G E MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268 XER: 00000036 CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000 IRQMASK: 1 GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100 c00000013d814540 GPR04: 0000000000000002 c00000013d814550 0000000000000045 0000000000000000 GPR08: c00000013444d000 c00000013d814538 c00000013d814538 0000000084002268 GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff 0000000000020000 GPR16: 0000000000000000 0000000000000002 c00000015f653000 0000000000000000 GPR20: c000000138662400 c00000013d814540 0000000000000000 c00000013d814500 GPR24: 0000000000000000 0000000000000002 c0000001e0957888 c0000001e0957878 GPR28: c00000013d814548 0000000000000000 c00000013d814540 c0000001e0957888 NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0 LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00 Call Trace: 0xc0000001e0957890 (unreliable) __mutex_lock.constprop.0+0x58/0xd00 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu] kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu] kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu] kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu] kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu] kfd_ioctl+0x514/0x670 [amdgpu] sys_ioctl+0x134/0x180 system_call_exception+0x114/0x300 system_call_vectored_common+0x15c/0x2ec This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve 64 KB for the trap in the address space, but only allocate 8 KB within it. With this approach, the allocation size never exceeds the reserved area. Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu/userq: Fix the code alignment for readabilitySunil Khatri
Fix the code alignment for if condition and also provide a line space between multiline if condition and next statement. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: reset ras eeprom table when it is invalidGangliang Xie
reset ras eeprom table when it is invalid Signed-off-by: Gangliang Xie <ganglxie@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu/userq: fix memory leak in MQD creation error pathsJunrui Luo
In mes_userq_mqd_create(), the memdup_user() allocations for IP-specific MQD structs are not freed when subsequent VA validation fails. The goto free_mqd label only cleans up the MQD BO object and userq_props. Fix by adding kfree() before each goto free_mqd on VA validation failure in the COMPUTE, GFX, and SDMA branches. Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd: Fix MQD and control stack alignment for non-4KDonet Tom
For gfxV9, due to a hardware bug ("based on the comments in the code here [1]"), the control stack of a user-mode compute queue must be allocated immediately after the page boundary of its regular MQD buffer. To handle this, we allocate an enlarged MQD buffer where the first page is used as the MQD and the remaining pages store the control stack. Although these regions share the same BO, they require different memory types: the MQD must be UC (uncached), while the control stack must be NC (non-coherent), matching the behavior when the control stack is allocated in user space. This logic works correctly on systems where the CPU page size matches the GPU page size (4K). However, the current implementation aligns both the MQD and the control stack to the CPU PAGE_SIZE. On systems with a larger CPU page size, the entire first CPU page is marked UC—even though that page may contain multiple GPU pages. The GPU treats the second 4K GPU page inside that CPU page as part of the control stack, but it is incorrectly mapped as UC. This patch fixes the issue by aligning both the MQD and control stack sizes to the GPU page size (4K). The first 4K page is correctly marked as UC for the MQD, and the remaining GPU pages are marked NC for the control stack. This ensures proper memory type assignment on systems with larger CPU page sizes. [1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118 Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdkfd: Align expected_queue_size to PAGE_SIZEDonet Tom
The AQL queue size can be 4K, but the minimum buffer object (BO) allocation size is PAGE_SIZE. On systems with a page size larger than 4K, the expected queue size does not match the allocated BO size, causing queue creation to fail. Align the expected queue size to PAGE_SIZE so that it matches the allocated BO size and allows queue creation to succeed. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Use str_enabled_disabled in amdgpu_pm sysfsAsad Kamal
Coccinelle flags hand-rolled "enabled"/"disabled" strings; use the shared str_enabled_disabled() helper from string_choices.h for npm_status and thermal throttling logging sysfs text. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603251434.zIN2QYWn-lkp@intel.com/ Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: fix the idr allocation flagsPrike Liang
Fix the IDR allocation flags by using atomic GFP flags in non‑sleepable contexts to avoid the __might_sleep() complaint. 268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0 [ 268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323 [ 268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe [ 268.295705] preempt_count: 1, expected: 0 [ 268.295886] RCU nest depth: 0, expected: 0 [ 268.296072] 2 locks held by modprobe/1744: [ 268.296077] #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210 [ 268.296100] #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu] [ 268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G U OE 6.19.0-custom #16 PREEMPT(voluntary) [ 268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 268.296501] Call Trace: Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: flush coredump work before HW teardownJesse Zhang
In amdgpu_device_fini_hw(), deferred coredump formatting work may still be pending when hardware and IP components are being torn down. Since the work may access device registers and memory that will be freed or powered off, it must be completed before proceeding. Add a flush_work() call for adev->coredump_work, guarded by CONFIG_DEV_COREDUMP, to ensure any pending coredump work finishes before the device enters the early IP fini stage. This avoids potential use-after-free or accessing hardware resources that are no longer available. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: guard atom_context in devcoredump VBIOS dumpJesse Zhang
During GPU reset coredump generation, amdgpu_devcoredump_fw_info() unconditionally dereferences adev->mode_info.atom_context to print VBIOS fields. On reset/teardown paths this pointer can be NULL, causing a kernel page fault from the deferred coredump workqueue. Fix by checking ctx before printing VBIOS fields: if ctx is valid, print full VBIOS information as before; This prevents NULL-dereference crashes while preserving coredump output. Observed page fault log: [ 667.933329] RIP: 0010:amdgpu_devcoredump_format+0x780/0xc00 [amdgpu] [ 667.941517] amdgpu 0002:01:00.0: Dumping IP State [ 667.949660] Code: 8d 57 74 48 c7 c6 01 65 9f c2 48 8d 7d 98 e8 97 96 7a ff 49 8d 97 b4 00 00 00 48 c7 c6 18 65 9f c2 48 8d 7d 98 e8 80 96 7a ff <41> 8b 97 f4 00 00 00 48 c7 c6 2f 65 9f c2 48 8d 7d 98 e8 69 96 7a [ 667.949666] RSP: 0018:ffffc9002302bd50 EFLAGS: 00010246 [ 667.949673] RAX: 0000000000000000 RBX: ffff888110600000 RCX: 0000000000000000 [ 667.949676] RDX: 000000000000a9b5 RSI: 0000000000000405 RDI: 000000000000a999 [ 667.949680] RBP: ffffc9002302be00 R08: ffffffffc09c3084 R09: ffffffffc09c3085 [ 667.949684] R10: 0000000000000000 R11: 0000000000000004 R12: 00000000000048e0 [ 667.993908] amdgpu 0002:01:00.0: Dumping IP State Completed [ 667.994229] R13: 0000000000000025 R14: 000000000000000c R15: 0000000000000000 [ 667.994233] FS: 0000000000000000(0000) GS:ffff88c44c2c9000(0000) knlGS:0000000000000000 [ 668.000076] amdgpu 0002:01:00.0: [drm] AMDGPU device coredump file has been created [ 668.008025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 668.008030] CR2: 00000000000000f4 CR3: 000000011195f001 CR4: 0000000000770ef0 [ 668.008035] PKRU: 55555554 [ 668.008040] Call Trace: [ 668.008045] <TASK> [ 668.016010] amdgpu 0002:01:00.0: [drm] Check your /sys/class/drm/card16/device/devcoredump/data [ 668.023967] ? srso_alias_return_thunk+0x5/0xfbef5 [ 668.023988] ? __pfx___drm_printfn_coredump+0x10/0x10 [drm] [ 668.031950] amdgpu 0003:01:00.0: Dumping IP State [ 668.038159] ? __pfx___drm_puts_coredump+0x10/0x10 [drm] [ 668.083017] amdgpu 0003:01:00.0: Dumping IP State Completed [ 668.083824] amdgpu_devcoredump_deferred_work+0x26/0xc0 [amdgpu] [ 668.086163] amdgpu 0003:01:00.0: [drm] AMDGPU device coredump file has been created [ 668.095863] process_scheduled_works+0xa6/0x420 [ 668.095880] worker_thread+0x12a/0x270 [ 668.101223] amdgpu 0003:01:00.0: [drm] Check your /sys/class/drm/card24/device/devcoredump/data [ 668.107441] kthread+0x10d/0x230 [ 668.107451] ? __pfx_worker_thread+0x10/0x10 [ 668.107458] ? __pfx_kthread+0x10/0x10 [ 668.112709] amdgpu 0000:01:00.0: ring vcn_unified_1 timeout, signaled seq=9, emitted seq=10 [ 668.118630] ret_from_fork+0x17c/0x1f0 [ 668.118640] ? __pfx_kthread+0x10/0x10 [ 668.118647] ret_from_fork_asm+0x1a/0x30 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu/userq: amdgpu_userq_vm_validate does not need userq mutexSunil Khatri
amdgpu_userq_vm_validate function does not need userq_mutex and exec lock is good enough to locking all bos and updating the eviction fence. Also since we only need userq_mutex for amdgpu_userq_restore_all so move the locks in the function itself. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: validate doorbell_offset in user queue creationJunrui Luo
amdgpu_userq_get_doorbell_index() passes the user-provided doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds checking. An arbitrarily large doorbell_offset can cause the calculated doorbell index to fall outside the allocated doorbell BO, potentially corrupting kernel doorbell space. Validate that doorbell_offset falls within the doorbell BO before computing the BAR index, using u64 arithmetic to prevent overflow. Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/xe: Avoid memory allocations in xe_device_declare_wedged()Matthew Brost
xe_device_declare_wedged() runs in the DMA-fence signaling path, where GFP_KERNEL memory allocations are not allowed. However, registering xe_device_wedged_fini via drmm_add_action_or_reset() triggers a GFP_KERNEL allocation. Fix this by deferring the registration of xe_device_wedged_fini until late in the driver load sequence. Additionally, drop the wedged PM reference only if the device is actually wedged in xe_device_wedged_fini. Fixes: 452bca0edbd0 ("drm/xe: Don't suspend device upon wedge") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260326210116.202585-2-matthew.brost@intel.com (cherry picked from commit b08ceb443866808b881b12d4183008d214d816c1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe: Disable garbage collector work item on SVM closeMatthew Brost
When an SVM is closed, the garbage collector work item must be stopped synchronously and any future queuing must be prevented. Replace flush_work() with disable_work_sync() to ensure both conditions are met. Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260227015225.3081787-1-matthew.brost@intel.com (cherry picked from commit 2247feb9badca5a4774df9a437bfc44fba4f22de) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/pxp: Don't allow PXP on older PTL GSC FWsDaniele Ceraolo Spurio
On PTL, older GSC FWs have a bug that can cause them to crash during PXP invalidation events, which leads to a complete loss of power management on the media GT. Therefore, we can't use PXP on FWs that have this bug, which was fixed in PTL GSC build 1396. Fixes: b1dcec9bd8a1 ("drm/xe/ptl: Enable PXP for PTL") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-10-daniele.ceraolospurio@intel.com (cherry picked from commit 6eb04caaa972934c9b6cea0e0c29e466bf9a346f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/pxp: Clear restart flag in pxp_start after jumping backDaniele Ceraolo Spurio
If we don't clear the flag we'll keep jumping back at the beginning of the function once we reach the end. Fixes: ccd3c6820a90 ("drm/xe/pxp: Decouple queue addition from PXP start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-9-daniele.ceraolospurio@intel.com (cherry picked from commit 0850ec7bb2459602351639dccf7a68a03c9d1ee0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/pxp: Remove incorrect handling of impossible state during suspendDaniele Ceraolo Spurio
The default case of the PXP suspend switch is incorrectly exiting without releasing the lock. However, this case is impossible to hit because we're switching on an enum and all the valid enum values have their own cases. Therefore, we can just get rid of the default case and rely on the compiler to warn us if a new enum value is added and we forget to add it to the switch. Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-8-daniele.ceraolospurio@intel.com (cherry picked from commit f1b5a77fc9b6a90cd9a5e3db9d4c73ae1edfcfac) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/pxp: Clean up termination status on failureDaniele Ceraolo Spurio
If the PXP HW termination fails during PXP start, the normal completion code won't be called, so the termination will remain uncomplete. To avoid unnecessary waits, mark the termination as completed from the error path. Note that we already do this if the termination fails when handling a termination irq from the HW. Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-7-daniele.ceraolospurio@intel.com (cherry picked from commit 5d9e708d2a69ab1f64a17aec810cd7c70c5b9fab) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/madvise: Accept canonical GPU addresses in xe_vm_madvise_ioctlArvind Yadav
Userspace passes canonical (sign-extended) GPU addresses where bits 63:48 mirror bit 47. The internal GPUVM uses non-canonical form (upper bits zeroed), so passing raw canonical addresses into GPUVM lookups causes mismatches for addresses above 128TiB. Strip the sign extension with xe_device_uncanonicalize_addr() at the top of xe_vm_madvise_ioctl(). Non-canonical addresses are unaffected. Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-13-arvind.yadav@intel.com (cherry picked from commit 05c8b1cdc54036465ea457a0501a8c2f9409fce7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/xe/xe_pagefault: Disallow writes to read-only VMAsJonathan Cavitt
The page fault handler should reject write/atomic access to read only VMAs. Add code to handle this in xe_pagefault_service after the VMA lookup. v2: - Apply max line length (Matthew) Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work") Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260324152935.72444-7-jonathan.cavitt@intel.com (cherry picked from commit 714ee6754ac5fa3dc078856a196a6b124cd797a0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-30drm/ast: dp501: Fix initialization of SCU2CThomas Zimmermann
Ast's DP501 initialization reads the register SCU2C at offset 0x1202c and tries to set it to source data from VGA. But writes the update to offset 0x0, with unknown results. Write the result to SCU instead. The bug only happens in ast_init_analog(). There's similar code in ast_init_dvo(), which works correctly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 83c6620bae3f ("drm/ast: initial DP501 support (v0.2)") Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.16+ Link: https://patch.msgid.link/20260327133532.79696-2-tzimmermann@suse.de
2026-03-30Merge drm/drm-fixes into drm-misc-next-fixesMaxime Ripard
Boris needs 7.0-rc6 for a shmem helper fix. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-03-30drm/i915/dsi: Don't do DSC horizontal timing adjustments in command modeVille Syrjälä
Stop adjusting the horizontal timing values based on the compression ratio in command mode. Bspec seems to be telling us to do this only in video mode, and this is also how the Windows driver does things. This should also fix a div-by-zero on some machines because the adjusted htotal ends up being so small that we end up with line_time_us==0 when trying to determine the vtotal value in command mode. Note that this doesn't actually make the display on the Huawei Matebook E work, but at least the kernel no longer explodes when the driver loads. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings") Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 0b475e91ecc2313207196c6d7fd5c53e1a878525) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-03-30Merge tag 'drm-xe-next-2026-03-26-1' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Hi Dave and Sima, Here goes our late, final drm-xe-next PR towards 7.1. We just purgeable BO uAPI in today, hence the late pull. In the big things we have: - Add support for purgeable buffer objects Thanks, Matt UAPI Changes: - Add support for purgeable buffer objects (Arvind, Himal) Driver Changes: - Remove useless comment (Maarten) - Issue GGTT invalidation under lock in ggtt_node_remove (Brost, Fixes) - Fix mismatched include guards in header files (Shuicheng) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/acX4fWxPkZrrfwnT@gsse-cloud1.jf.intel.com
2026-03-28Merge tag 'mediatek-drm-fixes-20260323' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20260323 1. dsi: Store driver data before invoking mipi_dsi_host_register Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org
2026-03-27rust: drm: gem: shmem: Add DRM shmem helper abstractionAsahi Lina
The DRM shmem helper includes common code useful for drivers which allocate GEM objects as anonymous shmem. Add a Rust abstraction for this. Drivers can choose the raw GEM implementation or the shmem layer, depending on their needs. Signed-off-by: Asahi Lina <lina@asahilina.net> Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Janne Grunau <j@jananu.net> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com> Link: https://patch.msgid.link/20260316211646.650074-6-lyude@redhat.com [ * DRM_GEM_SHMEM_HELPER is a tristate; when a module driver selects it, it becomes =m. The Rust kernel crate and its C helpers are always built into vmlinux and can't reference symbols from a module, causing link errors. Thus, add RUST_DRM_GEM_SHMEM_HELPER bool Kconfig that selects DRM_GEM_SHMEM_HELPER, forcing it built-in when Rust drivers need it; use cfg(CONFIG_RUST_DRM_GEM_SHMEM_HELPER) for the shmem module. * Add cfg_attr(not(CONFIG_RUST_DRM_GEM_SHMEM_HELPER), expect(unused)) on pub(crate) use impl_aref_for_gem_obj and BaseObjectPrivate, so that unused warnings are suppressed when shmem is not enabled. * Enable const_refs_to_static (stabilized in 1.83) to prevent build errors with older compilers. * Use &raw const for bindings::drm_gem_shmem_vm_ops and add #[allow(unused_unsafe, reason = "Safe since Rust 1.82.0")]. * Fix incorrect C Header path and minor spelling and formatting issues. * Drop shmem::Object::sg_table() as the current implementation is unsound. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-27drm/i915/uncore: Do GT FIFO checks in early sanitize and forcewake getVille Syrjälä
We're mixing up the GT FIFO debug checks (overflows and such) with RMbus unclaimed register checks. The two are quite different things as RMbus is only relevant for display registers, and the GT FIFO only relevant for GT registers. Split the GT FIFO debugs out from the unclaimed register logic and just do the checks during forcewake_get() and early init. That is still sufficient to detect if any errors have happened. Any errors would anyway be caused by overflowing the FIFO rather than accessing specific registers, so trying to figure out exactly when the error happened isn't particularly useful. To fix such issues we'd rather have to do something to slow down the rate at which registers are accessed (eg. increase GT_FIFO_NUM_RESERVED_ENTRIES or something). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260323101609.8391-3-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/selftests: Nuke live_forcewake_domains selftestVille Syrjälä
The live_forcewake_domains selftest doesn't really test anything particularly sensible. It only runs on platforms that have RMbus unclaimer error detection, but that only catches display registers which the test doesn't even access. I suppose if we really wanted to we might try to make the test exercise the GT FIFO instead by writing GT registers as fast as possible, and then checking GTFIFODBG to see if the FIFO has overflowed. But dunno if there's much point in that. I think a GT FIFO overflow might even be fatal to the machine. So in its current for the test doesn't really make sense, and it's in the way of moving all the RMbus noclaim stuff to the display driver side. So let's just get rid of it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260323101609.8391-2-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/dsi: Place clock into LP during LPM if requestedVille Syrjälä
TGL/ADL DSI can be configured to place the clock lane into LP state during LPM, if otherwise configured for continuous HS clock. Hook that up. VBT tells us whether this should be done. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-6-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/dsi: Fill BLLPs with blanking packets if requestedVille Syrjälä
TGL/ADL DSI can be configured to fill all BLLPs with blanking packets. Currently we enable that always, but the VBT actually tells us whether this is desired or not. Hook that up. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-5-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/dsi: Make 'clock_stop' booleanVille Syrjälä
The DSI 'clock_stop' parameter is a boolean, so use a real 'bool' for it. And pimp the debug print while at it. Note that we also remove the incorrect negation of the value in the debug print. That has been there since the code was introduced in commit 2ab8b458c6a1 ("drm/i915: Add support for Generic MIPI panel driver"). An earlier version of the patch https://lore.kernel.org/intel-gfx/1397454507-10273-5-git-send-email-shobhit.kumar@intel.com/ got it right, but looks like it got fumbled while dealing with other review comments. v2: Highlight the removal of the '!' (Jani) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-4-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/dsi: s/eotp_pkt/eot_pkt/Ville Syrjälä
eotp == "End of Transmission Packet". Drop the redundant extra 'p' from 'eotp_pkt', and make the thing a boolean while at it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-3-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27drm/i915/dsi: Don't do DSC horizontal timing adjustments in command modeVille Syrjälä
Stop adjusting the horizontal timing values based on the compression ratio in command mode. Bspec seems to be telling us to do this only in video mode, and this is also how the Windows driver does things. This should also fix a div-by-zero on some machines because the adjusted htotal ends up being so small that we end up with line_time_us==0 when trying to determine the vtotal value in command mode. Note that this doesn't actually make the display on the Huawei Matebook E work, but at least the kernel no longer explodes when the driver loads. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings") Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2026-03-27Merge tag 'drm-xe-fixes-2026-03-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix UAF in SRIOV migration restore (Winiarski) - Updates to HW W/a (Roper) - VMBind remap fix (Auld) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/acUgq2q2DrCUzFql@intel.com
2026-03-27Merge tag 'drm-misc-fixes-2026-03-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A page mapping fix for shmem fault handler, a power-off fix for ivpu, a GFP_* flag fix for syncobj, and a MAINTAINERS update. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260326-lush-cuddly-limpet-ab2aa9@houat
2026-03-27Merge tag 'drm-intel-fixes-2026-03-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - DP tunnel error handling fix - Spurious GMBUS timeout fix - Unlink NV12 planes earlier - Order OP vs. timeout correctly in __wait_for() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/acTdjAoOGkzl3dcc@jlahtine-mobl
2026-03-27Merge tag 'mediatek-drm-next-20260325' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next Mediatek DRM Next - 20260325 1. mtk_dsi: enable hs clock during pre-enable 2. Remove all conflicting aperture devices during probe 3. Add support for mt8167 display blocks Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patch.msgid.link/20260325160721.4891-1-chunkuang.hu@kernel.org
2026-03-26drm/xe/madvise: Accept canonical GPU addresses in xe_vm_madvise_ioctlArvind Yadav
Userspace passes canonical (sign-extended) GPU addresses where bits 63:48 mirror bit 47. The internal GPUVM uses non-canonical form (upper bits zeroed), so passing raw canonical addresses into GPUVM lookups causes mismatches for addresses above 128TiB. Strip the sign extension with xe_device_uncanonicalize_addr() at the top of xe_vm_madvise_ioctl(). Non-canonical addresses are unaffected. Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-13-arvind.yadav@intel.com
2026-03-26drm/xe/madvise: Enable purgeable buffer object IOCTL supportArvind Yadav
Hook the madvise_purgeable() handler into the madvise IOCTL now that all supporting infrastructure is complete: - Core purge implementation (patch 3) - BO state tracking and helpers (patches 1-2) - Per-VMA purgeable state tracking (patch 6) - Shrinker integration for memory reclamation (patch 10) This final patch enables userspace to use the DRM_XE_VMA_ATTR_PURGEABLE_STATE madvise type to mark buffers as WILLNEED/DONTNEED and receive the retained status indicating whether buffers were purged. The feature was kept disabled in earlier patches to maintain bisectability and ensure all components are in place before exposing to userspace. Userspace can detect kernel support for purgeable BOs by checking the DRM_XE_QUERY_CONFIG_FLAG_HAS_PURGING_SUPPORT flag in the query_config response. Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-12-arvind.yadav@intel.com
2026-03-26drm/xe/bo: Add purgeable shrinker state helpersArvind Yadav
Encapsulate TTM purgeable flag updates and shrinker page accounting into helper functions to prevent desynchronization between the TTM tt->purgeable flag and the shrinker's page bucket counters. Without these helpers, direct manipulation of xe_ttm_tt->purgeable risks forgetting to update the corresponding shrinker counters, leading to incorrect memory pressure calculations. Update purgeable BO state to PURGED after successful shrinker purge for DONTNEED BOs. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-11-arvind.yadav@intel.com
2026-03-26drm/xe/dma_buf: Block export of DONTNEED/purged BOsArvind Yadav
Don't allow exporting BOs marked DONTNEED or PURGED as dma-bufs. DONTNEED BOs can have their contents discarded at any time, making the exported dma-buf unusable for external devices. PURGED BOs have no backing store and are permanently invalid. Return -EBUSY for DONTNEED BOs (temporary purgeable state) and -EINVAL for purged BOs (permanent, no backing store). The export path now checks the BO's purgeable state before creating the dma-buf, preventing external devices from accessing memory that may be purged at any time. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-10-arvind.yadav@intel.com
2026-03-26drm/xe/bo: Block mmap of DONTNEED/purged BOsArvind Yadav
Don't allow new CPU mmaps to BOs marked DONTNEED or PURGED. DONTNEED BOs can have their contents discarded at any time, making CPU access undefined behavior. PURGED BOs have no backing store and are permanently invalid. Return -EBUSY for DONTNEED BOs (temporary purgeable state) and -EINVAL for purged BOs (permanent, no backing store). The mmap offset ioctl now checks the BO's purgeable state before allowing userspace to establish a new CPU mapping. This prevents the race where userspace gets a valid offset but the BO is purged before actual faulting begins. Existing mmaps (established before DONTNEED) may still work until pages are purged, at which point CPU faults fail with SIGBUS. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-9-arvind.yadav@intel.com
2026-03-26drm/xe/madvise: Block imported and exported dma-bufsArvind Yadav
Prevent marking imported or exported dma-bufs as purgeable. External devices may be accessing these buffers without our knowledge, making purging unsafe. Check drm_gem_is_imported() for buffers created by other drivers and obj->dma_buf for buffers exported to other drivers. Silently skip these BOs during madvise processing. This follows drm_gem_shmem's purgeable implementation and prevents data corruption from purging actively-used shared buffers. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-8-arvind.yadav@intel.com
2026-03-26drm/xe/madvise: Implement per-VMA purgeable state trackingArvind Yadav
Track purgeable state per-VMA instead of using a coarse shared BO check. This prevents purging shared BOs until all VMAs across all VMs are marked DONTNEED. Add xe_bo_all_vmas_dontneed() to check all VMAs before marking a BO purgeable. Add xe_bo_recheck_purgeable_on_vma_unbind() to handle state transitions when VMAs are destroyed - if all remaining VMAs are DONTNEED the BO can become purgeable, or if no VMAs remain it transitions to WILLNEED. The per-VMA purgeable_state field stores the madvise hint for each mapping. Shared BOs can only be purged when all VMAs unanimously indicate DONTNEED. This prevents the bug where unmapping the last VMA would incorrectly flip a DONTNEED BO back to WILLNEED. The enum-based state check preserves BO state when no VMAs remain, only updating when VMAs provide explicit hints. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-7-arvind.yadav@intel.com
2026-03-26drm/xe/vm: Prevent binding of purged buffer objectsArvind Yadav
Add purge checking to vma_lock_and_validate() to block new mapping operations on purged BOs while allowing cleanup operations to proceed. Purged BOs have their backing pages freed by the kernel. New mapping operations (MAP, PREFETCH, REMAP) must be rejected with -EINVAL to prevent GPU access to invalid memory. Cleanup operations (UNMAP) must be allowed so applications can release resources after detecting purge via the retained field. REMAP operations require mixed handling - reject new prev/next VMAs if the BO is purged, but allow the unmap portion to proceed for cleanup. The check_purged flag in struct xe_vma_lock_and_validate_flags distinguishes between these cases: true for new mappings (must reject), false for cleanup (allow). Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-6-arvind.yadav@intel.com
2026-03-26drm/xe/bo: Block CPU faults to purgeable buffer objectsArvind Yadav
Block CPU page faults to buffer objects marked as purgeable (DONTNEED) or already purged. Once a BO is marked DONTNEED, its contents can be discarded by the kernel at any time, making access undefined behavior. Return VM_FAULT_SIGBUS immediately to fail consistently instead of allowing erratic behavior where access sometimes works (if not yet purged) and sometimes fails (if purged). For DONTNEED BOs: - Block new CPU faults with SIGBUS to prevent undefined behavior. - Existing CPU PTEs may still work until TLB flush, but new faults fail immediately. For PURGED BOs: - Backing store has been reclaimed, making CPU access invalid. - Without this check, accessing existing mmap mappings would trigger xe_bo_fault_migrate() on freed backing store, causing kernel hangs or crashes. The purgeable check is added to both CPU fault paths: - Fastpath (xe_bo_cpu_fault_fastpath): Returns VM_FAULT_SIGBUS immediately under dma-resv lock, preventing attempts to migrate/validate DONTNEED/purged pages. - Slowpath (xe_bo_cpu_fault): Returns -EFAULT under drm_exec lock, converted to VM_FAULT_SIGBUS. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-5-arvind.yadav@intel.com
2026-03-26drm/xe/madvise: Implement purgeable buffer object supportArvind Yadav
This allows userspace applications to provide memory usage hints to the kernel for better memory management under pressure: Add the core implementation for purgeable buffer objects, enabling memory reclamation of user-designated DONTNEED buffers during eviction. This patch implements the purge operation and state machine transitions: Purgeable States (from xe_madv_purgeable_state): - WILLNEED (0): BO should be retained, actively used - DONTNEED (1): BO eligible for purging, not currently needed - PURGED (2): BO backing store reclaimed, permanently invalid Design Rationale: - Async TLB invalidation via trigger_rebind (no blocking xe_vm_invalidate_vma) - i915 compatibility: retained field, "once purged always purged" semantics - Shared BO protection prevents multi-process memory corruption - Scratch PTE reuse avoids new infrastructure, safe for fault mode Note: The madvise_purgeable() function is implemented but not hooked into the IOCTL handler (madvise_funcs[] entry is NULL) to maintain bisectability. The feature will be enabled in the final patch when all supporting infrastructure (shrinker, per-VMA tracking) is complete. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-4-arvind.yadav@intel.com