<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/vfio/pci, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>vfio/qat: fix f_pos race in qat_vf_resume_write()</title>
<updated>2026-06-10T20:33:05+00:00</updated>
<author>
<name>Giovanni Cabiddu</name>
<email>giovanni.cabiddu@intel.com</email>
</author>
<published>2026-06-08T15:12:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4ec5e932e636896e97e4c6a8205b0ac76d52421a'/>
<id>4ec5e932e636896e97e4c6a8205b0ac76d52421a</id>
<content type='text'>
qat_vf_resume_write() checks filp-&gt;f_pos before taking migf-&gt;lock, but
copies into the migration-state buffer after taking the lock and
re-reading the shared file position.

Two concurrent writers could therefore pass the bounds check with the
old offset, then have the second writer copy after the first advanced
f_pos, writing past the end of the migration-state buffer.

Take migf-&gt;lock before doing the boundary checks.

Fixes: bb208810b1ab ("vfio/qat: Add vfio_pci driver for Intel QAT SR-IOV VF devices")
Reviewed-by: Ahsan Atta &lt;ahsan.atta@intel.com&gt;
Signed-off-by: Giovanni Cabiddu &lt;giovanni.cabiddu@intel.com&gt;
Link: https://lore.kernel.org/r/20260608151317.136613-1-giovanni.cabiddu@intel.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
qat_vf_resume_write() checks filp-&gt;f_pos before taking migf-&gt;lock, but
copies into the migration-state buffer after taking the lock and
re-reading the shared file position.

Two concurrent writers could therefore pass the bounds check with the
old offset, then have the second writer copy after the first advanced
f_pos, writing past the end of the migration-state buffer.

Take migf-&gt;lock before doing the boundary checks.

Fixes: bb208810b1ab ("vfio/qat: Add vfio_pci driver for Intel QAT SR-IOV VF devices")
Reviewed-by: Ahsan Atta &lt;ahsan.atta@intel.com&gt;
Signed-off-by: Giovanni Cabiddu &lt;giovanni.cabiddu@intel.com&gt;
Link: https://lore.kernel.org/r/20260608151317.136613-1-giovanni.cabiddu@intel.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/nvgrace-gpu: Add Blackwell-Next GPU readiness check via CXL DVSEC</title>
<updated>2026-06-05T16:43:32+00:00</updated>
<author>
<name>Ankit Agrawal</name>
<email>ankita@nvidia.com</email>
</author>
<published>2026-06-02T06:30:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=682ecb14e83840e87ea36c6d7c16c5111ce18784'/>
<id>682ecb14e83840e87ea36c6d7c16c5111ce18784</id>
<content type='text'>
Add a CXL DVSEC-based readiness check for Blackwell-Next GPUs alongside
the existing legacy BAR0 polling path. The CXL Device DVSEC offset is
discovered at probe time. Probe, fault and read/write paths then branch
on that to use either the legacy BAR0 polling or the CXL DVSEC polling.

The CXL path polls Memory_Active, requiring MEM_INFO_VALID within 1s and
MEM_ACTIVE within Memory_Active_Timeout (up to 256s) as per CXL spec r4.0
sec 8.1.3.8.2. Given the long worst-case wait, the CXL poll runs outside
memory_lock with only a quick readiness check is done under the lock.

The poll loops sleep with schedule_timeout_killable() and return -EINTR
on a fatal signal. This avoids hung-task panics during the long
uninterruptible wait. Extend this to the legacy based wait as well for
improvement.

In the fault handler the wait runs locklessly before memory_lock. If a
reset races in, the in-lock recheck returns -EAGAIN and the wait is
retried rather than returning a spurious VM_FAULT_SIGBUS.

Add PCI_DVSEC_CXL_MEM_ACTIVE_TIMEOUT to pci_regs.h for the timeout field.

Cc: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Cc: Kevin Tian &lt;kevin.tian@intel.com&gt;
Suggested-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Ankit Agrawal &lt;ankita@nvidia.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20260602063015.3915-1-ankita@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a CXL DVSEC-based readiness check for Blackwell-Next GPUs alongside
the existing legacy BAR0 polling path. The CXL Device DVSEC offset is
discovered at probe time. Probe, fault and read/write paths then branch
on that to use either the legacy BAR0 polling or the CXL DVSEC polling.

The CXL path polls Memory_Active, requiring MEM_INFO_VALID within 1s and
MEM_ACTIVE within Memory_Active_Timeout (up to 256s) as per CXL spec r4.0
sec 8.1.3.8.2. Given the long worst-case wait, the CXL poll runs outside
memory_lock with only a quick readiness check is done under the lock.

The poll loops sleep with schedule_timeout_killable() and return -EINTR
on a fatal signal. This avoids hung-task panics during the long
uninterruptible wait. Extend this to the legacy based wait as well for
improvement.

In the fault handler the wait runs locklessly before memory_lock. If a
reset races in, the in-lock recheck returns -EAGAIN and the wait is
retried rather than returning a spurious VM_FAULT_SIGBUS.

Add PCI_DVSEC_CXL_MEM_ACTIVE_TIMEOUT to pci_regs.h for the timeout field.

Cc: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Cc: Kevin Tian &lt;kevin.tian@intel.com&gt;
Suggested-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Ankit Agrawal &lt;ankita@nvidia.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20260602063015.3915-1-ankita@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Use a private flag to prevent power state change with VFs</title>
<updated>2026-05-22T15:14:16+00:00</updated>
<author>
<name>Raghavendra Rao Ananta</name>
<email>rananta@google.com</email>
</author>
<published>2026-05-14T17:34:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=40ef3edf151e184d021917a5c4c771cc0870844a'/>
<id>40ef3edf151e184d021917a5c4c771cc0870844a</id>
<content type='text'>
The current implementation uses pci_num_vf() while holding the
memory_lock to prevent changing the power state of a PF when
VFs are enabled. This creates a lockdep circular dependency
warning because memory_lock is held during device probing.

[  286.997167] ======================================================
[  287.003363] WARNING: possible circular locking dependency detected
[  287.009562] 7.0.0-dbg-DEV #3 Tainted: G S
[  287.015074] ------------------------------------------------------
[  287.021270] vfio_pci_sriov_/18636 is trying to acquire lock:
[  287.026942] ff45bea2294d4968 (&amp;vdev-&gt;memory_lock){+.+.}-{4:4}, at:
vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.036530]
[  287.036530] but task is already holding lock:
[  287.042383] ff45bea3a96b8230 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4}, at:
vfio_group_fops_unl_ioctl+0x44d/0x7b0
[  287.051879]
[  287.051879] which lock already depends on the new lock.
[  287.051879]
[  287.060070]
[  287.060070] the existing dependency chain (in reverse order) is:
[  287.067568]
[  287.067568] -&gt; #2 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4}:
[  287.073941]        __mutex_lock+0x92/0xb80
[  287.078058]        vfio_assign_device_set+0x66/0x1b0
[  287.083042]        vfio_pci_core_register_device+0xd1/0x2a0
[  287.088638]        vfio_pci_probe+0xd2/0x100
[  287.092933]        local_pci_probe_callback+0x4d/0xa0
[  287.098001]        process_scheduled_works+0x2ca/0x680
[  287.103158]        worker_thread+0x1e8/0x2f0
[  287.107452]        kthread+0x10c/0x140
[  287.111230]        ret_from_fork+0x18e/0x360
[  287.115519]        ret_from_fork_asm+0x1a/0x30
[  287.119983]
[  287.119983] -&gt; #1 ((work_completion)(&amp;arg.work)){+.+.}-{0:0}:
[  287.127219]        __flush_work+0x345/0x490
[  287.131429]        pci_device_probe+0x2e3/0x490
[  287.135979]        really_probe+0x1f9/0x4e0
[  287.140180]        __driver_probe_device+0x77/0x100
[  287.145079]        driver_probe_device+0x1e/0x110
[  287.149803]        __device_attach_driver+0xe3/0x170
[  287.154789]        bus_for_each_drv+0x125/0x150
[  287.159346]        __device_attach+0xca/0x1a0
[  287.163720]        device_initial_probe+0x34/0x50
[  287.168445]        pci_bus_add_device+0x6e/0x90
[  287.172995]        pci_iov_add_virtfn+0x3c9/0x3e0
[  287.177719]        sriov_add_vfs+0x2c/0x60
[  287.181838]        sriov_enable+0x306/0x4a0
[  287.186038]        vfio_pci_core_sriov_configure+0x184/0x220
[  287.191715]        sriov_numvfs_store+0xd9/0x1c0
[  287.196351]        kernfs_fop_write_iter+0x13f/0x1d0
[  287.201338]        vfs_write+0x2be/0x3b0
[  287.205286]        ksys_write+0x73/0x100
[  287.209233]        do_syscall_64+0x14d/0x750
[  287.213529]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.219120]
[  287.219120] -&gt; #0 (&amp;vdev-&gt;memory_lock){+.+.}-{4:4}:
[  287.225491]        __lock_acquire+0x14c6/0x2800
[  287.230048]        lock_acquire+0xd3/0x2f0
[  287.234168]        down_write+0x3a/0xc0
[  287.238019]        vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.243436]        __rpm_callback+0x8c/0x310
[  287.247730]        rpm_resume+0x529/0x6f0
[  287.251765]        __pm_runtime_resume+0x68/0x90
[  287.256402]        vfio_pci_core_enable+0x44/0x310
[  287.261216]        vfio_pci_open_device+0x1c/0x80
[  287.265947]        vfio_df_open+0x10f/0x150
[  287.270148]        vfio_group_fops_unl_ioctl+0x4a4/0x7b0
[  287.275476]        __se_sys_ioctl+0x71/0xc0
[  287.279679]        do_syscall_64+0x14d/0x750
[  287.283975]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.289559]
[  287.289559] other info that might help us debug this:
[  287.289559]
[  287.297582] Chain exists of:
[  287.297582]   &amp;vdev-&gt;memory_lock --&gt; (work_completion)(&amp;arg.work)
--&gt; &amp;new_dev_set-&gt;lock
[  287.297582]
[  287.310023]  Possible unsafe locking scenario:
[  287.310023]
[  287.315961]        CPU0                    CPU1
[  287.320510]        ----                    ----
[  287.325059]   lock(&amp;new_dev_set-&gt;lock);
[  287.328917]
lock((work_completion)(&amp;arg.work));
[  287.336153]                                lock(&amp;new_dev_set-&gt;lock);
[  287.342523]   lock(&amp;vdev-&gt;memory_lock);
[  287.346382]
[  287.346382]  *** DEADLOCK ***
[  287.346382]
[  287.352315] 2 locks held by vfio_pci_sriov_/18636:
[  287.357125]  #0: ff45bea208ed3e18 (&amp;group-&gt;group_lock){+.+.}-{4:4},
at: vfio_group_fops_unl_ioctl+0x3e3/0x7b0
[  287.367048]  #1: ff45bea3a96b8230 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4},
at: vfio_group_fops_unl_ioctl+0x44d/0x7b0
[  287.376976]
[  287.376976] stack backtrace:
[  287.381353] CPU: 191 UID: 0 PID: 18636 Comm: vfio_pci_sriov_
Tainted: G S                  7.0.0-dbg-DEV #3 PREEMPTLAZY
[  287.381355] Tainted: [S]=CPU_OUT_OF_SPEC
[  287.381356] Call Trace:
[  287.381357]  &lt;TASK&gt;
[  287.381358]  dump_stack_lvl+0x54/0x70
[  287.381361]  print_circular_bug+0x2e1/0x300
[  287.381363]  check_noncircular+0xf9/0x120
[  287.381364]  ? __lock_acquire+0x5b4/0x2800
[  287.381366]  __lock_acquire+0x14c6/0x2800
[  287.381368]  ? pci_mmcfg_read+0x4f/0x220
[  287.381370]  ? pci_mmcfg_write+0x57/0x220
[  287.381371]  ? lock_acquire+0xd3/0x2f0
[  287.381373]  ? pci_mmcfg_write+0x57/0x220
[  287.381374]  ? lock_release+0xef/0x360
[  287.381376]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381377]  lock_acquire+0xd3/0x2f0
[  287.381378]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381379]  ? lock_is_held_type+0x76/0x100
[  287.381382]  down_write+0x3a/0xc0
[  287.381382]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381383]  vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381384]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[  287.381385]  __rpm_callback+0x8c/0x310
[  287.381386]  ? ktime_get_mono_fast_ns+0x3d/0xb0
[  287.381389]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[  287.381390]  rpm_resume+0x529/0x6f0
[  287.381392]  ? lock_is_held_type+0x76/0x100
[  287.381394]  __pm_runtime_resume+0x68/0x90
[  287.381396]  vfio_pci_core_enable+0x44/0x310
[  287.381398]  vfio_pci_open_device+0x1c/0x80
[  287.381399]  vfio_df_open+0x10f/0x150
[  287.381401]  vfio_group_fops_unl_ioctl+0x4a4/0x7b0
[  287.381402]  __se_sys_ioctl+0x71/0xc0
[  287.381404]  do_syscall_64+0x14d/0x750
[  287.381405]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.381406]  ? trace_irq_disable+0x25/0xd0
[  287.381409]  entry_SYSCALL_64_after_hwframe+0x77/0x7f

Introduce a private flag 'sriov_active' in the vfio_pci_core_device
struct. This  allows the driver to track the SR-IOV power state requirement
without  relying on pci_num_vf() while holding the memory_lock. The lock is
now  only held to set the flag and ensure the device is in D0, after which
pci_enable_sriov() can be called without the lock.

Fixes: f4162eb1e2fc ("vfio/pci: Change the PF power state to D0 before enabling VFs")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Suggested-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Raghavendra Rao Ananta &lt;rananta@google.com&gt;
Link: https://lore.kernel.org/r/20260514173449.3282188-1-rananta@google.com
[promote bitfield to plain bool to avoid storage-unit races]
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current implementation uses pci_num_vf() while holding the
memory_lock to prevent changing the power state of a PF when
VFs are enabled. This creates a lockdep circular dependency
warning because memory_lock is held during device probing.

[  286.997167] ======================================================
[  287.003363] WARNING: possible circular locking dependency detected
[  287.009562] 7.0.0-dbg-DEV #3 Tainted: G S
[  287.015074] ------------------------------------------------------
[  287.021270] vfio_pci_sriov_/18636 is trying to acquire lock:
[  287.026942] ff45bea2294d4968 (&amp;vdev-&gt;memory_lock){+.+.}-{4:4}, at:
vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.036530]
[  287.036530] but task is already holding lock:
[  287.042383] ff45bea3a96b8230 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4}, at:
vfio_group_fops_unl_ioctl+0x44d/0x7b0
[  287.051879]
[  287.051879] which lock already depends on the new lock.
[  287.051879]
[  287.060070]
[  287.060070] the existing dependency chain (in reverse order) is:
[  287.067568]
[  287.067568] -&gt; #2 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4}:
[  287.073941]        __mutex_lock+0x92/0xb80
[  287.078058]        vfio_assign_device_set+0x66/0x1b0
[  287.083042]        vfio_pci_core_register_device+0xd1/0x2a0
[  287.088638]        vfio_pci_probe+0xd2/0x100
[  287.092933]        local_pci_probe_callback+0x4d/0xa0
[  287.098001]        process_scheduled_works+0x2ca/0x680
[  287.103158]        worker_thread+0x1e8/0x2f0
[  287.107452]        kthread+0x10c/0x140
[  287.111230]        ret_from_fork+0x18e/0x360
[  287.115519]        ret_from_fork_asm+0x1a/0x30
[  287.119983]
[  287.119983] -&gt; #1 ((work_completion)(&amp;arg.work)){+.+.}-{0:0}:
[  287.127219]        __flush_work+0x345/0x490
[  287.131429]        pci_device_probe+0x2e3/0x490
[  287.135979]        really_probe+0x1f9/0x4e0
[  287.140180]        __driver_probe_device+0x77/0x100
[  287.145079]        driver_probe_device+0x1e/0x110
[  287.149803]        __device_attach_driver+0xe3/0x170
[  287.154789]        bus_for_each_drv+0x125/0x150
[  287.159346]        __device_attach+0xca/0x1a0
[  287.163720]        device_initial_probe+0x34/0x50
[  287.168445]        pci_bus_add_device+0x6e/0x90
[  287.172995]        pci_iov_add_virtfn+0x3c9/0x3e0
[  287.177719]        sriov_add_vfs+0x2c/0x60
[  287.181838]        sriov_enable+0x306/0x4a0
[  287.186038]        vfio_pci_core_sriov_configure+0x184/0x220
[  287.191715]        sriov_numvfs_store+0xd9/0x1c0
[  287.196351]        kernfs_fop_write_iter+0x13f/0x1d0
[  287.201338]        vfs_write+0x2be/0x3b0
[  287.205286]        ksys_write+0x73/0x100
[  287.209233]        do_syscall_64+0x14d/0x750
[  287.213529]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.219120]
[  287.219120] -&gt; #0 (&amp;vdev-&gt;memory_lock){+.+.}-{4:4}:
[  287.225491]        __lock_acquire+0x14c6/0x2800
[  287.230048]        lock_acquire+0xd3/0x2f0
[  287.234168]        down_write+0x3a/0xc0
[  287.238019]        vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.243436]        __rpm_callback+0x8c/0x310
[  287.247730]        rpm_resume+0x529/0x6f0
[  287.251765]        __pm_runtime_resume+0x68/0x90
[  287.256402]        vfio_pci_core_enable+0x44/0x310
[  287.261216]        vfio_pci_open_device+0x1c/0x80
[  287.265947]        vfio_df_open+0x10f/0x150
[  287.270148]        vfio_group_fops_unl_ioctl+0x4a4/0x7b0
[  287.275476]        __se_sys_ioctl+0x71/0xc0
[  287.279679]        do_syscall_64+0x14d/0x750
[  287.283975]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.289559]
[  287.289559] other info that might help us debug this:
[  287.289559]
[  287.297582] Chain exists of:
[  287.297582]   &amp;vdev-&gt;memory_lock --&gt; (work_completion)(&amp;arg.work)
--&gt; &amp;new_dev_set-&gt;lock
[  287.297582]
[  287.310023]  Possible unsafe locking scenario:
[  287.310023]
[  287.315961]        CPU0                    CPU1
[  287.320510]        ----                    ----
[  287.325059]   lock(&amp;new_dev_set-&gt;lock);
[  287.328917]
lock((work_completion)(&amp;arg.work));
[  287.336153]                                lock(&amp;new_dev_set-&gt;lock);
[  287.342523]   lock(&amp;vdev-&gt;memory_lock);
[  287.346382]
[  287.346382]  *** DEADLOCK ***
[  287.346382]
[  287.352315] 2 locks held by vfio_pci_sriov_/18636:
[  287.357125]  #0: ff45bea208ed3e18 (&amp;group-&gt;group_lock){+.+.}-{4:4},
at: vfio_group_fops_unl_ioctl+0x3e3/0x7b0
[  287.367048]  #1: ff45bea3a96b8230 (&amp;new_dev_set-&gt;lock){+.+.}-{4:4},
at: vfio_group_fops_unl_ioctl+0x44d/0x7b0
[  287.376976]
[  287.376976] stack backtrace:
[  287.381353] CPU: 191 UID: 0 PID: 18636 Comm: vfio_pci_sriov_
Tainted: G S                  7.0.0-dbg-DEV #3 PREEMPTLAZY
[  287.381355] Tainted: [S]=CPU_OUT_OF_SPEC
[  287.381356] Call Trace:
[  287.381357]  &lt;TASK&gt;
[  287.381358]  dump_stack_lvl+0x54/0x70
[  287.381361]  print_circular_bug+0x2e1/0x300
[  287.381363]  check_noncircular+0xf9/0x120
[  287.381364]  ? __lock_acquire+0x5b4/0x2800
[  287.381366]  __lock_acquire+0x14c6/0x2800
[  287.381368]  ? pci_mmcfg_read+0x4f/0x220
[  287.381370]  ? pci_mmcfg_write+0x57/0x220
[  287.381371]  ? lock_acquire+0xd3/0x2f0
[  287.381373]  ? pci_mmcfg_write+0x57/0x220
[  287.381374]  ? lock_release+0xef/0x360
[  287.381376]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381377]  lock_acquire+0xd3/0x2f0
[  287.381378]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381379]  ? lock_is_held_type+0x76/0x100
[  287.381382]  down_write+0x3a/0xc0
[  287.381382]  ? vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381383]  vfio_pci_core_runtime_resume+0x1f/0xa0
[  287.381384]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[  287.381385]  __rpm_callback+0x8c/0x310
[  287.381386]  ? ktime_get_mono_fast_ns+0x3d/0xb0
[  287.381389]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[  287.381390]  rpm_resume+0x529/0x6f0
[  287.381392]  ? lock_is_held_type+0x76/0x100
[  287.381394]  __pm_runtime_resume+0x68/0x90
[  287.381396]  vfio_pci_core_enable+0x44/0x310
[  287.381398]  vfio_pci_open_device+0x1c/0x80
[  287.381399]  vfio_df_open+0x10f/0x150
[  287.381401]  vfio_group_fops_unl_ioctl+0x4a4/0x7b0
[  287.381402]  __se_sys_ioctl+0x71/0xc0
[  287.381404]  do_syscall_64+0x14d/0x750
[  287.381405]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  287.381406]  ? trace_irq_disable+0x25/0xd0
[  287.381409]  entry_SYSCALL_64_after_hwframe+0x77/0x7f

Introduce a private flag 'sriov_active' in the vfio_pci_core_device
struct. This  allows the driver to track the SR-IOV power state requirement
without  relying on pci_num_vf() while holding the memory_lock. The lock is
now  only held to set the flag and ensure the device is in D0, after which
pci_enable_sriov() can be called without the lock.

Fixes: f4162eb1e2fc ("vfio/pci: Change the PF power state to D0 before enabling VFs")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Suggested-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Raghavendra Rao Ananta &lt;rananta@google.com&gt;
Link: https://lore.kernel.org/r/20260514173449.3282188-1-rananta@google.com
[promote bitfield to plain bool to avoid storage-unit races]
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/xe: avoid duplicate reset in xe_vfio_pci_reset_done</title>
<updated>2026-05-20T20:52:21+00:00</updated>
<author>
<name>GuoHan Zhao</name>
<email>zhaoguohan@kylinos.cn</email>
</author>
<published>2026-04-27T01:21:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b9285405c5f6144f4444f97bf3048d865e11cc1d'/>
<id>b9285405c5f6144f4444f97bf3048d865e11cc1d</id>
<content type='text'>
xe_vfio_pci_reset_done() sets deferred_reset and, when it manages to
acquire state_mutex itself, hands the cleanup off to
xe_vfio_pci_state_mutex_unlock().

That helper already clears deferred_reset and runs xe_vfio_pci_reset()
before dropping the mutex. Calling xe_vfio_pci_reset() again right
afterwards repeats the reset handling unnecessarily.

Fixes: 1f5556ec8b9e ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Signed-off-by: GuoHan Zhao &lt;zhaoguohan@kylinos.cn&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Acked-by: Michał Winiarski &lt;michal.winiarski@intel.com&gt;
Link: https://lore.kernel.org/r/20260427012128.117051-1-zhaoguohan@kylinos.cn
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
xe_vfio_pci_reset_done() sets deferred_reset and, when it manages to
acquire state_mutex itself, hands the cleanup off to
xe_vfio_pci_state_mutex_unlock().

That helper already clears deferred_reset and runs xe_vfio_pci_reset()
before dropping the mutex. Calling xe_vfio_pci_reset() again right
afterwards repeats the reset handling unnecessarily.

Fixes: 1f5556ec8b9e ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Signed-off-by: GuoHan Zhao &lt;zhaoguohan@kylinos.cn&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Acked-by: Michał Winiarski &lt;michal.winiarski@intel.com&gt;
Link: https://lore.kernel.org/r/20260427012128.117051-1-zhaoguohan@kylinos.cn
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hisi_acc_vfio_pci: simplify the command for reading device information</title>
<updated>2026-05-20T20:52:09+00:00</updated>
<author>
<name>Weili Qian</name>
<email>qianweili@huawei.com</email>
</author>
<published>2026-05-14T09:20:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e1b3b4d6ff6dc7f615c2259961ce860ceb93e74e'/>
<id>e1b3b4d6ff6dc7f615c2259961ce860ceb93e74e</id>
<content type='text'>
The mailbox operation for the Hisi accelerator device now provides a
new read function that supports direct information retrieval by
specifying commands, thereby simplifying the related mailbox command
handling in the driver.

Signed-off-by: Weili Qian &lt;qianweili@huawei.com&gt;
Signed-off-by: Longfang Liu &lt;liulongfang@huawei.com&gt;
Link: https://lore.kernel.org/r/20260514092026.2018844-1-liulongfang@huawei.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The mailbox operation for the Hisi accelerator device now provides a
new read function that supports direct information retrieval by
specifying commands, thereby simplifying the related mailbox command
handling in the driver.

Signed-off-by: Weili Qian &lt;qianweili@huawei.com&gt;
Signed-off-by: Longfang Liu &lt;liulongfang@huawei.com&gt;
Link: https://lore.kernel.org/r/20260514092026.2018844-1-liulongfang@huawei.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Replace vfio_pci_core_setup_barmap() with vfio_pci_core_get_iomap()</title>
<updated>2026-05-20T17:54:10+00:00</updated>
<author>
<name>Matt Evans</name>
<email>mattev@meta.com</email>
</author>
<published>2026-05-11T14:58:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=859dc0f6253b0d65fd4d79e29aa29b3d939c51a9'/>
<id>859dc0f6253b0d65fd4d79e29aa29b3d939c51a9</id>
<content type='text'>
Since "vfio/pci: Set up barmap in vfio_pci_core_enable()", the
resource request and iomap for the BARs was performed early, and
vfio_pci_core_setup_barmap() just checks those actions succeeded.

Move this logic to a new helper that checks success and returns the
iomap address, replacing the various bare vdev-&gt;barmap[] lookups.
This maintains the error behaviour of the previous on-demand
vfio_pci_core_setup_barmap() scheme.

Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-4-mattev@meta.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since "vfio/pci: Set up barmap in vfio_pci_core_enable()", the
resource request and iomap for the BARs was performed early, and
vfio_pci_core_setup_barmap() just checks those actions succeeded.

Move this logic to a new helper that checks success and returns the
iomap address, replacing the various bare vdev-&gt;barmap[] lookups.
This maintains the error behaviour of the previous on-demand
vfio_pci_core_setup_barmap() scheme.

Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-4-mattev@meta.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Check BAR resources before exporting a DMABUF</title>
<updated>2026-05-14T17:39:03+00:00</updated>
<author>
<name>Matt Evans</name>
<email>mattev@meta.com</email>
</author>
<published>2026-05-11T14:58:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=702809dabdecca807bdd50cfdcc1c980feb2ba62'/>
<id>702809dabdecca807bdd50cfdcc1c980feb2ba62</id>
<content type='text'>
A DMABUF exports access to BAR resources and, although they are
requested at startup time, we need to ensure they really were reserved
before exporting.  Otherwise, it's possible to access unreserved
resources through the export.

Add a check to the DMABUF-creation path.

Fixes: 5d74781ebc86c ("vfio/pci: Add dma-buf export support for MMIO regions")
Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-3-mattev@meta.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A DMABUF exports access to BAR resources and, although they are
requested at startup time, we need to ensure they really were reserved
before exporting.  Otherwise, it's possible to access unreserved
resources through the export.

Add a check to the DMABUF-creation path.

Fixes: 5d74781ebc86c ("vfio/pci: Add dma-buf export support for MMIO regions")
Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-3-mattev@meta.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Set up BAR resources and maps in vfio_pci_core_enable()</title>
<updated>2026-05-14T17:38:04+00:00</updated>
<author>
<name>Matt Evans</name>
<email>mattev@meta.com</email>
</author>
<published>2026-05-11T14:58:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=05f2a68b407a6817fe141dd64972c6ab8725312d'/>
<id>05f2a68b407a6817fe141dd64972c6ab8725312d</id>
<content type='text'>
Previously BAR resource requests and the corresponding pci_iomap()
were performed on-demand and without synchronisation, which was racy.
Rather than add synchronisation, it's simplest to address this by
doing both activities from vfio_pci_core_enable().

The resource allocation and/or pci_iomap() can still fail; their
status is tracked and existing calls to vfio_pci_core_setup_barmap()
will fail in a similar way to before.  This keeps the point of failure
as observed by userspace the same, i.e. failures to request/map unused
BARs are benign.

Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-2-mattev@meta.com
[ERR_PTR -&gt; IOMEM_ERR_PTR per lkp report]
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously BAR resource requests and the corresponding pci_iomap()
were performed on-demand and without synchronisation, which was racy.
Rather than add synchronisation, it's simplest to address this by
doing both activities from vfio_pci_core_enable().

The resource allocation and/or pci_iomap() can still fail; their
status is tracked and existing calls to vfio_pci_core_setup_barmap()
will fail in a similar way to before.  This keeps the point of failure
as observed by userspace the same, i.e. failures to request/map unused
BARs are benign.

Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Signed-off-by: Matt Evans &lt;mattev@meta.com&gt;
Link: https://lore.kernel.org/r/20260511145829.2993601-2-mattev@meta.com
[ERR_PTR -&gt; IOMEM_ERR_PTR per lkp report]
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: fix dma-buf kref underflow after revoke</title>
<updated>2026-05-13T19:58:27+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@nvidia.com</email>
</author>
<published>2026-05-07T14:35:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c64a647c84f30be368404f50f9052ed6c75c0f17'/>
<id>c64a647c84f30be368404f50f9052ed6c75c0f17</id>
<content type='text'>
vfio_pci_dma_buf_move(revoked=true) and vfio_pci_dma_buf_cleanup()
ran the same drain sequence: set priv-&gt;revoked, invalidate mappings,
wait for fences, drop the registered kref, wait for completion.
When the VFIO device fd was closed after PCI_COMMAND_MEMORY had been
cleared, both ran in turn -- the second kref_put underflowed and the
subsequent wait_for_completion() blocked on a completion that the
first run had already consumed:

  refcount_t: underflow; use-after-free.
  WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x59/0x90
  Call Trace:
   vfio_pci_dma_buf_cleanup+0x163/0x168 [vfio_pci_core]
   vfio_pci_core_close_device+0x67/0xe0 [vfio_pci_core]
   vfio_df_close+0x4c/0x80 [vfio]
   vfio_df_group_close+0x36/0x80 [vfio]
   vfio_device_fops_release+0x21/0x40 [vfio]
   __fput+0xe6/0x2b0
   __x64_sys_close+0x3d/0x80

Collapse the duplication: vfio_pci_dma_buf_cleanup() now delegates
the drain to vfio_pci_dma_buf_move(true), which is idempotent for
already-revoked dma-bufs.  cleanup retains only list removal and
the device registration drop; the dma_resv_lock that bracketed
those is dropped along with the in-line drain that required it,
memory_lock continues to protect them.

Re-arm the kref and the completion at the end of move()'s revoke
branch so post-revoke state matches post-creation (kref == 1,
completion ready).  This keeps cleanup's call into move() a no-op
when revoke already ran, and replaces the explicit kref_init() that
the un-revoke branch used to perform for the un-revoke -&gt; remap
path.

Fixes: 1a8a5227f229 ("vfio: Wait for dma-buf invalidation to complete")
Reported-by: Joonas Kylmälä &lt;joonas.kylmala@netum.fi&gt;
Closes: https://lore.kernel.org/all/GVXPR02MB12019AA6014F27EF5D773E89BFB372@GVXPR02MB12019.eurprd02.prod.outlook.com/
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Reviewed-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@nvidia.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20260507143548.1018405-1-alex.williamson@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vfio_pci_dma_buf_move(revoked=true) and vfio_pci_dma_buf_cleanup()
ran the same drain sequence: set priv-&gt;revoked, invalidate mappings,
wait for fences, drop the registered kref, wait for completion.
When the VFIO device fd was closed after PCI_COMMAND_MEMORY had been
cleared, both ran in turn -- the second kref_put underflowed and the
subsequent wait_for_completion() blocked on a completion that the
first run had already consumed:

  refcount_t: underflow; use-after-free.
  WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x59/0x90
  Call Trace:
   vfio_pci_dma_buf_cleanup+0x163/0x168 [vfio_pci_core]
   vfio_pci_core_close_device+0x67/0xe0 [vfio_pci_core]
   vfio_df_close+0x4c/0x80 [vfio]
   vfio_df_group_close+0x36/0x80 [vfio]
   vfio_device_fops_release+0x21/0x40 [vfio]
   __fput+0xe6/0x2b0
   __x64_sys_close+0x3d/0x80

Collapse the duplication: vfio_pci_dma_buf_cleanup() now delegates
the drain to vfio_pci_dma_buf_move(true), which is idempotent for
already-revoked dma-bufs.  cleanup retains only list removal and
the device registration drop; the dma_resv_lock that bracketed
those is dropped along with the in-line drain that required it,
memory_lock continues to protect them.

Re-arm the kref and the completion at the end of move()'s revoke
branch so post-revoke state matches post-creation (kref == 1,
completion ready).  This keeps cleanup's call into move() a no-op
when revoke already ran, and replaces the explicit kref_init() that
the un-revoke branch used to perform for the un-revoke -&gt; remap
path.

Fixes: 1a8a5227f229 ("vfio: Wait for dma-buf invalidation to complete")
Reported-by: Joonas Kylmälä &lt;joonas.kylmala@netum.fi&gt;
Closes: https://lore.kernel.org/all/GVXPR02MB12019AA6014F27EF5D773E89BFB372@GVXPR02MB12019.eurprd02.prod.outlook.com/
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Reviewed-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@nvidia.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20260507143548.1018405-1-alex.williamson@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/virtio: Use guard() for bar_mutex in legacy I/O</title>
<updated>2026-04-21T18:01:21+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@nvidia.com</email>
</author>
<published>2026-04-14T20:06:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b0eab97305ae97190a605117095d84f12ecef187'/>
<id>b0eab97305ae97190a605117095d84f12ecef187</id>
<content type='text'>
Convert the bar_mutex acquisition in virtiovf_issue_legacy_rw_cmd()
to use guard(), eliminating the out label and goto-based error paths
in favor of direct returns.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Alex Williamson &lt;alex.williamson@nvidia.com&gt;
Reviewed-by: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Link: https://lore.kernel.org/r/20260414200625.3601509-5-alex.williamson@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert the bar_mutex acquisition in virtiovf_issue_legacy_rw_cmd()
to use guard(), eliminating the out label and goto-based error paths
in favor of direct returns.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Alex Williamson &lt;alex.williamson@nvidia.com&gt;
Reviewed-by: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Link: https://lore.kernel.org/r/20260414200625.3601509-5-alex.williamson@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
