summaryrefslogtreecommitdiff
path: root/drivers/virtio
AgeCommit message (Collapse)Author
30 hoursMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - new virtio CAN driver - support for LoongArch architecture in fw_cfg - support for firmware notifications in vdpa/octeon_ep - support for VFs in virtio core - fixes, cleanups all over the place, notably: - vhost: fix vhost_get_avail_idx for a non empty ring fixing an significant old perf regression - READ_ONCE() annotations mean virtio ring is now free of KCSAN warnings * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (37 commits) can: virtio: Fix comment in UAPI header can: virtio: Add virtio CAN driver virtio: add num_vf callback to virtio_bus fw_cfg: Add support for LoongArch architecture vdpa/octeon_ep: fix IRQ-to-ring mapping in interrupt handler vdpa/octeon_ep: Add vDPA device event handling for firmware notifications vdpa/octeon_ep: Use 4 bytes for mailbox signature vdpa/octeon_ep: Fix PF->VF mailbox data address calculation vhost_task_create: kill unnecessary .exit_signal initialization vhost: remove unnecessary module_init/exit functions vdpa/mlx5: Use kvzalloc_flex() for MTT command memory vdpa_sim_net: switch to dynamic root device vdpa_sim_blk: switch to dynamic root device virtio-mem: Destroy mutex before freeing virtio_mem virtio-balloon: Destroy mutex before freeing virtio_balloon tools/virtio: fix build for kmalloc_obj API and missing stubs virtio_ring: Add READ_ONCE annotations for device-writable fields vduse: fix compat handling for VDUSE_IOTLB_GET_FD/VDUSE_VQ_GET_INFO tools/virtio: check mmap return value in vringh_test vhost/net: complete zerocopy ubufs only once ...
9 daysvirtio: add num_vf callback to virtio_busYui Washizu
Recent QEMU versions added support for virtio SR-IOV emulation, allowing virtio devices to expose SR-IOV VFs to the guest. However, virtio_bus does not implement the num_vf callback of bus_type, causing dev_num_vf() to return 0 for virtio devices even when SR-IOV VFs are active. net/core/rtnetlink.c calls dev_num_vf(dev->dev.parent) to populate IFLA_NUM_VF in RTM_GETLINK responses. For a virtio-net device, dev.parent points to the virtio_device, whose busis virtio_bus. Without num_vf, SR-IOV VF information is silently omitted from tools that rely on rtnetlink, such as 'ip link show'. Add a num_vf callback that delegates to dev_num_vf(dev->parent), which in turn reaches the underlying transport (pci_bus_type for virtio-pci) where the actual VF count is tracked. Non-PCI transports are unaffected as dev_num_vf() returns 0 when no num_vf callback is present. Signed-off-by: Yui Washizu <yui.washidu@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20260310061454.683894-1-yui.washidu@gmail.com>
9 daysvirtio-mem: Destroy mutex before freeing virtio_memMaurice Hieronymus
Add a call to mutex_destroy in the error code path as well as in the virtio_mem_remove code path. Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20251123175750.445461-3-mhi@mailbox.org>
9 daysvirtio-balloon: Destroy mutex before freeing virtio_balloonMaurice Hieronymus
Add a call to mutex_destroy in the error code path as well as in the virtballoon_remove code path. Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20251123175750.445461-2-mhi@mailbox.org>
9 daysvirtio_ring: Add READ_ONCE annotations for device-writable fieldsAlexander Graf
KCSAN reports data races when accessing virtio ring fields that are concurrently written by the device (host). These are legitimate concurrent accesses where the CPU reads fields that the device updates via DMA-like mechanisms. Add accessor functions that use READ_ONCE() to properly annotate these device-writable fields and prevent compiler optimizations that could in theory break the code. This also serves as documentation showing which fields are shared with the device. The affected fields are: - Split ring: used->idx, used->ring[].id, used->ring[].len - Packed ring: desc[].flags, desc[].id, desc[].len This patch was partially written using the help of Kiro, an AI coding assistant, to automate the mechanical work of generating the inline function definition. Signed-off-by: Alexander Graf <graf@amazon.com> [jth: Add READ_ONCE in virtqueue_kick_prepare_split ] Co-developed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20260131102810.1254845-1-johannes.thumshirn@wdc.com>
9 daysvirtio: rtc: tear down old virtqueues before restoreJia Jia
virtio_device_restore() resets the device and restores the negotiated features before calling ->restore(). viortc_freeze() intentionally leaves the existing virtqueues in place so the alarm queue can still wake the system, but viortc_restore() immediately calls viortc_init_vqs() without first deleting those old queues. If virtqueue reinitialization fails on virtio-pci, the transport error path can run vp_del_vqs() against a newly allocated vp_dev->vqs array while vdev->vqs still contains the old virtqueues. vp_del_vqs() then looks up queue state through the new array and can dereference a NULL info pointer in vp_del_vq(), crashing the guest kernel during restore. This can also happen during a non-faulty reinitialization, when one of the vp_find_vqs_msix() attempts is unsuccessful before a later attempt would succeed. Delete the stale virtqueues before rebuilding them. If restore fails before virtio_device_ready(), reuse the remove path to stop the device. Once the device is ready, return errors directly instead of deleting the virtqueues again. Fixes: 0623c7592768 ("virtio_rtc: Add module and driver core") Signed-off-by: Jia Jia <physicalmtea@gmail.com> Reviewed-by: Peter Hilber <peter.hilber@oss.qualcomm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20260507120801.3677552-1-physicalmtea@gmail.com>
9 daysvirtio-mmio: fix device release warning on module unloadJohan Hovold
Driver core expects devices to be allocated dynamically and complains loudly when a device that lacks a release function is freed. Use __root_device_register() to allocate and register the root device instead of open coding using a static device. Note that root_device_register(), which also creates a link to the module, cannot be used as the device is registered when parsing the module parameters which happens before the module kobject has been set up. Fixes: 81a054ce0b46 ("virtio-mmio: Devices parameter parsing") Cc: stable@vger.kernel.org # 3.5 Cc: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20260427143710.14702-1-johan@kernel.org>
2026-06-04virtio_rtc: Use provided clock ID for history snapshotThomas Gleixner
The PTP core indicates in system_device_crosststamp::clock_id the clock ID for which the system time stamp should be taken. That allows to utilize hardware timestamps with e.g. AUX clocks. Use ktime_get_snapshot_id() and hand the provided clock ID in. No functional change. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Arthur Kiyanovski <akiyano@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20260529195557.744271454@kernel.org
2026-06-04virtio_pci: fix vq info pointer lookup via wrong indexAmmar Faizi
Unbinding a virtio balloon device: echo virtio0 > /sys/bus/virtio/drivers/virtio_balloon/unbind triggers a NULL pointer dereference. The dmesg says: BUG: kernel NULL pointer dereference, address: 0000000000000008 [...] RIP: 0010:__list_del_entry_valid_or_report+0x5/0xf0 Call Trace: <TASK> vp_del_vqs+0x121/0x230 remove_common+0x135/0x150 virtballoon_remove+0xee/0x100 virtio_dev_remove+0x3b/0x80 device_release_driver_internal+0x187/0x2c0 unbind_store+0xb9/0xe0 kernfs_fop_write_iter.llvm.11660790530567441834+0xf6/0x180 vfs_write+0x2a9/0x3b0 ksys_write+0x5c/0xd0 do_syscall_64+0x54/0x230 entry_SYSCALL_64_after_hwframe+0x29/0x31 [...] </TASK> The virtio_balloon device registers 5 queues (inflate, deflate, stats, free_page, reporting) but only the first two are unconditional. The stats, free_page and reporting queues are each conditional on their respective feature bits. When any of these features are absent, the corresponding vqs_info entry has name == NULL, creating holes in the array. The root cause is an indexing mismatch introduced when vq info storage was changed to be passed as an argument. vp_find_vqs_msix() and vp_find_vqs_intx() store the info pointer at vp_dev->vqs[i], where 'i' is the caller's sparse array index. However, the virtqueue itself gets vq->index assigned from queue_idx, a dense index that skips NULL entries. When holes exist, 'i' and queue_idx diverge. Later, vp_del_vqs() looks up info via vp_dev->vqs[vq->index] using the dense index into the sparsely-populated array, and hits NULL. Fix this by storing info at vp_dev->vqs[queue_idx] instead of vp_dev->vqs[i], so the store index matches the lookup index (vq->index). Apply the fix to both the MSIX and INTX paths. Cc: Yichun Zhang <yichun@openresty.com> Cc: Jiri Pirko <jiri@nvidia.com> Cc: stable@vger.kernel.org # v6.11+ Tested-by: Yuka <yuka@umeyashiki.org> Fixes: 89a1c435aec2 ("virtio_pci: pass vq info as an argument to vp_setup_vq()") Signed-off-by: Ammar Faizi <ammarfaizi2@openresty.com> Message-Id: <20260315141808.547081-1-ammarfaizi2@openresty.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2026-04-15Merge tag 'mm-stable-2026-04-13-21-45' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...
2026-04-05virtio_balloon: set unspecified page reporting orderYuvraj Sakshith
virtio_balloon page reporting order is set to MAX_PAGE_ORDER implicitly as vb->prdev.order is never initialised and is auto-set to zero. Explicitly mention usage of default page order by making use of PAGE_REPORTING_ORDER_UNSPECIFIED fallback value. Link: https://lkml.kernel.org/r/20260303113032.3008371-3-yuvraj.sakshith@oss.qualcomm.com Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Liu <wei.liu@kernel.org> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: move pgscan, pgsteal, pgrefill to node statsJP Kobryn (Meta)
There are situations where reclaim kicks in on a system with free memory. One possible cause is a NUMA imbalance scenario where one or more nodes are under pressure. It would help if we could easily identify such nodes. Move the pgscan, pgsteal, and pgrefill counters from vm_event_item to node_stat_item to provide per-node reclaim visibility. With these counters as node stats, the values are now displayed in the per-node section of /proc/zoneinfo, which allows for quick identification of the affected nodes. /proc/vmstat continues to report the same counters, aggregated across all nodes. But the ordering of these items within the readout changes as they move from the vm events section to the node stats section. Memcg accounting of these counters is preserved. The relocated counters remain visible in memory.stat alongside the existing aggregate pgscan and pgsteal counters. However, this change affects how the global counters are accumulated. Previously, the global event count update was gated on !cgroup_reclaim(), excluding memcg-based reclaim from /proc/vmstat. Now that mod_lruvec_state() is being used to update the counters, the global counters will include all reclaim. This is consistent with how pgdemote counters are already tracked. Finally, the virtio_balloon driver is updated to use global_node_page_state() to fetch the counters, as they are no longer accessible through the vm_events array. Link: https://lkml.kernel.org/r/20260219235846.161910-1-jp.kobryn@linux.dev Signed-off-by: JP Kobryn <jp.kobryn@linux.dev> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@kernel.org> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-20dma-mapping: Clarify valid conditions for CPU cache line overlapLeon Romanovsky
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to better reflect that it is debugging aid to inform DMA core code that CPU cache line overlaps are allowed, and refine the documentation describing its use. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-3-1dde90a7f08b@nvidia.com
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-13Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - in-order support in virtio core - multiple address space support in vduse - fixes, cleanups all over the place, notably dma alignment fixes for non-cache-coherent systems * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (59 commits) vduse: avoid adding implicit padding vhost: fix caching attributes of MMIO regions by setting them explicitly vdpa/mlx5: update MAC address handling in mlx5_vdpa_set_attr() vdpa/mlx5: reuse common function for MAC address updates vdpa/mlx5: update mlx_features with driver state check crypto: virtio: Replace package id with numa node id crypto: virtio: Remove duplicated virtqueue_kick in virtio_crypto_skcipher_crypt_req crypto: virtio: Add spinlock protection with virtqueue notification Documentation: Add documentation for VDUSE Address Space IDs vduse: bump version number vduse: add vq group asid support vduse: merge tree search logic of IOTLB_GET_FD and IOTLB_GET_INFO ioctls vduse: take out allocations from vduse_dev_alloc_coherent vduse: remove unused vaddr parameter of vduse_domain_free_coherent vduse: refactor vdpa_dev_add for goto err handling vhost: forbid change vq groups ASID if DRIVER_OK is set vdpa: document set_group_asid thread safety vduse: return internal vq group struct as map token vduse: add vq group support vduse: add v1 API definition ...
2026-01-31mm: rename CONFIG_MEMORY_BALLOON -> CONFIG_BALLOONDavid Hildenbrand (Red Hat)
Let's make it consistent with the naming of the files but also with the naming of CONFIG_BALLOON_MIGRATION. While at it, add a "/* CONFIG_BALLOON */". Link: https://lkml.kernel.org/r/20260119230133.3551867-24-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm: rename CONFIG_BALLOON_COMPACTION to CONFIG_BALLOON_MIGRATIONDavid Hildenbrand (Red Hat)
While compaction depends on migration, the other direction is not the case. So let's make it clearer that this is all about migration of balloon pages. Adjust all comments/docs in the core to talk about "migration" instead of "compaction". While at it add some "/* CONFIG_BALLOON_MIGRATION */". Link: https://lkml.kernel.org/r/20260119230133.3551867-23-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm: rename balloon_compaction.(c|h) to balloon.(c|h)David Hildenbrand (Red Hat)
Even without CONFIG_BALLOON_COMPACTION this infrastructure implements basic list and page management for a memory balloon. Link: https://lkml.kernel.org/r/20260119230133.3551867-21-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31drivers/virtio/virtio_balloon: stop using balloon_page_push/pop()David Hildenbrand (Red Hat)
Let's stop using these functions so we can remove them. They look like belonging to the balloon API for managing the device balloon list when really they are just simple helpers only used by virtio-balloon. Let's just inline them and switch to a proper list_for_each_entry_safe(). Link: https://lkml.kernel.org/r/20260119230133.3551867-13-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm/balloon_compaction: centralize adjust_managed_page_count() handlingDavid Hildenbrand (Red Hat)
Let's centralize it, by allowing for the driver to enable this handling through a new flag (bool for now) in the balloon device info. Note that we now adjust the counter when adding/removing a page into the balloon list: when removing a page to deflate it, it will now happen before the driver communicated with hypervisor, not afterwards. This shouldn't make a difference in practice. Link: https://lkml.kernel.org/r/20260119230133.3551867-7-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm/balloon_compaction: centralize basic page migration handlingDavid Hildenbrand (Red Hat)
Let's update the balloon page references, the balloon page list, the BALLOON_MIGRATE counter and the isolated-pages counter in balloon_page_migrate(), after letting the balloon->migratepage() callback deal with the actual inflation+deflation. Note that we now perform the balloon list modifications outside of any implementation-specific locks: which is fine, there is nothing special about these page actions that the lock would be protecting. The old page is already no longer in the list (isolated) and the new page is not yet in the list. Let's use -ENOENT to communicate the special "inflation of new page failed after already deflating the old page" to balloon_page_migrate() so it can handle it accordingly. While at it, rename balloon->b_dev_info to make it match the other functions. Also, drop the comment above balloon_page_migrate(), which seems unnecessary. Link: https://lkml.kernel.org/r/20260119230133.3551867-6-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-08virtio_input: use virtqueue_add_inbuf_cache_clean for eventsMichael S. Tsirkin
The evts array contains 64 small (8-byte) input events that share cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled, this can trigger warnings about overlapping DMA mappings within the same cacheline. Previous patch isolated the array in its own cachelines, so the warnings are now spurious. Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not write into these cache lines, suppressing these warnings. Message-ID: <4c885b4046323f68cf5cadc7fbfb00216b11dd20.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2026-01-08virtio_input: fix DMA alignment for evtsMichael S. Tsirkin
On non-cache-coherent platforms, when a structure contains a buffer used for DMA alongside fields that the CPU writes to, cacheline sharing can cause data corruption. The evts array is used for DMA_FROM_DEVICE operations via virtqueue_add_inbuf(). The adjacent lock and ready fields are written by the CPU during normal operation. If these share cachelines with evts, CPU writes can corrupt DMA data. Add __dma_from_device_group_begin()/end() annotations to ensure evts is isolated in its own cachelines. Message-ID: <cd328233198a76618809bb5cd9a6ddcaa603a8a1.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2026-01-02virtio: add virtqueue_add_inbuf_cache_clean APIMichael S. Tsirkin
Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN to virtqueue operations. This suppresses DMA debug cacheline overlap warnings for buffers where proper cache management is ensured by the caller. Message-ID: <e50d38c974859e731e50bda7a0ee5691debf5bc4.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-12-31virtio_ring: add in order supportJason Wang
This patch implements in order support for both split virtqueue and packed virtqueue. Performance could be gained for the device where the memory access could be expensive (e.g vhost-net or a real PCI device): Benchmark with KVM guest: Vhost-net on the host: (pktgen + XDP_DROP): in_order=off | in_order=on | +% TX: 4.51Mpps | 5.30Mpps | +17% RX: 3.47Mpps | 3.61Mpps | + 4% Vhost-user(testpmd) on the host: (pktgen/XDP_DROP): For split virtqueue: in_order=off | in_order=on | +% TX: 5.60Mpps | 5.60Mpps | +0.0% RX: 9.16Mpps | 9.61Mpps | +4.9% For packed virtqueue: in_order=off | in_order=on | +% TX: 5.60Mpps | 5.70Mpps | +1.7% RX: 10.6Mpps | 10.8Mpps | +1.8% Benchmark also shows no performance impact for in_order=off for queue size with 256 and 1024. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-20-jasowang@redhat.com>
2025-12-31virtio_ring: factor out split detaching logicJason Wang
This patch factors out the split core detaching logic that could be reused by in order feature into a dedicated function. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-19-jasowang@redhat.com>
2025-12-31virtio_ring: factor out split indirect detaching logicJason Wang
Factor out the split indirect descriptor detaching logic in order to allow it to be reused by the in order support. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-18-jasowang@redhat.com>
2025-12-31virtio_ring: factor out core logic for updating last_used_idxJason Wang
Factor out the core logic for updating last_used_idx to be reused by the packed in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-17-jasowang@redhat.com>
2025-12-31virtio_ring: factor out core logic of buffer detachingJason Wang
Factor out core logic of buffer detaching and leave the free list management to the caller so in_order can just call the core logic. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-16-jasowang@redhat.com>
2025-12-31virtio_ring: determine descriptor flags at one timeJason Wang
Let's determine the last descriptor by counting the number of sg. This would be consistent with packed virtqueue implementation and ease the future in-order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-15-jasowang@redhat.com>
2025-12-31virtio_ring: introduce virtqueue opsJason Wang
This patch introduces virtqueue ops which is a set of callbacks that will be called for different queue layout or features. This would help to avoid branches for split/packed and will ease the future implementation like in order. Note that in order to eliminate the indirect calls this patch uses global array of const ops to allow compiler to avoid indirect branches. Tested with CONFIG_MITIGATION_RETPOLINE, no performance differences were noticed. Acked-by: Eugenio Pérez <eperezma@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-14-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use unsigned int for virtqueue_poll_packed()Jason Wang
Switch to use unsigned int for virtqueue_poll_packed() to match virtqueue_poll() and virtqueue_poll_split() and to ease the abstraction of the virtqueue ops. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-13-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for detach_unused_buf variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-12-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for disable_cb variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-11-jasowang@redhat.com>
2025-12-31virtio_ring: use vring_virtqueue for enable_cb_delayed variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-10-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-9-jasowang@redhat.com>
2025-12-31virtio: switch to use vring_virtqueue for virtqueue_get variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-8-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue_add variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-7-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-6-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue resize variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-5-jasowang@redhat.com>
2025-12-31virtio_ring: unify logic of virtqueue_poll() and more_used()Jason Wang
This patch unifies the logic of virtqueue_poll() and more_used() for better code reusing and ease the future in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-4-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue in virtqueue_poll variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-3-jasowang@redhat.com>
2025-12-31virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()Jason Wang
To be consistent with virtqueue_reset(). Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-2-jasowang@redhat.com>
2025-12-26virtio_ring: code cleanup in detach_buf_splitzhangdongchuan@eswincomputing.com
Since the return value of vring_unmap_one_split() is exactly vq->split.desc_extra[i].next, 'i = vq->split.desc_extra[i].next' is redundant. Assign vring_unmap_one_split() to i instead. Since vq->split.desc_extra is assigned to extra, use extra[i].next instead of vq->split.desc_extra[i].next to improve readability. No change in functionality. Signed-off-by: zhangdongchuan <zhangdongchuan@eswincomputing.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <202511261140162936986@eswincomputing.com>
2025-11-27virtio: clean up features qword/dword termsMichael S. Tsirkin
virtio pci uses word to mean "16 bits". mmio uses it to mean "32 bits". To avoid confusion, let's avoid the term in core virtio altogether. Just say U64 to mean "64 bit". Fixes: e7d4c1c5a546 ("virtio: introduce extended features") Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-ID: <ad53b7b6be87fc524f45abaeca0bb05fb3633397.1764225384.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio_balloon: add WQ_PERCPU to alloc_workqueue usersMarco Crivellari
Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251107154917.313090-2-marco.crivellari@suse.com>
2025-11-27virtio: fix kernel-doc for mapping/free_coherent functionsKriish Sharma
Documentation build reported: WARNING: ./drivers/virtio/virtio_ring.c:3174 function parameter 'vaddr' not described in 'virtqueue_map_free_coherent' WARNING: ./drivers/virtio/virtio_ring.c:3308 expecting prototype for virtqueue_mapping_error(). Prototype was for virtqueue_map_mapping_error() instead The kernel-doc block for virtqueue_map_free_coherent() omitted the @vaddr parameter, and the kernel-doc header for virtqueue_map_mapping_error() used the wrong function name (virtqueue_mapping_error) instead of the actual function name. This change updates: - the function name in the comment to virtqueue_map_mapping_error() - adds the missing @vaddr description in the comment for virtqueue_map_free_coherent() Fixes: b41cb3bcf67f ("virtio: rename dma helpers") Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251110202920.2250244-1-kriish.sharma2006@gmail.com>
2025-11-27virtio_vdpa: fix misleading return in void functionAlok Tiwari
virtio_vdpa_set_status() is declared as returning void, but it used "return vdpa_set_status()" Since vdpa_set_status() also returns void, the return statement is unnecessary and misleading. Remove it. Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Message-Id: <20251001191653.1713923-1-alok.a.tiwari@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>