summaryrefslogtreecommitdiff
path: root/drivers/virtio
AgeCommit message (Collapse)Author
13 hoursMerge tag 'mm-stable-2026-04-13-21-45' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...
11 daysvirtio_balloon: set unspecified page reporting orderYuvraj Sakshith
virtio_balloon page reporting order is set to MAX_PAGE_ORDER implicitly as vb->prdev.order is never initialised and is auto-set to zero. Explicitly mention usage of default page order by making use of PAGE_REPORTING_ORDER_UNSPECIFIED fallback value. Link: https://lkml.kernel.org/r/20260303113032.3008371-3-yuvraj.sakshith@oss.qualcomm.com Signed-off-by: Yuvraj Sakshith <yuvraj.sakshith@oss.qualcomm.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Liu <wei.liu@kernel.org> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 daysmm: move pgscan, pgsteal, pgrefill to node statsJP Kobryn (Meta)
There are situations where reclaim kicks in on a system with free memory. One possible cause is a NUMA imbalance scenario where one or more nodes are under pressure. It would help if we could easily identify such nodes. Move the pgscan, pgsteal, and pgrefill counters from vm_event_item to node_stat_item to provide per-node reclaim visibility. With these counters as node stats, the values are now displayed in the per-node section of /proc/zoneinfo, which allows for quick identification of the affected nodes. /proc/vmstat continues to report the same counters, aggregated across all nodes. But the ordering of these items within the readout changes as they move from the vm events section to the node stats section. Memcg accounting of these counters is preserved. The relocated counters remain visible in memory.stat alongside the existing aggregate pgscan and pgsteal counters. However, this change affects how the global counters are accumulated. Previously, the global event count update was gated on !cgroup_reclaim(), excluding memcg-based reclaim from /proc/vmstat. Now that mod_lruvec_state() is being used to update the counters, the global counters will include all reclaim. This is consistent with how pgdemote counters are already tracked. Finally, the virtio_balloon driver is updated to use global_node_page_state() to fetch the counters, as they are no longer accessible through the vm_events array. Link: https://lkml.kernel.org/r/20260219235846.161910-1-jp.kobryn@linux.dev Signed-off-by: JP Kobryn <jp.kobryn@linux.dev> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@kernel.org> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-20dma-mapping: Clarify valid conditions for CPU cache line overlapLeon Romanovsky
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to better reflect that it is debugging aid to inform DMA core code that CPU cache line overlaps are allowed, and refine the documentation describing its use. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-3-1dde90a7f08b@nvidia.com
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-13Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - in-order support in virtio core - multiple address space support in vduse - fixes, cleanups all over the place, notably dma alignment fixes for non-cache-coherent systems * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (59 commits) vduse: avoid adding implicit padding vhost: fix caching attributes of MMIO regions by setting them explicitly vdpa/mlx5: update MAC address handling in mlx5_vdpa_set_attr() vdpa/mlx5: reuse common function for MAC address updates vdpa/mlx5: update mlx_features with driver state check crypto: virtio: Replace package id with numa node id crypto: virtio: Remove duplicated virtqueue_kick in virtio_crypto_skcipher_crypt_req crypto: virtio: Add spinlock protection with virtqueue notification Documentation: Add documentation for VDUSE Address Space IDs vduse: bump version number vduse: add vq group asid support vduse: merge tree search logic of IOTLB_GET_FD and IOTLB_GET_INFO ioctls vduse: take out allocations from vduse_dev_alloc_coherent vduse: remove unused vaddr parameter of vduse_domain_free_coherent vduse: refactor vdpa_dev_add for goto err handling vhost: forbid change vq groups ASID if DRIVER_OK is set vdpa: document set_group_asid thread safety vduse: return internal vq group struct as map token vduse: add vq group support vduse: add v1 API definition ...
2026-01-31mm: rename CONFIG_MEMORY_BALLOON -> CONFIG_BALLOONDavid Hildenbrand (Red Hat)
Let's make it consistent with the naming of the files but also with the naming of CONFIG_BALLOON_MIGRATION. While at it, add a "/* CONFIG_BALLOON */". Link: https://lkml.kernel.org/r/20260119230133.3551867-24-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm: rename CONFIG_BALLOON_COMPACTION to CONFIG_BALLOON_MIGRATIONDavid Hildenbrand (Red Hat)
While compaction depends on migration, the other direction is not the case. So let's make it clearer that this is all about migration of balloon pages. Adjust all comments/docs in the core to talk about "migration" instead of "compaction". While at it add some "/* CONFIG_BALLOON_MIGRATION */". Link: https://lkml.kernel.org/r/20260119230133.3551867-23-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm: rename balloon_compaction.(c|h) to balloon.(c|h)David Hildenbrand (Red Hat)
Even without CONFIG_BALLOON_COMPACTION this infrastructure implements basic list and page management for a memory balloon. Link: https://lkml.kernel.org/r/20260119230133.3551867-21-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31drivers/virtio/virtio_balloon: stop using balloon_page_push/pop()David Hildenbrand (Red Hat)
Let's stop using these functions so we can remove them. They look like belonging to the balloon API for managing the device balloon list when really they are just simple helpers only used by virtio-balloon. Let's just inline them and switch to a proper list_for_each_entry_safe(). Link: https://lkml.kernel.org/r/20260119230133.3551867-13-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm/balloon_compaction: centralize adjust_managed_page_count() handlingDavid Hildenbrand (Red Hat)
Let's centralize it, by allowing for the driver to enable this handling through a new flag (bool for now) in the balloon device info. Note that we now adjust the counter when adding/removing a page into the balloon list: when removing a page to deflate it, it will now happen before the driver communicated with hypervisor, not afterwards. This shouldn't make a difference in practice. Link: https://lkml.kernel.org/r/20260119230133.3551867-7-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31mm/balloon_compaction: centralize basic page migration handlingDavid Hildenbrand (Red Hat)
Let's update the balloon page references, the balloon page list, the BALLOON_MIGRATE counter and the isolated-pages counter in balloon_page_migrate(), after letting the balloon->migratepage() callback deal with the actual inflation+deflation. Note that we now perform the balloon list modifications outside of any implementation-specific locks: which is fine, there is nothing special about these page actions that the lock would be protecting. The old page is already no longer in the list (isolated) and the new page is not yet in the list. Let's use -ENOENT to communicate the special "inflation of new page failed after already deflating the old page" to balloon_page_migrate() so it can handle it accordingly. While at it, rename balloon->b_dev_info to make it match the other functions. Also, drop the comment above balloon_page_migrate(), which seems unnecessary. Link: https://lkml.kernel.org/r/20260119230133.3551867-6-david@kernel.org Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-08virtio_input: use virtqueue_add_inbuf_cache_clean for eventsMichael S. Tsirkin
The evts array contains 64 small (8-byte) input events that share cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled, this can trigger warnings about overlapping DMA mappings within the same cacheline. Previous patch isolated the array in its own cachelines, so the warnings are now spurious. Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not write into these cache lines, suppressing these warnings. Message-ID: <4c885b4046323f68cf5cadc7fbfb00216b11dd20.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2026-01-08virtio_input: fix DMA alignment for evtsMichael S. Tsirkin
On non-cache-coherent platforms, when a structure contains a buffer used for DMA alongside fields that the CPU writes to, cacheline sharing can cause data corruption. The evts array is used for DMA_FROM_DEVICE operations via virtqueue_add_inbuf(). The adjacent lock and ready fields are written by the CPU during normal operation. If these share cachelines with evts, CPU writes can corrupt DMA data. Add __dma_from_device_group_begin()/end() annotations to ensure evts is isolated in its own cachelines. Message-ID: <cd328233198a76618809bb5cd9a6ddcaa603a8a1.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2026-01-02virtio: add virtqueue_add_inbuf_cache_clean APIMichael S. Tsirkin
Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN to virtqueue operations. This suppresses DMA debug cacheline overlap warnings for buffers where proper cache management is ensured by the caller. Message-ID: <e50d38c974859e731e50bda7a0ee5691debf5bc4.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-12-31virtio_ring: add in order supportJason Wang
This patch implements in order support for both split virtqueue and packed virtqueue. Performance could be gained for the device where the memory access could be expensive (e.g vhost-net or a real PCI device): Benchmark with KVM guest: Vhost-net on the host: (pktgen + XDP_DROP): in_order=off | in_order=on | +% TX: 4.51Mpps | 5.30Mpps | +17% RX: 3.47Mpps | 3.61Mpps | + 4% Vhost-user(testpmd) on the host: (pktgen/XDP_DROP): For split virtqueue: in_order=off | in_order=on | +% TX: 5.60Mpps | 5.60Mpps | +0.0% RX: 9.16Mpps | 9.61Mpps | +4.9% For packed virtqueue: in_order=off | in_order=on | +% TX: 5.60Mpps | 5.70Mpps | +1.7% RX: 10.6Mpps | 10.8Mpps | +1.8% Benchmark also shows no performance impact for in_order=off for queue size with 256 and 1024. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-20-jasowang@redhat.com>
2025-12-31virtio_ring: factor out split detaching logicJason Wang
This patch factors out the split core detaching logic that could be reused by in order feature into a dedicated function. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-19-jasowang@redhat.com>
2025-12-31virtio_ring: factor out split indirect detaching logicJason Wang
Factor out the split indirect descriptor detaching logic in order to allow it to be reused by the in order support. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-18-jasowang@redhat.com>
2025-12-31virtio_ring: factor out core logic for updating last_used_idxJason Wang
Factor out the core logic for updating last_used_idx to be reused by the packed in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-17-jasowang@redhat.com>
2025-12-31virtio_ring: factor out core logic of buffer detachingJason Wang
Factor out core logic of buffer detaching and leave the free list management to the caller so in_order can just call the core logic. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-16-jasowang@redhat.com>
2025-12-31virtio_ring: determine descriptor flags at one timeJason Wang
Let's determine the last descriptor by counting the number of sg. This would be consistent with packed virtqueue implementation and ease the future in-order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-15-jasowang@redhat.com>
2025-12-31virtio_ring: introduce virtqueue opsJason Wang
This patch introduces virtqueue ops which is a set of callbacks that will be called for different queue layout or features. This would help to avoid branches for split/packed and will ease the future implementation like in order. Note that in order to eliminate the indirect calls this patch uses global array of const ops to allow compiler to avoid indirect branches. Tested with CONFIG_MITIGATION_RETPOLINE, no performance differences were noticed. Acked-by: Eugenio Pérez <eperezma@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-14-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use unsigned int for virtqueue_poll_packed()Jason Wang
Switch to use unsigned int for virtqueue_poll_packed() to match virtqueue_poll() and virtqueue_poll_split() and to ease the abstraction of the virtqueue ops. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-13-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for detach_unused_buf variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-12-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for disable_cb variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-11-jasowang@redhat.com>
2025-12-31virtio_ring: use vring_virtqueue for enable_cb_delayed variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-10-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-9-jasowang@redhat.com>
2025-12-31virtio: switch to use vring_virtqueue for virtqueue_get variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-8-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue_add variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-7-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-6-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue for virtqueue resize variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-5-jasowang@redhat.com>
2025-12-31virtio_ring: unify logic of virtqueue_poll() and more_used()Jason Wang
This patch unifies the logic of virtqueue_poll() and more_used() for better code reusing and ease the future in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-4-jasowang@redhat.com>
2025-12-31virtio_ring: switch to use vring_virtqueue in virtqueue_poll variantsJason Wang
Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-3-jasowang@redhat.com>
2025-12-31virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()Jason Wang
To be consistent with virtqueue_reset(). Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-2-jasowang@redhat.com>
2025-12-26virtio_ring: code cleanup in detach_buf_splitzhangdongchuan@eswincomputing.com
Since the return value of vring_unmap_one_split() is exactly vq->split.desc_extra[i].next, 'i = vq->split.desc_extra[i].next' is redundant. Assign vring_unmap_one_split() to i instead. Since vq->split.desc_extra is assigned to extra, use extra[i].next instead of vq->split.desc_extra[i].next to improve readability. No change in functionality. Signed-off-by: zhangdongchuan <zhangdongchuan@eswincomputing.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <202511261140162936986@eswincomputing.com>
2025-11-27virtio: clean up features qword/dword termsMichael S. Tsirkin
virtio pci uses word to mean "16 bits". mmio uses it to mean "32 bits". To avoid confusion, let's avoid the term in core virtio altogether. Just say U64 to mean "64 bit". Fixes: e7d4c1c5a546 ("virtio: introduce extended features") Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-ID: <ad53b7b6be87fc524f45abaeca0bb05fb3633397.1764225384.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio_balloon: add WQ_PERCPU to alloc_workqueue usersMarco Crivellari
Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251107154917.313090-2-marco.crivellari@suse.com>
2025-11-27virtio: fix kernel-doc for mapping/free_coherent functionsKriish Sharma
Documentation build reported: WARNING: ./drivers/virtio/virtio_ring.c:3174 function parameter 'vaddr' not described in 'virtqueue_map_free_coherent' WARNING: ./drivers/virtio/virtio_ring.c:3308 expecting prototype for virtqueue_mapping_error(). Prototype was for virtqueue_map_mapping_error() instead The kernel-doc block for virtqueue_map_free_coherent() omitted the @vaddr parameter, and the kernel-doc header for virtqueue_map_mapping_error() used the wrong function name (virtqueue_mapping_error) instead of the actual function name. This change updates: - the function name in the comment to virtqueue_map_mapping_error() - adds the missing @vaddr description in the comment for virtqueue_map_free_coherent() Fixes: b41cb3bcf67f ("virtio: rename dma helpers") Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251110202920.2250244-1-kriish.sharma2006@gmail.com>
2025-11-27virtio_vdpa: fix misleading return in void functionAlok Tiwari
virtio_vdpa_set_status() is declared as returning void, but it used "return vdpa_set_status()" Since vdpa_set_status() also returns void, the return statement is unnecessary and misleading. Remove it. Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Message-Id: <20251001191653.1713923-1-alok.a.tiwari@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2025-10-04Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: "Just fixes and cleanups this time around. The mapping cleanups are preparing the ground for new features, though" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-vdpa: Drop redundant conversion to bool vduse: Use fixed 4KB bounce pages for non-4KB page size vduse: switch to use virtio map API instead of DMA API vdpa: introduce map ops vdpa: support virtio_map virtio: introduce map ops in virtio core virtio_ring: rename dma_handle to map_handle virtio: introduce virtio_map container union virtio: rename dma helpers virtio_ring: switch to use dma_{map|unmap}_page() virtio_ring: constify virtqueue pointer for DMA helpers virtio_balloon: Remove redundant __GFP_NOWARN vhost: vringh: Fix copy_to_iter return value check vhost: vringh: Modify the return value check
2025-10-03Merge tag 'dma-mapping-6.18-2025-09-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - Refactoring of DMA mapping API to physical addresses as the primary interface instead of page+offset parameters This gets much closer to Matthew Wilcox's long term wish for struct-pageless IO to cacheable DRAM and is supporting memdesc project which seeks to substantially transform how struct page works. An advantage of this approach is the possibility of introducing DMA_ATTR_MMIO, which covers existing 'dma_map_resource' flow in the common paths, what in turn lets to use recently introduced dma_iova_link() API to map PCI P2P MMIO without creating struct page Developped by Leon Romanovsky and Jason Gunthorpe - Minor clean-up by Petr Tesarik and Qianfeng Rong * tag 'dma-mapping-6.18-2025-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: kmsan: fix missed kmsan_handle_dma() signature conversion mm/hmm: properly take MMIO path mm/hmm: migrate to physical address-based DMA mapping API dma-mapping: export new dma_*map_phys() interface xen: swiotlb: Open code map_resource callback dma-mapping: implement DMA_ATTR_MMIO for dma_(un)map_page_attrs() kmsan: convert kmsan_handle_dma to use physical addresses dma-mapping: convert dma_direct_*map_page to be phys_addr_t based iommu/dma: implement DMA_ATTR_MMIO for iommu_dma_(un)map_phys() iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys dma-debug: refactor to use physical addresses for page mapping iommu/dma: implement DMA_ATTR_MMIO for dma_iova_link(). dma-mapping: introduce new DMA attribute to indicate MMIO memory swiotlb: Remove redundant __GFP_NOWARN dma-direct: clean up the logic in __dma_direct_alloc_pages()
2025-10-01virtio-vdpa: Drop redundant conversion to boolXichao Zhao
The result of integer comparison already evaluates to bool. No need for explicit conversion. Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Message-Id: <20250818102848.578875-1-zhao.xichao@vivo.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-10-01vdpa: introduce map opsJason Wang
Virtio core allows the transport to provide device or transport specific mapping functions. This patch adds this support to vDPA. We can simply do this by allowing the vDPA parent to register a virtio_map_ops. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250924070045.10361-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01vdpa: support virtio_mapJason Wang
Virtio core switches from DMA device to virtio_map, let's do that as well for vDPA. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-8-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01virtio: introduce map ops in virtio coreJason Wang
This patch introduces map operations for virtio device. Virtio used to use DMA API which is not necessarily the case since some devices doesn't do DMA. Instead of using tricks and abusing DMA API, let's simply abstract the current mapping logic into a virtio specific mapping operations. For the device or transport that doesn't do DMA, they can implement their own mapping logic without the need to trick DMA core. In this case the mapping metadata is opaque to the virtio core that will be passed back to the transport or device specific map operations. For other devices, DMA API will still be used, so map token will still be the dma device to minimize the changeset and performance impact. The mapping operations are abstracted as a independent structure instead of reusing virtio_config_ops. This allows the transport can simply reuse the structure for lower layers like vDPA. A set of new mapping helpers were introduced for the device that want to do mapping by themselves. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-7-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01virtio_ring: rename dma_handle to map_handleJason Wang
Following patch will introduce virtio map operations which means the address is not necessarily used for DMA. Let's rename the dma_handle to map_handle first. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-6-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01virtio: introduce virtio_map container unionJason Wang
Following patch will introduce the mapping operations for virtio device. In order to achieve this, besides the dma device, virtio core needs to support a transport or device specific mapping metadata as well. So this patch introduces a union container of a dma device. The idea is the allow the transport layer to pass device specific mapping metadata which will be used as a parameter for the virtio mapping operations. For the transport or device that is using DMA, dma device is still being used. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-5-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2025-10-01virtio: rename dma helpersJason Wang
Following patch will introduce virtio mapping function to avoid abusing DMA API for device that doesn't do DMA. To ease the introduction, this patch rename "dma" to "map" for the current dma mapping helpers. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-4-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>