linux-toradex.git/drivers/block, branch master

Merge tag 'mm-stable-2026-06-18-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-06-19T17:14:34+00:00

Pull MM updates from Andrew Morton:

 - "selftests/mm: clean up build output and verbosity" (Li Wang)

   Remove some noise from the MM selftests build

 - "mm: Free contiguous order-0 pages efficiently" (Ryan Roberts)

   Speed up the freeing of a batch of 0-order pages by first scanning
   them for coalescing opportunities. This is applicable to vfree() and
   to the releasing of frozen pages

 - "mm/damon: introduce DAMOS failed region quota charge ratio"
   (SeongJae Park)

   Address a DAMOS usability issue: The DAMOS quota often exhausts
   prematurely because it charges for all memory attempted, causing slow
   and inconsistent performance when actions fail on unreclaimable
   memory.

   To fix this, a new feature lets users set a smaller, flexible quota
   charge ratio (via a numerator and denominator) for failed regions.
   Since failed actions cause less overhead, reducing their quota cost
   ensures more predictable and efficient DAMOS processing

 - "selftests/cgroup: improve zswap tests robustness and support large
   page sizes" (Li Wang)

   Fix various spurious failures and improves the overall robustness of
   the cgroup zswap selftests

 - "fix MAP_DROPPABLE not supported errno" (Anthony Yznaga)

   Fix an issue in the mlock selftests on arm32

 - "mm: huge_memory: clean up defrag sysfs with shared" (Breno Leitao)

   Some maintenance work in the huge_memory code

 - "treewide: fixup gfp_t printks" (Brendan Jackman)

   Use the special vprintf() gfp_t conversion in various places

 - "mm: Fix vmemmap optimization accounting and initialization" (Muchun
   Song)

   Fix several bugs in the vmemmap optimization, mainly around incorrect
   page accounting and memmap initialization in the DAX and memory
   hotplug paths. It also fixes pageblock migratetype initialization and
   struct page initialization for ZONE_DEVICE compound pages

 - "mm/damon: repost non-hotfix reviewed patches in damon/next tree"

   A sprinkle of unrelated minor bugfixes for DAMON

 - "mm: remove page_mapped()" (David Hildenbrand)

   Remove this function from the tree, replacing it with folio_mapped()

 - "mm/damon: let DAMON be paused and resumed" (SeongJae Park)

   Allow DAMON to be paused and resumed without losing its current state

 - "kasan: hw_tags: Disable tagging for stack and page-tables" (Muhammad
   Usama Anjum)

   Simplify and speed up kasan by removing its ineffective tagging of
   stacks and page tables

 - "mm/damon/reclaim,lru_sort: monitor all system rams by default"
   (SeongJae Park)

   Simplify deployment on diverse hardware like NUMA systems by updating
   DAMON_RECLAIM and DAMON_LRU_SORT to automatically monitor the
   physical address range covering all System RAM areas by default,
   replacing the overly restrictive behavior that only targeted the
   single largest memory block to save on negligible overhead

 - "mm/damon/sysfs: document filters/ directory as deprecated" (SeongJae
   Park)

   Update some DAMON docs

 - "mm: use spinlock guards for zone lock" (Dmitry Ilvokhin)

   Switch zone->lock handling over to using the guard() mechanisms

 - "mm/filemap: tighten mmap_miss hit accounting" (fujunjie)

   Fix a flaw where the mmap_miss counter over-credited page cache hits
   during fault-arounds and page-fault retries. This results in
   significant reduction of redundant synchronous mmap readahead I/O,
   drastically cutting down execution time and gigabytes read for sparse
   random or strided memory access workloads

 - "selftests/cgroup: Fix false positive failures in test_percpu_basic"
   (Li Wang)

   Fix a couple of false-positives in the cgroup kmem selftests

 - "mm/damon/reclaim: support monitoring intervals auto-tuning"
   (SeongJae Park)

   Add a new parameter to DAMON permitting DAMON_RECLAIM to
   automatically tune DAMON's sampling and aggregation intervals

 - "mm/damon/stat: add kdamond_pid parameter" (SeongJae Park)

   Change DAMON_STAT to provide the pid of its kdamond

 - "mm/kmemleak: dedupe verbose scan output" (Breno Leitao)

   Remove large amounts of duplicated backtraces from the verbose-mode
   kmemleak output

 - "mm: remove CONFIG_HAVE_BOOTMEM_INFO_NODE (Part 1)" (David
   Hildenbrand)

   Reduce our use of CONFIG_HAVE_BOOTMEM_INFO_NODE, with a view to
   removing it entirely in a later series

 - "mm/damon: validate min_region_size to be power of 2" (Liew Rui Yan)

   Prevent users from passing a non-power-of-2 value of `addr_unit', as
   this later results in undesirable behavior

 - "mm: document read_pages and simplify usage" (Frederick Mayle)

 - "tools/mm/page-types: Fix misc bugs" (Ye Liu)

   Fix three issues in tools/mm/page-types.c

 - "mm: misc cleanups from __GFP_UNMAPPED series" (Brendan Jackman)

   Implement several cleanups in the page allocator and related code

 - "mm, swap: swap table phase IV: unify allocation" (Kairui Song)

   Unify the allocation and charging of anon and shmem swap in folios,
   provides better synchronization, consolidates the metadata
   management, hence dropping the static array and map, and improves
   performance

 - "mm/damon: introduce data attributes monitoring" (SeongJae Park(

   Extend DAMON to monitor general data attributes other than accesses

 - "mm/vmalloc: free unused pages on vrealloc() shrink" (Shivam Kalra)

   Implement the TODO in vrealloc() to unmap and free unused pages when
   shrinking across a page boundary

 - "mm/damon: documentation and comment fixes" (niecheng)

 - "remove mmap_action success, error hooks" (Lorenzo Stoakes)

   Eliminate custom hooks from mmap_action by removing the problematic
   success_hook which allowed drivers to improperly access uninitialized
   VMAs. It replaces the error_hook with a simple error-code field and
   updates the memory char driver accordingly

 - "mm/damon: minor improvements for code readability and tests"
   (SeongJae Park)

 - "mm/damon: fix macro arguments and clarify quota goals doc" (Maksym
   Shcherba)

 - "userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c" (Mike
   Rapoport)

 - "mm/mglru: improve reclaim loop and dirty folio" (Kairui Song and
   others)

   Clean up and slightly improves MGLRU's reclaim loop and dirty
   writeback handling. Large performance improvements are measured

 - "use vma locks for proc/pid/{smaps|numa_maps} reads" (Suren
   Baghdasaryan)

   Use per-vma locks when reading /proc/pid/smaps and numa_maps similar
   to reduce contention on central mmap_lock

 - "refactors thpsize_shmem_enabled_store() and thpsize_shmem_enabled_show()"
   (Ran Xiaokai)

   Some cleanup work in the THP code

 - "selftests/memfd: fix compilation warnings" (Konstantin Khorenko)

   Fix a few build glitches in the memfd selftest code.

 - "memcg: shrink obj_stock_pcp and cache multiple objcgs" (Shakeel
   Butt)

   Resolve a 68% performance regression caused by NUMA-node cache
   thrashing around struct obj_stock_pcp by shrinking its existing
   fields and expanding it into a multi-slot array that caches up to
   five obj_cgroup pointers per CPU, allowing per-node variants of the
   same memcg to coexist within a single 64-byte cache line.

 - "zram: writeback fixes" (Sergey Senozhatsky)

   address a couple of unrelated zram writeback issues

 - "mm: switch THP shrinker to list_lru" (Johannes Weiner)

   Resolve NUMA-awareness issues and streamlines callsite interaction by
   refactoring and extending the list_lru API to completely replace the
   complex, open-coded deferred split queue for Transparent Huge Pages

 - "mm: improve large folio readahead for exec memory" (Usama Arif)

   Improve large-folio readahead on systems like 64K-page arm64 by
   preventing the mmap_miss check from permanently disabling
   target-oriented VM_EXEC readahead, and by generalizing the
   force_thp_readahead gate to support mappings with any usefully large
   maximum folio order under the cache cap.

 - "userfaultfd/pagemap: pre-existing fixes" (Kiryl Shutsemau)

   Fix a bunch of minor issues in the userfaultfd/pagemap, all of which
   were flagged by Sashiko review of proposed new material

 - "mm/sparse-vmemmap: Provide generic vmemmap_set_pmd() and
   vmemmap_check_pmd()" (Muchun Song)

   Provide generic versions of these two functions so the four
   arch-specific implementations can be removed.

 - "mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap
   device" (Youngjun Park)

   Address a uswsusp-vs-swapoff race and reduces the swap device
   reference taking/releasing frequency.

 - "mm/hmm: A fix and a selftest" (Dev Jain)

* tag 'mm-stable-2026-06-18-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  selftests/mm/hmm-tests: test pagemap reads of PMD device-private entries
  fs/proc/task_mmu: do not warn on seeing non-migration pmd entry
  lib/test_hmm: check alloc_page_vma() return value and handle OOM
  mm/compaction: cap compact_gap() at COMPACT_CLUSTER_MAX
  mm/swap: remove redundant swap device reference in alloc/free
  mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap device
  mm/filemap: use folio_next_index() for start
  vmalloc: fix NULL pointer dereference in is_vm_area_hugepages()
  sparc/mm: drop vmemmap_check_pmd helper and use generic code
  loongarch/mm: drop vmemmap_check_pmd helper and use generic code
  riscv/mm: drop vmemmap_pmd helpers and use generic code
  arm64/mm: drop vmemmap_pmd helpers and use generic code
  mm/sparse-vmemmap: provide generic vmemmap_set_pmd() and vmemmap_check_pmd()
  rust: page: mark Page::nid as inline
  userfaultfd: build __VMA_UFFD_FLAGS from config-gated masks
  userfaultfd: gate must_wait writability check on pte_present()
  mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade
  fs/proc/task_mmu: fix hugetlb self-deadlock in pagemap_scan_pte_hole()
  fs/proc/task_mmu: use huge_page_size() in pagemap_scan_hugetlb_entry()
  fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race
  ...

Merge tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

2026-06-16T07:32:47+00:00

Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
     - Per-controller admin and IO timeout sysfs attributes, and
       letting the block layer set request timeouts (Maurizio,
       Maximilian)
     - Multipath passthrough iostats, and PCI P2PDMA enablement for
       multipath devices (Keith, Kiran)
     - A new diag sysfs attribute group exporting per-controller
       counters (retries, multipath failover, error counters, requeue
       and failure counts, reset and reconnect events) (Nilay)
     - FDP configuration validation and bounds check fixes (liuxixin)
     - Various nvmet fixes, including a pre-auth out-of-bounds read in
       the Discovery Get Log Page handler, auth payload bounds
       validation, and tcp error-path leak fixes (Bryam, Tianchu,
       Geliang)
     - nvme-tcp lockdep and workqueue fixes (Shin'ichiro, Kuniyuki,
       Eric)
     - Assorted other fixes and cleanups (John, Yao, Chao, Mateusz,
       Achkinazi, Wentao)

 - MD pull request via Yu Kuai:
     - raid1/raid10 fixes for a deadlock in the read error recovery
       path, error-path detection and bio accounting with cloned bios,
       and an nr_pending leak in the REQ_ATOMIC bad-block error path
       (Abd-Alrhman)
     - PCI P2PDMA propagation from member devices to the RAID device
       (Kiran)
     - dm-raid bio requeue fix, and various smaller fixes and cleanups
       (Benjamin, Chen, Li, Thorsten)

 - Enable Clang lock context analysis for the block layer, with the
   accompanying annotations across queue limits, the blk_holder_ops
   callbacks, crypto, cgroup, iocost, kyber and mq-deadline (Bart)

 - Block status code infrastructure work: a tagged status table, a
   str_to_blk_op() helper, a bio_endio_status() helper, and on top of
   that a new configurable block-layer error injection facility
   (Christoph)

 - DRBD netlink rework, replacing the genl_magic machinery with explicit
   netlink serialization and moving the DRBD UAPI headers to
   include/uapi/linux/ (Christoph Böhmwalder)

 - bvec improvements: a bvec_folio() helper and making the bvec_iter
   helpers proper inline functions (Willy, Christoph)

 - ublk cleanups and a canceling-flag fix for the disk-not-allocated
   case (Caleb, Ming)

 - Partition handling fixes: bound the AIX pp_count scan, fix an of_node
   refcount leak, and replace __get_free_page() with kmalloc() (Bryam,
   Wentao, Mike)

 - Convert numa_node to int in blk_mq_hw_ctx and ->init_request, and add
   WQ_PERCPU to the block workqueue users (Mateusz, Marco)

 - Block statistics and tracing: propagate in-flight to the whole disk
   on partition IO, export passthrough stats, and a new
   block_rq_tag_wait tracepoint (Tang, Keith, Aaron)

 - A round of removals, unexports and cleanups across bio, direct-io and
   the bvec helpers (Christoph)

 - Various driver fixes (mtip32xx use-after-free, rbd snap_count
   validation and strscpy conversion, nbd socket lockdep reclassify,
   virtio-blk zone report clamp, floppy) and a batch of MAINTAINERS
   email/list updates (Coly, Li, Yu, Christoph Böhmwalder)

 - Other little fixes and cleanups all over

* tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (117 commits)
  MAINTAINERS: Update Coly Li's email address
  block: check bio split for unaligned bvec
  nbd: Reclassify sockets to avoid lockdep circular dependency
  block: add configurable error injection
  block: add a str_to_blk_op helper
  block: add a "tag" for block status codes
  block: add a macro to initialize the status table
  floppy: Drop unused pnp driver data
  block: propagate in_flight to whole disk on partition I/O
  virtio-blk: clamp zone report to the report buffer capacity
  block: optimize I/O merge hot path with unlikely() hints
  drivers/block/rbd: Use strscpy() to copy strings into arrays
  partitions: aix: bound the pp_count scan to the ppe array
  block: Enable lock context analysis
  block/mq-deadline: Make the lock context annotations compatible with Clang
  block/Kyber: Make the lock context annotations compatible with Clang
  block/blk-mq-debugfs: Improve lock context annotations
  block/blk-iocost: Inline iocg_lock() and iocg_unlock()
  block/blk-iocost: Split ioc_rqos_throttle()
  block/crypto: Annotate the crypto functions
  ...

nbd: Reclassify sockets to avoid lockdep circular dependency

2026-06-13T12:34:34+00:00

syzbot reported a possible circular locking dependency in udp_sendmsg()
where fs_reclaim can be triggered while holding sk_lock, and fs_reclaim
can eventually depend on another sk_lock (e.g., if NBD is used for swap
or writeback and NBD uses TLS/TCP which acquires sk_lock).

Since the UDP socket and the NBD TCP/TLS socket are different, this is a
false positive. Fix this by reclassifying NBD sockets to a separate lock
class when they are added to the NBD device.

This is similar to what nvme-tcp and other network block devices do.

Fixes: ffa1e7ada456 ("block: Make request_queue lockdep splats show up earlier")
Reported-by: syzbot+607cdcf978b3e79da878@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a2cdafe.428ffe26.258b27.0161.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet 
Link: https://patch.msgid.link/20260613042619.1108126-1-edumazet@google.com
Signed-off-by: Jens Axboe

floppy: Drop unused pnp driver data

2026-06-10T12:23:51+00:00

The pnp_device_id array is only used for module data to support
auto-loading the floppy module. So the .driver_data member is unused and
this assignment can be dropped.

While touching that array, align the coding style to what is used most
for these.

This patch doesn't modify the compiled array, only its representation
in source form benefits. The former was confirmed with x86 and arm64
builds.

Signed-off-by: Uwe Kleine-König (The Capable Hub) 
Reviewed-by: Denis Efremov (Oracle) 
Link: https://patch.msgid.link/99dbf851ffb99229ea1dcfd8f58e9ee6a1f05349.1781075967.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jens Axboe

virtio-blk: clamp zone report to the report buffer capacity

2026-06-09T16:01:50+00:00

virtblk_report_zones() trusts the device-reported number of zones when
walking the report buffer:

	nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
		   nr_zones);
	...
	for (i = 0; i < nz && zone_idx < nr_zones; i++) {
		ret = virtblk_parse_zone(vblk, &report->zones[i], ...);

The buffer is allocated by virtblk_alloc_report_buffer(), whose size is
capped by the queue's max hardware sectors and max segments and can
therefore hold fewer descriptors than nr_zones. nz is bounded only by
the device-supplied report->nr_zones and the requested nr_zones, never
by the buffer's descriptor capacity. At probe time the request count is
unbounded (blk_revalidate_disk_zones() calls report_zones() with
nr_zones == UINT_MAX), so the device-supplied report->nr_zones is the
sole gate: a device that reports more zones than fit in the buffer
drives the loop to read report->zones[i] past the end of the allocation.

A malicious or buggy virtio-blk device that reports an inflated nr_zones
triggers this during zone revalidation at probe. KASAN reports a
vmalloc-out-of-bounds read in virtblk_report_zones() against the report
buffer allocated a few lines earlier.

Clamp nz to the number of descriptors that actually fit in the report
buffer.

Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
Link: https://patch.msgid.link/20260607124834.3059944-1-michael.bommarito@gmail.com
Signed-off-by: Jens Axboe

zram: drop unused bio parameter from write helpers

2026-06-09T01:21:27+00:00

After "zram: fix use-after-free in zram_bvec_write_partial()",
zram_bvec_write_partial() always passes NULL to zram_read_page() and no
longer needs the parent bio.  Mirror the read side
(zram_bvec_read_partial() has not taken a bio since commit 4e3c87b9421d
("zram: fix synchronous reads")) and drop the parameter from
zram_bvec_write_partial() and zram_bvec_write().

No functional change.

Link: https://lore.kernel.org/20260528-zram-v3-2-cab86eef8764@gmail.com
Signed-off-by: Cunlong Li 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Sergey Senozhatsky 
Cc: Jens Axboe 
Cc: Minchan Kim 
Cc: Yisheng Xie 
Signed-off-by: Andrew Morton

drivers/block/rbd: Use strscpy() to copy strings into arrays

2026-06-08T13:45:56+00:00

Replacing strcpy() with strscpy() ensures than overflow of the target
buffer cannot happen.

Signed-off-by: David Laight 
Reviewed-by: Alex Elder 
Link: https://patch.msgid.link/20260606202744.5113-5-david.laight.linux@gmail.com
Signed-off-by: Jens Axboe

zram: clear trailing bytes of compressed writeback pages

2026-06-04T21:45:07+00:00

Patch series "zram: writeback fixes", v2.

Brian (privately) reported a "leak" of writeback bitmap in certain cases,
so that backing device can store less pages; and a theoretical data leak
in the trailing bytes of compressed writeback pages.  Both issues are low
risk.


This patch (of 2):

When compressed writeback is available writtenback pages contain "garbage"
in PAGE_SIZE - obj_size trailing bytes.  That "garbage" is, basically,
whatever data that page held before we got it for writeback.  To get
advantage of it an attacker needs to be able to read from active backing
swap device, which is already catastrophic.  Still, just in case, zero out
those trailing bytes before writeback to a backing device so that we only
store swap-ed out data there.

Link: https://lore.kernel.org/20260526022754.2377730-1-senozhatsky@chromium.org
Link: https://lore.kernel.org/20260526022754.2377730-3-senozhatsky@chromium.org
Fixes: d38fab605c66 ("zram: introduce compressed data writeback")
Signed-off-by: Sergey Senozhatsky 
Suggested-by: Brian Geffon 
Cc: Jens Axboe 
Cc: Minchan Kim 
Cc: Richard Chang 
Signed-off-by: Andrew Morton

zram: do not leak blk idx at the end of writeback

2026-06-04T21:45:07+00:00

zram_writeback_slots() loop can terminate with valid reserved backing
device blk_idx.  The problem is that cleanup code doesn't release that
reserved blk_idx before zram_writeback_slots() returns, which leads to
blk_idx leak (it becomes permanently busy and can not be used for actual
writeback.) This does not lead to any system instabilities, it only means
that we can writeback less pages.  The scenario is hard to hit in practice
as it requires writeabck to race with modification (slot-free or
overwrite) of the final post-processing slot.

Release reserved but unused blk_idx before returning from
zram_writeback_slots().

Link: https://lore.kernel.org/20260526022754.2377730-2-senozhatsky@chromium.org
Fixes: f405066a1f0db ("zram: introduce writeback bio batching")
Signed-off-by: Sergey Senozhatsky 
Suggested-by: Brian Geffon 
Cc: Jens Axboe 
Cc: Minchan Kim 
Cc: Richard Chang 
Signed-off-by: Andrew Morton

zram: fix use-after-free in zram_bvec_write_partial()

2026-06-03T23:25:50+00:00

zram_read_page() picks the sync or async backing device read path based on
whether the parent bio is NULL.  zram_bvec_write_partial() passes its
parent bio down, so for ZRAM_WB slots the read is dispatched
asynchronously and zram_read_page() returns 0 while the bio is still in
flight.  The caller then runs memcpy_from_bvec(), zram_write_page() and
__free_page() on the buffer, leaving the async read to write into a freed
page.

zram_bvec_read_partial() was switched to NULL in commit 4e3c87b9421d
("zram: fix synchronous reads") for the same reason; the write_partial
counterpart was missed.

Link: https://lore.kernel.org/20260528-zram-v3-1-cab86eef8764@gmail.com
Fixes: 8e654f8fbff5 ("zram: read page from backing device")
Reviewed-by: Christoph Hellwig 
Reviewed-by: Sergey Senozhatsky 
Signed-off-by: Cunlong Li 
Cc: Jens Axboe 
Cc: Minchan Kim 
Cc: Yisheng Xie 
Cc: 
Signed-off-by: Andrew Morton