linux-toradex.git/drivers/block/zloop.c, branch v7.0-rc7

Merge tag 'block-7.0-20260227' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

2026-02-27T18:42:02+00:00

Pull block fixes from Jens Axboe:
 "Two sets of fixes, one for drbd, and one for the zoned loop driver"

* tag 'block-7.0-20260227' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  zloop: check for spurious options passed to remove
  zloop: advertise a volatile write cache
  drbd: fix null-pointer dereference on local read error
  drbd: Replace deprecated strcpy with strscpy
  drbd: fix "LOGIC BUG" in drbd_al_begin_io_nonblock()

zloop: check for spurious options passed to remove

2026-02-24T21:50:18+00:00

Zloop uses a command option parser for all control commands,
but most options are only valid for adding a new device.  Check
for incorrectly specified options in the remove handler.

Fixes: eb0570c7df23 ("block: new zoned loop block device driver")
Signed-off-by: Christoph Hellwig 
Reviewed-by: Damien Le Moal 
Signed-off-by: Jens Axboe

zloop: advertise a volatile write cache

2026-02-24T21:50:18+00:00

Zloop is file system backed and thus needs to sync the underlying file
system to persist data.  Set BLK_FEAT_WRITE_CACHE so that the block
layer actually send flush commands, and fix the flush implementation
as sync_filesystem requires s_umount to be held and the code currently
misses that.

Fixes: eb0570c7df23 ("block: new zoned loop block device driver")
Signed-off-by: Christoph Hellwig 
Reviewed-by: Damien Le Moal 
Signed-off-by: Jens Axboe

Convert 'alloc_flex' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook

zloop: use READ_ONCE() to read lo->lo_state in queue_rq path

2025-12-15T16:32:42+00:00

In the queue_rq path, zlo->state is accessed without locking, and direct
access may read stale data. This patch uses READ_ONCE() to read
zlo->state and data_race() to silence code checkers, and changes all
assignments to use WRITE_ONCE().

Reviewed-by: Damien Le Moal 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Yongpeng Yang 
Signed-off-by: Jens Axboe

blk-mq: add blk_rq_nr_bvec() helper

2025-12-04T14:19:26+00:00

Add a new helper function blk_rq_nr_bvec() that returns the number of
bvecs in a request. This count represents the number of iterations
rq_for_each_bvec() would perform on a request.

Drivers need to pre-allocate bvec arrays before iterating through
a request's bvecs. Currently, they manually count bvecs using
rq_for_each_bvec() in a loop, which is repetitive. The new helper
centralizes this logic.

This pattern exists in loop and zloop drivers, where multi-bio requests
require copying bvecs into a contiguous array before creating
an iov_iter for file operations.

Update loop and zloop drivers to use the new helper, eliminating
duplicate code.

This patch also provides a clear API to avoid any potential misuse of
blk_nr_phys_segments() for calculating the bvecs since, one bvec can
have more than one segments and use of blk_nr_phys_segments() can
lead to extra memory allocation :-

[ 6155.673749] nullb_bio: 128K bio as ONE bvec: sector=0, size=131072
[ 6155.673846] null_blk: #### null_handle_data_transfer:1375
[ 6155.673850] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=2
[ 6155.674263] null_blk: #### null_handle_data_transfer:1375
[ 6155.674267] null_blk: nr_bvec=1 blk_rq_nr_phys_segments=1

Reviewed-by: Niklas Cassel 
Signed-off-by: Chaitanya Kulkarni 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jens Axboe

zloop: clear nowait flag in workqueue context

2025-11-20T14:42:32+00:00

The zloop driver advertises REQ_NOWAIT support through BLK_FEAT_NOWAIT
(enabled by default for all blk-mq devices), and honors the nowait
behavior throughout zloop_queue_rq().

However, actual I/O to the backing file is performed in a workqueue,
where blocking is allowed.

To avoid imposing unnecessary non-blocking constraints in this blocking
context, clear the REQ_NOWAIT flag before processing the request in the
workqueue context.

Signed-off-by: Chaitanya Kulkarni 
Reviewed-by: Damien Le Moal 
Signed-off-by: Jens Axboe

zloop: fix zone append check in zloop_rw()

2025-11-19T14:39:42+00:00

While commit cf28f6f923cb ("zloop: fail zone append operations that are
targeting full zones") added a check in zloop_rw() that a zone append is
not issued to a full zone, commit e3a96ca90462 ("zloop: simplify checks
for writes to sequential zones") inadvertently removed the check to
verify that there is enough unwritten space in a zone for an incoming
zone append opration.

Re-add this check in zloop_rw() to make sure we do not write beyond the
end of a zone. Of note is that this same check is already present in the
function zloop_set_zone_append_sector() when ordered zone append is in
use.

Reported-by: Hans Holmberg 
Fixes: e3a96ca90462 ("zloop: simplify checks for writes to sequential zones")
Signed-off-by: Damien Le Moal 
Reviewed-by: Hans Holmberg 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jens Axboe

zloop: introduce the ordered_zone_append configuration parameter

2025-11-17T16:40:09+00:00

The zone append operation processing for zloop devices is similar to any
other command, that is, the operation is processed as a command work
item, without any special serialization between the work items (beside
the zone mutex for mutually exclusive code sections).

This processing is fine and gives excellent performance. However, it has
a side effect: zone append operation are very often reordered and
processed in a sequence that is very different from their issuing order
by the user. This effect is very visible using an XFS file system on top
of a zloop device. A simple file write leads to many file extents as the
data writes using zone append are reordered and so result in the
physical order being different than the file logical order.
E.g. executing:

$ dd if=/dev/zero of=/mnt/test bs=1M count=10 && sync
$ xfs_bmap /mnt/test
/mnt/test:
	0: [0..4095]: 2162688..2166783
	1: [4096..6143]: 2168832..2170879
	2: [6144..8191]: 2166784..2168831
	3: [8192..10239]: 2170880..2172927
	4: [10240..12287]: 2174976..2177023
	5: [12288..14335]: 2172928..2174975
	6: [14336..20479]: 2177024..2183167

For 10 IOs, 6 extents are created.

This is fine and actually allows to exercise XFS zone garbage collection
very well. However, this also makes debugging/working on XFS data
placement harder as the underlying device will most of the time reorder
IOs, resulting in many file extents.

Allow a user to mitigate this with the new ordered_zone_append
configuration parameter. For a zloop device created with this parameter
specified, the sector of a zone append command is set early, when the
command is submitted by the block layer with the zloop_queue_rq()
function, instead of in the zloop_rw() function which is exectued later
in the command work item context. This change ensures that more often
than not, zone append operations data end up being written in the same
order as the command submission by the user.

In the case of XFS, this leads to far less file data extents. E.g., for
the previous example, we get a single file data extent for the written
file.

$ dd if=/dev/zero of=/mnt/test bs=1M count=10 && sync
$ xfs_bmap /mnt/test
/mnt/test:
	0: [0..20479]: 2162688..2183167

Since we cannot use a mutex in the context of the zloop_queue_rq()
function to atomically set a zone append operation sector to the target
zone write pointer location and increment that the write pointer, a new
per-zone spinlock is introduced to protect a zone write pointer access
and modifications. To check a zone write pointer location and set a zone
append operation target sector to that value, the function
zloop_set_zone_append_sector() is introduced and called from
zloop_queue_rq().

Signed-off-by: Damien Le Moal 
Signed-off-by: Jens Axboe