summaryrefslogtreecommitdiff
path: root/io_uring
AgeCommit message (Collapse)Author
2025-11-13io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRLPavel Begunkov
It'll be annoying and take enough of boilerplate code to implement new zcrx features as separate io_uring register opcode. Introduce IORING_REGISTER_ZCRX_CTRL that will multiplex such calls to zcrx. Note, there are no real users of the opcode in this patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring/zcrx: elide passing msg flagsPavel Begunkov
zcrx sqe->msg_flags has never been defined and checked to be zero. It doesn't need to be a MSG_* bitmask. Keep them undefined, don't mix with MSG_DONTWAIT, and don't pass into io_zcrx_recv() as it's ignored anyway. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring/zcrx: use folio_nr_pages() instead of shift operationPedro Demarchi Gomes
folio_nr_pages() is a faster helper function to get the number of pages when NR_PAGES_IN_LARGE_FOLIO is enabled. Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring/zcrx: convert to use netmem_descPavel Begunkov
Convert zcrx to struct netmem_desc, and use struct net_iov::desc to access its fields instead of struct net_iov inner union alises. zcrx only directly reads niov->pp, so with this patch it doesn't depend on the union anymore. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Byungchul Park <byungchul@sk.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring/query: introduce rings info queryPavel Begunkov
Same problem as with zcrx in the previous patch, the user needs to know SQ/CQ header sizes to allocated memory before setup to use it for user provided rings, i.e. IORING_SETUP_NO_MMAP, however that information is only returned after registration, hence the user is guessing kernel implementation details. Return the header size and alignment, which is split with the same motivation, to allow the user to know the real structure size without alignment in case there will be more flexible placement schemes in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring/query: introduce zcrx queryPavel Begunkov
Add a new query type IO_URING_QUERY_ZCRX returning the user some basic information about the interface, which includes allowed flags for areas and registration and supported IORING_REGISTER_ZCRX_CTRL subcodes. There is also a chicken-egg problem with user provided refill queue memory, where offsets and size information is returned after registration, but to properly allocate memory you need to know it beforehand, which is why the userspace currently has to guess the RQ headers size and severely overestimates it. Return the size information. It's split into "size" and "alignment" fields because for default placement modes the user is interested in the aligned size, however if it gets support for more flexible placement, it'll need to only know the actual header size. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: move cq/sq user offset init aroundPavel Begunkov
Move user SQ/CQ offset initialisation at the end of io_prepare_config() where it already calculated all information to set it properly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: pre-calculate scq layoutPavel Begunkov
Move ring layouts calculations into io_prepare_config(), so that more misconfiguration checking can be done earlier before creating a ctx. It also deduplicates some code with ring resizing. And as a bonus, now it initialises params->sq_off.array, which is closer to all other user offset init, and also applies it to ring resizing, which was previously missing it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: keep ring laoyut in a structurePavel Begunkov
Add a structure keeping SQ/CQ sizes and offsets. For now it only records data previously returned from rings_size and the SQ size. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: introduce struct io_ctx_configPavel Begunkov
There will be more information needed during ctx setup, and instead of passing a handful of pointers around, wrap them all into a new structure. Add a helper for encapsulating all configuration checks and preparation, that's also reused for ring resizing. Note, it indirectly adds a io_uring_sanitise_params() check to ring resizing, which is a good thing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: convert params to pointer in ring reiszePavel Begunkov
The parameters in io_register_resize_rings() will be moved into another structure in a later patch. In preparation to that, convert the params variable it to a pointer, but still store the data on stack. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: use size_add helpers for ring offsetsPavel Begunkov
Use size_add / size_mul set of functions for rings_size() calculations. It's more consistent with struct_size(), and errors are preserved across a series of calculations, so intermediate result checks can be omitted. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13io_uring: refactor rings_size nosqarray handlingPavel Begunkov
A preparation patch inversing the IORING_SETUP_NO_SQARRAY check, this way there is only one successful return path from the function, which will be helpful later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-13Merge branch 'io_uring-6.18' into for-6.19/io_uringJens Axboe
Merge 6.18-rc io_uring fixes, as certain coming changes depend on some of these. * io_uring-6.18: io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_uring/query: return number of available queries io_uring/rw: ensure allocated iovec gets cleared for early failure io_uring: fix regbuf vector size truncation io_uring: fix types for region size calulation io_uring/zcrx: remove sync refill uapi io_uring: fix buffer auto-commit for multishot uring_cmd io_uring: correct __must_hold annotation in io_install_fixed_file io_uring zcrx: add MAINTAINERS entry io_uring: Fix code indentation error io_uring/sqpoll: be smarter on when to update the stime usage io_uring/sqpoll: switch away from getrusage() for CPU accounting io_uring: fix incorrect unlikely() usage in io_waitid_prep() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-12io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecsCaleb Sander Mateos
io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as the number of bvecs in the request. However, bvecs may be split into multiple segments depending on the queue limits. Thus, the number of segments may overestimate the number of bvecs. For ublk devices, the only current users of io_buffer_register_bvec(), virt_boundary_mask, seg_boundary_mask, max_segments, and max_segment_size can all be set arbitrarily by the ublk server process. Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec() loop actually yields. However, continue using blk_rq_nr_phys_segments() as an upper bound on the number of bvecs when allocating imu to avoid needing to iterate the bvecs a second time. Link: https://lore.kernel.org/io-uring/20251111191530.1268875-1-csander@purestorage.com/ Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 27cb27b6d5ea ("io_uring: add support for kernel registered bvecs") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring: move flags check to io_uring_sanitise_paramsPavel Begunkov
io_uring_sanitise_params() sanitises most of the setup flags invariants, move the IORING_SETUP_FLAGS check from io_uring_setup() into it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring: use mem_is_zero to check ring paramsPavel Begunkov
mem_is_zero() does the job without hand rolled loops, use that to verify reserved fields of ring params. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring: pass sq entries in the params structPavel Begunkov
There is no need to pass the user requested number of SQ entries separately from the main parameter structure io_uring_params. Initialise it at the beginning and stop passing it in favour of struct io_uring_params::sq_entries. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring: add helper calculating region byte sizePavel Begunkov
There has been type related issues with region size calculation, add an utility helper function that returns the size and handles type conversions right. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring/query: buffer size calculations with a unionPavel Begunkov
Instead of having an array of a calculated size as a buffer, put all query uapi structures into a union and pass that around. That way everything is well typed, and the compiler will prevent opcode handling using a structure not accounted into the buffer size. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11io_uring/zcrx: call netdev_queue_get_dma_dev() under instance lockDavid Wei
netdev ops must be called under instance lock or rtnl_lock, but io_register_zcrx_ifq() isn't doing this for netdev_queue_get_dma_dev(). Fix this by taking the instance lock using netdev_get_by_index_lock(). Extended the instance lock section to include attaching a memory provider. Could not move io_zcrx_create_area() outside, since the dmabuf codepath IORING_ZCRX_AREA_DMABUF requires ifq->dev. Fixes: 59b8b32ac8d4 ("io_uring/zcrx: add support for custom DMA devices") Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-10io_uring/query: return number of available queriesPavel Begunkov
It's useful to know which query opcodes are available. Extend the structure and return that. It's a trivial change, and even though it can be painlessly extended later, it'd still require adding a v2 of the structure. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-10io_uring/rw: ensure allocated iovec gets cleared for early failureJens Axboe
A previous commit reused the recyling infrastructure for early cleanup, but this is not enough for the case where our internal caches have overflowed. If this happens, then the allocated iovec can get leaked if the request is also aborted early. Reinstate the previous forced free of the iovec for that situation. Cc: stable@vger.kernel.org Reported-by: syzbot+3c93637d7648c24e1fd0@syzkaller.appspotmail.com Tested-by: syzbot+3c93637d7648c24e1fd0@syzkaller.appspotmail.com Fixes: 9ac273ae3dc2 ("io_uring/rw: use io_rw_recycle() from cleanup path") Link: https://lore.kernel.org/io-uring/69122a59.a70a0220.22f260.00fd.GAE@google.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-07io_uring: fix regbuf vector size truncationPavel Begunkov
There is a report of io_estimate_bvec_size() truncating the calculated number of segments that leads to corruption issues. Check it doesn't overflow "int"s used later. Rough but simple, can be improved on top. Cc: stable@vger.kernel.org Fixes: 9ef4cbbcb4ac3 ("io_uring: add infra for importing vectored reg buffers") Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-458654612@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Günther Noack <gnoack@google.com> Tested-by: Günther Noack <gnoack@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring: use WRITE_ONCE for user shared memoryPavel Begunkov
IORING_SETUP_NO_MMAP rings remain user accessible even before the ctx setup is finalised, so use WRITE_ONCE consistently when initialising rings. Fixes: 03d89a2de25bb ("io_uring: support for user allocated memory for rings/sqes") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/zcrx: reverse ifq refcountDavid Wei
Add a refcount to struct io_zcrx_ifq to reverse the refcounting relationship i.e. rings now reference ifqs instead. As a result of this, remove ctx->refs that an ifq holds on a ring via the page pool memory provider. This ref ifq->refs is held by internal users of an ifq, namely rings and the page pool memory provider associated with an ifq. This is needed to keep the ifq around until the page pool is destroyed. Since ifqs now no longer hold refs to ring ctx, there isn't a need to split the cleanup of ifqs into two: io_shutdown_zcrx_ifqs() in io_ring_exit_work() while waiting for ctx->refs to drop to 0, and io_unregister_zcrx_ifqs() after. Remove io_shutdown_zcrx_ifqs(). Signed-off-by: David Wei <dw@davidwei.uk> Co-developed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/zcrx: move io_unregister_zcrx_ifqs() downDavid Wei
In preparation for removing the ref on ctx->refs held by an ifq and removing io_shutdown_zcrx_ifqs(), move io_unregister_zcrx_ifqs() down such that it can call io_zcrx_scrub(). Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/zcrx: add user_struct and mm_struct to io_zcrx_ifqDavid Wei
In preparation for removing ifq->ctx and making ifq lifetime independent of ring ctx, add user_struct and mm_struct to io_zcrx_ifq. In the ifq cleanup path, these are the only fields used from the main ring ctx to do accounting. Taking a copy in the ifq allows ifq->ctx to be removed later, including the ctx->refs held by the ifq. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/zcrx: add io_zcrx_ifq arg to io_zcrx_free_area()David Wei
Add io_zcrx_ifq arg to io_zcrx_free_area(). A QOL change to reduce line widths. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/rsrc: refactor io_{un}account_mem() to take {user,mm}_struct paramDavid Wei
Refactor io_{un}account_mem() to take user_struct and mm_struct directly, instead of accessing it from the ring ctx. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/memmap: refactor io_free_region() to take user_struct paramDavid Wei
Refactor io_free_region() to take user_struct directly, instead of accessing it from the ring ctx. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-06io_uring/memmap: remove unneeded io_ring_ctx argDavid Wei
Remove io_ring_ctx arg from io_region_pin_pages() and io_region_allocate_pages() that isn't used. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-05io_uring/futex: move futexv owned status to struct io_futexv_dataJens Axboe
Free up a bit of space in the shared futex opcode private data, by moving the futexv specific futexv_owned out of there and into the struct specific to vectored futexes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-05io_uring/futex: move futexv async data handling to struct io_futexv_dataJens Axboe
Rather than alloc an array of struct futex_vector for the futexv wait handling, wrap it in a struct io_futexv_data struct, similar to what the non-vectored futex wait handling does. No functional changes in this patch. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-05io_uring: fix types for region size calulationPavel Begunkov
->nr_pages is int, it needs type extension before calculating the region size. Fixes: a90558b36ccee ("io_uring/memmap: helper for pinning region pages") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> [axboe: style fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-05io_uring: fix typos and comment wordingAlok Tiwari
Corrected spelling mistakes in comments "reuqests" -> "requests", "noifications" -> "notifications", "seperately" -> "separately"). Fixed a small grammar issue ("then" -> "than"). Updated "flag" -> "flags" in fdinfo.c Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04io_uring/memmap: return bool from io_mem_alloc_compound()Caleb Sander Mateos
io_mem_alloc_compound() returns either ERR_PTR(-ENOMEM) or a virtual address for the allocated memory, but its caller just checks whether the result is an error. Return a bool success value instead. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04io_uring/cancel: move cancelation code from io_uring.c to cancel.cJens Axboe
There's a bunch of code strictly dealing with cancelations, and that code really belongs in cancel.c rather than in the core io_uring.c file. Move the code there. Mostly mechanical, only real oddity here is that struct io_defer_entry now needs to be visible across both io_uring.c and cancel.c. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04io_uring/cancel: move __io_uring_cancel() into cancel.cJens Axboe
Yet another function that should be in cancel.c, move it over. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04io_uring/cancel: move request/task cancelation logic into cancel.cJens Axboe
Move io_match_task_safe() and helpers into cancel.c, where it belongs. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04io_uring/memmap: remove dead io_create_region_mmap_safe() declarationJens Axboe
No longer used and doesn't even exist, kill it from the memmap header file. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring/rsrc: use get/put_user() for integer copyJens Axboe
It's just getting an integer from userspace, installing a file, then copying the output direct descriptor back. No need to use the full copy_to/from_user() for that. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring/slist: remove unused wq list splice helpersJens Axboe
Nobody is using those helpers anymore, get rid of them. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring/zcrx: remove sync refill uapiPavel Begunkov
There is a better way to handle the problem IORING_REGISTER_ZCRX_REFILL solves. The uapi can also be slightly adjusted to accommodate future extensions. Remove the feature for now, it'll be reworked for the next release. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring/uring_cmd: avoid double indirect call in task work dispatchCaleb Sander Mateos
io_uring task work dispatch makes an indirect call to struct io_kiocb's io_task_work.func field to allow running arbitrary task work functions. In the uring_cmd case, this calls io_uring_cmd_work(), which immediately makes another indirect call to struct io_uring_cmd's task_work_cb field. Change the uring_cmd task work callbacks to functions whose signatures match io_req_tw_func_t. Add a function io_uring_cmd_from_tw() to convert from the task work's struct io_tw_req argument to struct io_uring_cmd *. Define a constant IO_URING_CMD_TASK_WORK_ISSUE_FLAGS to avoid manufacturing issue_flags in the uring_cmd task work callbacks. Now uring_cmd task work dispatch makes a single indirect call to the uring_cmd implementation's callback. This also allows removing the task_work_cb field from struct io_uring_cmd, freeing up 8 bytes for future storage. Since fuse_uring_send_in_task() now has access to the io_tw_token_t, check its cancel field directly instead of relying on the IO_URING_F_TASK_DEAD issue flag. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring: add wrapper type for io_req_tw_func_t argCaleb Sander Mateos
In preparation for uring_cmd implementations to implement functions with the io_req_tw_func_t signature, introduce a wrapper struct io_tw_req to hide the struct io_kiocb * argument. The intention is for only the io_uring core to access the inner struct io_kiocb *. uring_cmd implementations should instead call a helper from io_uring/cmd.h to convert struct io_tw_req to struct io_uring_cmd *. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-03io_uring: only call io_should_terminate_tw() once for ctxCaleb Sander Mateos
io_fallback_req_func() calls io_should_terminate_tw() on each req's ctx. But since the reqs all come from the ctx's fallback_llist, req->ctx will be ctx for all of the reqs. Therefore, compute ts.cancel as io_should_terminate_tw(ctx) just once, outside the loop. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-30io_uring/fdinfo: validate opcode before checking if it's an 128b oneJens Axboe
The mixed SQE support assumes that userspace always passes valid data, that is not the case. Validate the opcode properly before indexing the io_issue_defs[] array, and pass it through the nospec indexing as well as it's a user valid indexing a kernel array. Fixes: 1cba30bf9fdd ("io_uring: add support for IORING_SETUP_SQE_MIXED") Reported-by: syzbot+b883b008a0b1067d5833@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-27io_uring/fdinfo: cap SQ iteration at max SQ entriesJens Axboe
A previous commit changed the logic around how SQ entries are iterated, and as a result, had a few bugs. One is that it fully trusts the SQ head and tail, which are user exposed. Another is that it fails to increment the SQ head if the SQ index is out of range. Fix both of those up, reverting to the previous logic of how to iterate SQ entries. Link: https://lore.kernel.org/io-uring/68ffdf18.050a0220.3344a1.039e.GAE@google.com/ Fixes: 1cba30bf9fdd ("io_uring: add support for IORING_SETUP_SQE_MIXED") Reported-by: syzbot+10a9b495f54a17b607a6@syzkaller.appspotmail.com Tested-by: syzbot+10a9b495f54a17b607a6@syzkaller.appspotmail.com Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-23io_uring: fix buffer auto-commit for multishot uring_cmdMing Lei
Commit 620a50c92700 ("io_uring: uring_cmd: add multishot support") added multishot uring_cmd support with explicit buffer upfront commit via io_uring_mshot_cmd_post_cqe(). However, the buffer selection path in io_ring_buffer_select() was auto-committing buffers for non-pollable files, which conflicts with uring_cmd's explicit upfront commit model. This way consumes the whole selected buffer immediately, and causes failure on the following buffer selection. Fix this by checking uring_cmd to identify operations that handle buffer commit explicitly, and skip auto-commit for these operations. Cc: Caleb Sander Mateos <csander@purestorage.com> Fixes: 620a50c92700 ("io_uring: uring_cmd: add multishot support") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>