<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/uapi/linux/io_uring.h, branch v6.3-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2023-02-21T19:07:23+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-02-21T19:07:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4a7d37e824f57dbace61abf62f53843800bd245c'/>
<id>4a7d37e824f57dbace61abf62f53843800bd245c</id>
<content type='text'>
Pull hardening updates from Kees Cook:
 "Beyond some specific LoadPin, UBSAN, and fortify features, there are
  other fixes scattered around in various subsystems where maintainers
  were okay with me carrying them in my tree or were non-responsive but
  the patches were reviewed by others:

   - Replace 0-length and 1-element arrays with flexible arrays in
     various subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees
     Cook)

   - randstruct: Disable Clang 15 support (Eric Biggers)

   - GCC plugins: Drop -std=gnu++11 flag (Sam James)

   - strpbrk(): Refactor to use strchr() (Andy Shevchenko)

   - LoadPin LSM: Allow root filesystem switching when non-enforcing

   - fortify: Use dynamic object size hints when available

   - ext4: Fix CFI function prototype mismatch

   - Nouveau: Fix DP buffer size arguments

   - hisilicon: Wipe entire crypto DMA pool on error

   - coda: Fully allocate sig_inputArgs

   - UBSAN: Improve arm64 trap code reporting

   - copy_struct_from_user(): Add minimum bounds check on kernel buffer
     size"

* tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randstruct: disable Clang 15 support
  uaccess: Add minimum bounds check on kernel buffer size
  arm64: Support Clang UBSAN trap codes for better reporting
  coda: Avoid partial allocation of sig_inputArgs
  gcc-plugins: drop -std=gnu++11 to fix GCC 13 build
  lib/string: Use strchr() in strpbrk()
  crypto: hisilicon: Wipe entire pool on error
  net/i40e: Replace 0-length array with flexible array
  io_uring: Replace 0-length array with flexible array
  ext4: Fix function prototype mismatch for ext4_feat_ktype
  i915/gvt: Replace one-element array with flexible-array member
  drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size
  LoadPin: Allow filesystem switch when not enforcing
  LoadPin: Move pin reporting cleanly out of locking
  LoadPin: Refactor sysctl initialization
  LoadPin: Refactor read-only check into a helper
  ARM: ixp4xx: Replace 0-length arrays with flexible arrays
  fortify: Use __builtin_dynamic_object_size() when available
  rxrpc: replace zero-lenth array with DECLARE_FLEX_ARRAY() helper
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull hardening updates from Kees Cook:
 "Beyond some specific LoadPin, UBSAN, and fortify features, there are
  other fixes scattered around in various subsystems where maintainers
  were okay with me carrying them in my tree or were non-responsive but
  the patches were reviewed by others:

   - Replace 0-length and 1-element arrays with flexible arrays in
     various subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees
     Cook)

   - randstruct: Disable Clang 15 support (Eric Biggers)

   - GCC plugins: Drop -std=gnu++11 flag (Sam James)

   - strpbrk(): Refactor to use strchr() (Andy Shevchenko)

   - LoadPin LSM: Allow root filesystem switching when non-enforcing

   - fortify: Use dynamic object size hints when available

   - ext4: Fix CFI function prototype mismatch

   - Nouveau: Fix DP buffer size arguments

   - hisilicon: Wipe entire crypto DMA pool on error

   - coda: Fully allocate sig_inputArgs

   - UBSAN: Improve arm64 trap code reporting

   - copy_struct_from_user(): Add minimum bounds check on kernel buffer
     size"

* tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randstruct: disable Clang 15 support
  uaccess: Add minimum bounds check on kernel buffer size
  arm64: Support Clang UBSAN trap codes for better reporting
  coda: Avoid partial allocation of sig_inputArgs
  gcc-plugins: drop -std=gnu++11 to fix GCC 13 build
  lib/string: Use strchr() in strpbrk()
  crypto: hisilicon: Wipe entire pool on error
  net/i40e: Replace 0-length array with flexible array
  io_uring: Replace 0-length array with flexible array
  ext4: Fix function prototype mismatch for ext4_feat_ktype
  i915/gvt: Replace one-element array with flexible-array member
  drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size
  LoadPin: Allow filesystem switch when not enforcing
  LoadPin: Move pin reporting cleanly out of locking
  LoadPin: Refactor sysctl initialization
  LoadPin: Refactor read-only check into a helper
  ARM: ixp4xx: Replace 0-length arrays with flexible arrays
  fortify: Use __builtin_dynamic_object_size() when available
  rxrpc: replace zero-lenth array with DECLARE_FLEX_ARRAY() helper
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: Support calling io_uring_register with a registered ring fd</title>
<updated>2023-02-16T13:09:30+00:00</updated>
<author>
<name>Josh Triplett</name>
<email>josh@joshtriplett.org</email>
</author>
<published>2023-02-15T00:42:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7d3fd88d61a41016da01889f076fd1c60c7298fc'/>
<id>7d3fd88d61a41016da01889f076fd1c60c7298fc</id>
<content type='text'>
Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
of the opcode) to treat the fd as a registered index rather than a file
descriptor.

This makes it possible for a library to open an io_uring, register the
ring fd, close the ring fd, and subsequently use the ring entirely via
registered index.

Signed-off-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
Link: https://lore.kernel.org/r/f2396369e638284586b069dbddffb8c992afba95.1676419314.git.josh@joshtriplett.org
[axboe: remove extra high bit clear]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
of the opcode) to treat the fd as a registered index rather than a file
descriptor.

This makes it possible for a library to open an io_uring, register the
ring fd, close the ring fd, and subsequently use the ring entirely via
registered index.

Signed-off-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
Link: https://lore.kernel.org/r/f2396369e638284586b069dbddffb8c992afba95.1676419314.git.josh@joshtriplett.org
[axboe: remove extra high bit clear]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/msg_ring: Pass custom flags to the cqe</title>
<updated>2023-01-29T22:17:40+00:00</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2023-01-03T16:05:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cbeb47a7b5f003429ded32b1fb3a7108ce5c1b54'/>
<id>cbeb47a7b5f003429ded32b1fb3a7108ce5c1b54</id>
<content type='text'>
This patch adds a new flag (IORING_MSG_RING_FLAGS_PASS) in the message
ring operations (IORING_OP_MSG_RING). This new flag enables the sender
to specify custom flags, which will be copied over to cqe-&gt;flags in the
receiving ring.  These custom flags should be specified using the
sqe-&gt;file_index field.

This mechanism provides additional flexibility when sending messages
between rings.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Link: https://lore.kernel.org/r/20230103160507.617416-1-leitao@debian.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds a new flag (IORING_MSG_RING_FLAGS_PASS) in the message
ring operations (IORING_OP_MSG_RING). This new flag enables the sender
to specify custom flags, which will be copied over to cqe-&gt;flags in the
receiving ring.  These custom flags should be specified using the
sqe-&gt;file_index field.

This mechanism provides additional flexibility when sending messages
between rings.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Link: https://lore.kernel.org/r/20230103160507.617416-1-leitao@debian.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: Replace 0-length array with flexible array</title>
<updated>2023-01-27T19:42:57+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-01-05T19:05:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36632d062975a9ff4410c90dd6d37922b68d0920'/>
<id>36632d062975a9ff4410c90dd6d37922b68d0920</id>
<content type='text'>
Zero-length arrays are deprecated[1]. Replace struct io_uring_buf_ring's
"bufs" with a flexible array member. (How is the size of this array
verified?) Detected with GCC 13, using -fstrict-flex-arrays=3:

In function 'io_ring_buffer_select',
    inlined from 'io_buffer_select' at io_uring/kbuf.c:183:10:
io_uring/kbuf.c:141:23: warning: array subscript 255 is outside the bounds of an interior zero-length array 'struct io_uring_buf[0]' [-Wzero-length-bounds]
  141 |                 buf = &amp;br-&gt;bufs[head];
      |                       ^~~~~~~~~~~~~~~
In file included from include/linux/io_uring.h:7,
                 from io_uring/kbuf.c:10:
include/uapi/linux/io_uring.h: In function 'io_buffer_select':
include/uapi/linux/io_uring.h:628:41: note: while referencing 'bufs'
  628 |                 struct io_uring_buf     bufs[0];
      |                                         ^~~~

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Cc: "Gustavo A. R. Silva" &lt;gustavoars@kernel.org&gt;
Cc: stable@vger.kernel.org
Cc: io-uring@vger.kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Link: https://lore.kernel.org/r/20230105190507.gonna.131-kees@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Zero-length arrays are deprecated[1]. Replace struct io_uring_buf_ring's
"bufs" with a flexible array member. (How is the size of this array
verified?) Detected with GCC 13, using -fstrict-flex-arrays=3:

In function 'io_ring_buffer_select',
    inlined from 'io_buffer_select' at io_uring/kbuf.c:183:10:
io_uring/kbuf.c:141:23: warning: array subscript 255 is outside the bounds of an interior zero-length array 'struct io_uring_buf[0]' [-Wzero-length-bounds]
  141 |                 buf = &amp;br-&gt;bufs[head];
      |                       ^~~~~~~~~~~~~~~
In file included from include/linux/io_uring.h:7,
                 from io_uring/kbuf.c:10:
include/uapi/linux/io_uring.h: In function 'io_buffer_select':
include/uapi/linux/io_uring.h:628:41: note: while referencing 'bufs'
  628 |                 struct io_uring_buf     bufs[0];
      |                                         ^~~~

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Cc: "Gustavo A. R. Silva" &lt;gustavoars@kernel.org&gt;
Cc: stable@vger.kernel.org
Cc: io-uring@vger.kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Link: https://lore.kernel.org/r/20230105190507.gonna.131-kees@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>uapi:io_uring.h: allow linux/time_types.h to be skipped</title>
<updated>2022-12-27T14:32:51+00:00</updated>
<author>
<name>Stefan Metzmacher</name>
<email>metze@samba.org</email>
</author>
<published>2022-11-16T20:25:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9eb803402a2a83400c6c6afd900e3b7c87c06816'/>
<id>9eb803402a2a83400c6c6afd900e3b7c87c06816</id>
<content type='text'>
include/uapi/linux/io_uring.h is synced 1:1 into
liburing:src/include/liburing/io_uring.h.

liburing has a configure check to detect the need for
linux/time_types.h. It can opt-out by defining
UAPI_LINUX_IO_URING_H_SKIP_LINUX_TIME_TYPES_H

Fixes: 78a861b94959 ("io_uring: add sync cancelation API through io_uring_register()")
Link: https://github.com/axboe/liburing/issues/708
Link: https://github.com/axboe/liburing/pull/709
Link: https://lore.kernel.org/io-uring/20221115212614.1308132-1-ammar.faizi@intel.com/T/#m9f5dd571cd4f6a5dee84452dbbca3b92ba7a4091
CC: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Reviewed-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Link: https://lore.kernel.org/r/7071a0a1d751221538b20b63f9160094fc7e06f4.1668630247.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
include/uapi/linux/io_uring.h is synced 1:1 into
liburing:src/include/liburing/io_uring.h.

liburing has a configure check to detect the need for
linux/time_types.h. It can opt-out by defining
UAPI_LINUX_IO_URING_H_SKIP_LINUX_TIME_TYPES_H

Fixes: 78a861b94959 ("io_uring: add sync cancelation API through io_uring_register()")
Link: https://github.com/axboe/liburing/issues/708
Link: https://github.com/axboe/liburing/pull/709
Link: https://lore.kernel.org/io-uring/20221115212614.1308132-1-ammar.faizi@intel.com/T/#m9f5dd571cd4f6a5dee84452dbbca3b92ba7a4091
CC: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Reviewed-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Link: https://lore.kernel.org/r/7071a0a1d751221538b20b63f9160094fc7e06f4.1668630247.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/net: introduce IORING_SEND_ZC_REPORT_USAGE flag</title>
<updated>2022-11-21T14:38:31+00:00</updated>
<author>
<name>Stefan Metzmacher</name>
<email>metze@samba.org</email>
</author>
<published>2022-10-27T18:34:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e307e6698165ca6508ed42c69cb1be76c8eb6a3c'/>
<id>e307e6698165ca6508ed42c69cb1be76c8eb6a3c</id>
<content type='text'>
It might be useful for applications to detect if a zero copy transfer with
SEND[MSG]_ZC was actually possible or not. The application can fallback to
plain SEND[MSG] in order to avoid the overhead of two cqes per request. Or
it can generate a log message that could indicate to an administrator that
no zero copy was possible and could explain degraded performance.

Cc: stable@vger.kernel.org # 6.1
Link: https://lore.kernel.org/io-uring/fb6a7599-8a9b-15e5-9b64-6cd9d01c6ff4@gmail.com/T/#m2b0d9df94ce43b0e69e6c089bdff0ce6babbdfaa
Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Reviewed-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/8945b01756d902f5d5b0667f20b957ad3f742e5e.1666895626.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It might be useful for applications to detect if a zero copy transfer with
SEND[MSG]_ZC was actually possible or not. The application can fallback to
plain SEND[MSG] in order to avoid the overhead of two cqes per request. Or
it can generate a log message that could indicate to an administrator that
no zero copy was possible and could explain degraded performance.

Cc: stable@vger.kernel.org # 6.1
Link: https://lore.kernel.org/io-uring/fb6a7599-8a9b-15e5-9b64-6cd9d01c6ff4@gmail.com/T/#m2b0d9df94ce43b0e69e6c089bdff0ce6babbdfaa
Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Reviewed-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/8945b01756d902f5d5b0667f20b957ad3f742e5e.1666895626.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix typo in io_uring.h comment</title>
<updated>2022-11-06T20:17:27+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-11-06T20:17:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6dcabcd398946e2b0b776a8310291aeebe1ca0e6'/>
<id>6dcabcd398946e2b0b776a8310291aeebe1ca0e6</id>
<content type='text'>
Just a basic s/thig/this swap, fixing up a typo introduced by a commit
added in the 6.1 release.

Fixes: 9cda70f622cd ("io_uring: introduce fixed buffer support for io_uring_cmd")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just a basic s/thig/this swap, fixing up a typo introduced by a commit
added in the 6.1 release.

Fixes: 9cda70f622cd ("io_uring: introduce fixed buffer support for io_uring_cmd")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: introduce fixed buffer support for io_uring_cmd</title>
<updated>2022-09-30T13:50:59+00:00</updated>
<author>
<name>Anuj Gupta</name>
<email>anuj20.g@samsung.com</email>
</author>
<published>2022-09-30T06:27:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9cda70f622cdcf049521a9c2886e5fd8a90a0591'/>
<id>9cda70f622cdcf049521a9c2886e5fd8a90a0591</id>
<content type='text'>
Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe-&gt;buf_index, same as done in read/write variants that uses
fixed buffers.

Signed-off-by: Anuj Gupta &lt;anuj20.g@samsung.com&gt;
Signed-off-by: Kanchan Joshi &lt;joshi.k@samsung.com&gt;
Link: https://lore.kernel.org/r/20220930062749.152261-3-anuj20.g@samsung.com
[axboe: shuffle valid flags check before acting on it]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe-&gt;buf_index, same as done in read/write variants that uses
fixed buffers.

Signed-off-by: Anuj Gupta &lt;anuj20.g@samsung.com&gt;
Signed-off-by: Kanchan Joshi &lt;joshi.k@samsung.com&gt;
Link: https://lore.kernel.org/r/20220930062749.152261-3-anuj20.g@samsung.com
[axboe: shuffle valid flags check before acting on it]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/net: zerocopy sendmsg</title>
<updated>2022-09-21T19:15:02+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-09-21T11:17:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=493108d95f1464ccd101d4e5cfa7e93f1fc64d47'/>
<id>493108d95f1464ccd101d4e5cfa7e93f1fc64d47</id>
<content type='text'>
Add a zerocopy version of sendmsg.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/6aabc4bdfc0ec78df6ec9328137e394af9d4e7ef.1663668091.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a zerocopy version of sendmsg.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/6aabc4bdfc0ec78df6ec9328137e394af9d4e7ef.1663668091.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: add IORING_SETUP_DEFER_TASKRUN</title>
<updated>2022-09-21T16:30:42+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@fb.com</email>
</author>
<published>2022-08-30T12:50:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c0e0d6ba25f180ab76d3c18f8b360a119dffa634'/>
<id>c0e0d6ba25f180ab76d3c18f8b360a119dffa634</id>
<content type='text'>
Allow deferring async tasks until the user calls io_uring_enter(2) with
the IORING_ENTER_GETEVENTS flag. Enable this mode with a flag at
io_uring_setup time. This functionality requires that the later
io_uring_enter will be called from the same submission task, and therefore
restrict this flag to work only when IORING_SETUP_SINGLE_ISSUER is also
set.

Being able to hand pick when tasks are run prevents the problem where
there is current work to be done, however task work runs anyway.

For example, a common workload would obtain a batch of CQEs, and process
each one. Interrupting this to additional taskwork would add latency but
not gain anything. If instead task work is deferred to just before more
CQEs are obtained then no additional latency is added.

The way this is implemented is by trying to keep task work local to a
io_ring_ctx, rather than to the submission task. This is required, as the
application will want to wake up only a single io_ring_ctx at a time to
process work, and so the lists of work have to be kept separate.

This has some other benefits like not having to check the task continually
in handle_tw_list (and potentially unlocking/locking those), and reducing
locks in the submit &amp; process completions path.

There are networking cases where using this option can reduce request
latency by 50%. For example a contrived example using [1] where the client
sends 2k data and receives the same data back while doing some system
calls (to trigger task work) shows this reduction. The reason ends up
being that if sending responses is delayed by processing task work, then
the client side sits idle. Whereas reordering the sends first means that
the client runs it's workload in parallel with the local task work.

[1]:
Using https://github.com/DylanZA/netbench/tree/defer_run
Client:
./netbench  --client_only 1 --control_port 10000 --host &lt;host&gt; --tx "epoll --threads 16 --per_thread 1 --size 2048 --resp 2048 --workload 1000"
Server:
./netbench  --server_only 1 --control_port 10000  --rx "io_uring --defer_taskrun 0 --workload 100"   --rx "io_uring  --defer_taskrun 1 --workload 100"

Signed-off-by: Dylan Yudaken &lt;dylany@fb.com&gt;
Link: https://lore.kernel.org/r/20220830125013.570060-5-dylany@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow deferring async tasks until the user calls io_uring_enter(2) with
the IORING_ENTER_GETEVENTS flag. Enable this mode with a flag at
io_uring_setup time. This functionality requires that the later
io_uring_enter will be called from the same submission task, and therefore
restrict this flag to work only when IORING_SETUP_SINGLE_ISSUER is also
set.

Being able to hand pick when tasks are run prevents the problem where
there is current work to be done, however task work runs anyway.

For example, a common workload would obtain a batch of CQEs, and process
each one. Interrupting this to additional taskwork would add latency but
not gain anything. If instead task work is deferred to just before more
CQEs are obtained then no additional latency is added.

The way this is implemented is by trying to keep task work local to a
io_ring_ctx, rather than to the submission task. This is required, as the
application will want to wake up only a single io_ring_ctx at a time to
process work, and so the lists of work have to be kept separate.

This has some other benefits like not having to check the task continually
in handle_tw_list (and potentially unlocking/locking those), and reducing
locks in the submit &amp; process completions path.

There are networking cases where using this option can reduce request
latency by 50%. For example a contrived example using [1] where the client
sends 2k data and receives the same data back while doing some system
calls (to trigger task work) shows this reduction. The reason ends up
being that if sending responses is delayed by processing task work, then
the client side sits idle. Whereas reordering the sends first means that
the client runs it's workload in parallel with the local task work.

[1]:
Using https://github.com/DylanZA/netbench/tree/defer_run
Client:
./netbench  --client_only 1 --control_port 10000 --host &lt;host&gt; --tx "epoll --threads 16 --per_thread 1 --size 2048 --resp 2048 --workload 1000"
Server:
./netbench  --server_only 1 --control_port 10000  --rx "io_uring --defer_taskrun 0 --workload 100"   --rx "io_uring  --defer_taskrun 1 --workload 100"

Signed-off-by: Dylan Yudaken &lt;dylany@fb.com&gt;
Link: https://lore.kernel.org/r/20220830125013.570060-5-dylany@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
