<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/virtio_vsock.h, branch v6.12-rc4</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>vsock/virtio: add SIOCOUTQ support for all virtio based transports</title>
<updated>2024-08-02T08:20:28+00:00</updated>
<author>
<name>Luigi Leonardi</name>
<email>luigi.leonardi@outlook.com</email>
</author>
<published>2024-07-30T19:43:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e6ab45005772014ce49a693a63f928203c0cbbb0'/>
<id>e6ab45005772014ce49a693a63f928203c0cbbb0</id>
<content type='text'>
Introduce support for virtio_transport_unsent_bytes
ioctl for virtio_transport, vhost_vsock and vsock_loopback.

For all transports the unsent bytes counter is incremented
in virtio_transport_get_credit.

In virtio_transport (G2H) and in vhost-vsock (H2G) the counter
is decremented when the skbuff is consumed. In vsock_loopback the
same skbuff is passed from the transmitter to the receiver, so
the counter is decremented before queuing the skbuff to the
receiver.

Signed-off-by: Luigi Leonardi &lt;luigi.leonardi@outlook.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce support for virtio_transport_unsent_bytes
ioctl for virtio_transport, vhost_vsock and vsock_loopback.

For all transports the unsent bytes counter is incremented
in virtio_transport_get_credit.

In virtio_transport (G2H) and in vhost-vsock (H2G) the counter
is decremented when the skbuff is consumed. In vsock_loopback the
same skbuff is passed from the transmitter to the receiver, so
the counter is decremented before queuing the skbuff to the
receiver.

Signed-off-by: Luigi Leonardi &lt;luigi.leonardi@outlook.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio/vsock: send credit update during setting SO_RCVLOWAT</title>
<updated>2023-12-15T10:37:35+00:00</updated>
<author>
<name>Arseniy Krasnov</name>
<email>avkrasnov@salutedevices.com</email>
</author>
<published>2023-12-14T12:52:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0fe1798968115488c0c02f4633032a015b1faf97'/>
<id>0fe1798968115488c0c02f4633032a015b1faf97</id>
<content type='text'>
Send credit update message when SO_RCVLOWAT is updated and it is bigger
than number of bytes in rx queue. It is needed, because 'poll()' will
wait until number of bytes in rx queue will be not smaller than
O_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup
for tx/rx is possible: sender waits for free space and receiver is
waiting data in 'poll()'.

Rename 'set_rcvlowat' callback to 'notify_set_rcvlowat' and set
'sk-&gt;sk_rcvlowat' only in one place (i.e. 'vsock_set_rcvlowat'), so the
transport doesn't need to do it.

Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages")
Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Send credit update message when SO_RCVLOWAT is updated and it is bigger
than number of bytes in rx queue. It is needed, because 'poll()' will
wait until number of bytes in rx queue will be not smaller than
O_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup
for tx/rx is possible: sender waits for free space and receiver is
waiting data in 'poll()'.

Rename 'set_rcvlowat' callback to 'notify_set_rcvlowat' and set
'sk-&gt;sk_rcvlowat' only in one place (i.e. 'vsock_set_rcvlowat'), so the
transport doesn't need to do it.

Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages")
Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock/virtio: MSG_ZEROCOPY flag support</title>
<updated>2023-09-21T10:34:00+00:00</updated>
<author>
<name>Arseniy Krasnov</name>
<email>avkrasnov@salutedevices.com</email>
</author>
<published>2023-09-16T13:09:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=581512a6dc939ef122e49336626ae159f3b8a345'/>
<id>581512a6dc939ef122e49336626ae159f3b8a345</id>
<content type='text'>
This adds handling of MSG_ZEROCOPY flag on transmission path:

1) If this flag is set and zerocopy transmission is possible (enabled
   in socket options and transport allows zerocopy), then non-linear
   skb will be created and filled with the pages of user's buffer.
   Pages of user's buffer are locked in memory by 'get_user_pages()'.
2) Replaces way of skb owning: instead of 'skb_set_owner_sk_safe()' it
   calls 'skb_set_owner_w()'. Reason of this change is that
   '__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc' of socket, so
   to decrease this field correctly, proper skb destructor is needed:
   'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'.
3) Adds new callback to 'struct virtio_transport': 'can_msgzerocopy'.
   If this callback is set, then transport needs extra check to be able
   to send provided number of buffers in zerocopy mode. Currently, the
   only transport that needs this callback set is virtio, because this
   transport adds new buffers to the virtio queue and we need to check,
   that number of these buffers is less than size of the queue (it is
   required by virtio spec). vhost and loopback transports don't need
   this check.

Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds handling of MSG_ZEROCOPY flag on transmission path:

1) If this flag is set and zerocopy transmission is possible (enabled
   in socket options and transport allows zerocopy), then non-linear
   skb will be created and filled with the pages of user's buffer.
   Pages of user's buffer are locked in memory by 'get_user_pages()'.
2) Replaces way of skb owning: instead of 'skb_set_owner_sk_safe()' it
   calls 'skb_set_owner_w()'. Reason of this change is that
   '__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc' of socket, so
   to decrease this field correctly, proper skb destructor is needed:
   'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'.
3) Adds new callback to 'struct virtio_transport': 'can_msgzerocopy'.
   If this callback is set, then transport needs extra check to be able
   to send provided number of buffers in zerocopy mode. Currently, the
   only transport that needs this callback set is virtio, because this
   transport adds new buffers to the virtio queue and we need to check,
   that number of these buffers is less than size of the queue (it is
   required by virtio spec). vhost and loopback transports don't need
   this check.

Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock/virtio/vhost: read data from non-linear skb</title>
<updated>2023-09-21T10:34:00+00:00</updated>
<author>
<name>Arseniy Krasnov</name>
<email>avkrasnov@salutedevices.com</email>
</author>
<published>2023-09-16T13:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0df7cd3c13e44d01f9f28e29cbce74e2931b00fe'/>
<id>0df7cd3c13e44d01f9f28e29cbce74e2931b00fe</id>
<content type='text'>
This is preparation patch for MSG_ZEROCOPY support. It adds handling of
non-linear skbs by replacing direct calls of 'memcpy_to_msg()' with
'skb_copy_datagram_iter()'. Main advantage of the second one is that it
can handle paged part of the skb by using 'kmap()' on each page, but if
there are no pages in the skb, it behaves like simple copying to iov
iterator. This patch also adds new field to the control block of skb -
this value shows current offset in the skb to read next portion of data
(it doesn't matter linear it or not). Idea behind this field is that
'skb_copy_datagram_iter()' handles both types of skb internally - it
just needs an offset from which to copy data from the given skb. This
offset is incremented on each read from skb. This approach allows to
simplify handling of both linear and non-linear skbs, because for
linear skb we need to call 'skb_pull()' after reading data from it,
while in non-linear case we need to update 'data_len'.

Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is preparation patch for MSG_ZEROCOPY support. It adds handling of
non-linear skbs by replacing direct calls of 'memcpy_to_msg()' with
'skb_copy_datagram_iter()'. Main advantage of the second one is that it
can handle paged part of the skb by using 'kmap()' on each page, but if
there are no pages in the skb, it behaves like simple copying to iov
iterator. This patch also adds new field to the control block of skb -
this value shows current offset in the skb to read next portion of data
(it doesn't matter linear it or not). Idea behind this field is that
'skb_copy_datagram_iter()' handles both types of skb internally - it
just needs an offset from which to copy data from the given skb. This
offset is incremented on each read from skb. This approach allows to
simplify handling of both linear and non-linear skbs, because for
linear skb we need to call 'skb_pull()' after reading data from it,
while in non-linear case we need to update 'data_len'.

Signed-off-by: Arseniy Krasnov &lt;avkrasnov@salutedevices.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock: support sockmap</title>
<updated>2023-03-29T07:19:38+00:00</updated>
<author>
<name>Bobby Eshleman</name>
<email>bobby.eshleman@bytedance.com</email>
</author>
<published>2023-03-27T19:11:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=634f1a7110b439c65fd8a809171c1d2d28bcea6f'/>
<id>634f1a7110b439c65fd8a809171c1d2d28bcea6f</id>
<content type='text'>
This patch adds sockmap support for vsock sockets. It is intended to be
usable by all transports, but only the virtio and loopback transports
are implemented.

SOCK_STREAM, SOCK_DGRAM, and SOCK_SEQPACKET are all supported.

Signed-off-by: Bobby Eshleman &lt;bobby.eshleman@bytedance.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds sockmap support for vsock sockets. It is intended to be
usable by all transports, but only the virtio and loopback transports
are implemented.

SOCK_STREAM, SOCK_DGRAM, and SOCK_SEQPACKET are all supported.

Signed-off-by: Bobby Eshleman &lt;bobby.eshleman@bytedance.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio/vsock: replace virtio_vsock_pkt with sk_buff</title>
<updated>2023-01-16T13:26:33+00:00</updated>
<author>
<name>Bobby Eshleman</name>
<email>bobby.eshleman@bytedance.com</email>
</author>
<published>2023-01-13T22:21:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=71dc9ec9ac7d3eee785cdc986c3daeb821381e20'/>
<id>71dc9ec9ac7d3eee785cdc986c3daeb821381e20</id>
<content type='text'>
This commit changes virtio/vsock to use sk_buff instead of
virtio_vsock_pkt. Beyond better conforming to other net code, using
sk_buff allows vsock to use sk_buff-dependent features in the future
(such as sockmap) and improves throughput.

This patch introduces the following performance changes:

Tool: Uperf
Env: Phys Host + L1 Guest
Payload: 64k
Threads: 16
Test Runs: 10
Type: SOCK_STREAM
Before: commit b7bfaa761d760 ("Linux 6.2-rc3")

Before
------
g2h: 16.77Gb/s
h2g: 10.56Gb/s

After
-----
g2h: 21.04Gb/s
h2g: 10.76Gb/s

Signed-off-by: Bobby Eshleman &lt;bobby.eshleman@bytedance.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit changes virtio/vsock to use sk_buff instead of
virtio_vsock_pkt. Beyond better conforming to other net code, using
sk_buff allows vsock to use sk_buff-dependent features in the future
(such as sockmap) and improves throughput.

This patch introduces the following performance changes:

Tool: Uperf
Env: Phys Host + L1 Guest
Payload: 64k
Threads: 16
Test Runs: 10
Type: SOCK_STREAM
Before: commit b7bfaa761d760 ("Linux 6.2-rc3")

Before
------
g2h: 16.77Gb/s
h2g: 10.56Gb/s

After
-----
g2h: 21.04Gb/s
h2g: 10.76Gb/s

Signed-off-by: Bobby Eshleman &lt;bobby.eshleman@bytedance.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio/vsock: rest of SOCK_SEQPACKET support</title>
<updated>2021-06-11T20:32:47+00:00</updated>
<author>
<name>Arseny Krasnov</name>
<email>arseny.krasnov@kaspersky.com</email>
</author>
<published>2021-06-11T11:13:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9ac841f5e9f261245d9d2841ad123566bd160a6e'/>
<id>9ac841f5e9f261245d9d2841ad123566bd160a6e</id>
<content type='text'>
Small updates to make SOCK_SEQPACKET work:
1) Send SHUTDOWN on socket close for SEQPACKET type.
2) Set SEQPACKET packet type during send.
3) Set 'VIRTIO_VSOCK_SEQ_EOR' bit in flags for last
   packet of message.
4) Implement data check function for SEQPACKET.
5) Check for max datagram size.

Signed-off-by: Arseny Krasnov &lt;arseny.krasnov@kaspersky.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Small updates to make SOCK_SEQPACKET work:
1) Send SHUTDOWN on socket close for SEQPACKET type.
2) Set SEQPACKET packet type during send.
3) Set 'VIRTIO_VSOCK_SEQ_EOR' bit in flags for last
   packet of message.
4) Implement data check function for SEQPACKET.
5) Check for max datagram size.

Signed-off-by: Arseny Krasnov &lt;arseny.krasnov@kaspersky.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio/vsock: dequeue callback for SOCK_SEQPACKET</title>
<updated>2021-06-11T20:32:47+00:00</updated>
<author>
<name>Arseny Krasnov</name>
<email>arseny.krasnov@kaspersky.com</email>
</author>
<published>2021-06-11T11:12:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=44931195a5412a97c46d299227fbabad4e09010d'/>
<id>44931195a5412a97c46d299227fbabad4e09010d</id>
<content type='text'>
Callback fetches RW packets from rx queue of socket until whole record
is copied(if user's buffer is full, user is not woken up). This is done
to not stall sender, because if we wake up user and it leaves syscall,
nobody will send credit update for rest of record, and sender will wait
for next enter of read syscall at receiver's side. So if user buffer is
full, we just send credit update and drop data.

Signed-off-by: Arseny Krasnov &lt;arseny.krasnov@kaspersky.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Callback fetches RW packets from rx queue of socket until whole record
is copied(if user's buffer is full, user is not woken up). This is done
to not stall sender, because if we wake up user and it leaves syscall,
nobody will send credit update for rest of record, and sender will wait
for next enter of read syscall at receiver's side. So if user buffer is
full, we just send credit update and drop data.

Signed-off-by: Arseny Krasnov &lt;arseny.krasnov@kaspersky.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock/virtio: fix multiple packet delivery to monitoring devices</title>
<updated>2020-04-27T17:18:01+00:00</updated>
<author>
<name>Stefano Garzarella</name>
<email>sgarzare@redhat.com</email>
</author>
<published>2020-04-24T15:08:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a78d163978567adc2733465289293dad479d842a'/>
<id>a78d163978567adc2733465289293dad479d842a</id>
<content type='text'>
In virtio_transport.c, if the virtqueue is full, the transmitting
packet is queued up and it will be sent in the next iteration.
This causes the same packet to be delivered multiple times to
monitoring devices.

We want to continue to deliver packets to monitoring devices before
it is put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.

This patch fixes the issue, adding a new flag (tap_delivered) in
struct virtio_vsock_pkt, to check if the packet is already delivered
to monitoring devices.

In vhost/vsock.c, we are splitting packets, so we must set
'tap_delivered' to false when we queue up the same virtio_vsock_pkt
to handle the remaining bytes.

Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In virtio_transport.c, if the virtqueue is full, the transmitting
packet is queued up and it will be sent in the next iteration.
This causes the same packet to be delivered multiple times to
monitoring devices.

We want to continue to deliver packets to monitoring devices before
it is put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.

This patch fixes the issue, adding a new flag (tap_delivered) in
struct virtio_vsock_pkt, to check if the packet is already delivered
to monitoring devices.

In vhost/vsock.c, we are splitting packets, so we must set
'tap_delivered' to false when we queue up the same virtio_vsock_pkt
to handle the remaining bytes.

Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock: handle buffer_size sockopts in the core</title>
<updated>2019-11-15T02:12:18+00:00</updated>
<author>
<name>Stefano Garzarella</name>
<email>sgarzare@redhat.com</email>
</author>
<published>2019-11-14T09:57:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b9f2b0ffde0c9b666b2b1672eb468b8f805a9b97'/>
<id>b9f2b0ffde0c9b666b2b1672eb468b8f805a9b97</id>
<content type='text'>
virtio_transport and vmci_transport handle the buffer_size
sockopts in a very similar way.

In order to support multiple transports, this patch moves this
handling in the core to allow the user to change the options
also if the socket is not yet assigned to any transport.

This patch also adds the '.notify_buffer_size' callback in the
'struct virtio_transport' in order to inform the transport,
when the buffer_size is changed by the user. It is also useful
to limit the 'buffer_size' requested (e.g. virtio transports).

Acked-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
virtio_transport and vmci_transport handle the buffer_size
sockopts in a very similar way.

In order to support multiple transports, this patch moves this
handling in the core to allow the user to change the options
also if the socket is not yet assigned to any transport.

This patch also adds the '.notify_buffer_size' callback in the
'struct virtio_transport' in order to inform the transport,
when the buffer_size is changed by the user. It is also useful
to limit the 'buffer_size' requested (e.g. virtio transports).

Acked-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
