<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/vmw_vsock, branch v4.12</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>vsock: use new wait API for vsock_stream_sendmsg()</title>
<updated>2017-05-22T18:39:36+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2017-05-19T18:21:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=499fde662f1957e3cb8d192a94a099ebe19c714b'/>
<id>499fde662f1957e3cb8d192a94a099ebe19c714b</id>
<content type='text'>
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():

  vsock_stream_has_space
    vmci_transport_stream_has_space
      vmci_qpair_produce_free_space
        qp_lock
          qp_acquire_queue_mutex
            mutex_lock

Just switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").

Reported-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Cc: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Cc: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.vnet.ibm.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():

  vsock_stream_has_space
    vmci_transport_stream_has_space
      vmci_qpair_produce_free_space
        qp_lock
          qp_acquire_queue_mutex
            mutex_lock

Just switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").

Reported-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Cc: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Cc: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.vnet.ibm.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2017-05-10T18:33:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-05-10T18:33:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c44b59430393c38873fd933333d945f426857a59'/>
<id>c44b59430393c38873fd933333d945f426857a59</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:
 "Fixes, cleanups, performance

  A bunch of changes to virtio, most affecting virtio net. Also ptr_ring
  batched zeroing - first of batching enhancements that seems ready."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  s390/virtio: change maintainership
  tools/virtio: fix spelling mistake: "wakeus" -&gt; "wakeups"
  virtio_net: tidy a couple debug statements
  ptr_ring: support testing different batching sizes
  ringtest: support test specific parameters
  ptr_ring: batch ring zeroing
  virtio: virtio_driver doc
  virtio_net: don't reset twice on XDP on/off
  virtio_net: fix support for small rings
  virtio_net: reduce alignment for buffers
  virtio_net: rework mergeable buffer handling
  virtio_net: allow specifying context for rx
  virtio: allow extra context per descriptor
  tools/virtio: fix build breakage
  virtio: add context flag to find vqs
  virtio: wrap find_vqs
  ringtest: fix an assert statement
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull virtio updates from Michael Tsirkin:
 "Fixes, cleanups, performance

  A bunch of changes to virtio, most affecting virtio net. Also ptr_ring
  batched zeroing - first of batching enhancements that seems ready."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  s390/virtio: change maintainership
  tools/virtio: fix spelling mistake: "wakeus" -&gt; "wakeups"
  virtio_net: tidy a couple debug statements
  ptr_ring: support testing different batching sizes
  ringtest: support test specific parameters
  ptr_ring: batch ring zeroing
  virtio: virtio_driver doc
  virtio_net: don't reset twice on XDP on/off
  virtio_net: fix support for small rings
  virtio_net: reduce alignment for buffers
  virtio_net: rework mergeable buffer handling
  virtio_net: allow specifying context for rx
  virtio: allow extra context per descriptor
  tools/virtio: fix build breakage
  virtio: add context flag to find vqs
  virtio: wrap find_vqs
  ringtest: fix an assert statement
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio: wrap find_vqs</title>
<updated>2017-05-02T20:41:42+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2017-03-06T16:19:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9b2bbdb227588455afcc3b03475fa9b0a35d83af'/>
<id>9b2bbdb227588455afcc3b03475fa9b0a35d83af</id>
<content type='text'>
We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>VSOCK: Add virtio vsock vsockmon hooks</title>
<updated>2017-04-24T16:35:56+00:00</updated>
<author>
<name>Gerard Garcia</name>
<email>ggarcia@deic.uab.cat</email>
</author>
<published>2017-04-21T09:10:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82dfb540aeb277d945bf646ff780493b8a520d8a'/>
<id>82dfb540aeb277d945bf646ff780493b8a520d8a</id>
<content type='text'>
The virtio drivers deal with struct virtio_vsock_pkt.  Add
virtio_transport_deliver_tap_pkt(pkt) for handing packets to the
vsockmon device.

We call virtio_transport_deliver_tap_pkt(pkt) from
net/vmw_vsock/virtio_transport.c and drivers/vhost/vsock.c instead of
common code.  This is because the drivers may drop packets before
handing them to common code - we still want to capture them.

Signed-off-by: Gerard Garcia &lt;ggarcia@deic.uab.cat&gt;
Signed-off-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The virtio drivers deal with struct virtio_vsock_pkt.  Add
virtio_transport_deliver_tap_pkt(pkt) for handing packets to the
vsockmon device.

We call virtio_transport_deliver_tap_pkt(pkt) from
net/vmw_vsock/virtio_transport.c and drivers/vhost/vsock.c instead of
common code.  This is because the drivers may drop packets before
handing them to common code - we still want to capture them.

Signed-off-by: Gerard Garcia &lt;ggarcia@deic.uab.cat&gt;
Signed-off-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>VSOCK: Add vsockmon tap functions</title>
<updated>2017-04-24T16:35:56+00:00</updated>
<author>
<name>Gerard Garcia</name>
<email>ggarcia@deic.uab.cat</email>
</author>
<published>2017-04-21T09:10:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=531b374834c891ae2abf800693074df35a7d1a36'/>
<id>531b374834c891ae2abf800693074df35a7d1a36</id>
<content type='text'>
Add tap functions that can be used by the vsock transports to
deliver packets to vsockmon virtual network devices.

Signed-off-by: Gerard Garcia &lt;ggarcia@deic.uab.cat&gt;
Signed-off-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add tap functions that can be used by the vsock transports to
deliver packets to vsockmon virtual network devices.

Signed-off-by: Gerard Garcia &lt;ggarcia@deic.uab.cat&gt;
Signed-off-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>VSOCK: remove unnecessary ternary operator on return value</title>
<updated>2017-03-30T18:07:08+00:00</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2017-03-29T15:33:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ed8bfd5c1ca6e2911c797da611fdada958ab44c0'/>
<id>ed8bfd5c1ca6e2911c797da611fdada958ab44c0</id>
<content type='text'>
Rather than assign the positive errno values to ret and then
checking if it is positive and flip the sign, just return the
errno value.

Detected by CoverityScan, CID#986649 ("Logically Dead Code")

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than assign the positive errno values to ret and then
checking if it is positive and flip the sign, just return the
errno value.

Detected by CoverityScan, CID#986649 ("Logically Dead Code")

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock: cancel packets when failing to connect</title>
<updated>2017-03-21T21:41:47+00:00</updated>
<author>
<name>Peng Tao</name>
<email>bergwolf@gmail.com</email>
</author>
<published>2017-03-15T01:32:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=380feae0def7e6a115124a3219c3ec9b654dca32'/>
<id>380feae0def7e6a115124a3219c3ec9b654dca32</id>
<content type='text'>
Otherwise we'll leave the packets queued until releasing vsock device.
E.g., if guest is slow to start up, resulting ETIMEDOUT on connect, guest
will get the connect requests from failed host sockets.

Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise we'll leave the packets queued until releasing vsock device.
E.g., if guest is slow to start up, resulting ETIMEDOUT on connect, guest
will get the connect requests from failed host sockets.

Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock: add pkt cancel capability</title>
<updated>2017-03-21T21:41:47+00:00</updated>
<author>
<name>Peng Tao</name>
<email>bergwolf@gmail.com</email>
</author>
<published>2017-03-15T01:32:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=073b4f2c50fe67c7c66a059a4d6db52bb1465490'/>
<id>073b4f2c50fe67c7c66a059a4d6db52bb1465490</id>
<content type='text'>
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vsock: track pkt owner vsock</title>
<updated>2017-03-21T21:41:46+00:00</updated>
<author>
<name>Peng Tao</name>
<email>bergwolf@gmail.com</email>
</author>
<published>2017-03-15T01:32:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36d277bac8080202684e67162ebb157f16631581'/>
<id>36d277bac8080202684e67162ebb157f16631581</id>
<content type='text'>
So that we can cancel a queued pkt later if necessary.

Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So that we can cancel a queued pkt later if necessary.

Signed-off-by: Peng Tao &lt;bergwolf@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Work around lockdep limitation in sockets that use sockets</title>
<updated>2017-03-10T02:23:27+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2017-03-09T08:09:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cdfbabfb2f0ce983fdaa42f20e5f7842178fc01e'/>
<id>cdfbabfb2f0ce983fdaa42f20e5f7842178fc01e</id>
<content type='text'>
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

The theory lockdep comes up with is as follows:

 (1) If the pagefault handler decides it needs to read pages from AFS, it
     calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
     creating a call requires the socket lock:

	mmap_sem must be taken before sk_lock-AF_RXRPC

 (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
     binds the underlying UDP socket whilst holding its socket lock.
     inet_bind() takes its own socket lock:

	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

 (3) Reading from a TCP socket into a userspace buffer might cause a fault
     and thus cause the kernel to take the mmap_sem, but the TCP socket is
     locked whilst doing this:

	sk_lock-AF_INET must be taken before mmap_sem

However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks.  The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace.  This is
a limitation in the design of lockdep.

Fix the general case by:

 (1) Double up all the locking keys used in sockets so that one set are
     used if the socket is created by userspace and the other set is used
     if the socket is created by the kernel.

 (2) Store the kern parameter passed to sk_alloc() in a variable in the
     sock struct (sk_kern_sock).  This informs sock_lock_init(),
     sock_init_data() and sk_clone_lock() as to the lock keys to be used.

     Note that the child created by sk_clone_lock() inherits the parent's
     kern setting.

 (3) Add a 'kern' parameter to -&gt;accept() that is analogous to the one
     passed in to -&gt;create() that distinguishes whether kernel_accept() or
     sys_accept4() was the caller and can be passed to sk_alloc().

     Note that a lot of accept functions merely dequeue an already
     allocated socket.  I haven't touched these as the new socket already
     exists before we get the parameter.

     Note also that there are a couple of places where I've made the accepted
     socket unconditionally kernel-based:

	irda_accept()
	rds_rcp_accept_one()
	tcp_accept_from_sock()

     because they follow a sock_create_kern() and accept off of that.

Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal.  I wonder if these should do that so
that they use the new set of lock keys.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

The theory lockdep comes up with is as follows:

 (1) If the pagefault handler decides it needs to read pages from AFS, it
     calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
     creating a call requires the socket lock:

	mmap_sem must be taken before sk_lock-AF_RXRPC

 (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
     binds the underlying UDP socket whilst holding its socket lock.
     inet_bind() takes its own socket lock:

	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

 (3) Reading from a TCP socket into a userspace buffer might cause a fault
     and thus cause the kernel to take the mmap_sem, but the TCP socket is
     locked whilst doing this:

	sk_lock-AF_INET must be taken before mmap_sem

However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks.  The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace.  This is
a limitation in the design of lockdep.

Fix the general case by:

 (1) Double up all the locking keys used in sockets so that one set are
     used if the socket is created by userspace and the other set is used
     if the socket is created by the kernel.

 (2) Store the kern parameter passed to sk_alloc() in a variable in the
     sock struct (sk_kern_sock).  This informs sock_lock_init(),
     sock_init_data() and sk_clone_lock() as to the lock keys to be used.

     Note that the child created by sk_clone_lock() inherits the parent's
     kern setting.

 (3) Add a 'kern' parameter to -&gt;accept() that is analogous to the one
     passed in to -&gt;create() that distinguishes whether kernel_accept() or
     sys_accept4() was the caller and can be passed to sk_alloc().

     Note that a lot of accept functions merely dequeue an already
     allocated socket.  I haven't touched these as the new socket already
     exists before we get the parameter.

     Note also that there are a couple of places where I've made the accepted
     socket unconditionally kernel-based:

	irda_accept()
	rds_rcp_accept_one()
	tcp_accept_from_sock()

     because they follow a sock_create_kern() and accept off of that.

Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal.  I wonder if these should do that so
that they use the new set of lock keys.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
