<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/unix, branch v4.4.78</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers</title>
<updated>2017-07-05T12:37:13+00:00</updated>
<author>
<name>Mateusz Jurczyk</name>
<email>mjurczyk@google.com</email>
</author>
<published>2017-06-08T09:13:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0fc0fad07722e7ff1e4322e2155b8cd4d963e42a'/>
<id>0fc0fad07722e7ff1e4322e2155b8cd4d963e42a</id>
<content type='text'>
[ Upstream commit defbcf2decc903a28d8398aa477b6881e711e3ea ]

Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.

Signed-off-by: Mateusz Jurczyk &lt;mjurczyk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit defbcf2decc903a28d8398aa477b6881e711e3ea ]

Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.

Signed-off-by: Mateusz Jurczyk &lt;mjurczyk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: unix: properly re-increment inflight counter of GC discarded candidates</title>
<updated>2017-03-30T07:35:13+00:00</updated>
<author>
<name>Andrey Ulanov</name>
<email>andreyu@google.com</email>
</author>
<published>2017-03-15T03:16:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=610c6bcc5fcfb6d02d63cfded2375a829df7faba'/>
<id>610c6bcc5fcfb6d02d63cfded2375a829df7faba</id>
<content type='text'>
[ Upstream commit 7df9c24625b9981779afb8fcdbe2bb4765e61147 ]

Dmitry has reported that a BUG_ON() condition in unix_notinflight()
may be triggered by a simple code that forwards unix socket in an
SCM_RIGHTS message.
That is caused by incorrect unix socket GC implementation in unix_gc().

The GC first collects list of candidates, then (a) decrements their
"children's" inflight counter, (b) checks which inflight counters are
now 0, and then (c) increments all inflight counters back.
(a) and (c) are done by calling scan_children() with inc_inflight or
dec_inflight as the second argument.

Commit 6209344f5a37 ("net: unix: fix inflight counting bug in garbage
collector") changed scan_children() such that it no longer considers
sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block
of code that that unsets this flag _before_ invoking
scan_children(, dec_iflight, ). This may lead to incorrect inflight
counters for some sockets.

This change fixes this bug by changing order of operations:
UNIX_GC_CANDIDATE is now unset only after all inflight counters are
restored to the original state.

  kernel BUG at net/unix/garbage.c:149!
  RIP: 0010:[&lt;ffffffff8717ebf4&gt;]  [&lt;ffffffff8717ebf4&gt;]
  unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149
  Call Trace:
   [&lt;ffffffff8716cfbf&gt;] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487
   [&lt;ffffffff8716f6a9&gt;] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496
   [&lt;ffffffff86a90a01&gt;] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655
   [&lt;ffffffff86a9808a&gt;] skb_release_all+0x1a/0x60 net/core/skbuff.c:668
   [&lt;ffffffff86a980ea&gt;] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684
   [&lt;ffffffff86a98284&gt;] kfree_skb+0x184/0x570 net/core/skbuff.c:705
   [&lt;ffffffff871789d5&gt;] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559
   [&lt;ffffffff87179039&gt;] unix_release+0x49/0x90 net/unix/af_unix.c:836
   [&lt;ffffffff86a694b2&gt;] sock_release+0x92/0x1f0 net/socket.c:570
   [&lt;ffffffff86a6962b&gt;] sock_close+0x1b/0x20 net/socket.c:1017
   [&lt;ffffffff81a76b8e&gt;] __fput+0x34e/0x910 fs/file_table.c:208
   [&lt;ffffffff81a771da&gt;] ____fput+0x1a/0x20 fs/file_table.c:244
   [&lt;ffffffff81483ab0&gt;] task_work_run+0x1a0/0x280 kernel/task_work.c:116
   [&lt;     inline     &gt;] exit_task_work include/linux/task_work.h:21
   [&lt;ffffffff8141287a&gt;] do_exit+0x183a/0x2640 kernel/exit.c:828
   [&lt;ffffffff8141383e&gt;] do_group_exit+0x14e/0x420 kernel/exit.c:931
   [&lt;ffffffff814429d3&gt;] get_signal+0x663/0x1880 kernel/signal.c:2307
   [&lt;ffffffff81239b45&gt;] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
   [&lt;ffffffff8100666a&gt;] exit_to_usermode_loop+0x1ea/0x2d0
  arch/x86/entry/common.c:156
   [&lt;     inline     &gt;] prepare_exit_to_usermode arch/x86/entry/common.c:190
   [&lt;ffffffff81009693&gt;] syscall_return_slowpath+0x4d3/0x570
  arch/x86/entry/common.c:259
   [&lt;ffffffff881478e6&gt;] entry_SYSCALL_64_fastpath+0xc4/0xc6

Link: https://lkml.org/lkml/2017/3/6/252
Signed-off-by: Andrey Ulanov &lt;andreyu@google.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7df9c24625b9981779afb8fcdbe2bb4765e61147 ]

Dmitry has reported that a BUG_ON() condition in unix_notinflight()
may be triggered by a simple code that forwards unix socket in an
SCM_RIGHTS message.
That is caused by incorrect unix socket GC implementation in unix_gc().

The GC first collects list of candidates, then (a) decrements their
"children's" inflight counter, (b) checks which inflight counters are
now 0, and then (c) increments all inflight counters back.
(a) and (c) are done by calling scan_children() with inc_inflight or
dec_inflight as the second argument.

Commit 6209344f5a37 ("net: unix: fix inflight counting bug in garbage
collector") changed scan_children() such that it no longer considers
sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block
of code that that unsets this flag _before_ invoking
scan_children(, dec_iflight, ). This may lead to incorrect inflight
counters for some sockets.

This change fixes this bug by changing order of operations:
UNIX_GC_CANDIDATE is now unset only after all inflight counters are
restored to the original state.

  kernel BUG at net/unix/garbage.c:149!
  RIP: 0010:[&lt;ffffffff8717ebf4&gt;]  [&lt;ffffffff8717ebf4&gt;]
  unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149
  Call Trace:
   [&lt;ffffffff8716cfbf&gt;] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487
   [&lt;ffffffff8716f6a9&gt;] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496
   [&lt;ffffffff86a90a01&gt;] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655
   [&lt;ffffffff86a9808a&gt;] skb_release_all+0x1a/0x60 net/core/skbuff.c:668
   [&lt;ffffffff86a980ea&gt;] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684
   [&lt;ffffffff86a98284&gt;] kfree_skb+0x184/0x570 net/core/skbuff.c:705
   [&lt;ffffffff871789d5&gt;] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559
   [&lt;ffffffff87179039&gt;] unix_release+0x49/0x90 net/unix/af_unix.c:836
   [&lt;ffffffff86a694b2&gt;] sock_release+0x92/0x1f0 net/socket.c:570
   [&lt;ffffffff86a6962b&gt;] sock_close+0x1b/0x20 net/socket.c:1017
   [&lt;ffffffff81a76b8e&gt;] __fput+0x34e/0x910 fs/file_table.c:208
   [&lt;ffffffff81a771da&gt;] ____fput+0x1a/0x20 fs/file_table.c:244
   [&lt;ffffffff81483ab0&gt;] task_work_run+0x1a0/0x280 kernel/task_work.c:116
   [&lt;     inline     &gt;] exit_task_work include/linux/task_work.h:21
   [&lt;ffffffff8141287a&gt;] do_exit+0x183a/0x2640 kernel/exit.c:828
   [&lt;ffffffff8141383e&gt;] do_group_exit+0x14e/0x420 kernel/exit.c:931
   [&lt;ffffffff814429d3&gt;] get_signal+0x663/0x1880 kernel/signal.c:2307
   [&lt;ffffffff81239b45&gt;] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
   [&lt;ffffffff8100666a&gt;] exit_to_usermode_loop+0x1ea/0x2d0
  arch/x86/entry/common.c:156
   [&lt;     inline     &gt;] prepare_exit_to_usermode arch/x86/entry/common.c:190
   [&lt;ffffffff81009693&gt;] syscall_return_slowpath+0x4d3/0x570
  arch/x86/entry/common.c:259
   [&lt;ffffffff881478e6&gt;] entry_SYSCALL_64_fastpath+0xc4/0xc6

Link: https://lkml.org/lkml/2017/3/6/252
Signed-off-by: Andrey Ulanov &lt;andreyu@google.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: move unix_mknod() out of bindlock</title>
<updated>2017-02-04T08:45:09+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2017-01-23T19:17:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0492a033fb71b5ef0573aa157aadda049e8a7770'/>
<id>0492a033fb71b5ef0573aa157aadda049e8a7770</id>
<content type='text'>
[ Upstream commit 0fb44559ffd67de8517098b81f675fa0210f13f0 ]

Dmitry reported a deadlock scenario:

unix_bind() path:
u-&gt;bindlock ==&gt; sb_writer

do_splice() path:
sb_writer ==&gt; pipe-&gt;mutex ==&gt; u-&gt;bindlock

In the unix_bind() code path, unix_mknod() does not have to
be done with u-&gt;bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Tested-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0fb44559ffd67de8517098b81f675fa0210f13f0 ]

Dmitry reported a deadlock scenario:

unix_bind() path:
u-&gt;bindlock ==&gt; sb_writer

do_splice() path:
sb_writer ==&gt; pipe-&gt;mutex ==&gt; u-&gt;bindlock

In the unix_bind() code path, unix_mknod() does not have to
be done with u-&gt;bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Tested-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: conditionally use freezable blocking calls in read</title>
<updated>2016-12-10T18:07:22+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2016-11-17T23:55:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6ef59b986190f07386b062884ab48006c16cf152'/>
<id>6ef59b986190f07386b062884ab48006c16cf152</id>
<content type='text'>
[ Upstream commit 06a77b07e3b44aea2b3c0e64de420ea2cfdcbaa9 ]

Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
converts schedule_timeout() to its freezable version, it was probably
correct at that time, but later, commit 2b514574f7e8
("net: af_unix: implement splice for stream af_unix sockets") breaks
the strong requirement for a freezable sleep, according to
commit 0f9548ca1091:

    We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
    deadlock if the lock is later acquired in the suspend or hibernate path
    (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
    cgroup_freezer if a lock is held inside a frozen cgroup that is later
    acquired by a process outside that group.

The pipe_lock is still held at that point.

So use freezable version only for the recvmsg call path, avoid impact for
Android.

Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 06a77b07e3b44aea2b3c0e64de420ea2cfdcbaa9 ]

Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
converts schedule_timeout() to its freezable version, it was probably
correct at that time, but later, commit 2b514574f7e8
("net: af_unix: implement splice for stream af_unix sockets") breaks
the strong requirement for a freezable sleep, according to
commit 0f9548ca1091:

    We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
    deadlock if the lock is later acquired in the suspend or hibernate path
    (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
    cgroup_freezer if a lock is held inside a frozen cgroup that is later
    acquired by a process outside that group.

The pipe_lock is still held at that point.

So use freezable version only for the recvmsg call path, avoid impact for
Android.

Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: split 'u-&gt;readlock' into two: 'iolock' and 'bindlock'</title>
<updated>2016-09-30T08:18:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-09-01T21:43:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9b5390d7b6abf1fe04ccc6437691881c369f7ff9'/>
<id>9b5390d7b6abf1fe04ccc6437691881c369f7ff9</id>
<content type='text'>
commit 6e1ce3c3451291142a57c4f3f6f999a29fb5b3bc upstream.

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking.  The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.

Acked-by: Rainer Weikusat &lt;rweikusat@cyberadapt.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6e1ce3c3451291142a57c4f3f6f999a29fb5b3bc upstream.

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking.  The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.

Acked-by: Rainer Weikusat &lt;rweikusat@cyberadapt.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "af_unix: Fix splice-bind deadlock"</title>
<updated>2016-09-30T08:18:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-09-01T21:56:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=941f6995440400a67cebce678e612459ea60b96b'/>
<id>941f6995440400a67cebce678e612459ea60b96b</id>
<content type='text'>
commit 38f7bd94a97b542de86a2be9229289717e33a7a4 upstream.

This reverts commit c845acb324aa85a39650a14e7696982ceea75dc1.

It turns out that it just replaces one deadlock with another one: we can
still get the wrong lock ordering with the readlock due to overlayfs
calling back into the filesystem layer and still taking the vfs locks
after the readlock.

The proper solution ends up being to just split the readlock into two
pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
(taken *inside* the filesystem locks).  The two locks are independent
anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reviewed-by: Shmulik Ladkani &lt;shmulik.ladkani@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 38f7bd94a97b542de86a2be9229289717e33a7a4 upstream.

This reverts commit c845acb324aa85a39650a14e7696982ceea75dc1.

It turns out that it just replaces one deadlock with another one: we can
still get the wrong lock ordering with the readlock due to overlayfs
calling back into the filesystem layer and still taking the vfs locks
after the readlock.

The proper solution ends up being to just split the readlock into two
pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
(taken *inside* the filesystem locks).  The two locks are independent
anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reviewed-by: Shmulik Ladkani &lt;shmulik.ladkani@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: fix hard linked sockets on overlay</title>
<updated>2016-07-27T16:47:33+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2016-05-20T20:13:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0da3127a76c2cf7ad2c56e88841fa83613a67a77'/>
<id>0da3127a76c2cf7ad2c56e88841fa83613a67a77</id>
<content type='text'>
commit eb0a4a47ae89aaa0674ab3180de6a162f3be2ddf upstream.

Overlayfs uses separate inodes even in the case of hard links on the
underlying filesystems.  This is a problem for AF_UNIX socket
implementation which indexes sockets based on the inode.  This resulted in
hard linked sockets not working.

The fix is to use the real, underlying inode.

Test case follows:

-- ovl-sock-test.c --
#include &lt;unistd.h&gt;
#include &lt;err.h&gt;
#include &lt;sys/socket.h&gt;
#include &lt;sys/un.h&gt;

#define SOCK "test-sock"
#define SOCK2 "test-sock2"

int main(void)
{
	int fd, fd2;
	struct sockaddr_un addr = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK,
	};
	struct sockaddr_un addr2 = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK2,
	};

	unlink(SOCK);
	unlink(SOCK2);
	if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (bind(fd, (struct sockaddr *) &amp;addr, sizeof(addr)) == -1)
		err(1, "bind");
	if (listen(fd, 0) == -1)
		err(1, "listen");
	if (link(SOCK, SOCK2) == -1)
		err(1, "link");
	if ((fd2 = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (connect(fd2, (struct sockaddr *) &amp;addr2, sizeof(addr2)) == -1)
		err (1, "connect");
	return 0;
}
----

Reported-by: Alexander Morozov &lt;alexandr.morozov@docker.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit eb0a4a47ae89aaa0674ab3180de6a162f3be2ddf upstream.

Overlayfs uses separate inodes even in the case of hard links on the
underlying filesystems.  This is a problem for AF_UNIX socket
implementation which indexes sockets based on the inode.  This resulted in
hard linked sockets not working.

The fix is to use the real, underlying inode.

Test case follows:

-- ovl-sock-test.c --
#include &lt;unistd.h&gt;
#include &lt;err.h&gt;
#include &lt;sys/socket.h&gt;
#include &lt;sys/un.h&gt;

#define SOCK "test-sock"
#define SOCK2 "test-sock2"

int main(void)
{
	int fd, fd2;
	struct sockaddr_un addr = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK,
	};
	struct sockaddr_un addr2 = {
		.sun_family = AF_UNIX,
		.sun_path = SOCK2,
	};

	unlink(SOCK);
	unlink(SOCK2);
	if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (bind(fd, (struct sockaddr *) &amp;addr, sizeof(addr)) == -1)
		err(1, "bind");
	if (listen(fd, 0) == -1)
		err(1, "listen");
	if (link(SOCK, SOCK2) == -1)
		err(1, "link");
	if ((fd2 = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
		err(1, "socket");
	if (connect(fd2, (struct sockaddr *) &amp;addr2, sizeof(addr2)) == -1)
		err (1, "connect");
	return 0;
}
----

Reported-by: Alexander Morozov &lt;alexandr.morozov@docker.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>unix_diag: fix incorrect sign extension in unix_lookup_by_ino</title>
<updated>2016-03-03T23:07:07+00:00</updated>
<author>
<name>Dmitry V. Levin</name>
<email>ldv@altlinux.org</email>
</author>
<published>2016-02-19T01:27:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=82f26aa4a5537b080c0cf71f0f1016c37f01d25e'/>
<id>82f26aa4a5537b080c0cf71f0f1016c37f01d25e</id>
<content type='text'>
[ Upstream commit b5f0549231ffb025337be5a625b0ff9f52b016f0 ]

The value passed by unix_diag_get_exact to unix_lookup_by_ino has type
__u32, but unix_lookup_by_ino's argument ino has type int, which is not
a problem yet.
However, when ino is compared with sock_i_ino return value of type
unsigned long, ino is sign extended to signed long, and this results
to incorrect comparison on 64-bit architectures for inode numbers
greater than INT_MAX.

This bug was found by strace test suite.

Fixes: 5d3cae8bc39d ("unix_diag: Dumping exact socket core")
Signed-off-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Acked-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b5f0549231ffb025337be5a625b0ff9f52b016f0 ]

The value passed by unix_diag_get_exact to unix_lookup_by_ino has type
__u32, but unix_lookup_by_ino's argument ino has type int, which is not
a problem yet.
However, when ino is compared with sock_i_ino return value of type
unsigned long, ino is sign extended to signed long, and this results
to incorrect comparison on 64-bit architectures for inode numbers
greater than INT_MAX.

This bug was found by strace test suite.

Fixes: 5d3cae8bc39d ("unix_diag: Dumping exact socket core")
Signed-off-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Acked-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: Guard against other == sk in unix_dgram_sendmsg</title>
<updated>2016-03-03T23:07:06+00:00</updated>
<author>
<name>Rainer Weikusat</name>
<email>rweikusat@mobileactivedefense.com</email>
</author>
<published>2016-02-11T19:37:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1bd367857b7f884db302703c624f88adfd8eb171'/>
<id>1bd367857b7f884db302703c624f88adfd8eb171</id>
<content type='text'>
[ Upstream commit a5527dda344fff0514b7989ef7a755729769daa1 ]

The unix_dgram_sendmsg routine use the following test

if (unlikely(unix_peer(other) != sk &amp;&amp; unix_recvq_full(other))) {

to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.

Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn &lt;pmhahn@pmhahn.de&gt;
Signed-off-by: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Tested-by: Philipp Hahn &lt;pmhahn@pmhahn.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a5527dda344fff0514b7989ef7a755729769daa1 ]

The unix_dgram_sendmsg routine use the following test

if (unlikely(unix_peer(other) != sk &amp;&amp; unix_recvq_full(other))) {

to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.

Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn &lt;pmhahn@pmhahn.de&gt;
Signed-off-by: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Tested-by: Philipp Hahn &lt;pmhahn@pmhahn.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_unix: Don't set err in unix_stream_read_generic unless there was an error</title>
<updated>2016-03-03T23:07:06+00:00</updated>
<author>
<name>Rainer Weikusat</name>
<email>rweikusat@mobileactivedefense.com</email>
</author>
<published>2016-02-08T18:47:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2f46f069ccfb28e6fdaa6798544fd30b72835b04'/>
<id>2f46f069ccfb28e6fdaa6798544fd30b72835b04</id>
<content type='text'>
[ Upstream commit 1b92ee3d03af6643df395300ba7748f19ecdb0c5 ]

The present unix_stream_read_generic contains various code sequences of
the form

err = -EDISASTER;
if (&lt;test&gt;)
	goto out;

This has the unfortunate side effect of possibly causing the error code
to bleed through to the final

out:
	return copied ? : err;

and then to be wrongly returned if no data was copied because the caller
didn't supply a data buffer, as demonstrated by the program available at

http://pad.lv/1540731

Change it such that err is only set if an error condition was detected.

Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code")
Reported-by: Joseph Salisbury &lt;joseph.salisbury@canonical.com&gt;
Signed-off-by: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 1b92ee3d03af6643df395300ba7748f19ecdb0c5 ]

The present unix_stream_read_generic contains various code sequences of
the form

err = -EDISASTER;
if (&lt;test&gt;)
	goto out;

This has the unfortunate side effect of possibly causing the error code
to bleed through to the final

out:
	return copied ? : err;

and then to be wrongly returned if no data was copied because the caller
didn't supply a data buffer, as demonstrated by the program available at

http://pad.lv/1540731

Change it such that err is only set if an error condition was detected.

Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code")
Reported-by: Joseph Salisbury &lt;joseph.salisbury@canonical.com&gt;
Signed-off-by: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
