<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/packet, branch v4.4.93</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>net/packet: check length in getsockopt() called with PACKET_HDRLEN</title>
<updated>2017-10-08T08:14:18+00:00</updated>
<author>
<name>Alexander Potapenko</name>
<email>glider@google.com</email>
</author>
<published>2017-04-25T16:51:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fa63895f47c9253a0305a5d0862e98ab6f11e718'/>
<id>fa63895f47c9253a0305a5d0862e98ab6f11e718</id>
<content type='text'>
[ Upstream commit fd2c83b35752f0a8236b976978ad4658df14a59f ]

In the case getsockopt() is called with PACKET_HDRLEN and optlen &lt; 4
|val| remains uninitialized and the syscall may behave differently
depending on its value, and even copy garbage to userspace on certain
architectures. To fix this we now return -EINVAL if optlen is too small.

This bug has been detected with KMSAN.

Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit fd2c83b35752f0a8236b976978ad4658df14a59f ]

In the case getsockopt() is called with PACKET_HDRLEN and optlen &lt; 4
|val| remains uninitialized and the syscall may behave differently
depending on its value, and even copy garbage to userspace on certain
architectures. To fix this we now return -EINVAL if optlen is too small.

This bug has been detected with KMSAN.

Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: fix tp_reserve race in packet_set_ring</title>
<updated>2017-08-13T02:29:08+00:00</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2017-08-10T16:41:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=63364a508d24944abb0975bd823cb11367c56283'/>
<id>63364a508d24944abb0975bd823cb11367c56283</id>
<content type='text'>
[ Upstream commit c27927e372f0785f3303e8fad94b85945e2c97b7 ]

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c27927e372f0785f3303e8fad94b85945e2c97b7 ]

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: fix use-after-free in prb_retire_rx_blk_timer_expired()</title>
<updated>2017-08-11T16:08:53+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2017-07-24T17:07:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=499338964af84436c0306e061c7b0212a181fccb'/>
<id>499338964af84436c0306e061c7b0212a181fccb</id>
<content type='text'>
[ Upstream commit c800aaf8d869f2b9b47b10c5c312fe19f0a94042 ]

There are multiple reports showing we have a use-after-free in
the timer prb_retire_rx_blk_timer_expired(), where we use struct
tpacket_kbdq_core::pkbdq, a pg_vec, after it gets freed by
free_pg_vec().

The interesting part is it is not freed via packet_release() but
via packet_setsockopt(), which means we are not closing the socket.
Looking into the big and fat function packet_set_ring(), this could
happen if we satisfy the following conditions:

1. closing == 0, not on packet_release() path
2. req-&gt;tp_block_nr == 0, we don't allocate a new pg_vec
3. rx_ring-&gt;pg_vec is already set as V3, which means we already called
   packet_set_ring() wtih req-&gt;tp_block_nr &gt; 0 previously
4. req-&gt;tp_frame_nr == 0, pass sanity check
5. po-&gt;mapped == 0, never called mmap()

In this scenario we are clearing the old rx_ring-&gt;pg_vec, so we need
to free this pg_vec, but we don't stop the timer on this path because
of closing==0.

The timer has to be stopped as long as we need to free pg_vec, therefore
the check on closing!=0 is wrong, we should check pg_vec!=NULL instead.

Thanks to liujian for testing different fixes.

Reported-by: alexander.levin@verizon.com
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Reported-by: liujian (CE) &lt;liujian56@huawei.com&gt;
Tested-by: liujian (CE) &lt;liujian56@huawei.com&gt;
Cc: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
Cc: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c800aaf8d869f2b9b47b10c5c312fe19f0a94042 ]

There are multiple reports showing we have a use-after-free in
the timer prb_retire_rx_blk_timer_expired(), where we use struct
tpacket_kbdq_core::pkbdq, a pg_vec, after it gets freed by
free_pg_vec().

The interesting part is it is not freed via packet_release() but
via packet_setsockopt(), which means we are not closing the socket.
Looking into the big and fat function packet_set_ring(), this could
happen if we satisfy the following conditions:

1. closing == 0, not on packet_release() path
2. req-&gt;tp_block_nr == 0, we don't allocate a new pg_vec
3. rx_ring-&gt;pg_vec is already set as V3, which means we already called
   packet_set_ring() wtih req-&gt;tp_block_nr &gt; 0 previously
4. req-&gt;tp_frame_nr == 0, pass sanity check
5. po-&gt;mapped == 0, never called mmap()

In this scenario we are clearing the old rx_ring-&gt;pg_vec, so we need
to free this pg_vec, but we don't stop the timer on this path because
of closing==0.

The timer has to be stopped as long as we need to free pg_vec, therefore
the check on closing!=0 is wrong, we should check pg_vec!=NULL instead.

Thanks to liujian for testing different fixes.

Reported-by: alexander.levin@verizon.com
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Reported-by: liujian (CE) &lt;liujian56@huawei.com&gt;
Tested-by: liujian (CE) &lt;liujian56@huawei.com&gt;
Cc: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
Cc: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/packet: fix overflow in check for tp_reserve</title>
<updated>2017-05-03T04:19:51+00:00</updated>
<author>
<name>Andrey Konovalov</name>
<email>andreyknvl@google.com</email>
</author>
<published>2017-03-29T14:11:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25adf4e32a890244f4f53d2ab30b423efb5aab41'/>
<id>25adf4e32a890244f4f53d2ab30b423efb5aab41</id>
<content type='text'>
[ Upstream commit bcc5364bdcfe131e6379363f089e7b4108d35b70 ]

When calculating po-&gt;tp_hdrlen + po-&gt;tp_reserve the result can overflow.

Fix by checking that tp_reserve &lt;= INT_MAX on assign.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit bcc5364bdcfe131e6379363f089e7b4108d35b70 ]

When calculating po-&gt;tp_hdrlen + po-&gt;tp_reserve the result can overflow.

Fix by checking that tp_reserve &lt;= INT_MAX on assign.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/packet: fix overflow in check for tp_frame_nr</title>
<updated>2017-05-03T04:19:51+00:00</updated>
<author>
<name>Andrey Konovalov</name>
<email>andreyknvl@google.com</email>
</author>
<published>2017-03-29T14:11:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cf71bd41f8091588c7b28ebba6eaa635a6874493'/>
<id>cf71bd41f8091588c7b28ebba6eaa635a6874493</id>
<content type='text'>
[ Upstream commit 8f8d28e4d6d815a391285e121c3a53a0b6cb9e7b ]

When calculating rb-&gt;frames_per_block * req-&gt;tp_block_nr the result
can overflow.

Add a check that tp_block_size * tp_block_nr &lt;= UINT_MAX.

Since frames_per_block &lt;= tp_block_size, the expression would
never overflow.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 8f8d28e4d6d815a391285e121c3a53a0b6cb9e7b ]

When calculating rb-&gt;frames_per_block * req-&gt;tp_block_nr the result
can overflow.

Add a check that tp_block_size * tp_block_nr &lt;= UINT_MAX.

Since frames_per_block &lt;= tp_block_size, the expression would
never overflow.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/packet: fix overflow in check for priv area size</title>
<updated>2017-04-18T05:14:37+00:00</updated>
<author>
<name>Andrey Konovalov</name>
<email>andreyknvl@google.com</email>
</author>
<published>2017-03-29T14:11:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d35f8fa0b93e61dd95b8f86928a783c4d8a32d3e'/>
<id>d35f8fa0b93e61dd95b8f86928a783c4d8a32d3e</id>
<content type='text'>
commit 2b6867c2ce76c596676bec7d2d525af525fdc6e2 upstream.

Subtracting tp_sizeof_priv from tp_block_size and casting to int
to check whether one is less then the other doesn't always work
(both of them are unsigned ints).

Compare them as is instead.

Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as
it can overflow inside BLK_PLUS_PRIV otherwise.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2b6867c2ce76c596676bec7d2d525af525fdc6e2 upstream.

Subtracting tp_sizeof_priv from tp_block_size and casting to int
to check whether one is less then the other doesn't always work
(both of them are unsigned ints).

Compare them as is instead.

Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as
it can overflow inside BLK_PLUS_PRIV otherwise.

Signed-off-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>net: don't call strlen() on the user buffer in packet_bind_spkt()</title>
<updated>2017-03-22T11:04:14+00:00</updated>
<author>
<name>Alexander Potapenko</name>
<email>glider@google.com</email>
</author>
<published>2017-03-01T11:57:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f331d6445a3e4013428b06169acf3ae33614e69b'/>
<id>f331d6445a3e4013428b06169acf3ae33614e69b</id>
<content type='text'>
[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
 0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
 [&lt;     inline     &gt;] __dump_stack lib/dump_stack.c:15
 [&lt;ffffffff82559ae8&gt;] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [&lt;ffffffff818a6626&gt;] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [&lt;ffffffff818a783b&gt;] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [&lt;     inline     &gt;] strlen lib/string.c:484
 [&lt;ffffffff8259b58d&gt;] strlcpy+0x9d/0x200 lib/string.c:144
 [&lt;ffffffff84b2eca4&gt;] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [&lt;ffffffff84242e4d&gt;] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [&lt;ffffffff84242a22&gt;] SyS_bind+0x82/0xa0 net/socket.c:1356
 [&lt;ffffffff8515991b&gt;] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
 [&lt;ffffffff810bb787&gt;] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [&lt;     inline     &gt;] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [&lt;     inline     &gt;] kmsan_save_stack mm/kmsan/kmsan.c:334
 [&lt;ffffffff818a59f8&gt;] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [&lt;ffffffff818a7773&gt;] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [&lt;ffffffff84242b69&gt;] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [&lt;ffffffff84242a22&gt;] SyS_bind+0x82/0xa0 net/socket.c:1356
 [&lt;ffffffff8515991b&gt;] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
 #include &lt;string.h&gt;
 #include &lt;sys/socket.h&gt;
 #include &lt;netpacket/packet.h&gt;
 #include &lt;net/ethernet.h&gt;

 int main() {
   struct sockaddr addr;
   memset(&amp;addr, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, &amp;addr, sizeof(addr));
   return 0;
 }
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
 0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
 [&lt;     inline     &gt;] __dump_stack lib/dump_stack.c:15
 [&lt;ffffffff82559ae8&gt;] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [&lt;ffffffff818a6626&gt;] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [&lt;ffffffff818a783b&gt;] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [&lt;     inline     &gt;] strlen lib/string.c:484
 [&lt;ffffffff8259b58d&gt;] strlcpy+0x9d/0x200 lib/string.c:144
 [&lt;ffffffff84b2eca4&gt;] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [&lt;ffffffff84242e4d&gt;] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [&lt;ffffffff84242a22&gt;] SyS_bind+0x82/0xa0 net/socket.c:1356
 [&lt;ffffffff8515991b&gt;] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
 [&lt;ffffffff810bb787&gt;] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [&lt;     inline     &gt;] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [&lt;     inline     &gt;] kmsan_save_stack mm/kmsan/kmsan.c:334
 [&lt;ffffffff818a59f8&gt;] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [&lt;ffffffff818a7773&gt;] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [&lt;ffffffff84242b69&gt;] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [&lt;ffffffff84242a22&gt;] SyS_bind+0x82/0xa0 net/socket.c:1356
 [&lt;ffffffff8515991b&gt;] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
 #include &lt;string.h&gt;
 #include &lt;sys/socket.h&gt;
 #include &lt;netpacket/packet.h&gt;
 #include &lt;net/ethernet.h&gt;

 int main() {
   struct sockaddr addr;
   memset(&amp;addr, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, &amp;addr, sizeof(addr));
   return 0;
 }
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: Do not call fanout_release from atomic contexts</title>
<updated>2017-02-26T10:07:50+00:00</updated>
<author>
<name>Anoob Soman</name>
<email>anoob.soman@citrix.com</email>
</author>
<published>2017-02-15T20:25:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe41cfb48f2d82577ca5b56909ee251ae52cd172'/>
<id>fe41cfb48f2d82577ca5b56909ee251ae52cd172</id>
<content type='text'>
[ Upstream commit 2bd624b4611ffee36422782d16e1c944d1351e98 ]

Commit 6664498280cf ("packet: call fanout_release, while UNREGISTERING a
netdev"), unfortunately, introduced the following issues.

1. calling mutex_lock(&amp;fanout_mutex) (fanout_release()) from inside
rcu_read-side critical section. rcu_read_lock disables preemption, most often,
which prohibits calling sleeping functions.

[  ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
[  ]
[  ] rcu_scheduler_active = 1, debug_locks = 0
[  ] 4 locks held by ovs-vswitchd/1969:
[  ]  #0:  (cb_lock){++++++}, at: [&lt;ffffffff8158a6c9&gt;] genl_rcv+0x19/0x40
[  ]  #1:  (ovs_mutex){+.+.+.}, at: [&lt;ffffffffa04878ca&gt;] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
[  ]  #2:  (rtnl_mutex){+.+.+.}, at: [&lt;ffffffff81564157&gt;] rtnl_lock+0x17/0x20
[  ]  #3:  (rcu_read_lock){......}, at: [&lt;ffffffff81614165&gt;] packet_notifier+0x5/0x3f0
[  ]
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff810c9077&gt;] lockdep_rcu_suspicious+0x107/0x110
[  ]  [&lt;ffffffff810a2da7&gt;] ___might_sleep+0x57/0x210
[  ]  [&lt;ffffffff810a2fd0&gt;] __might_sleep+0x70/0x90
[  ]  [&lt;ffffffff8162e80c&gt;] mutex_lock_nested+0x3c/0x3a0
[  ]  [&lt;ffffffff810de93f&gt;] ? vprintk_default+0x1f/0x30
[  ]  [&lt;ffffffff81186e88&gt;] ? printk+0x4d/0x4f
[  ]  [&lt;ffffffff816106dd&gt;] fanout_release+0x1d/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

2. calling mutex_lock(&amp;fanout_mutex) inside spin_lock(&amp;po-&gt;bind_lock).
"sleeping function called from invalid context"

[  ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[  ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
[  ] INFO: lockdep is turned off.
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff810a2f52&gt;] ___might_sleep+0x202/0x210
[  ]  [&lt;ffffffff810a2fd0&gt;] __might_sleep+0x70/0x90
[  ]  [&lt;ffffffff8162e80c&gt;] mutex_lock_nested+0x3c/0x3a0
[  ]  [&lt;ffffffff816106dd&gt;] fanout_release+0x1d/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

3. calling dev_remove_pack(&amp;fanout-&gt;prot_hook), from inside
spin_lock(&amp;po-&gt;bind_lock) or rcu_read-side critical-section. dev_remove_pack()
-&gt; synchronize_net(), which might sleep.

[  ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
[  ] INFO: lockdep is turned off.
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff81186274&gt;] __schedule_bug+0x64/0x73
[  ]  [&lt;ffffffff8162b8cb&gt;] __schedule+0x6b/0xd10
[  ]  [&lt;ffffffff8162c5db&gt;] schedule+0x6b/0x80
[  ]  [&lt;ffffffff81630b1d&gt;] schedule_timeout+0x38d/0x410
[  ]  [&lt;ffffffff810ea3fd&gt;] synchronize_sched_expedited+0x53d/0x810
[  ]  [&lt;ffffffff810ea6de&gt;] synchronize_rcu_expedited+0xe/0x10
[  ]  [&lt;ffffffff8154eab5&gt;] synchronize_net+0x35/0x50
[  ]  [&lt;ffffffff8154eae3&gt;] dev_remove_pack+0x13/0x20
[  ]  [&lt;ffffffff8161077e&gt;] fanout_release+0xbe/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

4. fanout_release() races with calls from different CPU.

To fix the above problems, remove the call to fanout_release() under
rcu_read_lock(). Instead, call __dev_remove_pack(&amp;fanout-&gt;prot_hook) and
netdev_run_todo will be happy that &amp;dev-&gt;ptype_specific list is empty. In order
to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
__fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
fanout-&gt;prot_hook is removed as well.

Fixes: 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev")
Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Anoob Soman &lt;anoob.soman@citrix.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2bd624b4611ffee36422782d16e1c944d1351e98 ]

Commit 6664498280cf ("packet: call fanout_release, while UNREGISTERING a
netdev"), unfortunately, introduced the following issues.

1. calling mutex_lock(&amp;fanout_mutex) (fanout_release()) from inside
rcu_read-side critical section. rcu_read_lock disables preemption, most often,
which prohibits calling sleeping functions.

[  ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
[  ]
[  ] rcu_scheduler_active = 1, debug_locks = 0
[  ] 4 locks held by ovs-vswitchd/1969:
[  ]  #0:  (cb_lock){++++++}, at: [&lt;ffffffff8158a6c9&gt;] genl_rcv+0x19/0x40
[  ]  #1:  (ovs_mutex){+.+.+.}, at: [&lt;ffffffffa04878ca&gt;] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
[  ]  #2:  (rtnl_mutex){+.+.+.}, at: [&lt;ffffffff81564157&gt;] rtnl_lock+0x17/0x20
[  ]  #3:  (rcu_read_lock){......}, at: [&lt;ffffffff81614165&gt;] packet_notifier+0x5/0x3f0
[  ]
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff810c9077&gt;] lockdep_rcu_suspicious+0x107/0x110
[  ]  [&lt;ffffffff810a2da7&gt;] ___might_sleep+0x57/0x210
[  ]  [&lt;ffffffff810a2fd0&gt;] __might_sleep+0x70/0x90
[  ]  [&lt;ffffffff8162e80c&gt;] mutex_lock_nested+0x3c/0x3a0
[  ]  [&lt;ffffffff810de93f&gt;] ? vprintk_default+0x1f/0x30
[  ]  [&lt;ffffffff81186e88&gt;] ? printk+0x4d/0x4f
[  ]  [&lt;ffffffff816106dd&gt;] fanout_release+0x1d/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

2. calling mutex_lock(&amp;fanout_mutex) inside spin_lock(&amp;po-&gt;bind_lock).
"sleeping function called from invalid context"

[  ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[  ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
[  ] INFO: lockdep is turned off.
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff810a2f52&gt;] ___might_sleep+0x202/0x210
[  ]  [&lt;ffffffff810a2fd0&gt;] __might_sleep+0x70/0x90
[  ]  [&lt;ffffffff8162e80c&gt;] mutex_lock_nested+0x3c/0x3a0
[  ]  [&lt;ffffffff816106dd&gt;] fanout_release+0x1d/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

3. calling dev_remove_pack(&amp;fanout-&gt;prot_hook), from inside
spin_lock(&amp;po-&gt;bind_lock) or rcu_read-side critical-section. dev_remove_pack()
-&gt; synchronize_net(), which might sleep.

[  ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
[  ] INFO: lockdep is turned off.
[  ] Call Trace:
[  ]  [&lt;ffffffff813770c1&gt;] dump_stack+0x85/0xc4
[  ]  [&lt;ffffffff81186274&gt;] __schedule_bug+0x64/0x73
[  ]  [&lt;ffffffff8162b8cb&gt;] __schedule+0x6b/0xd10
[  ]  [&lt;ffffffff8162c5db&gt;] schedule+0x6b/0x80
[  ]  [&lt;ffffffff81630b1d&gt;] schedule_timeout+0x38d/0x410
[  ]  [&lt;ffffffff810ea3fd&gt;] synchronize_sched_expedited+0x53d/0x810
[  ]  [&lt;ffffffff810ea6de&gt;] synchronize_rcu_expedited+0xe/0x10
[  ]  [&lt;ffffffff8154eab5&gt;] synchronize_net+0x35/0x50
[  ]  [&lt;ffffffff8154eae3&gt;] dev_remove_pack+0x13/0x20
[  ]  [&lt;ffffffff8161077e&gt;] fanout_release+0xbe/0xe0
[  ]  [&lt;ffffffff81614459&gt;] packet_notifier+0x2f9/0x3f0

4. fanout_release() races with calls from different CPU.

To fix the above problems, remove the call to fanout_release() under
rcu_read_lock(). Instead, call __dev_remove_pack(&amp;fanout-&gt;prot_hook) and
netdev_run_todo will be happy that &amp;dev-&gt;ptype_specific list is empty. In order
to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
__fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
fanout-&gt;prot_hook is removed as well.

Fixes: 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev")
Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Anoob Soman &lt;anoob.soman@citrix.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: fix races in fanout_add()</title>
<updated>2017-02-26T10:07:49+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-02-14T17:03:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=abd672deb170c4443e41173160de0ba2ae1abc08'/>
<id>abd672deb170c4443e41173160de0ba2ae1abc08</id>
<content type='text'>
[ Upstream commit d199fab63c11998a602205f7ee7ff7c05c97164b ]

Multiple threads can call fanout_add() at the same time.

We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po-&gt;rollover that was set by another thread.

Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.

Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d199fab63c11998a602205f7ee7ff7c05c97164b ]

Multiple threads can call fanout_add() at the same time.

We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po-&gt;rollover that was set by another thread.

Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.

Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: round up linear to header len</title>
<updated>2017-02-18T15:39:27+00:00</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2017-02-07T20:57:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9117c897c9aa213fa0f682720f88a259b7ce7af9'/>
<id>9117c897c9aa213fa0f682720f88a259b7ce7af9</id>
<content type='text'>
[ Upstream commit 57031eb794906eea4e1c7b31dc1e2429c0af0c66 ]

Link layer protocols may unconditionally pull headers, as Ethernet
does in eth_type_trans. Ensure that the entire link layer header
always lies in the skb linear segment. tpacket_snd has such a check.
Extend this to packet_snd.

Variable length link layer headers complicate the computation
somewhat. Here skb-&gt;len may be smaller than dev-&gt;hard_header_len.

Round up the linear length to be at least as long as the smallest of
the two.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 57031eb794906eea4e1c7b31dc1e2429c0af0c66 ]

Link layer protocols may unconditionally pull headers, as Ethernet
does in eth_type_trans. Ensure that the entire link layer header
always lies in the skb linear segment. tpacket_snd has such a check.
Extend this to packet_snd.

Variable length link layer headers complicate the computation
somewhat. Here skb-&gt;len may be smaller than dev-&gt;hard_header_len.

Round up the linear length to be at least as long as the smallest of
the two.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
