<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net, branch v4.6-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2016-04-21T19:57:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-04-21T19:57:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c5edde3a81149d29ceae4221f09f4c7bc2f70846'/>
<id>c5edde3a81149d29ceae4221f09f4c7bc2f70846</id>
<content type='text'>
Pull networking fixes from David Miller:

 1) Fix memory leak in iwlwifi, from Matti Gottlieb.

 2) Add missing registration of netfilter arp_tables into initial
    namespace, from Florian Westphal.

 3) Fix potential NULL deref in DecNET routing code.

 4) Restrict NETLINK_URELEASE to truly bound sockets only, from Dmitry
    Ivanov.

 5) Fix dst ref counting in VRF, from David Ahern.

 6) Fix TSO segmenting limits in i40e driver, from Alexander Duyck.

 7) Fix heap leak in PACKET_DIAG_MCLIST, from Mathias Krause.

 8) Ravalidate IPV6 datagram socket cached routes properly, particularly
    with UDP, from Martin KaFai Lau.

 9) Fix endian bug in RDS dp_ack_seq handling, from Qing Huang.

10) Fix stats typing in bcmgenet driver, from Eric Dumazet.

11) Openvswitch needs to orphan SKBs before ipv6 fragmentation handing,
    from Joe Stringer.

12) SPI device reference leak in spi_ks8895 PHY driver, from Mark Brown.

13) atl2 doesn't actually support scatter-gather, so don't advertise the
    feature.  From Ben Hucthings.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
  openvswitch: use flow protocol when recalculating ipv6 checksums
  Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets
  atl2: Disable unimplemented scatter/gather feature
  net/mlx4_en: Split SW RX dropped counter per RX ring
  net/mlx4_core: Don't allow to VF change global pause settings
  net/mlx4_core: Avoid repeated calls to pci enable/disable
  net/mlx4_core: Implement pci_resume callback
  net: phy: spi_ks8895: Don't leak references to SPI devices
  net: ethernet: davinci_emac: Fix platform_data overwrite
  net: ethernet: davinci_emac: Fix Unbalanced pm_runtime_enable
  qede: Fix single MTU sized packet from firmware GRO flow
  qede: Fix setting Skb network header
  qede: Fix various memory allocation error flows for fastpath
  tcp: Merge tx_flags and tskey in tcp_shifted_skb
  tcp: Merge tx_flags and tskey in tcp_collapse_retrans
  drivers: net: cpsw: fix wrong regs access in cpsw_ndo_open
  tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks
  openvswitch: Orphan skbs before IPv6 defrag
  Revert "Prevent NUll pointer dereference with two PHYs on cpsw"
  VSOCK: Only check error on skb_recv_datagram when skb is NULL
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking fixes from David Miller:

 1) Fix memory leak in iwlwifi, from Matti Gottlieb.

 2) Add missing registration of netfilter arp_tables into initial
    namespace, from Florian Westphal.

 3) Fix potential NULL deref in DecNET routing code.

 4) Restrict NETLINK_URELEASE to truly bound sockets only, from Dmitry
    Ivanov.

 5) Fix dst ref counting in VRF, from David Ahern.

 6) Fix TSO segmenting limits in i40e driver, from Alexander Duyck.

 7) Fix heap leak in PACKET_DIAG_MCLIST, from Mathias Krause.

 8) Ravalidate IPV6 datagram socket cached routes properly, particularly
    with UDP, from Martin KaFai Lau.

 9) Fix endian bug in RDS dp_ack_seq handling, from Qing Huang.

10) Fix stats typing in bcmgenet driver, from Eric Dumazet.

11) Openvswitch needs to orphan SKBs before ipv6 fragmentation handing,
    from Joe Stringer.

12) SPI device reference leak in spi_ks8895 PHY driver, from Mark Brown.

13) atl2 doesn't actually support scatter-gather, so don't advertise the
    feature.  From Ben Hucthings.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
  openvswitch: use flow protocol when recalculating ipv6 checksums
  Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets
  atl2: Disable unimplemented scatter/gather feature
  net/mlx4_en: Split SW RX dropped counter per RX ring
  net/mlx4_core: Don't allow to VF change global pause settings
  net/mlx4_core: Avoid repeated calls to pci enable/disable
  net/mlx4_core: Implement pci_resume callback
  net: phy: spi_ks8895: Don't leak references to SPI devices
  net: ethernet: davinci_emac: Fix platform_data overwrite
  net: ethernet: davinci_emac: Fix Unbalanced pm_runtime_enable
  qede: Fix single MTU sized packet from firmware GRO flow
  qede: Fix setting Skb network header
  qede: Fix various memory allocation error flows for fastpath
  tcp: Merge tx_flags and tskey in tcp_shifted_skb
  tcp: Merge tx_flags and tskey in tcp_collapse_retrans
  drivers: net: cpsw: fix wrong regs access in cpsw_ndo_open
  tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks
  openvswitch: Orphan skbs before IPv6 defrag
  Revert "Prevent NUll pointer dereference with two PHYs on cpsw"
  VSOCK: Only check error on skb_recv_datagram when skb is NULL
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: use flow protocol when recalculating ipv6 checksums</title>
<updated>2016-04-21T19:28:47+00:00</updated>
<author>
<name>Simon Horman</name>
<email>simon.horman@netronome.com</email>
</author>
<published>2016-04-21T01:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b4f70527f052b0c00be4d7cac562baa75b212df5'/>
<id>b4f70527f052b0c00be4d7cac562baa75b212df5</id>
<content type='text'>
When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.

This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.

Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
Cc: Jarno Rajahalme &lt;jrajahalme@nicira.com&gt;
Signed-off-by: Simon Horman &lt;simon.horman@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.

This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.

Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
Cc: Jarno Rajahalme &lt;jrajahalme@nicira.com&gt;
Signed-off-by: Simon Horman &lt;simon.horman@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Merge tx_flags and tskey in tcp_shifted_skb</title>
<updated>2016-04-21T18:40:55+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2016-04-20T05:39:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cfea5a688eb37bcd1081255df9f9f777f4e61999'/>
<id>cfea5a688eb37bcd1081255df9f9f777f4e61999</id>
<content type='text'>
After receiving sacks, tcp_shifted_skb() will collapse
skbs if possible.  tx_flags and tskey also have to be
merged.

This patch reuses the tcp_skb_collapse_tstamp() to handle
them.

BPF Output Before:
~~~~~
&lt;no-output-due-to-missing-tstamp-event&gt;

BPF Output After:
~~~~~
&lt;...&gt;-2024  [007] d.s.    88.644374: : ee_data:14599

Packetdrill Script:
~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 write(4, ..., 1460) = 1460
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 13140) = 13140

0.200 &gt; P. 1:1461(1460) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:14601(5840) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:14601,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 14601 win 257

0.400 close(4) = 0
0.400 &gt; F. 14601:14601(0) ack 1
0.500 &lt; F. 1:1(0) ack 14602 win 257
0.500 &gt; . 14602:14602(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After receiving sacks, tcp_shifted_skb() will collapse
skbs if possible.  tx_flags and tskey also have to be
merged.

This patch reuses the tcp_skb_collapse_tstamp() to handle
them.

BPF Output Before:
~~~~~
&lt;no-output-due-to-missing-tstamp-event&gt;

BPF Output After:
~~~~~
&lt;...&gt;-2024  [007] d.s.    88.644374: : ee_data:14599

Packetdrill Script:
~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 write(4, ..., 1460) = 1460
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 13140) = 13140

0.200 &gt; P. 1:1461(1460) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:14601(5840) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:14601,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 14601 win 257

0.400 close(4) = 0
0.400 &gt; F. 14601:14601(0) ack 1
0.500 &lt; F. 1:1(0) ack 14602 win 257
0.500 &gt; . 14602:14602(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Merge tx_flags and tskey in tcp_collapse_retrans</title>
<updated>2016-04-21T18:40:55+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2016-04-20T05:39:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=082ac2d51d9f19ec1c29bdaaaf7fb49889e4fade'/>
<id>082ac2d51d9f19ec1c29bdaaaf7fb49889e4fade</id>
<content type='text'>
If two skbs are merged/collapsed during retransmission, the current
logic does not merge the tx_flags and tskey.  The end result is
the SCM_TSTAMP_ACK timestamp could be missing for a packet.

The patch:
1. Merge the tx_flags
2. Overwrite the prev_skb's tskey with the next_skb's tskey

BPF Output Before:
~~~~~~
&lt;no-output-due-to-missing-tstamp-event&gt;

BPF Output After:
~~~~~~
packetdrill-2092  [001] d.s.   453.998486: : ee_data:1459

Packetdrill Script:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 write(4, ..., 730) = 730
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 730) = 730
+0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
0.200 write(4, ..., 11680) = 11680
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0

0.200 &gt; P. 1:731(730) ack 1
0.200 &gt; P. 731:1461(730) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:13141(4380) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:2921,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:4381,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:5841,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 13141 win 257

0.400 close(4) = 0
0.400 &gt; F. 13141:13141(0) ack 1
0.500 &lt; F. 1:1(0) ack 13142 win 257
0.500 &gt; . 13142:13142(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If two skbs are merged/collapsed during retransmission, the current
logic does not merge the tx_flags and tskey.  The end result is
the SCM_TSTAMP_ACK timestamp could be missing for a packet.

The patch:
1. Merge the tx_flags
2. Overwrite the prev_skb's tskey with the next_skb's tskey

BPF Output Before:
~~~~~~
&lt;no-output-due-to-missing-tstamp-event&gt;

BPF Output After:
~~~~~~
packetdrill-2092  [001] d.s.   453.998486: : ee_data:1459

Packetdrill Script:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 write(4, ..., 730) = 730
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 730) = 730
+0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
0.200 write(4, ..., 11680) = 11680
+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0

0.200 &gt; P. 1:731(730) ack 1
0.200 &gt; P. 731:1461(730) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:13141(4380) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:2921,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:4381,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:5841,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 13141 win 257

0.400 close(4) = 0
0.400 &gt; F. 13141:13141(0) ack 1
0.500 &lt; F. 1:1(0) ack 13142 win 257
0.500 &gt; . 13142:13142(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks</title>
<updated>2016-04-21T17:45:43+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2016-04-18T22:39:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=479f85c36688f5c855ad463b71902ef5992628b7'/>
<id>479f85c36688f5c855ad463b71902ef5992628b7</id>
<content type='text'>
Assuming SOF_TIMESTAMPING_TX_ACK is on. When dup acks are received,
it could incorrectly think that a skb has already
been acked and queue a SCM_TSTAMP_ACK cmsg to the
sk-&gt;sk_error_queue.

In tcp_ack_tstamp(), it checks
'between(shinfo-&gt;tskey, prior_snd_una, tcp_sk(sk)-&gt;snd_una - 1)'.
If prior_snd_una == tcp_sk(sk)-&gt;snd_una like the following packetdrill
script, between() returns true but the tskey is actually not acked.
e.g. try between(3, 2, 1).

The fix is to replace between() with one before() and one !before().
By doing this, the -1 offset on the tcp_sk(sk)-&gt;snd_una can also be
removed.

A packetdrill script is used to reproduce the dup ack scenario.
Due to the lacking cmsg support in packetdrill (may be I
cannot find it),  a BPF prog is used to kprobe to
sock_queue_err_skb() and print out the value of
serr-&gt;ee.ee_data.

Both the packetdrill and the bcc BPF script is attached at the end of
this commit message.

BPF Output Before Fix:
~~~~~~
      &lt;...&gt;-2056  [001] d.s.   433.927987: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   433.929563: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   433.930765: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   434.028177: : ee_data:1459
packetdrill-2056  [001] d.s.   434.029686: : ee_data:14599

BPF Output After Fix:
~~~~~~
      &lt;...&gt;-2049  [000] d.s.   113.517039: : ee_data:1459
      &lt;...&gt;-2049  [000] d.s.   113.517253: : ee_data:14599

BCC BPF Script:
~~~~~~
#!/usr/bin/env python

from __future__ import print_function
from bcc import BPF

bpf_text = """
#include &lt;uapi/linux/ptrace.h&gt;
#include &lt;net/sock.h&gt;
#include &lt;bcc/proto.h&gt;
#include &lt;linux/errqueue.h&gt;

#ifdef memset
#undef memset
#endif

int trace_err_skb(struct pt_regs *ctx)
{
	struct sk_buff *skb = (struct sk_buff *)ctx-&gt;si;
	struct sock *sk = (struct sock *)ctx-&gt;di;
	struct sock_exterr_skb *serr;
	u32 ee_data = 0;

	if (!sk || !skb)
		return 0;

	serr = SKB_EXT_ERR(skb);
	bpf_probe_read(&amp;ee_data, sizeof(ee_data), &amp;serr-&gt;ee.ee_data);
	bpf_trace_printk("ee_data:%u\\n", ee_data);

	return 0;
};
"""

b = BPF(text=bpf_text)
b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb")
print("Attached to kprobe")
b.trace_print()

Packetdrill Script:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 1460) = 1460
0.200 write(4, ..., 13140) = 13140

0.200 &gt; P. 1:1461(1460) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:14601(5840) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:2921,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:4381,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:5841,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 14601 win 257

0.400 close(4) = 0
0.400 &gt; F. 14601:14601(0) ack 1
0.500 &lt; F. 1:1(0) ack 14602 win 257
0.500 &gt; . 14602:14602(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil.kdev@gmail.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Assuming SOF_TIMESTAMPING_TX_ACK is on. When dup acks are received,
it could incorrectly think that a skb has already
been acked and queue a SCM_TSTAMP_ACK cmsg to the
sk-&gt;sk_error_queue.

In tcp_ack_tstamp(), it checks
'between(shinfo-&gt;tskey, prior_snd_una, tcp_sk(sk)-&gt;snd_una - 1)'.
If prior_snd_una == tcp_sk(sk)-&gt;snd_una like the following packetdrill
script, between() returns true but the tskey is actually not acked.
e.g. try between(3, 2, 1).

The fix is to replace between() with one before() and one !before().
By doing this, the -1 offset on the tcp_sk(sk)-&gt;snd_una can also be
removed.

A packetdrill script is used to reproduce the dup ack scenario.
Due to the lacking cmsg support in packetdrill (may be I
cannot find it),  a BPF prog is used to kprobe to
sock_queue_err_skb() and print out the value of
serr-&gt;ee.ee_data.

Both the packetdrill and the bcc BPF script is attached at the end of
this commit message.

BPF Output Before Fix:
~~~~~~
      &lt;...&gt;-2056  [001] d.s.   433.927987: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   433.929563: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   433.930765: : ee_data:1459  #incorrect
packetdrill-2056  [001] d.s.   434.028177: : ee_data:1459
packetdrill-2056  [001] d.s.   434.029686: : ee_data:14599

BPF Output After Fix:
~~~~~~
      &lt;...&gt;-2049  [000] d.s.   113.517039: : ee_data:1459
      &lt;...&gt;-2049  [000] d.s.   113.517253: : ee_data:14599

BCC BPF Script:
~~~~~~
#!/usr/bin/env python

from __future__ import print_function
from bcc import BPF

bpf_text = """
#include &lt;uapi/linux/ptrace.h&gt;
#include &lt;net/sock.h&gt;
#include &lt;bcc/proto.h&gt;
#include &lt;linux/errqueue.h&gt;

#ifdef memset
#undef memset
#endif

int trace_err_skb(struct pt_regs *ctx)
{
	struct sk_buff *skb = (struct sk_buff *)ctx-&gt;si;
	struct sock *sk = (struct sock *)ctx-&gt;di;
	struct sock_exterr_skb *serr;
	u32 ee_data = 0;

	if (!sk || !skb)
		return 0;

	serr = SKB_EXT_ERR(skb);
	bpf_probe_read(&amp;ee_data, sizeof(ee_data), &amp;serr-&gt;ee.ee_data);
	bpf_trace_printk("ee_data:%u\\n", ee_data);

	return 0;
};
"""

b = BPF(text=bpf_text)
b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb")
print("Attached to kprobe")
b.trace_print()

Packetdrill Script:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 &lt; S 0:0(0) win 32792 &lt;mss 1460,sackOK,nop,nop,nop,wscale 7&gt;
0.100 &gt; S. 0:0(0) ack 1 &lt;mss 1460,nop,nop,sackOK,nop,wscale 7&gt;
0.200 &lt; . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

+0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
0.200 write(4, ..., 1460) = 1460
0.200 write(4, ..., 13140) = 13140

0.200 &gt; P. 1:1461(1460) ack 1
0.200 &gt; . 1461:8761(7300) ack 1
0.200 &gt; P. 8761:14601(5840) ack 1

0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:2921,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:4381,nop,nop&gt;
0.300 &lt; . 1:1(0) ack 1 win 257 &lt;sack 1461:5841,nop,nop&gt;
0.300 &gt; P. 1:1461(1460) ack 1
0.400 &lt; . 1:1(0) ack 14601 win 257

0.400 close(4) = 0
0.400 &gt; F. 14601:14601(0) ack 1
0.500 &lt; F. 1:1(0) ack 14602 win 257
0.500 &gt; . 14602:14602(0) ack 2

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Soheil Hassas Yeganeh &lt;soheil.kdev@gmail.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Tested-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: Orphan skbs before IPv6 defrag</title>
<updated>2016-04-21T17:42:05+00:00</updated>
<author>
<name>Joe Stringer</name>
<email>joe@ovn.org</email>
</author>
<published>2016-04-18T21:51:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=49e261a8a21e0960a3f7ff187a453ba1c1149053'/>
<id>49e261a8a21e0960a3f7ff187a453ba1c1149053</id>
<content type='text'>
This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
orphan skbs inside ip_defrag()").

Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
cloned (implicitly orphaning) prior to queueing for reassembly. As such,
when the IPv6 message is eventually reassembled, the skb-&gt;sk for all
fragments would be NULL. After that commit was introduced, rather than
cloning, the original skbs were queued directly without orphaning. The
end result is that all frags except for the first and last may have a
socket attached.

This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
prevent BUG_ON(skb-&gt;sk) during a later call to ip6_fragment().

kernel BUG at net/ipv6/ip6_output.c:631!
[...]
Call Trace:
 &lt;IRQ&gt;
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffffa042c7c0&gt;] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
 [&lt;ffffffff810bb8a2&gt;] ? __lock_is_held+0x52/0x70
 [&lt;ffffffffa042c587&gt;] ovs_fragment+0x1f7/0x280 [openvswitch]
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff817be416&gt;] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [&lt;ffffffff81697ea0&gt;] ? dst_discard_out+0x20/0x20
 [&lt;ffffffff81697e80&gt;] ? dst_ifdown+0x80/0x80
 [&lt;ffffffffa042c703&gt;] do_output.isra.28+0xf3/0x1b0 [openvswitch]
 [&lt;ffffffffa042d279&gt;] do_execute_actions+0x709/0x12c0 [openvswitch]
 [&lt;ffffffffa04340a4&gt;] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
 [&lt;ffffffffa04340d1&gt;] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
 [&lt;ffffffff817be387&gt;] ? _raw_spin_unlock+0x27/0x40
 [&lt;ffffffffa042de75&gt;] ovs_execute_actions+0x45/0x120 [openvswitch]
 [&lt;ffffffffa0432d65&gt;] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [&lt;ffffffff817be387&gt;] ? _raw_spin_unlock+0x27/0x40
 [&lt;ffffffffa042def4&gt;] ovs_execute_actions+0xc4/0x120 [openvswitch]
 [&lt;ffffffffa0432d65&gt;] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [&lt;ffffffffa04337f2&gt;] ? key_extract+0x442/0xc10 [openvswitch]
 [&lt;ffffffffa043b26d&gt;] ovs_vport_receive+0x5d/0xb0 [openvswitch]
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff817be416&gt;] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [&lt;ffffffffa043c11d&gt;] internal_dev_xmit+0x6d/0x150 [openvswitch]
 [&lt;ffffffffa043c0b5&gt;] ? internal_dev_xmit+0x5/0x150 [openvswitch]
 [&lt;ffffffff8168fb5f&gt;] dev_hard_start_xmit+0x2df/0x660
 [&lt;ffffffff8168f5ea&gt;] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
 [&lt;ffffffff81690925&gt;] __dev_queue_xmit+0x8f5/0x950
 [&lt;ffffffff81690080&gt;] ? __dev_queue_xmit+0x50/0x950
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff81690990&gt;] dev_queue_xmit+0x10/0x20
 [&lt;ffffffff8169a418&gt;] neigh_resolve_output+0x178/0x220
 [&lt;ffffffff81752759&gt;] ? ip6_finish_output2+0x219/0x7b0
 [&lt;ffffffff81752759&gt;] ip6_finish_output2+0x219/0x7b0
 [&lt;ffffffff817525a5&gt;] ? ip6_finish_output2+0x65/0x7b0
 [&lt;ffffffff816cde2b&gt;] ? ip_idents_reserve+0x6b/0x80
 [&lt;ffffffff8175488f&gt;] ? ip6_fragment+0x93f/0xc50
 [&lt;ffffffff81754af1&gt;] ip6_fragment+0xba1/0xc50
 [&lt;ffffffff81752540&gt;] ? ip6_flush_pending_frames+0x40/0x40
 [&lt;ffffffff81754c6b&gt;] ip6_finish_output+0xcb/0x1d0
 [&lt;ffffffff81754dcf&gt;] ip6_output+0x5f/0x1a0
 [&lt;ffffffff81754ba0&gt;] ? ip6_fragment+0xc50/0xc50
 [&lt;ffffffff81797fbd&gt;] ip6_local_out+0x3d/0x80
 [&lt;ffffffff817554df&gt;] ip6_send_skb+0x2f/0xc0
 [&lt;ffffffff817555bd&gt;] ip6_push_pending_frames+0x4d/0x50
 [&lt;ffffffff817796cc&gt;] icmpv6_push_pending_frames+0xac/0xe0
 [&lt;ffffffff8177a4be&gt;] icmpv6_echo_reply+0x42e/0x500
 [&lt;ffffffff8177acbf&gt;] icmpv6_rcv+0x4cf/0x580
 [&lt;ffffffff81755ac7&gt;] ip6_input_finish+0x1a7/0x690
 [&lt;ffffffff81755925&gt;] ? ip6_input_finish+0x5/0x690
 [&lt;ffffffff817567a0&gt;] ip6_input+0x30/0xa0
 [&lt;ffffffff81755920&gt;] ? ip6_rcv_finish+0x1a0/0x1a0
 [&lt;ffffffff817557ce&gt;] ip6_rcv_finish+0x4e/0x1a0
 [&lt;ffffffff8175640f&gt;] ipv6_rcv+0x45f/0x7c0
 [&lt;ffffffff81755fe6&gt;] ? ipv6_rcv+0x36/0x7c0
 [&lt;ffffffff81755780&gt;] ? ip6_make_skb+0x1c0/0x1c0
 [&lt;ffffffff8168b649&gt;] __netif_receive_skb_core+0x229/0xb80
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff8168c07f&gt;] ? process_backlog+0x6f/0x230
 [&lt;ffffffff8168bfb6&gt;] __netif_receive_skb+0x16/0x70
 [&lt;ffffffff8168c088&gt;] process_backlog+0x78/0x230
 [&lt;ffffffff8168c0ed&gt;] ? process_backlog+0xdd/0x230
 [&lt;ffffffff8168db43&gt;] net_rx_action+0x203/0x480
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff817c156e&gt;] __do_softirq+0xde/0x49f
 [&lt;ffffffff81752768&gt;] ? ip6_finish_output2+0x228/0x7b0
 [&lt;ffffffff817c070c&gt;] do_softirq_own_stack+0x1c/0x30
 &lt;EOI&gt;
 [&lt;ffffffff8106f88b&gt;] do_softirq.part.18+0x3b/0x40
 [&lt;ffffffff8106f946&gt;] __local_bh_enable_ip+0xb6/0xc0
 [&lt;ffffffff81752791&gt;] ip6_finish_output2+0x251/0x7b0
 [&lt;ffffffff81754af1&gt;] ? ip6_fragment+0xba1/0xc50
 [&lt;ffffffff816cde2b&gt;] ? ip_idents_reserve+0x6b/0x80
 [&lt;ffffffff8175488f&gt;] ? ip6_fragment+0x93f/0xc50
 [&lt;ffffffff81754af1&gt;] ip6_fragment+0xba1/0xc50
 [&lt;ffffffff81752540&gt;] ? ip6_flush_pending_frames+0x40/0x40
 [&lt;ffffffff81754c6b&gt;] ip6_finish_output+0xcb/0x1d0
 [&lt;ffffffff81754dcf&gt;] ip6_output+0x5f/0x1a0
 [&lt;ffffffff81754ba0&gt;] ? ip6_fragment+0xc50/0xc50
 [&lt;ffffffff81797fbd&gt;] ip6_local_out+0x3d/0x80
 [&lt;ffffffff817554df&gt;] ip6_send_skb+0x2f/0xc0
 [&lt;ffffffff817555bd&gt;] ip6_push_pending_frames+0x4d/0x50
 [&lt;ffffffff81778558&gt;] rawv6_sendmsg+0xa28/0xe30
 [&lt;ffffffff81719097&gt;] ? inet_sendmsg+0xc7/0x1d0
 [&lt;ffffffff817190d6&gt;] inet_sendmsg+0x106/0x1d0
 [&lt;ffffffff81718fd5&gt;] ? inet_sendmsg+0x5/0x1d0
 [&lt;ffffffff8166d078&gt;] sock_sendmsg+0x38/0x50
 [&lt;ffffffff8166d4d6&gt;] SYSC_sendto+0xf6/0x170
 [&lt;ffffffff8100201b&gt;] ? trace_hardirqs_on_thunk+0x1b/0x1d
 [&lt;ffffffff8166e38e&gt;] SyS_sendto+0xe/0x10
 [&lt;ffffffff817bebe5&gt;] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a &lt;0f&gt; 0b 41 8b 86 cc 00 00 00 49 8#
RIP  [&lt;ffffffff8175468a&gt;] ip6_fragment+0x73a/0xc50
 RSP &lt;ffff880072803120&gt;

Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
operations")
Reported-by: Daniele Di Proietto &lt;diproiettod@vmware.com&gt;
Signed-off-by: Joe Stringer &lt;joe@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
orphan skbs inside ip_defrag()").

Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
cloned (implicitly orphaning) prior to queueing for reassembly. As such,
when the IPv6 message is eventually reassembled, the skb-&gt;sk for all
fragments would be NULL. After that commit was introduced, rather than
cloning, the original skbs were queued directly without orphaning. The
end result is that all frags except for the first and last may have a
socket attached.

This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
prevent BUG_ON(skb-&gt;sk) during a later call to ip6_fragment().

kernel BUG at net/ipv6/ip6_output.c:631!
[...]
Call Trace:
 &lt;IRQ&gt;
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffffa042c7c0&gt;] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
 [&lt;ffffffff810bb8a2&gt;] ? __lock_is_held+0x52/0x70
 [&lt;ffffffffa042c587&gt;] ovs_fragment+0x1f7/0x280 [openvswitch]
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff817be416&gt;] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [&lt;ffffffff81697ea0&gt;] ? dst_discard_out+0x20/0x20
 [&lt;ffffffff81697e80&gt;] ? dst_ifdown+0x80/0x80
 [&lt;ffffffffa042c703&gt;] do_output.isra.28+0xf3/0x1b0 [openvswitch]
 [&lt;ffffffffa042d279&gt;] do_execute_actions+0x709/0x12c0 [openvswitch]
 [&lt;ffffffffa04340a4&gt;] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
 [&lt;ffffffffa04340d1&gt;] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
 [&lt;ffffffff817be387&gt;] ? _raw_spin_unlock+0x27/0x40
 [&lt;ffffffffa042de75&gt;] ovs_execute_actions+0x45/0x120 [openvswitch]
 [&lt;ffffffffa0432d65&gt;] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [&lt;ffffffff817be387&gt;] ? _raw_spin_unlock+0x27/0x40
 [&lt;ffffffffa042def4&gt;] ovs_execute_actions+0xc4/0x120 [openvswitch]
 [&lt;ffffffffa0432d65&gt;] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [&lt;ffffffffa04337f2&gt;] ? key_extract+0x442/0xc10 [openvswitch]
 [&lt;ffffffffa043b26d&gt;] ovs_vport_receive+0x5d/0xb0 [openvswitch]
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff810be8f7&gt;] ? __lock_acquire+0x927/0x20a0
 [&lt;ffffffff817be416&gt;] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [&lt;ffffffffa043c11d&gt;] internal_dev_xmit+0x6d/0x150 [openvswitch]
 [&lt;ffffffffa043c0b5&gt;] ? internal_dev_xmit+0x5/0x150 [openvswitch]
 [&lt;ffffffff8168fb5f&gt;] dev_hard_start_xmit+0x2df/0x660
 [&lt;ffffffff8168f5ea&gt;] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
 [&lt;ffffffff81690925&gt;] __dev_queue_xmit+0x8f5/0x950
 [&lt;ffffffff81690080&gt;] ? __dev_queue_xmit+0x50/0x950
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff81690990&gt;] dev_queue_xmit+0x10/0x20
 [&lt;ffffffff8169a418&gt;] neigh_resolve_output+0x178/0x220
 [&lt;ffffffff81752759&gt;] ? ip6_finish_output2+0x219/0x7b0
 [&lt;ffffffff81752759&gt;] ip6_finish_output2+0x219/0x7b0
 [&lt;ffffffff817525a5&gt;] ? ip6_finish_output2+0x65/0x7b0
 [&lt;ffffffff816cde2b&gt;] ? ip_idents_reserve+0x6b/0x80
 [&lt;ffffffff8175488f&gt;] ? ip6_fragment+0x93f/0xc50
 [&lt;ffffffff81754af1&gt;] ip6_fragment+0xba1/0xc50
 [&lt;ffffffff81752540&gt;] ? ip6_flush_pending_frames+0x40/0x40
 [&lt;ffffffff81754c6b&gt;] ip6_finish_output+0xcb/0x1d0
 [&lt;ffffffff81754dcf&gt;] ip6_output+0x5f/0x1a0
 [&lt;ffffffff81754ba0&gt;] ? ip6_fragment+0xc50/0xc50
 [&lt;ffffffff81797fbd&gt;] ip6_local_out+0x3d/0x80
 [&lt;ffffffff817554df&gt;] ip6_send_skb+0x2f/0xc0
 [&lt;ffffffff817555bd&gt;] ip6_push_pending_frames+0x4d/0x50
 [&lt;ffffffff817796cc&gt;] icmpv6_push_pending_frames+0xac/0xe0
 [&lt;ffffffff8177a4be&gt;] icmpv6_echo_reply+0x42e/0x500
 [&lt;ffffffff8177acbf&gt;] icmpv6_rcv+0x4cf/0x580
 [&lt;ffffffff81755ac7&gt;] ip6_input_finish+0x1a7/0x690
 [&lt;ffffffff81755925&gt;] ? ip6_input_finish+0x5/0x690
 [&lt;ffffffff817567a0&gt;] ip6_input+0x30/0xa0
 [&lt;ffffffff81755920&gt;] ? ip6_rcv_finish+0x1a0/0x1a0
 [&lt;ffffffff817557ce&gt;] ip6_rcv_finish+0x4e/0x1a0
 [&lt;ffffffff8175640f&gt;] ipv6_rcv+0x45f/0x7c0
 [&lt;ffffffff81755fe6&gt;] ? ipv6_rcv+0x36/0x7c0
 [&lt;ffffffff81755780&gt;] ? ip6_make_skb+0x1c0/0x1c0
 [&lt;ffffffff8168b649&gt;] __netif_receive_skb_core+0x229/0xb80
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff8168c07f&gt;] ? process_backlog+0x6f/0x230
 [&lt;ffffffff8168bfb6&gt;] __netif_receive_skb+0x16/0x70
 [&lt;ffffffff8168c088&gt;] process_backlog+0x78/0x230
 [&lt;ffffffff8168c0ed&gt;] ? process_backlog+0xdd/0x230
 [&lt;ffffffff8168db43&gt;] net_rx_action+0x203/0x480
 [&lt;ffffffff810bdab5&gt;] ? mark_held_locks+0x75/0xa0
 [&lt;ffffffff817c156e&gt;] __do_softirq+0xde/0x49f
 [&lt;ffffffff81752768&gt;] ? ip6_finish_output2+0x228/0x7b0
 [&lt;ffffffff817c070c&gt;] do_softirq_own_stack+0x1c/0x30
 &lt;EOI&gt;
 [&lt;ffffffff8106f88b&gt;] do_softirq.part.18+0x3b/0x40
 [&lt;ffffffff8106f946&gt;] __local_bh_enable_ip+0xb6/0xc0
 [&lt;ffffffff81752791&gt;] ip6_finish_output2+0x251/0x7b0
 [&lt;ffffffff81754af1&gt;] ? ip6_fragment+0xba1/0xc50
 [&lt;ffffffff816cde2b&gt;] ? ip_idents_reserve+0x6b/0x80
 [&lt;ffffffff8175488f&gt;] ? ip6_fragment+0x93f/0xc50
 [&lt;ffffffff81754af1&gt;] ip6_fragment+0xba1/0xc50
 [&lt;ffffffff81752540&gt;] ? ip6_flush_pending_frames+0x40/0x40
 [&lt;ffffffff81754c6b&gt;] ip6_finish_output+0xcb/0x1d0
 [&lt;ffffffff81754dcf&gt;] ip6_output+0x5f/0x1a0
 [&lt;ffffffff81754ba0&gt;] ? ip6_fragment+0xc50/0xc50
 [&lt;ffffffff81797fbd&gt;] ip6_local_out+0x3d/0x80
 [&lt;ffffffff817554df&gt;] ip6_send_skb+0x2f/0xc0
 [&lt;ffffffff817555bd&gt;] ip6_push_pending_frames+0x4d/0x50
 [&lt;ffffffff81778558&gt;] rawv6_sendmsg+0xa28/0xe30
 [&lt;ffffffff81719097&gt;] ? inet_sendmsg+0xc7/0x1d0
 [&lt;ffffffff817190d6&gt;] inet_sendmsg+0x106/0x1d0
 [&lt;ffffffff81718fd5&gt;] ? inet_sendmsg+0x5/0x1d0
 [&lt;ffffffff8166d078&gt;] sock_sendmsg+0x38/0x50
 [&lt;ffffffff8166d4d6&gt;] SYSC_sendto+0xf6/0x170
 [&lt;ffffffff8100201b&gt;] ? trace_hardirqs_on_thunk+0x1b/0x1d
 [&lt;ffffffff8166e38e&gt;] SyS_sendto+0xe/0x10
 [&lt;ffffffff817bebe5&gt;] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a &lt;0f&gt; 0b 41 8b 86 cc 00 00 00 49 8#
RIP  [&lt;ffffffff8175468a&gt;] ip6_fragment+0x73a/0xc50
 RSP &lt;ffff880072803120&gt;

Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
operations")
Reported-by: Daniele Di Proietto &lt;diproiettod@vmware.com&gt;
Signed-off-by: Joe Stringer &lt;joe@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>VSOCK: Only check error on skb_recv_datagram when skb is NULL</title>
<updated>2016-04-20T00:42:01+00:00</updated>
<author>
<name>Jorgen Hansen</name>
<email>jhansen@vmware.com</email>
</author>
<published>2016-04-19T06:58:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9c995cc9a206a008699da82f6cd01e9b2615649a'/>
<id>9c995cc9a206a008699da82f6cd01e9b2615649a</id>
<content type='text'>
If skb_recv_datagram returns an skb, we should ignore the err
value returned. Otherwise, datagram receives will return EAGAIN
when they have to wait for a datagram.

Acked-by: Adit Ranadive &lt;aditr@vmware.com&gt;
Signed-off-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If skb_recv_datagram returns an skb, we should ignore the err
value returned. Otherwise, datagram receives will return EAGAIN
when they have to wait for a datagram.

Acked-by: Adit Ranadive &lt;aditr@vmware.com&gt;
Signed-off-by: Jorgen Hansen &lt;jhansen@vmware.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDS: Fix the atomicity for congestion map update</title>
<updated>2016-04-16T23:01:05+00:00</updated>
<author>
<name>santosh.shilimkar@oracle.com</name>
<email>santosh.shilimkar@oracle.com</email>
</author>
<published>2016-04-14T17:43:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e47db94e10447fc467777a40302f2b393e9af2fa'/>
<id>e47db94e10447fc467777a40302f2b393e9af2fa</id>
<content type='text'>
Two different threads with different rds sockets may be in
rds_recv_rcvbuf_delta() via receive path. If their ports
both map to the same word in the congestion map, then
using non-atomic ops to update it could cause the map to
be incorrect. Lets use atomics to avoid such an issue.

Full credit to Wengang &lt;wen.gang.wang@oracle.com&gt; for
finding the issue, analysing it and also pointing out
to offending code with spin lock based fix.

Reviewed-by: Leon Romanovsky &lt;leon@leon.nu&gt;
Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Signed-off-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Two different threads with different rds sockets may be in
rds_recv_rcvbuf_delta() via receive path. If their ports
both map to the same word in the congestion map, then
using non-atomic ops to update it could cause the map to
be incorrect. Lets use atomics to avoid such an issue.

Full credit to Wengang &lt;wen.gang.wang@oracle.com&gt; for
finding the issue, analysing it and also pointing out
to offending code with spin lock based fix.

Reviewed-by: Leon Romanovsky &lt;leon@leon.nu&gt;
Signed-off-by: Wengang Wang &lt;wen.gang.wang@oracle.com&gt;
Signed-off-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDS: fix endianness for dp_ack_seq</title>
<updated>2016-04-16T23:01:05+00:00</updated>
<author>
<name>Qing Huang</name>
<email>qing.huang@oracle.com</email>
</author>
<published>2016-04-14T17:43:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a7c556546f610a331c22cb7edd9d1afe63f0cd52'/>
<id>a7c556546f610a331c22cb7edd9d1afe63f0cd52</id>
<content type='text'>
dp-&gt;dp_ack_seq is used in big endian format. We need to do the
big endianness conversion when we assign a value in host format
to it.

Signed-off-by: Qing Huang &lt;qing.huang@oracle.com&gt;
Signed-off-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dp-&gt;dp_ack_seq is used in big endian format. We need to do the
big endianness conversion when we assign a value in host format
to it.

Signed-off-by: Qing Huang &lt;qing.huang@oracle.com&gt;
Signed-off-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vlan: pull on __vlan_insert_tag error path and fix csum correction</title>
<updated>2016-04-16T03:20:11+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2016-04-16T00:27:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9241e2df4fbc648a92ea0752918e05c26255649e'/>
<id>9241e2df4fbc648a92ea0752918e05c26255649e</id>
<content type='text'>
When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb-&gt;data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb-&gt;data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb-&gt;data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb-&gt;data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb-&gt;data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb-&gt;data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
