linux-toradex.git/net/core, branch v4.4.29

tunnels: Don't apply GRO to multiple layers of encapsulation.

2016-10-31T10:13:59+00:00

commit fac8e0f579695a3ecbc4d3cac369139d7f819971 upstream.

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross 
Signed-off-by: David S. Miller 
Signed-off-by: Juerg Haefliger 
Signed-off-by: Greg Kroah-Hartman

bonding: Fix bonding crash

2016-09-30T08:18:36+00:00

[ Upstream commit 24b27fc4cdf9e10c5e79e5923b6b7c2c5c95096c ]

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.

Signed-off-by: Mahesh Bandewar 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net_sched: fix mirrored packets checksum

2016-07-27T16:47:31+00:00

[ Upstream commit 82a31b9231f02d9c1b7b290a46999d517b0d312a ]

Similar to commit 9b368814b336 ("net: fix bridge multicast packet checksum validation")
we need to fixup the checksum for CHECKSUM_COMPLETE when
pushing skb on RX path. Otherwise we get similar splats.

Cc: Jamal Hadi Salim 
Cc: Tom Herbert 
Signed-off-by: Cong Wang 
Acked-by: Jamal Hadi Salim 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

packet: Use symmetric hash for PACKET_FANOUT_HASH.

2016-07-27T16:47:31+00:00

[ Upstream commit eb70db8756717b90c01ccc765fdefc4dd969fc74 ]

People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
they want packets going in both directions on a flow to hash to the
same bucket.

The core kernel SKB hash became non-symmetric when the ipv6 flow label
and other entities were incorporated into the standard flow hash order
to increase entropy.

But there are no users of PACKET_FANOUT_HASH who want an assymetric
hash, they all want a symmetric one.

Therefore, use the flow dissector to compute a flat symmetric hash
over only the protocol, addresses and ports.  This hash does not get
installed into and override the normal skb hash, so this change has
no effect whatsoever on the rest of the stack.

Reported-by: Eric Leblond 
Tested-by: Eric Leblond 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bpf: try harder on clones when writing into skb

2016-07-11T16:31:12+00:00

[ Upstream commit 3697649ff29e0f647565eed04b27a7779c646a22 ]

When we're dealing with clones and the area is not writeable, try
harder and get a copy via pskb_expand_head(). Replace also other
occurences in tc actions with the new skb_try_make_writable().

Reported-by: Ashhad Sheikh 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()

2016-07-11T16:31:12+00:00

[ Upstream commit b560f03ddfb072bca65e9440ff0dc4f9b1d1f056 ]

neigh_xmit() expects to be called inside an RCU-bh read side critical
section, and while one of its two current callers gets this right, the
other one doesn't.

More specifically, neigh_xmit() has two callers, mpls_forward() and
mpls_output(), and while both callers call neigh_xmit() under
rcu_read_lock(), this provides sufficient protection for neigh_xmit()
only in the case of mpls_forward(), as that is always called from
softirq context and therefore doesn't need explicit BH protection,
while mpls_output() can be called from process context with softirqs
enabled.

When mpls_output() is called from process context, with softirqs
enabled, we can be preempted by a softirq at any time, and RCU-bh
considers the completion of a softirq as signaling the end of any
pending read-side critical sections, so if we do get a softirq
while we are in the part of neigh_xmit() that expects to be run inside
an RCU-bh read side critical section, we can end up with an unexpected
RCU grace period running right in the middle of that critical section,
making things go boom.

This patch fixes this impedance mismatch in the callee, by making
neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that
expects to be treated as an RCU-bh read side critical section, as this
seems a safer option than fixing it in the callers.

Fixes: 4fd3d7d9e868f ("neigh: Add helper function neigh_xmit")
Signed-off-by: David Barroso 
Signed-off-by: Lennert Buytenhek 
Acked-by: David Ahern 
Acked-by: Robert Shearman 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: fix infoleak in rtnetlink

2016-05-19T00:06:41+00:00

[ Upstream commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 ]

The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.

Signed-off-by: Kangjie Lu 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

vlan: pull on __vlan_insert_tag error path and fix csum correction

2016-05-19T00:06:36+00:00

[ Upstream commit 9241e2df4fbc648a92ea0752918e05c26255649e ]

When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb->data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb->data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb->data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann 
Reviewed-by: Jiri Pirko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: use skb_postpush_rcsum instead of own implementations

2016-05-19T00:06:36+00:00

[ Upstream commit 6b83d28a55a891a9d70fc61ccb1c138e47dcbe74 ]

Replace individual implementations with the recently introduced
skb_postpush_rcsum() helper.

Signed-off-by: Daniel Borkmann 
Acked-by: Tom Herbert 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter

2016-04-20T06:42:06+00:00

[ Upstream commit 5a5abb1fa3b05dd6aa821525832644c1e7d2905f ]

Sasha Levin reported a suspicious rcu_dereference_protected() warning
found while fuzzing with trinity that is similar to this one:

  [   52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
  [   52.765688] other info that might help us debug this:
  [   52.765695] rcu_scheduler_active = 1, debug_locks = 1
  [   52.765701] 1 lock held by a.out/1525:
  [   52.765704]  #0:  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
  [   52.765721] stack backtrace:
  [   52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
  [...]
  [   52.765768] Call Trace:
  [   52.765775]  [] dump_stack+0x85/0xc8
  [   52.765784]  [] lockdep_rcu_suspicious+0xd5/0x110
  [   52.765792]  [] sk_detach_filter+0x82/0x90
  [   52.765801]  [] tun_detach_filter+0x35/0x90 [tun]
  [   52.765810]  [] __tun_chr_ioctl+0x354/0x1130 [tun]
  [   52.765818]  [] ? selinux_file_ioctl+0x130/0x210
  [   52.765827]  [] tun_chr_ioctl+0x13/0x20 [tun]
  [   52.765834]  [] do_vfs_ioctl+0x96/0x690
  [   52.765843]  [] ? security_file_ioctl+0x43/0x60
  [   52.765850]  [] SyS_ioctl+0x79/0x90
  [   52.765858]  [] do_syscall_64+0x62/0x140
  [   52.765866]  [] entry_SYSCALL64_slow_path+0x25/0x25

Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.

Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
filter with rcu_dereference_protected(), checking whether socket lock
is held in control path.

Since its introduction in 994051625981 ("tun: socket filter support"),
tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
sock_owned_by_user(sk) doesn't apply in this specific case and therefore
triggers the false positive.

Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
that is used by tap filters and pass in lockdep_rtnl_is_held() for the
rcu_dereference_protected() checks instead.

Reported-by: Sasha Levin 
Signed-off-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman