linux-toradex.git/include, branch v4.4.35

tcp: take care of truncations done by sk_filter()

2016-11-21T09:06:40+00:00

[ Upstream commit ac6e780070e30e4c35bd395acfe9191e6268bdd3 ]

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet 
Reported-by: Marco Grassi 
Reported-by: Vladis Dronov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()

2016-11-21T09:06:39+00:00

[ Upstream commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 ]

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

udp: fix IP_CHECKSUM handling

2016-11-15T06:46:39+00:00

[ Upstream commit 10df8e6152c6c400a563a673e9956320bfce1871 ]

First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to
ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));

Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before
queueing") forgot to adjust the offsets now UDP headers are pulled
before skb are put in receive queue.

Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet 
Cc: Sam Kumar 
Cc: Willem de Bruijn 
Tested-by: Willem de Bruijn 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: add recursion limit to GRO

2016-11-15T06:46:38+00:00

[ Upstream commit fcd91dd449867c6bfe56a81cabba76b829fd05cd ]

Currently, GRO can do unlimited recursion through the gro_receive
handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem.  Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.

This patch adds a recursion counter to the GRO layer to prevent stack
overflow.  When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally.  This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.

Thanks to Vladimír Beneš  for the initial bug report.

Fixes: CVE-2016-7039
Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca 
Reviewed-by: Jiri Benc 
Acked-by: Hannes Frederic Sowa 
Acked-by: Tom Herbert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

rtnetlink: Add rtnexthop offload flag to compare mask

2016-11-15T06:46:38+00:00

[ Upstream commit 85dda4e5b0ee1f5b4e8cc93d39e475006bc61ccd ]

The offload flag is a status flag and should not be used by
FIB semantics for comparison.

Fixes: 37ed9493699c ("rtnetlink: add RTNH_F_EXTERNAL flag for fib offload")
Signed-off-by: Jiri Pirko 
Reviewed-by: Andy Gospodarek 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions

2016-11-15T06:46:37+00:00

[ Upstream commit f39acc84aad10710e89835c60d3b6694c43a8dd9 ]

Generic skb_vlan_push/skb_vlan_pop functions don't properly handle the
case where the input skb data pointer does not point at the mac header:

- They're doing push/pop, but fail to properly unwind data back to its
  original location.
  For example, in the skb_vlan_push case, any subsequent
  'skb_push(skb, skb->mac_len)' calls make the skb->data point 4 bytes
  BEFORE start of frame, leading to bogus frames that may be transmitted.

- They update rcsum per the added/removed 4 bytes tag.
  Alas if data is originally after the vlan/eth headers, then these
  bytes were already pulled out of the csum.

OTOH calling skb_vlan_push/skb_vlan_pop with skb->data at mac_header
present no issues.

act_vlan is the only caller to skb_vlan_*() that has skb->data pointing
at network header (upon ingress).
Other calles (ovs, bpf) already adjust skb->data at mac_header.

This patch fixes act_vlan to point to the mac_header prior calling
skb_vlan_*() functions, as other callers do.

Signed-off-by: Shmulik Ladkani 
Cc: Daniel Borkmann 
Cc: Pravin Shelar 
Cc: Jiri Pirko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route

2016-11-15T06:46:37+00:00

[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: [] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]    [] dump_stack+0x85/0xc1
[ 7858.215662]  [] ___might_sleep+0x192/0x250
[ 7858.215868]  [] __might_sleep+0x6f/0x100
[ 7858.216072]  [] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [] netlink_unicast+0x19c/0x260
[ 7858.216900]  [] rtnl_unicast+0x20/0x30
[ 7858.217128]  [] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [] call_timer_fn+0xa5/0x350
[ 7858.218192]  [] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [] run_timer_softirq+0x260/0x640
[ 7858.218865]  [] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [] __do_softirq+0xe8/0x54f
[ 7858.219269]  [] irq_exit+0xb8/0xc0
[ 7858.219463]  [] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]    [] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [] default_idle+0x23/0x190
[ 7858.220574]  [] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [] default_idle_call+0x4c/0x60
[ 7858.221016]  [] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [] rest_init+0x135/0x140
[ 7858.221469]  [] start_kernel+0x50e/0x51b
[ 7858.221670]  [] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: avoid sk_forward_alloc overflows

2016-11-15T06:46:36+00:00

[ Upstream commit 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ]

A malicious TCP receiver, sending SACK, can force the sender to split
skbs in write queue and increase its memory usage.

Then, when socket is closed and its write queue purged, we might
overflow sk_forward_alloc (It becomes negative)

sk_mem_reclaim() does nothing in this case, and more than 2GB
are leaked from TCP perspective (tcp_memory_allocated is not changed)

Then warnings trigger from inet_sock_destruct() and
sk_stream_kill_queues() seeing a not zero sk_forward_alloc

All TCP stack can be stuck because TCP is under memory pressure.

A simple fix is to preemptively reclaim from sk_mem_uncharge().

This makes sure a socket wont have more than 2 MB forward allocated,
after burst and idle period.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

pwm: Unexport children before chip removal

2016-11-10T15:36:37+00:00

commit 0733424c9ba9f42242409d1ece780777272f7ea1 upstream.

Exported pwm channels aren't removed before the pwmchip and are
leaked. This results in invalid sysfs files. This fix removes
all exported pwm channels before chip removal.

Signed-off-by: David Hsu 
Fixes: 76abbdde2d95 ("pwm: Add sysfs interface")
Signed-off-by: Thierry Reding 
Signed-off-by: Greg Kroah-Hartman

tunnels: Remove encapsulation offloads on decap.

2016-10-31T10:13:59+00:00

commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168 upstream.

If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.

This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.

The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.

Reported-by: Ramu Ramamurthy 
Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
Signed-off-by: Jesse Gross 
Signed-off-by: David S. Miller 
(backported from commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168)
[adapt iptunnel_pull_header arguments, avoid 7f290c9]
Signed-off-by: Stefan Bader 
Signed-off-by: Juerg Haefliger 
Signed-off-by: Greg Kroah-Hartman