linux-toradex.git/net, branch v4.2.5

svcrdma: handle rdma read with a non-zero initial page offset

2015-10-27T00:53:43+00:00

commit c91aed9896946721bb30705ea2904edb3725dd61 upstream.

The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
were not taking into account the initial page_offset when determining
the rdma read length.  This resulted in a read who's starting address
and length exceeded the base/bounds of the frmr.

The server gets an async error from the rdma device and kills the
connection, and the client then reconnects and resends.  This repeats
indefinitely, and the application hangs.

Most work loads don't tickle this bug apparently, but one test hit it
every time: building the linux kernel on a 16 core node with 'make -j
16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.

This bug seems to only be tripped with devices having small fastreg page
list depths.  I didn't see it with mlx4, for instance.

Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
Signed-off-by: Steve Wise 
Tested-by: Chuck Lever 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

netlink: Trim skb to alloc size to avoid MSG_TRUNC

2015-10-27T00:53:37+00:00

[ Upstream commit db65a3aaf29ecce2e34271d52e8d2336b97bd9fe ]

netlink_dump() allocates skb based on the calculated min_dump_alloc or
a per socket max_recvmsg_len.
min_alloc_size is maximum space required for any single netdev
attributes as calculated by rtnl_calcit().
max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
It is capped at 16KiB.
The intention is to avoid small allocations and to minimize the number
of calls required to obtain dump information for all net devices.

netlink_dump packs as many small messages as could fit within an skb
that was sized for the largest single netdev information. The actual
space available within an skb is larger than what is requested. It could
be much larger and up to near 2x with align to next power of 2 approach.

Allowing netlink_dump to use all the space available within the
allocated skb increases the buffer size a user has to provide to avoid
truncaion (i.e. MSG_TRUNG flag set).

It was observed that with many VLANs configured on at least one netdev,
a larger buffer of near 64KiB was necessary to avoid "Message truncated"
error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
min_alloc_size was only little over 32KiB.

This patch trims skb to allocated size in order to allow the user to
avoid truncation with more reasonable buffer size.

Signed-off-by: Ronen Arad 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tipc: move fragment importance field to new header position

2015-10-27T00:53:37+00:00

[ Upstream commit dde4b5ae65de659b9ec64bafdde0430459fcb495 ]

In commit e3eea1eb47a ("tipc: clean up handling of message priorities")
we introduced a field in the packet header for keeping track of the
priority of fragments, since this value is not present in the specified
protocol header. Since the value so far only is used at the transmitting
end of the link, we have not yet officially defined it as part of the
protocol.

Unfortunately, the field we use for keeping this value, bits 13-15 in
in word 5, has turned out to be a poor choice; it is already used by the
broadcast protocol for carrying the 'network id' field of the sending
node. Since packet fragments also need to be transported across the
broadcast protocol, the risk of conflict is obvious, and we see this
happen when we use network identities larger than 2^13-1. This has
escaped our testing because we have so far only been using small network
id values.

We now move this field to bits 0-2 in word 9, a field that is guaranteed
to be unused by all involved protocols.

Fixes: e3eea1eb47a ("tipc: clean up handling of message priorities")
Signed-off-by: Jon Maloy 
Acked-by: Ying Xue 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings

2015-10-27T00:53:37+00:00

[ Upstream commit 077cb37fcf6f00a45f375161200b5ee0cd4e937b ]

It seems that kernel memory can leak into userspace by a
kmalloc, ethtool_get_strings, then copy_to_user sequence.

Avoid this by using kcalloc to zero fill the copied buffer.

Signed-off-by: Joe Perches 
Acked-by: Ben Hutchings 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: Don't call with rt6_uncached_list_flush_dev

2015-10-27T00:53:37+00:00

[ Upstream commit e332bc67cf5e5e5b71a1aec9750d0791aac65183 ]

As originally written rt6_uncached_list_flush_dev makes no sense when
called with dev == NULL as it attempts to flush all uncached routes
regardless of network namespace when dev == NULL.  Which is simply
incorrect behavior.

Furthermore at the point rt6_ifdown is called with dev == NULL no more
network devices exist in the network namespace so even if the code in
rt6_uncached_list_flush_dev were to attempt something sensible it
would be meaningless.

Therefore remove support in rt6_uncached_list_flush_dev for handling
network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
 when rt6_ifdown is called with a network device.

Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Signed-off-by: "Eric W. Biederman" 
Reviewed-by: Martin KaFai Lau 
Tested-by: Martin KaFai Lau 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bpf: clear sender_cpu before xmit

2015-10-27T00:53:37+00:00

[ Upstream commit 6bf0577374cfb6c2301dbf4934a4f23ad3d72763 ]

Similar to commit c29390c6dfee ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared before xmit.

Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper")
Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

act_mirred: clear sender cpu before sending to tx

2015-10-27T00:53:36+00:00

[ Upstream commit d40496a56430eac0d330378816954619899fe303 ]

Similar to commit c29390c6dfee ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared when moving from Rx
Tx, otherwise kernel could crash.

Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices")
Cc: Eric Dumazet 
Cc: Jamal Hadi Salim 
Signed-off-by: Cong Wang 
Signed-off-by: Cong Wang 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ovs: do not allocate memory from offline numa node

2015-10-27T00:53:36+00:00

[ Upstream commit 598c12d0ba6de9060f04999746eb1e015774044b ]

When openvswitch tries allocate memory from offline numa node 0:
stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
[ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
This patch disables numa affinity in this case.

Signed-off-by: Konstantin Khlebnikov 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bpf: fix panic in SO_GET_FILTER with native ebpf programs

2015-10-27T00:53:36+00:00

[ Upstream commit 93d08b6966cf730ea669d4d98f43627597077153 ]

When sockets have a native eBPF program attached through
setsockopt(sk, SOL_SOCKET, SO_ATTACH_BPF, ...), and then try to
dump these over getsockopt(sk, SOL_SOCKET, SO_GET_FILTER, ...),
the following panic appears:

  [49904.178642] BUG: unable to handle kernel NULL pointer dereference at (null)
  [49904.178762] IP: [] sk_get_filter+0x39/0x90
  [49904.182000] PGD 86fc9067 PUD 531a1067 PMD 0
  [49904.185196] Oops: 0000 [#1] SMP
  [...]
  [49904.224677] Call Trace:
  [49904.226090]  [] sock_getsockopt+0x319/0x740
  [49904.227535]  [] ? sock_has_perm+0x63/0x70
  [49904.228953]  [] ? release_sock+0x108/0x150
  [49904.230380]  [] ? selinux_socket_getsockopt+0x23/0x30
  [49904.231788]  [] SyS_getsockopt+0xa6/0xc0
  [49904.233267]  [] entry_SYSCALL_64_fastpath+0x12/0x71

The underlying issue is the very same as in commit b382c0865600
("sock, diag: fix panic in sock_diag_put_filterinfo"), that is,
native eBPF programs don't store an original program since this
is only needed in cBPF ones.

However, sk_get_filter() wasn't updated to test for this at the
time when eBPF could be attached. Just throw an error to the user
to indicate that eBPF cannot be dumped over this interface.
That way, it can also be known that a program _is_ attached (as
opposed to just return 0), and a different (future) method needs
to be consulted for a dump.

Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

inet: fix race in reqsk_queue_unlink()

2015-10-27T00:53:36+00:00

[ Upstream commit 2306c704ce280c97a60d1f45333b822b40281dea ]

reqsk_timer_handler() tests if icsk_accept_queue.listen_opt
is NULL at its beginning.

By the time it calls inet_csk_reqsk_queue_drop() and
reqsk_queue_unlink(), listener might have been closed and
inet_csk_listen_stop() had called reqsk_queue_yank_acceptq()
which sets icsk_accept_queue.listen_opt to NULL

We therefore need to correctly check listen_opt being NULL
after holding syn_wait_lock for proper synchronization.

Fixes: fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer")
Fixes: b357a364c57c ("inet: fix possible panic in reqsk_queue_unlink()")
Signed-off-by: Eric Dumazet 
Cc: Yuchung Cheng 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman