linux-toradex.git/include/net, branch v4.1.13

af_unix: Convert the unix_sk macro to an inline function for type safety

2015-10-27T00:51:52+00:00

[ Upstream commit 4613012db1d911f80897f9446a49de817b2c4c47 ]

As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.

Signed-off-by: Aaron Conole 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: add pfmemalloc check in sk_add_backlog()

2015-10-27T00:51:49+00:00

[ Upstream commit c7c49b8fde26b74277188bdc6c9dca38db6fa35b ]

Greg reported crashes hitting the following check in __sk_backlog_rcv()

	BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));

The pfmemalloc bit is currently checked in sk_filter().

This works correctly for TCP, because sk_filter() is ran in
tcp_v[46]_rcv() before hitting the prequeue or backlog checks.

For UDP or other protocols, this does not work, because the sk_filter()
is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
queuing if socket is owned by user by the time packet is processed by
softirq handler.

Fixes: b4b9e35585089 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
Signed-off-by: Eric Dumazet 
Reported-by: Greg Thelen 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook

2015-10-22T21:43:17+00:00

commit 8405a8fff3f8545c888a872d6e3c0c8eecd4d348 upstream.

Add code to nf_unregister_hook to flush the nf_queue when a hook is
unregistered.  This guarantees that the pointer that the nf_queue code
retains into the nf_hook list will remain valid while a packet is
queued.

I tested what would happen if we do not flush queued packets and was
trivially able to obtain the oops below.  All that was required was
to stop the nf_queue listening process, to delete all of the nf_tables,
and to awaken the nf_queue listening process.

> BUG: unable to handle kernel paging request at 0000000100000001
> IP: [<0000000100000001>] 0x100000001
> PGD b9c35067 PUD 0
> Oops: 0010 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 519 Comm: lt-nfqnl_test Not tainted
> task: ffff8800b9c8c050 ti: ffff8800ba9d8000 task.ti: ffff8800ba9d8000
> RIP: 0010:[<0000000100000001>]  [<0000000100000001>] 0x100000001
> RSP: 0018:ffff8800ba9dba40  EFLAGS: 00010a16
> RAX: ffff8800bab48a00 RBX: ffff8800ba9dba90 RCX: ffff8800ba9dba90
> RDX: ffff8800b9c10128 RSI: ffff8800ba940900 RDI: ffff8800bab48a00
> RBP: ffff8800b9c10128 R08: ffffffff82976660 R09: ffff8800ba9dbb28
> R10: dead000000100100 R11: dead000000200200 R12: ffff8800ba940900
> R13: ffffffff8313fd50 R14: ffff8800b9c95200 R15: 0000000000000000
> FS:  00007fb91fc34700(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000100000001 CR3: 00000000babfb000 CR4: 00000000000007f0
> Stack:
>  ffffffff8206ab0f ffffffff82982240 ffff8800bab48a00 ffff8800b9c100a8
>  ffff8800b9c10100 0000000000000001 ffff8800ba940900 ffff8800b9c10128
>  ffffffff8206bd65 ffff8800bfb0d5e0 ffff8800bab48a00 0000000000014dc0
> Call Trace:
>  [] ? nf_iterate+0x4f/0xa0
>  [] ? nf_reinject+0x125/0x190
>  [] ? nfqnl_recv_verdict+0x255/0x360
>  [] ? nla_parse+0x80/0xf0
>  [] ? nfnetlink_rcv_msg+0x13c/0x240
>  [] ? __memcg_kmem_get_cache+0x4c/0x150
>  [] ? nfnl_lock+0x20/0x20
>  [] ? netlink_rcv_skb+0xa9/0xc0
>  [] ? netlink_unicast+0x12f/0x1c0
>  [] ? netlink_sendmsg+0x28e/0x650
>  [] ? sock_sendmsg+0x44/0x50
>  [] ? ___sys_sendmsg+0x2ab/0x2c0
>  [] ? __wake_up+0x43/0x70
>  [] ? tty_write+0x1c4/0x2a0
>  [] ? __sys_sendmsg+0x44/0x80
>  [] ? system_call_fastpath+0x12/0x6a
> Code:  Bad RIP value.
> RIP  [<0000000100000001>] 0x100000001
>  RSP 
> CR2: 0000000100000001
> ---[ end trace 08eb65d42362793f ]---

Signed-off-by: "Eric W. Biederman" 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

netfilter: nf_tables: Use 32 bit addressing register from nft_type_to_reg()

2015-10-22T21:43:17+00:00

commit bf798657eb5ba57552096843c315f096fdf9b715 upstream.

nft_type_to_reg() needs to return the register in the new 32 bit addressing,
otherwise we hit EINVAL when using mappings.

Fixes: 49499c3 ("netfilter: nf_tables: switch registers to 32 bit addressing")
Reported-by: Andreas Schultz 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Greg Kroah-Hartman

net: sched: fix refcount imbalance in actions

2015-09-29T17:26:24+00:00

[ Upstream commit 28e6b67f0b292f557468c139085303b15f1a678f ]

Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action
outside"), we end up with a wrong reference count for a tc action.

Test case 1:

  FOO="1,6 0 0 4294967295,"
  BAR="1,6 0 0 4294967294,"
  tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \
     action bpf bytecode "$FOO"
  tc actions show action bpf
    action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
    index 1 ref 1 bind 1
  tc actions replace action bpf bytecode "$BAR" index 1
  tc actions show action bpf
    action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
    index 1 ref 2 bind 1
  tc actions replace action bpf bytecode "$FOO" index 1
  tc actions show action bpf
    action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
    index 1 ref 3 bind 1

Test case 2:

  FOO="1,6 0 0 4294967295,"
  tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
  tc actions show action gact
    action order 0: gact action pass
    random type none pass val 0
     index 1 ref 1 bind 1
  tc actions add action drop index 1
    RTNETLINK answers: File exists [...]
  tc actions show action gact
    action order 0: gact action pass
     random type none pass val 0
     index 1 ref 2 bind 1
  tc actions add action drop index 1
    RTNETLINK answers: File exists [...]
  tc actions show action gact
    action order 0: gact action pass
     random type none pass val 0
     index 1 ref 3 bind 1

What happens is that in tcf_hash_check(), we check tcf_common for a given
index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've
found an existing action. Now there are the following cases:

  1) We do a late binding of an action. In that case, we leave the
     tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init()
     handler. This is correctly handeled.

  2) We replace the given action, or we try to add one without replacing
     and find out that the action at a specific index already exists
     (thus, we go out with error in that case).

In case of 2), we have to undo the reference count increase from
tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to
do so because of the 'tcfc_bindcnt > 0' check which bails out early with
an -EPERM error.

Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an
already classifier-bound action to drop the reference count (which could
then become negative, wrap around etc), this restriction only accounts for
invocations outside a specific action's ->init() handler.

One possible solution would be to add a flag thus we possibly trigger
the -EPERM ony in situations where it is indeed relevant.

After the patch, above test cases have correct reference count again.

Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Daniel Borkmann 
Reviewed-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: lock socket in ip6_datagram_connect()

2015-09-29T17:26:22+00:00

[ Upstream commit 03645a11a570d52e70631838cb786eb4253eb463 ]

ip6_datagram_connect() is doing a lot of socket changes without
socket being locked.

This looks wrong, at least for udp_lib_rehash() which could corrupt
lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.

Signed-off-by: Eric Dumazet 
Acked-by: Herbert Xu 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

sctp: fix ASCONF list handling

2015-07-10T16:49:28+00:00

[ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ]

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen 
Suggested-by: Neil Horman 
Suggested-by: Hannes Frederic Sowa 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Marcelo Ricardo Leitner 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tcp: fix child sockets to use system default congestion control if not set

2015-06-01T04:49:14+00:00

Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694fa82c ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal 
Cc: Daniel Borkmann 
Cc: Glenn Judd 
Cc: Stephen Hemminger 
Signed-off-by: Neal Cardwell 
Signed-off-by: Eric Dumazet 
Signed-off-by: Yuchung Cheng 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller

Merge tag 'mac80211-for-davem-2015-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

2015-05-31T06:37:46+00:00

Johannes Berg says:

====================
This just has a single docbook build fix. In my confusion
I'd already sent the same fix for -next, but Ben Hutchings
noted it's necessary in 4.1.
====================

Signed-off-by: David S. Miller

mac80211: Fix mac80211.h docbook comments

2015-05-28T12:37:43+00:00

A couple of enums in mac80211.h became structures recently, but the
comments didn't follow suit, leading to errors like:

  Error(.//include/net/mac80211.h:367): Cannot parse enum!
  Documentation/DocBook/Makefile:93: recipe for target 'Documentation/DocBook/80211.xml' failed
  make[1]: *** [Documentation/DocBook/80211.xml] Error 1
  Makefile:1361: recipe for target 'mandocs' failed
  make: *** [mandocs] Error 2

Fix the comments comments accordingly.  Added a couple of other small
comment fixes while I was there to silence other recently-added docbook
warnings.

Reported-by: Jim Davis 
Signed-off-by: Jonathan Corbet 
Signed-off-by: Johannes Berg