summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2021-10-17netfilter: ip6_tables: zero-initialize fragment offsetJeremy Sowden
[ Upstream commit 310e2d43c3ad429c1fba4b175806cf1f55ed73a6 ] ip6tables only sets the `IP6T_F_PROTO` flag on a rule if a protocol is specified (`-p tcp`, for example). However, if the flag is not set, `ip6_packet_match` doesn't call `ipv6_find_hdr` for the skb, in which case the fragment offset is left uninitialized and a garbage value is passed to each matcher. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-28ipv6: tcp: drop silly ICMPv6 packet too big messagesEric Dumazet
commit c7bb4b89033b764eb07db4e060548a6311d801ee upstream. While TCP stack scales reasonably well, there is still one part that can be used to DDOS it. IPv6 Packet too big messages have to lookup/insert a new route, and if abused by attackers, can easily put hosts under high stress, with many cpus contending on a spinlock while one is stuck in fib6_run_gc() ip6_protocol_deliver_rcu() icmpv6_rcv() icmpv6_notify() tcp_v6_err() tcp_v6_mtu_reduced() inet6_csk_update_pmtu() ip6_rt_update_pmtu() __ip6_rt_update_pmtu() ip6_rt_cache_alloc() ip6_dst_alloc() dst_alloc() ip6_dst_gc() fib6_run_gc() spin_lock_bh() ... Some of our servers have been hit by malicious ICMPv6 packets trying to _increase_ the MTU/MSS of TCP flows. We believe these ICMPv6 packets are a result of a bug in one ISP stack, since they were blindly sent back for _every_ (small) packet sent to them. These packets are for one TCP flow: 09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 TCP stack can filter some silly requests : 1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err() 2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS. This tests happen before the IPv6 routing stack is entered, thus removing the potential contention and route exhaustion. Note that IPv6 stack was performing these checks, but too late (ie : after the route has been added, and after the potential garbage collect war) v2: fix typo caught by Martin, thanks ! v3: exports tcp_mtu_to_mss(), caught by David, thanks ! Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28tcp: annotate data races around tp->mtu_infoEric Dumazet
commit 561022acb1ce62e50f7a8258687a21b84282a4cb upstream. While tp->mtu_info is read while socket is owned, the write sides happen from err handlers (tcp_v[46]_mtu_reduced) which only own the socket spinlock. Fixes: 563d34d05786 ("tcp: dont drop MTU reduction indications") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28net: ipv6: fix return value of ip6_skb_dst_mtuVadim Fedorenko
commit 40fc3054b45820c28ea3c65e2c86d041dc244a8a upstream. Commit 628a5c561890 ("[INET]: Add IP(V6)_PMTUDISC_RPOBE") introduced ip6_skb_dst_mtu with return value of signed int which is inconsistent with actually returned values. Also 2 users of this function actually assign its value to unsigned int variable and only __xfrm6_output assigns result of this function to signed variable but actually uses as unsigned in further comparisons and calls. Change this function to return unsigned int value. Fixes: 628a5c561890 ("[INET]: Add IP(V6)_PMTUDISC_RPOBE") Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20ipv6: use prandom_u32() for ID generationWilly Tarreau
[ Upstream commit 62f20e068ccc50d6ab66fdb72ba90da2b9418c99 ] This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash table for IP ID generation"), but focusing on some specific aspects of IPv6. Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a minimum MTU of 1280, it's much less easy to force a remote peer to produce many fragments to explore its ID sequence. In addition packet IDs are 32-bit in IPv6, which further complicates their analysis. On the other hand, it is often easier to choose among plenty of possible source addresses and partially work around the bigger hash table the commit above permits, which leaves IPv6 partially exposed to some possibilities of remote analysis at the risk of weakening some protocols like DNS if some IDs can be predicted with a good enough probability. Given the wide range of permitted IDs, the risk of collision is extremely low so there's no need to rely on the positive increment algorithm that is shared with the IPv4 code via ip_idents_reserve(). We have a fast PRNG, so let's simply call prandom_u32() and be done with it. Performance measurements at 10 Gbps couldn't show any difference with the previous code, even when using a single core, because due to the large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost of the random generation is completely offset by other operations and by the network transfer time. In addition, this change removes the need to update a shared entry in the idents table so it may even end up being slightly faster on large scale systems where this matters. The risk of at least one collision here is about 1/80 million among 10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs, which remains very low compared to IPv4 where all IDs are reused every 4 to 80ms on a 10 Gbps flow depending on packet sizes. Reported-by: Amit Klein <aksecurity@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210529110746.6796-1-w@1wt.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03mld: fix panic in mld_newpack()Taehee Yoo
[ Upstream commit 020ef930b826d21c5446fdc9db80fd72a791bc21 ] mld_newpack() doesn't allow to allocate high order page, only order-0 allocation is allowed. If headroom size is too large, a kernel panic could occur in skb_put(). Test commands: ip netns del A ip netns del B ip netns add A ip netns add B ip link add veth0 type veth peer name veth1 ip link set veth0 netns A ip link set veth1 netns B ip netns exec A ip link set lo up ip netns exec A ip link set veth0 up ip netns exec A ip -6 a a 2001:db8:0::1/64 dev veth0 ip netns exec B ip link set lo up ip netns exec B ip link set veth1 up ip netns exec B ip -6 a a 2001:db8:0::2/64 dev veth1 for i in {1..99} do let A=$i-1 ip netns exec A ip link add ip6gre$i type ip6gre \ local 2001:db8:$A::1 remote 2001:db8:$A::2 encaplimit 100 ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev ip6gre$i ip netns exec A ip link set ip6gre$i up ip netns exec B ip link add ip6gre$i type ip6gre \ local 2001:db8:$A::2 remote 2001:db8:$A::1 encaplimit 100 ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev ip6gre$i ip netns exec B ip link set ip6gre$i up done Splat looks like: kernel BUG at net/core/skbuff.c:110! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.12.0+ #891 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:skb_panic+0x15d/0x15f Code: 92 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 00 ae 79 83 41 57 41 56 41 55 48 8b 54 24 a6 26 f9 ff <0f> 0b 48 8b 6c 24 20 89 34 24 e8 4a 4e 92 fe 8b 34 24 48 c7 c1 20 RSP: 0018:ffff88810091f820 EFLAGS: 00010282 RAX: 0000000000000089 RBX: ffff8881086e9000 RCX: 0000000000000000 RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed1020123efb RBP: ffff888005f6eac0 R08: ffffed1022fc0031 R09: ffffed1022fc0031 R10: ffff888117e00187 R11: ffffed1022fc0030 R12: 0000000000000028 R13: ffff888008284eb0 R14: 0000000000000ed8 R15: 0000000000000ec0 FS: 0000000000000000(0000) GS:ffff888117c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8b801c5640 CR3: 0000000033c2c006 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 skb_put.cold.104+0x22/0x22 ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 ? rcu_read_lock_sched_held+0x91/0xc0 mld_newpack+0x398/0x8f0 ? ip6_mc_hdr.isra.26.constprop.46+0x600/0x600 ? lock_contended+0xc40/0xc40 add_grhead.isra.33+0x280/0x380 add_grec+0x5ca/0xff0 ? mld_sendpack+0xf40/0xf40 ? lock_downgrade+0x690/0x690 mld_send_initial_cr.part.34+0xb9/0x180 ipv6_mc_dad_complete+0x15d/0x1b0 addrconf_dad_completed+0x8d2/0xbb0 ? lock_downgrade+0x690/0x690 ? addrconf_rs_timer+0x660/0x660 ? addrconf_dad_work+0x73c/0x10e0 addrconf_dad_work+0x73c/0x10e0 Allowing high order page allocation could fix this problem. Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22ipv6: remove extra dev_hold() for fallback tunnelsEric Dumazet
commit 0d7a7b2014b1a499a0fe24c9f3063d7856b5aaaf upstream. My previous commits added a dev_hold() in tunnels ndo_init(), but forgot to remove it from special functions setting up fallback tunnels. Fallback tunnels do call their respective ndo_init() This leads to various reports like : unregister_netdevice: waiting for ip6gre0 to become free. Usage count = 2 Fixes: 48bb5697269a ("ip6_tunnel: sit: proper dev_{hold|put} in ndo_[un]init methods") Fixes: 6289a98f0817 ("sit: proper dev_{hold|put} in ndo_[un]init methods") Fixes: 40cb881b5aaa ("ip6_vti: proper dev_{hold|put} in ndo_[un]init methods") Fixes: 7f700334be9a ("ip6_gre: proper dev_{hold|put} in ndo_[un]init methods") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22ip6_tunnel: sit: proper dev_{hold|put} in ndo_[un]init methodsEric Dumazet
commit 48bb5697269a7cbe5194dbb044dc38c517e34c58 upstream. Same reasons than for the previous commits : 6289a98f0817 ("sit: proper dev_{hold|put} in ndo_[un]init methods") 40cb881b5aaa ("ip6_vti: proper dev_{hold|put} in ndo_[un]init methods") 7f700334be9a ("ip6_gre: proper dev_{hold|put} in ndo_[un]init methods") After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. [1] WARNING: CPU: 1 PID: 21059 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 1 PID: 21059 Comm: syz-executor.4 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc900025aefe8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815c51f5 RDI: fffff520004b5def RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff888023488568 R13: ffff8880254e9000 R14: 00000000dfd82cfd R15: ffff88802ee2d7c0 FS: 00007f13bc590700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0943e74000 CR3: 0000000025273000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] ip6_tnl_dev_uninit+0x370/0x3d0 net/ipv6/ip6_tunnel.c:387 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 ip6_tnl_create2+0x1b5/0x400 net/ipv6/ip6_tunnel.c:263 ip6_tnl_newlink+0x312/0x580 net/ipv6/ip6_tunnel.c:2052 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22sit: proper dev_{hold|put} in ndo_[un]init methodsEric Dumazet
commit 6289a98f0817a4a457750d6345e754838eae9439 upstream. After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22ip6_vti: proper dev_{hold|put} in ndo_[un]init methodsEric Dumazet
[ Upstream commit 40cb881b5aaa0b69a7d93dec8440d5c62dae299f ] After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. Therefore, we need to move dev_hold() call from vti6_tnl_create2() to vti6_dev_init_gen() [1] WARNING: CPU: 0 PID: 15951 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 0 PID: 15951 Comm: syz-executor.3 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc90001eaef28 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815c51f5 RDI: fffff520003d5dd7 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff88801bb1c568 R13: ffff88801f69e800 R14: 00000000ffffffff R15: ffff888050889d40 FS: 00007fc79314e700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1c1ff47108 CR3: 0000000020fd5000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] vti6_dev_uninit+0x31a/0x360 net/ipv6/ip6_vti.c:297 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 vti6_tnl_create2+0x1b5/0x400 net/ipv6/ip6_vti.c:190 vti6_newlink+0x9d/0xd0 net/ipv6/ip6_vti.c:1020 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x331/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmmsg+0x195/0x470 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2516 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-16netfilter: x_tables: fix compat match/target pad out-of-bound writeFlorian Westphal
commit b29c457a6511435960115c0f548c4360d5f4801d upstream. xt_compat_match/target_from_user doesn't check that zeroing the area to start of next rule won't write past end of allocated ruleset blob. Remove this code and zero the entire blob beforehand. Reported-by: syzbot+cfc0247ac173f597aaaa@syzkaller.appspotmail.com Reported-by: Andy Nguyen <theflow@google.com> Fixes: 9fa492cdc160c ("[NETFILTER]: x_tables: simplify compat API") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-16net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlhMuhammad Usama Anjum
commit 864db232dc7036aa2de19749c3d5be0143b24f8f upstream. nlh is being checked for validtity two times when it is dereferenced in this function. Check for validity again when updating the flags through nlh pointer to make the dereferencing safe. CC: <stable@vger.kernel.org> Addresses-Coverity: ("NULL pointer dereference") Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07ipv6: weaken the v4mapped source checkJakub Kicinski
[ Upstream commit dcc32f4f183ab8479041b23a1525d48233df1d43 ] This reverts commit 6af1799aaf3f1bc8defedddfa00df3192445bbf3. Commit 6af1799aaf3f ("ipv6: drop incoming packets having a v4mapped source address") introduced an input check against v4mapped addresses. Use of such addresses on the wire is indeed questionable and not allowed on public Internet. As the commit pointed out https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-02 lists potential issues. Unfortunately there are applications which use v4mapped addresses, and breaking them is a clear regression. For example v4mapped addresses (or any semi-valid addresses, really) may be used for uni-direction event streams or packet export. Since the issue which sparked the addition of the check was with TCP and request_socks in particular push the check down to TCPv6 and DCCP. This restores the ability to receive UDPv6 packets with v4mapped address as the source. Keep using the IPSTATS_MIB_INHDRERRORS statistic to minimize the user-visible changes. Fixes: 6af1799aaf3f ("ipv6: drop incoming packets having a v4mapped source address") Reported-by: Sunyi Shao <sunyishao@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-30ipv6: create multicast route with RTPROT_KERNELMatteo Croce
commit a826b04303a40d52439aa141035fca5654ccaccd upstream. The ff00::/8 multicast route is created without specifying the fc_protocol field, so the default RTPROT_BOOT value is used: $ ip -6 -d route unicast ::1 dev lo proto kernel scope global metric 256 pref medium unicast fe80::/64 dev eth0 proto kernel scope global metric 256 pref medium unicast ff00::/8 dev eth0 proto boot scope global metric 256 pref medium As the documentation says, this value identifies routes installed during boot, but the route is created when interface is set up. Change the value to RTPROT_KERNEL which is a better value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-23net: sit: unregister_netdevice on newlink's error pathJakub Kicinski
[ Upstream commit 47e4bb147a96f1c9b4e7691e7e994e53838bfff8 ] We need to unregister the netdevice if config failed. .ndo_uninit takes care of most of the heavy lifting. This was uncovered by recent commit c269a24ce057 ("net: make free_netdev() more lenient with unregistering devices"). Previously the partially-initialized device would be left in the system. Reported-and-tested-by: syzbot+2393580080a2da190f04@syzkaller.appspotmail.com Fixes: e2f1f072db8d ("sit: allow to configure 6rd tunnels via netlink") Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20210114012947.2515313-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-24ah6: fix error return code in ah6_input()Zhang Changzhong
[ Upstream commit a5ebcbdf34b65fcc07f38eaf2d60563b42619a59 ] Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1605581105-35295-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-18net: Update window_clamp if SOCK_RCVBUF is setMao Wenan
[ Upstream commit 909172a149749242990a6e64cb55d55460d4e417 ] When net.ipv4.tcp_syncookies=1 and syn flood is happened, cookie_v4_check or cookie_v6_check tries to redo what tcp_v4_send_synack or tcp_v6_send_synack did, rsk_window_clamp will be changed if SOCK_RCVBUF is set, which will make rcv_wscale is different, the client still operates with initial window scale and can overshot granted window, the client use the initial scale but local server use new scale to advertise window value, and session work abnormally. Fixes: e88c64f0a425 ("tcp: allow effective reduction of TCP's rcv-buffer via setsockopt") Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1604967391-123737-1-git-send-email-wenan.mao@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-18IPv6: Set SIT tunnel hard_header_len to zeroOliver Herms
[ Upstream commit 8ef9ba4d666614497a057d09b0a6eafc1e34eadf ] Due to the legacy usage of hard_header_len for SIT tunnels while already using infrastructure from net/ipv4/ip_tunnel.c the calculation of the path MTU in tnl_update_pmtu is incorrect. This leads to unnecessary creation of MTU exceptions for any flow going over a SIT tunnel. As SIT tunnels do not have a header themsevles other than their transport (L3, L2) headers we're leaving hard_header_len set to zero as tnl_update_pmtu is already taking care of the transport headers sizes. This will also help avoiding unnecessary IPv6 GC runs and spinlock contention seen when using SIT tunnels and for more than net.ipv6.route.gc_thresh flows. Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20201103104133.GA1573211@tws Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21ipv6: check skb->protocol before lookup for nexthopWANG Cong
commit 199ab00f3cdb6f154ea93fa76fd80192861a821d upstream. Andrey reported a out-of-bound access in ip6_tnl_xmit(), this is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4 neigh key as an IPv6 address: neigh = dst_neigh_lookup(skb_dst(skb), &ipv6_hdr(skb)->daddr); if (!neigh) goto tx_err_link_failure; addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE addr_type = ipv6_addr_type(addr6); if (addr_type == IPV6_ADDR_ANY) addr6 = &ipv6_hdr(skb)->daddr; memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); Also the network header of the skb at this point should be still IPv4 for 4in6 tunnels, we shold not just use it as IPv6 header. This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it is, we are safe to do the nexthop lookup using skb_dst() and ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which dest address we can pick here, we have to rely on callers to fill it from tunnel config, so just fall to ip6_route_output() to make the decision. Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alessio Balsini <balsini@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21udp: drop corrupt packets earlier to avoid data corruptionDexuan Cui
The v4.4 stable kernel lacks this bugfix: commit 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error"). As a result, the v4.4 kernel can deliver corrupt data to the application when a corrupt UDP packet is closely followed by a valid UDP packet: the same invocation of the recvmsg() syscall can deliver the corrupt packet's UDP payload to the application with the UDP payload length and the "from IP/Port" of the valid packet. Details: For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's include/linux/skbuff.h:3951), Linux delays the UDP checksum verification until the application invokes the syscall recvmsg(). In the recvmsg() syscall handler, while Linux is copying the UDP payload to the application's memory, it calculates the UDP checksum. If the calculated checksum doesn't match the received checksum, Linux drops the corrupt UDP packet, and then starts to process the next packet (if any), and if the next packet is valid (i.e. the checksum is correct), Linux will copy the valid UDP packet's payload to the application's receiver buffer. The bug is: before Linux starts to copy the valid UDP packet, the data structure used to track how many more bytes should be copied to the application memory is not reset to what it was when the application just entered the kernel by the syscall! Consequently, only a small portion or none of the valid packet's payload is copied to the application's receive buffer, and later when the application exits from the kernel, actually most of the application's receive buffer contains the payload of the corrupt packet while recvmsg() returns the length of the UDP payload of the valid packet. For the mainline kernel, the bug was fixed in commit 327868212381, but unluckily the bugfix is only backported to v4.9+. It turns out backporting 327868212381 to v4.4 means that some supporting patches must be backported first, so the overall changes seem too big, so the alternative is performs the csum validation earlier and drops the corrupt packets earlier. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21ipv6: fix memory leaks on IPV6_ADDRFORM pathCong Wang
[ Upstream commit 8c0de6e96c9794cb523a516c465991a70245da1c ] IPV6_ADDRFORM causes resource leaks when converting an IPv6 socket to IPv4, particularly struct ipv6_ac_socklist. Similar to struct ipv6_mc_socklist, we should just close it on this path. This bug can be easily reproduced with the following C program: #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <arpa/inet.h> int main() { int s, value; struct sockaddr_in6 addr; struct ipv6_mreq m6; s = socket(AF_INET6, SOCK_DGRAM, 0); addr.sin6_family = AF_INET6; addr.sin6_port = htons(5000); inet_pton(AF_INET6, "::ffff:192.168.122.194", &addr.sin6_addr); connect(s, (struct sockaddr *)&addr, sizeof(addr)); inet_pton(AF_INET6, "fe80::AAAA", &m6.ipv6mr_multiaddr); m6.ipv6mr_interface = 5; setsockopt(s, SOL_IPV6, IPV6_JOIN_ANYCAST, &m6, sizeof(m6)); value = AF_INET; setsockopt(s, SOL_IPV6, IPV6_ADDRFORM, &value, sizeof(value)); close(s); return 0; } Reported-by: ch3332xr@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31ip6_gre: fix null-ptr-deref in ip6gre_init_net()Wei Yongjun
[ Upstream commit 46ef5b89ec0ecf290d74c4aee844f063933c4da4 ] KASAN report null-ptr-deref error when register_netdev() failed: KASAN: null-ptr-deref in range [0x00000000000003c0-0x00000000000003c7] CPU: 2 PID: 422 Comm: ip Not tainted 5.8.0-rc4+ #12 Call Trace: ip6gre_init_net+0x4ab/0x580 ? ip6gre_tunnel_uninit+0x3f0/0x3f0 ops_init+0xa8/0x3c0 setup_net+0x2de/0x7e0 ? rcu_read_lock_bh_held+0xb0/0xb0 ? ops_init+0x3c0/0x3c0 ? kasan_unpoison_shadow+0x33/0x40 ? __kasan_kmalloc.constprop.0+0xc2/0xd0 copy_net_ns+0x27d/0x530 create_new_namespaces+0x382/0xa30 unshare_nsproxy_namespaces+0xa1/0x1d0 ksys_unshare+0x39c/0x780 ? walk_process_tree+0x2a0/0x2a0 ? trace_hardirqs_on+0x4a/0x1b0 ? _raw_spin_unlock_irq+0x1f/0x30 ? syscall_trace_enter+0x1a7/0x330 ? do_syscall_64+0x1c/0xa0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x56/0xa0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ip6gre_tunnel_uninit() has set 'ign->fb_tunnel_dev' to NULL, later access to ign->fb_tunnel_dev cause null-ptr-deref. Fix it by saving 'ign->fb_tunnel_dev' to local variable ndev. Fixes: dafabb6590cb ("ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31net: udp: Fix wrong clean up for IS_UDPLITE macroMiaohe Lin
[ Upstream commit b0a422772fec29811e293c7c0e6f991c0fd9241d ] We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is checked. Fixes: b2bf1e2659b1 ("[UDP]: Clean up for IS_UDPLITE macro") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-29mld: fix memory leak in ipv6_mc_destroy_dev()Wang Hai
[ Upstream commit ea2fce88d2fd678ed9d45354ff49b73f1d5615dd ] Commit a84d01647989 ("mld: fix memory leak in mld_del_delrec()") fixed the memory leak of MLD, but missing the ipv6_mc_destroy_dev() path, in which mca_sources are leaked after ma_put(). Using ip6_mc_clear_src() to take care of the missing free. BUG: memory leak unreferenced object 0xffff8881113d3180 (size 64): comm "syz-executor071", pid 389, jiffies 4294887985 (age 17.943s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002cbc483c>] kmalloc include/linux/slab.h:555 [inline] [<000000002cbc483c>] kzalloc include/linux/slab.h:669 [inline] [<000000002cbc483c>] ip6_mc_add1_src net/ipv6/mcast.c:2237 [inline] [<000000002cbc483c>] ip6_mc_add_src+0x7f5/0xbb0 net/ipv6/mcast.c:2357 [<0000000058b8b1ff>] ip6_mc_source+0xe0c/0x1530 net/ipv6/mcast.c:449 [<000000000bfc4fb5>] do_ipv6_setsockopt.isra.12+0x1b2c/0x3b30 net/ipv6/ipv6_sockglue.c:754 [<00000000e4e7a722>] ipv6_setsockopt+0xda/0x150 net/ipv6/ipv6_sockglue.c:950 [<0000000029260d9a>] rawv6_setsockopt+0x45/0x100 net/ipv6/raw.c:1081 [<000000005c1b46f9>] __sys_setsockopt+0x131/0x210 net/socket.c:2132 [<000000008491f7db>] __do_sys_setsockopt net/socket.c:2148 [inline] [<000000008491f7db>] __se_sys_setsockopt net/socket.c:2145 [inline] [<000000008491f7db>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2145 [<00000000c7bc11c5>] do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295 [<000000005fb7a3f3>] entry_SYSCALL_64_after_hwframe+0x49/0xb3 Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-29ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()Taehee Yoo
[ Upstream commit dafabb6590cb15f300b77c095d50312e2c7c8e0f ] In the datapath, the ip6gre_tunnel_lookup() is used and it internally uses fallback tunnel device pointer, which is fb_tunnel_dev. This pointer variable should be set to NULL when a fb interface is deleted. But there is no routine to set fb_tunnel_dev pointer to NULL. So, this pointer will be still used after interface is deleted and it eventually results in the use-after-free problem. Test commands: ip netns add A ip netns add B ip link add eth0 type veth peer name eth1 ip link set eth0 netns A ip link set eth1 netns B ip netns exec A ip link set lo up ip netns exec A ip link set eth0 up ip netns exec A ip link add ip6gre1 type ip6gre local fc:0::1 \ remote fc:0::2 ip netns exec A ip -6 a a fc:100::1/64 dev ip6gre1 ip netns exec A ip link set ip6gre1 up ip netns exec A ip -6 a a fc:0::1/64 dev eth0 ip netns exec A ip link set ip6gre0 up ip netns exec B ip link set lo up ip netns exec B ip link set eth1 up ip netns exec B ip link add ip6gre1 type ip6gre local fc:0::2 \ remote fc:0::1 ip netns exec B ip -6 a a fc:100::2/64 dev ip6gre1 ip netns exec B ip link set ip6gre1 up ip netns exec B ip -6 a a fc:0::2/64 dev eth1 ip netns exec B ip link set ip6gre0 up ip netns exec A ping fc:100::2 -s 60000 & ip netns del B Splat looks like: [ 73.087285][ C1] BUG: KASAN: use-after-free in ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.088361][ C1] Read of size 4 at addr ffff888040559218 by task ping/1429 [ 73.089317][ C1] [ 73.089638][ C1] CPU: 1 PID: 1429 Comm: ping Not tainted 5.7.0+ #602 [ 73.090531][ C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 73.091725][ C1] Call Trace: [ 73.092160][ C1] <IRQ> [ 73.092556][ C1] dump_stack+0x96/0xdb [ 73.093122][ C1] print_address_description.constprop.6+0x2cc/0x450 [ 73.094016][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.094894][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.095767][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.096619][ C1] kasan_report+0x154/0x190 [ 73.097209][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.097989][ C1] ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.098750][ C1] ? gre_del_protocol+0x60/0x60 [gre] [ 73.099500][ C1] gre_rcv+0x1c5/0x1450 [ip6_gre] [ 73.100199][ C1] ? ip6gre_header+0xf00/0xf00 [ip6_gre] [ 73.100985][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.101830][ C1] ? ip6_input_finish+0x5/0xf0 [ 73.102483][ C1] ip6_protocol_deliver_rcu+0xcbb/0x1510 [ 73.103296][ C1] ip6_input_finish+0x5b/0xf0 [ 73.103920][ C1] ip6_input+0xcd/0x2c0 [ 73.104473][ C1] ? ip6_input_finish+0xf0/0xf0 [ 73.105115][ C1] ? rcu_read_lock_held+0x90/0xa0 [ 73.105783][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.106548][ C1] ipv6_rcv+0x1f1/0x300 [ ... ] Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-20ipv6: fix IPV6_ADDRFORM operation logicHangbin Liu
[ Upstream commit 79a1f0ccdbb4ad700590f61b00525b390cb53905 ] Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present. Previously the checking logic looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol != IPPROTO_TCP) break; After commit b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP was blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; break; else break; Then after commit 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") UDP/UDPLITE were blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; if (sk->sk_protocol == IPPROTO_TCP) do_some_check; if (sk->sk_protocol != IPPROTO_TCP) break; Fix it by using Eric's code and simply remove the break in TCP check, which looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; else break; Fixes: 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11esp6: fix memleak on error path in esp6_inputZhen Lei
commit 7284fdf39a912322ce97de2d30def3c6068a418c upstream. This ought to be an omission in e6194923237 ("esp: Fix memleaks on error paths."). The memleak on error path in esp6_input is similar to esp_input of esp4. Fixes: e6194923237 ("esp: Fix memleaks on error paths.") Fixes: 3f29770723f ("ipsec: check return value of skb_to_sgvec always") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27l2tp: lock socket before checking flags in connect()Guillaume Nault
commit 0382a25af3c771a8e4d5e417d1834cbe28c2aaac upstream. Socket flags aren't updated atomically, so the socket must be locked while reading the SOCK_ZAPPED flag. This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch also brings error handling for __ip6_datagram_connect() failures. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20net: handle no dst on skb in icmp6_sendDavid Ahern
commit 79dc7e3f1cd323be4c81aa1a94faa1b3ed987fb2 upstream. Andrey reported the following while fuzzing the kernel with syzkaller: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 3859 Comm: a.out Not tainted 4.9.0-rc6+ #429 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800666d4200 task.stack: ffff880067348000 RIP: 0010:[<ffffffff833617ec>] [<ffffffff833617ec>] icmp6_send+0x5fc/0x1e30 net/ipv6/icmp.c:451 RSP: 0018:ffff88006734f2c0 EFLAGS: 00010206 RAX: ffff8800666d4200 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018 RBP: ffff88006734f630 R08: ffff880064138418 R09: 0000000000000003 R10: dffffc0000000000 R11: 0000000000000005 R12: 0000000000000000 R13: ffffffff84e7e200 R14: ffff880064138484 R15: ffff8800641383c0 FS: 00007fb3887a07c0(0000) GS:ffff88006cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000006b040000 CR4: 00000000000006f0 Stack: ffff8800666d4200 ffff8800666d49f8 ffff8800666d4200 ffffffff84c02460 ffff8800666d4a1a 1ffff1000ccdaa2f ffff88006734f498 0000000000000046 ffff88006734f440 ffffffff832f4269 ffff880064ba7456 0000000000000000 Call Trace: [<ffffffff83364ddc>] icmpv6_param_prob+0x2c/0x40 net/ipv6/icmp.c:557 [< inline >] ip6_tlvopt_unknown net/ipv6/exthdrs.c:88 [<ffffffff83394405>] ip6_parse_tlv+0x555/0x670 net/ipv6/exthdrs.c:157 [<ffffffff8339a759>] ipv6_parse_hopopts+0x199/0x460 net/ipv6/exthdrs.c:663 [<ffffffff832ee773>] ipv6_rcv+0xfa3/0x1dc0 net/ipv6/ip6_input.c:191 ... icmp6_send / icmpv6_send is invoked for both rx and tx paths. In both cases the dst->dev should be preferred for determining the L3 domain if the dst has been set on the skb. Fallback to the skb->dev if it has not. This covers the case reported here where icmp6_send is invoked on Rx before the route lookup. Fixes: 5d41ce29e ("net: icmp6_send should use dst dev to determine L3 domain") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookupSabrina Dubroca
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2 upstream. ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.4: - Drop changes in lwt_bpf.c, mlx5, and rxe - Adjust filename, context, indentation] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20net: ipv6: add net argument to ip6_dst_lookup_flowSabrina Dubroca
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e upstream. This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument"). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20ipv6: fix cleanup ordering for ip6_mr failureSabrina Dubroca
commit afe49de44c27a89e8e9631c44b5ffadf6ace65e2 upstream. Commit 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") moved the cleanup label for ipmr_fail, but should have changed the contents of the cleanup labels as well. Now we can end up cleaning up icmpv6 even though it hasn't been initialized (jump to icmp_fail or ipmr_fail). Simply undo things in the reverse order of their initialization. Example of panic (triggered by faking a failure of icmpv6_init): kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [...] RIP: 0010:__list_del_entry_valid+0x79/0x160 [...] Call Trace: ? lock_release+0x8a0/0x8a0 unregister_pernet_operations+0xd4/0x560 ? ops_free_list+0x480/0x480 ? down_write+0x91/0x130 ? unregister_pernet_subsys+0x15/0x30 ? down_read+0x1b0/0x1b0 ? up_read+0x110/0x110 ? kmem_cache_create_usercopy+0x1b4/0x240 unregister_pernet_subsys+0x1d/0x30 icmpv6_cleanup+0x1d/0x30 inet6_init+0x1b5/0x23f Fixes: 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net: icmp6_send should use dst dev to determine L3 domainDavid Ahern
commit 5d41ce29e3b91ef305f88d23f72b3359de329cec upstream. icmp6_send is called in response to some event. The skb may not have the device set (skb->dev is NULL), but it is expected to have a dst set. Update icmp6_send to use the dst on the skb to determine L3 domain. Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10vti6: fix input pathNicolas Dichtel
commit 63c43787d35e45562a6b5927e2edc8f4783d95b8 upstream. Since commit 1625f4529957, vti6 is broken, all input packets are dropped (LINUX_MIB_XFRMINNOSTATES is incremented). XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in xfrm6_rcv_spi(). A new function xfrm6_rcv_tnl() that enables to pass a value to xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function is used in several handlers). CC: Alexey Kodanev <alexey.kodanev@oracle.com> Fixes: 1625f4529957 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net: vrf: Fix dst reference countingDavid Ahern
commit 9ab179d83b4e31ea277a123492e419067c2f129a upstream. Vivek reported a kernel exception deleting a VRF with an active connection through it. The root cause is that the socket has a cached reference to a dst that is destroyed. Converting the dst_destroy to dst_release and letting proper reference counting kick in does not work as the dst has a reference to the device which needs to be released as well. I talked to Hannes about this at netdev and he pointed out the ipv4 and ipv6 dst handling has dst_ifdown for just this scenario. Rather than continuing with the reinvented dst wheel in VRF just remove it and leverage the ipv4 and ipv6 versions. Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver") Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10udp: restore UDPlite many-cast deliveryPablo Neira
commit 73e2d5e34b6cdd1080038daf3d6d6d744a9eefe6 upstream. Honor udptable parameter that is passed to __udp*_lib_mcast_deliver(), otherwise udplite broadcast/multicast use the wrong table and it breaks. Fixes: 2dc41cff7545 ("udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10netfilter: nft_dup: do not use sreg_dev if the user doesn't specify itLiping Zhang
commit b73b8a1ba598236296a46103d81c10d629d9a470 upstream. The NFTA_DUP_SREG_DEV attribute is not a must option, so we should use it in routing lookup only when the user specify it. Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net: ipv6: Fix processing of RAs in presence of VRFDavid Ahern
commit 830218c1add1da16519b71909e5cf21522b7d062 upstream. rt6_add_route_info and rt6_add_dflt_router were updated to pull the FIB table from the device index, but the corresponding rt6_get_route_info and rt6_get_dflt_router functions were not leading to the failure to process RA's: ICMPv6: RA: ndisc_router_discovery failed to add default route Fix the 'get' functions by using the table id associated with the device when applicable. Also, now that default routes can be added to tables other than the default table, rt6_purge_dflt_routers needs to be updated as well to look at all tables. To handle that efficiently, add a flag to the table denoting if it is has a default route via RA. Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10ipv6: add missing netconf notif when 'all' is updatedNicolas Dichtel
commit d26c638c16cb54f6fb1507e27df93ede692db572 upstream. The 'default' value was not advertised. Fixes: f3a1bfb11ccb ("rtnl/ipv6: use netconf msg to advertise forwarding status") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10ipv6: do not abuse GFP_ATOMIC in inet6_netconf_notify_devconf()Eric Dumazet
commit 927265bc6cd6374c9bafc43408ece4e92311b149 upstream. All inet6_netconf_notify_devconf() callers are in process context, so we can use GFP_KERNEL allocations if we take care of not holding a rwlock while not needed in ip6mr (we hold RTNL there) Fixes: d67b8c616b48 ("netconf: advertise mc_forwarding status") Fixes: f3a1bfb11ccb ("rtnl/ipv6: use netconf msg to advertise forwarding status") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10ipv6: fix checksum annotation in udp6_csum_initHannes Frederic Sowa
commit c148d16369ff0095eca950d17968ba1d56a47b53 upstream. Cc: Tom Herbert <tom@herbertland.com> Fixes: 4068579e1e098fa ("net: Implmement RFC 6936 (zero RX csums for UDP/IPv6") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net: vrf: Fix dev refcnt leak due to IPv6 prefix routeDavid Ahern
commit 4f7f34eaab9f68c9bcd45386b15c414c38b40587 upstream. ifupdown2 found a kernel bug with IPv6 routes and movement from the main table to the VRF table. Sequence of events: Create the interface and add addresses: ip link add dev eth4.105 link eth4 type vlan id 105 ip addr add dev eth4.105 8.105.105.10/24 ip -6 addr add dev eth4.105 2008:105:105::10/64 At this point IPv6 has inserted a prefix route in the main table even though the interface is 'down'. From there the VRF device is created: ip link add dev vrf105 type vrf table 105 ip addr add dev vrf105 9.9.105.10/32 ip -6 addr add dev vrf105 2000:9:105::10/128 ip link set vrf105 up Then the interface is enslaved, while still in the 'down' state: ip link set dev eth4.105 master vrf105 Since the device is down the VRF driver cycling the device does not send the NETDEV_UP and NETDEV_DOWN but rather the NETDEV_CHANGE event which does not flush the routes inserted prior. When the link is brought up ip link set dev eth4.105 up the prefix route is added in the VRF table, but does not remove the route from the main table. Fix by handling the NETDEV_CHANGEUPPER event similar what was implemented for IPv4 in 7f49e7a38b77 ("net: Flush local routes when device changes vrf association") Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_keyAlexey Kodanev
commit 1625f4529957738be7d87cf157e107b8fb9d23b9 upstream. Running LTP 'icmp-uni-basic.sh -6 -p ipcomp -m tunnel' test over openvswitch + veth can trigger kernel panic: BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: [<ffffffff8169d1d2>] xfrm_input+0x82/0x750 ... [<ffffffff816d472e>] xfrm6_rcv_spi+0x1e/0x20 [<ffffffffa082c3c2>] xfrm6_tunnel_rcv+0x42/0x50 [xfrm6_tunnel] [<ffffffffa082727e>] tunnel6_rcv+0x3e/0x8c [tunnel6] [<ffffffff8169f365>] ip6_input_finish+0xd5/0x430 [<ffffffff8169fc53>] ip6_input+0x33/0x90 [<ffffffff8169f1d5>] ip6_rcv_finish+0xa5/0xb0 ... It seems that tunnel.ip6 can have garbage values and also dereferenced without a proper check, only tunnel.ip4 is being verified. Fix it by adding one more if block for AF_INET6 and initialize tunnel.ip6 with NULL inside xfrm6_rcv_spi() (which is similar to xfrm4_rcv_spi()). Fixes: 049f8e2 ("xfrm: Override skb->mark with tunnel->parm.i_key in xfrm_input") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-10net: ipv6: tcp reset, icmp need to consider L3 domainDavid Ahern
commit 1d2f7b2d956e242179aaf4a08f3545f99c81f9a3 upstream. Responses for packets to unused ports are getting lost with L3 domains. IPv4 has ip_send_unicast_reply for sending TCP responses which accounts for L3 domains; update the IPv6 counterpart tcp_v6_send_response. For icmp the L3 master check needs to be moved up in icmp6_send to properly respond to UDP packets to a port with no listener. Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05ipv6: use READ_ONCE() for inet->hdrincl as in ipv4Olivier Matz
commit 59e3e4b52663a9d97efbce7307f62e4bc5c9ce91 upstream. As it was done in commit 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg") and commit 20b50d79974e ("net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the value of inet->hdrincl in a local variable, to avoid introducing a race condition in the next commit. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-02xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finishDavid Ahern
[ Upstream commit 0c922a4850eba2e668f73a3f1153196e09abb251 ] IPSKB_XFRM_TRANSFORMED and IP6SKB_XFRM_TRANSFORMED are skb flags set by xfrm code to tell other skb handlers that the packet has been passed through the xfrm output functions. Simplify the code and just always set them rather than conditionally based on netfilter enabled thus making the flag available for other users. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-02ipv6: fix restrict IPV6_ADDRFORM operationJohn Haxby
[ Upstream commit 82c9ae440857840c56e05d4fb1427ee032531346 ] Commit b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation") fixed a problem found by syzbot an unfortunate logic error meant that it also broke IPV6_ADDRFORM. Rearrange the checks so that the earlier test is just one of the series of checks made before moving the socket from IPv6 to IPv4. Fixes: b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation") Signed-off-by: John Haxby <john.haxby@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02vti6: Fix memory leak of skb if input policy check failsTorsten Hilbrich
commit 2a9de3af21aa8c31cd68b0b39330d69f8c1e59df upstream. The vti6_rcv function performs some tests on the retrieved tunnel including checking the IP protocol, the XFRM input policy, the source and destination address. In all but one places the skb is released in the error case. When the input policy check fails the network packet is leaked. Using the same goto-label discard in this case to fix this problem. Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces") Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02vti[6]: fix packet tx through bpf_redirect() in XinY casesNicolas Dichtel
commit f1ed10264ed6b66b9cd5e8461cffce69be482356 upstream. I forgot the 4in6/6in4 cases in my previous patch. Let's fix them. Fixes: 95224166a903 ("vti[6]: fix packet tx through bpf_redirect()") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20ipv6: restrict IPV6_ADDRFORM operationEric Dumazet
commit b6f6118901d1e867ac9177bbff3b00b185bd4fdc upstream. IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one. While this operation sounds illogical, we have to support it. One of the things it does for TCP socket is to switch sk->sk_prot to tcp_prot. We now have other layers playing with sk->sk_prot, so we should make sure to not interfere with them. This patch makes sure sk_prot is the default pointer for TCP IPv6 socket. syzbot reported : BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD a0113067 P4D a0113067 PUD a8771067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427 __sock_release net/socket.c:605 [inline] sock_close+0xe1/0x260 net/socket.c:1283 __fput+0x2e4/0x740 fs/file_table.c:280 ____fput+0x15/0x20 fs/file_table.c:313 task_work_run+0x176/0x1b0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop arch/x86/entry/common.c:164 [inline] prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c429 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429 RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004 RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000 R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c Modules linked in: CR2: 0000000000000000 ---[ end trace 82567b5207e87bae ]--- RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>