Age | Commit message (Collapse) | Author |
|
Otherwise, we hit bogus ENOENT when removing elements.
Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Previous implementation was not usable with CONFIG_IPV6=m.
Fixes: a3419ce3356c ("netfilter: nf_conntrack_sip: add sip_external_media logic")
Signed-off-by: Alin Nastac <alin.nastac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree:
1) Fix list corruption in device notifier in the masquerade
infrastructure, from Florian Westphal.
2) Fix double-free of sets and use-after-free when deleting elements.
3) Don't bogusly return EBUSY when removing a set after flush command.
4) Use-after-free in dynamically allocate operations.
5) Don't report a new ruleset generation to userspace if transaction
list is empty, this invalidates the userspace cache innecessarily.
From Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case x25_connect() fails and frees the socket neighbour,
we also need to undo the change done to x25->state.
Before my last bug fix, we had use-after-free so this
patch fixes a latent bug.
syzbot report :
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 16137 Comm: syz-executor.1 Not tainted 5.0.0+ #117
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:x25_write_internal+0x1e8/0xdf0 net/x25/x25_subr.c:173
Code: 00 40 88 b5 e0 fe ff ff 0f 85 01 0b 00 00 48 8b 8b 80 04 00 00 48 ba 00 00 00 00 00 fc ff df 48 8d 79 1c 48 89 fe 48 c1 ee 03 <0f> b6 34 16 48 89 fa 83 e2 07 83 c2 03 40 38 f2 7c 09 40 84 f6 0f
RSP: 0018:ffff888076717a08 EFLAGS: 00010207
RAX: ffff88805f2f2292 RBX: ffff8880a0ae6000 RCX: 0000000000000000
kobject: 'loop5' (0000000018d0d0ee): kobject_uevent_env
RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 000000000000001c
RBP: ffff888076717b40 R08: ffff8880950e0580 R09: ffffed100be5e46d
R10: ffffed100be5e46c R11: ffff88805f2f2363 R12: ffff888065579840
kobject: 'loop5' (0000000018d0d0ee): fill_kobj_path: path = '/devices/virtual/block/loop5'
R13: 1ffff1100ece2f47 R14: 0000000000000013 R15: 0000000000000013
FS: 00007fb88cf43700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9a42a41028 CR3: 0000000087a67000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
x25_release+0xd0/0x340 net/x25/af_x25.c:658
__sock_release+0xd3/0x2b0 net/socket.c:579
sock_close+0x1b/0x30 net/socket.c:1162
__fput+0x2df/0x8d0 fs/file_table.c:278
____fput+0x16/0x20 fs/file_table.c:309
task_work_run+0x14a/0x1c0 kernel/task_work.c:113
get_signal+0x1961/0x1d50 kernel/signal.c:2388
do_signal+0x87/0x1940 arch/x86/kernel/signal.c:816
exit_to_usermode_loop+0x244/0x2c0 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_64+0x52d/0x610 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457f29
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb88cf42c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: fffffffffffffe00 RBX: 0000000000000003 RCX: 0000000000457f29
RDX: 0000000000000012 RSI: 0000000020000080 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb88cf436d4
R13: 00000000004be462 R14: 00000000004cec98 R15: 00000000ffffffff
Modules linked in:
Fixes: 95d6ebd53c79 ("net/x25: fix use-after-free in x25_device_event()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: andrew hendry <andrew.hendry@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit eeea10b83a13 ("tcp: add
tcp_v4_fill_cb()/tcp_v4_restore_cb()"), tcp_vX_fill_cb is only called
after tcp_filter(). That means, TCP_SKB_CB(skb)->end_seq still points to
the IP-part of the cb.
We thus should not mock with it, as this can trigger bugs (thanks
syzkaller):
[ 12.349396] ==================================================================
[ 12.350188] BUG: KASAN: slab-out-of-bounds in ip6_datagram_recv_specific_ctl+0x19b3/0x1a20
[ 12.351035] Read of size 1 at addr ffff88006adbc208 by task test_ip6_datagr/1799
Setting end_seq is actually no more necessary in tcp_filter as it gets
initialized later on in tcp_vX_fill_cb.
Cc: Eric Dumazet <edumazet@google.com>
Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When running 'nft flush ruleset' while no rules exist, we will increment
the generation counter and announce a new genid to userspace, yet
nothing had changed in the first place.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
32,so UBSAN complain about it.
UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
__ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
try_6rd net/ipv6/sit.c:806 [inline]
ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
__netdev_start_xmit include/linux/netdevice.h:4300 [inline]
netdev_start_xmit include/linux/netdevice.h:4309 [inline]
xmit_one net/core/dev.c:3243 [inline]
dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
__dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
neigh_output include/net/neighbour.h:501 [inline]
ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xc8/0x110 net/socket.c:631
___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
__sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: linmiaohe <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Smatch reports:
net/netfilter/nf_tables_api.c:2167 nf_tables_expr_destroy()
error: dereferencing freed memory 'expr->ops'
net/netfilter/nf_tables_api.c
2162 static void nf_tables_expr_destroy(const struct nft_ctx *ctx,
2163 struct nft_expr *expr)
2164 {
2165 if (expr->ops->destroy)
2166 expr->ops->destroy(ctx, expr);
^^^^
--> 2167 module_put(expr->ops->type->owner);
^^^^^^^^^
2168 }
Smatch says there are three functions which free expr->ops.
Fixes: b8e204006340 ("netfilter: nft_compat: use .release_ops and remove list of extension")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Set deletion after flush coming in the same batch results in EBUSY. Add
set use counter to track the number of references to this set from
rules. We cannot rely on the list of bindings for this since such list
is still populated from the preparation phase.
Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We keep receiving syzbot reports [1] that show that tunnels do not play
the rcu/IFF_UP rules properly.
At device dismantle phase, gro_cells_destroy() will be called
only after a full rcu grace period is observed after IFF_UP
has been cleared.
This means that IFF_UP needs to be tested before queueing packets
into netif_rx() or gro_cells.
This patch implements the test in gro_cells_receive() because
too many callers do not seem to bother enough.
[1]
BUG: unable to handle kernel paging request at fffff4ca0b9ffffe
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.0.0+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <42> 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
kobject: 'loop2' (000000004bd7d84a): kobject_uevent_env
FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0
Call Trace:
kobject: 'loop2' (000000004bd7d84a): fill_kobj_path: path = '/devices/virtual/block/loop2'
ip_tunnel_dev_free+0x19/0x60 net/ipv4/ip_tunnel.c:1010
netdev_run_todo+0x51c/0x7d0 net/core/dev.c:8970
rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:116
ip_tunnel_delete_nets+0x423/0x5f0 net/ipv4/ip_tunnel.c:1124
vti_exit_batch_net+0x23/0x30 net/ipv4/ip_vti.c:495
ops_exit_list.isra.0+0x105/0x160 net/core/net_namespace.c:156
cleanup_net+0x3fb/0x960 net/core/net_namespace.c:551
process_one_work+0x98e/0x1790 kernel/workqueue.c:2173
worker_thread+0x98/0xe40 kernel/workqueue.c:2319
kthread+0x357/0x430 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Modules linked in:
CR2: fffff4ca0b9ffffe
[ end trace 513fc9c1338d1cb3 ]
RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <42> 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
kobject: 'loop3' (00000000e4ee57a6): kobject_uevent_env
R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0
Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case of failure x25_connect() does a x25_neigh_put(x25->neighbour)
but forgets to clear x25->neighbour pointer, thus triggering use-after-free.
Since the socket is visible in x25_list, we need to hold x25_list_lock
to protect the operation.
syzbot report :
BUG: KASAN: use-after-free in x25_kill_by_device net/x25/af_x25.c:217 [inline]
BUG: KASAN: use-after-free in x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
Read of size 8 at addr ffff8880a030edd0 by task syz-executor003/7854
CPU: 0 PID: 7854 Comm: syz-executor003 Not tainted 5.0.0+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
x25_kill_by_device net/x25/af_x25.c:217 [inline]
x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
call_netdevice_notifiers net/core/dev.c:1765 [inline]
__dev_notify_flags+0x1e9/0x2c0 net/core/dev.c:7607
dev_change_flags+0x10d/0x170 net/core/dev.c:7643
dev_ifsioc+0x2b0/0x940 net/core/dev_ioctl.c:237
dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:488
sock_do_ioctl+0x1bd/0x300 net/socket.c:995
sock_ioctl+0x32b/0x610 net/socket.c:1096
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4467c9
Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fdbea222d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000006dbc58 RCX: 00000000004467c9
RDX: 0000000020000340 RSI: 0000000000008914 RDI: 0000000000000003
RBP: 00000000006dbc50 R08: 00007fdbea223700 R09: 0000000000000000
R10: 00007fdbea223700 R11: 0000000000000246 R12: 00000000006dbc5c
R13: 6000030030626669 R14: 0000000000000000 R15: 0000000030626669
Allocated by task 7843:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_kmalloc mm/kasan/common.c:495 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:468
kasan_kmalloc+0x9/0x10 mm/kasan/common.c:509
kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3615
kmalloc include/linux/slab.h:545 [inline]
x25_link_device_up+0x46/0x3f0 net/x25/x25_link.c:249
x25_device_event+0x116/0x2b0 net/x25/af_x25.c:242
notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
call_netdevice_notifiers net/core/dev.c:1765 [inline]
__dev_notify_flags+0x121/0x2c0 net/core/dev.c:7605
dev_change_flags+0x10d/0x170 net/core/dev.c:7643
dev_ifsioc+0x2b0/0x940 net/core/dev_ioctl.c:237
dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:488
sock_do_ioctl+0x1bd/0x300 net/socket.c:995
sock_ioctl+0x32b/0x610 net/socket.c:1096
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 7865:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:457
kasan_slab_free+0xe/0x10 mm/kasan/common.c:465
__cache_free mm/slab.c:3494 [inline]
kfree+0xcf/0x230 mm/slab.c:3811
x25_neigh_put include/net/x25.h:253 [inline]
x25_connect+0x8d8/0xde0 net/x25/af_x25.c:824
__sys_connect+0x266/0x330 net/socket.c:1685
__do_sys_connect net/socket.c:1696 [inline]
__se_sys_connect net/socket.c:1693 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1693
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8880a030edc0
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 16 bytes inside of
256-byte region [ffff8880a030edc0, ffff8880a030eec0)
The buggy address belongs to the page:
page:ffffea000280c380 count:1 mapcount:0 mapping:ffff88812c3f07c0 index:0x0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea0002806788 ffffea00027f0188 ffff88812c3f07c0
raw: 0000000000000000 ffff8880a030e000 000000010000000c 0000000000000000
page dumped because: kasan: bad access detected
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+04babcefcd396fabec37@syzkaller.appspotmail.com
Cc: andrew hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rxrpc_get_client_conn() adds a new call to the front of the waiting_calls
queue if the connection it's going to use already exists. This is bad as
it allows calls to get starved out.
Fix this by adding to the tail instead.
Also change the other enqueue point in the same function to put it on the
front (ie. when we have a new connection). This makes the point that in
the case of a new connection the new call goes at the front (though it
doesn't actually matter since the queue should be unoccupied).
Fixes: 45025bceef17 ("rxrpc: Improve management and caching of client connection objects")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2019-03-09
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a crash in AF_XDP's xsk_diag_put_ring() which was passing
wrong queue argument, from Eric.
2) Fix a regression due to wrong test for TCP GSO packets used in
various BPF helpers like NAT64, from Willem.
3) Fix a sk_msg strparser warning which asserts that strparser must
be stopped first, from Jakub.
4) Fix rejection of invalid options/bind flags in AF_XDP, from Björn.
5) Fix GSO in bpf_lwt_push_ip_encap() which must properly set inner
headers and inner protocol, from Peter.
6) Fix a libbpf leak when kernel does not support BTF, from Nikita.
7) Various BPF selftest and libbpf build fixes to make out-of-tree
compilation work and to properly resolve dependencies via fixdep
target, from Stanislav.
8) Fix rejection of invalid ldimm64 imm field, from Daniel.
9) Fix bpf stats sysctl compile warning of unused helper function
proc_dointvec_minmax_bpf_stats() under some configs, from Arnd.
10) Fix couple of warnings about using plain integer as NULL, from Bo.
11) Fix some BPF sample spelling mistakes, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 7716682cc58e ("tcp/dccp: fix another race at listener
dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
{tcp,dccp}_check_req() accordingly. However, TFO and syncookies
weren't modified, thus leaking allocated resources on error.
Contrary to tcp_check_req(), in both syncookies and TFO cases,
we need to drop the request socket. Also, since the child socket is
created with inet_csk_clone_lock(), we have to unlock it and drop an
extra reference (->sk_refcount is initially set to 2 and
inet_csk_reqsk_queue_add() drops only one ref).
For TFO, we also need to revert the work done by tcp_try_fastopen()
(with reqsk_fastopen_remove()).
Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
My prior commit missed the fact that these functions
were using udp_hdr() (aka skb_transport_header())
to get access to GUE header.
Since pskb_transport_may_pull() does not exist yet, we have to add
transport_offset to our pskb_may_pull() calls.
BUG: KMSAN: uninit-value in gue_err+0x514/0xfa0 net/ipv4/fou.c:1032
CPU: 1 PID: 10648 Comm: syz-executor.1 Not tainted 5.0.0+ #11
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600
__msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
gue_err+0x514/0xfa0 net/ipv4/fou.c:1032
__udp4_lib_err_encap_no_sk net/ipv4/udp.c:571 [inline]
__udp4_lib_err_encap net/ipv4/udp.c:626 [inline]
__udp4_lib_err+0x12e6/0x1d40 net/ipv4/udp.c:665
udp_err+0x74/0x90 net/ipv4/udp.c:737
icmp_socket_deliver net/ipv4/icmp.c:767 [inline]
icmp_unreach+0xb65/0x1070 net/ipv4/icmp.c:884
icmp_rcv+0x11a1/0x1950 net/ipv4/icmp.c:1066
ip_protocol_deliver_rcu+0x584/0xbb0 net/ipv4/ip_input.c:208
ip_local_deliver_finish net/ipv4/ip_input.c:234 [inline]
NF_HOOK include/linux/netfilter.h:289 [inline]
ip_local_deliver+0x624/0x7b0 net/ipv4/ip_input.c:255
dst_input include/net/dst.h:450 [inline]
ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
NF_HOOK include/linux/netfilter.h:289 [inline]
ip_rcv+0x6bd/0x740 net/ipv4/ip_input.c:524
__netif_receive_skb_one_core net/core/dev.c:4973 [inline]
__netif_receive_skb net/core/dev.c:5083 [inline]
process_backlog+0x756/0x10e0 net/core/dev.c:5923
napi_poll net/core/dev.c:6346 [inline]
net_rx_action+0x78b/0x1a60 net/core/dev.c:6412
__do_softirq+0x53f/0x93a kernel/softirq.c:293
invoke_softirq kernel/softirq.c:375 [inline]
irq_exit+0x214/0x250 kernel/softirq.c:416
exiting_irq+0xe/0x10 arch/x86/include/asm/apic.h:536
smp_apic_timer_interrupt+0x48/0x70 arch/x86/kernel/apic/apic.c:1064
apic_timer_interrupt+0x2e/0x40 arch/x86/entry/entry_64.S:814
</IRQ>
RIP: 0010:finish_lock_switch+0x2b/0x40 kernel/sched/core.c:2597
Code: 48 89 e5 53 48 89 fb e8 63 e7 95 00 8b b8 88 0c 00 00 48 8b 00 48 85 c0 75 12 48 89 df e8 dd db 95 00 c6 00 00 c6 03 00 fb 5b <5d> c3 e8 4e e6 95 00 eb e7 66 90 66 2e 0f 1f 84 00 00 00 00 00 55
RSP: 0018:ffff888081a0fc80 EFLAGS: 00000296 ORIG_RAX: ffffffffffffff13
RAX: ffff88821fd6bd80 RBX: ffff888027898000 RCX: ccccccccccccd000
RDX: ffff88821fca8d80 RSI: ffff888000000000 RDI: 00000000000004a0
RBP: ffff888081a0fc80 R08: 0000000000000002 R09: ffff888081a0fb08
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff88811130e388 R14: ffff88811130da00 R15: ffff88812fdb7d80
finish_task_switch+0xfc/0x2d0 kernel/sched/core.c:2698
context_switch kernel/sched/core.c:2851 [inline]
__schedule+0x6cc/0x800 kernel/sched/core.c:3491
schedule+0x15b/0x240 kernel/sched/core.c:3535
freezable_schedule include/linux/freezer.h:172 [inline]
do_nanosleep+0x2ba/0x980 kernel/time/hrtimer.c:1679
hrtimer_nanosleep kernel/time/hrtimer.c:1733 [inline]
__do_sys_nanosleep kernel/time/hrtimer.c:1767 [inline]
__se_sys_nanosleep+0x746/0x960 kernel/time/hrtimer.c:1754
__x64_sys_nanosleep+0x3e/0x60 kernel/time/hrtimer.c:1754
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x4855a0
Code: 00 00 48 c7 c0 d4 ff ff ff 64 c7 00 16 00 00 00 31 c0 eb be 66 0f 1f 44 00 00 83 3d b1 11 5d 00 00 75 14 b8 23 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 04 e2 f8 ff c3 48 83 ec 08 e8 3a 55 fd ff
RSP: 002b:0000000000a4fd58 EFLAGS: 00000246 ORIG_RAX: 0000000000000023
RAX: ffffffffffffffda RBX: 0000000000085780 RCX: 00000000004855a0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000a4fd60
RBP: 00000000000007ec R08: 0000000000000001 R09: 0000000000ceb940
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000a4fdb0 R14: 0000000000085711 R15: 0000000000a4fdc0
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:205 [inline]
kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:159
kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
slab_post_alloc_hook mm/slab.h:445 [inline]
slab_alloc_node mm/slub.c:2773 [inline]
__kmalloc_node_track_caller+0xe9e/0xff0 mm/slub.c:4398
__kmalloc_reserve net/core/skbuff.c:140 [inline]
__alloc_skb+0x309/0xa20 net/core/skbuff.c:208
alloc_skb include/linux/skbuff.h:1012 [inline]
alloc_skb_with_frags+0x186/0xa60 net/core/skbuff.c:5287
sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2091
sock_alloc_send_skb+0xca/0xe0 net/core/sock.c:2108
__ip_append_data+0x34cd/0x5000 net/ipv4/ip_output.c:998
ip_append_data+0x324/0x480 net/ipv4/ip_output.c:1220
icmp_push_reply+0x23d/0x7e0 net/ipv4/icmp.c:375
__icmp_send+0x2ea3/0x30f0 net/ipv4/icmp.c:737
icmp_send include/net/icmp.h:47 [inline]
ipv4_link_failure+0x6d/0x230 net/ipv4/route.c:1190
dst_link_failure include/net/dst.h:427 [inline]
arp_error_report+0x106/0x1a0 net/ipv4/arp.c:297
neigh_invalidate+0x359/0x8e0 net/core/neighbour.c:992
neigh_timer_handler+0xdf2/0x1280 net/core/neighbour.c:1078
call_timer_fn+0x285/0x600 kernel/time/timer.c:1325
expire_timers kernel/time/timer.c:1362 [inline]
__run_timers+0xdb4/0x11d0 kernel/time/timer.c:1681
run_timer_softirq+0x2e/0x50 kernel/time/timer.c:1694
__do_softirq+0x53f/0x93a kernel/softirq.c:293
Fixes: 26fc181e6cac ("fou, fou6: do not assume linear skbs")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When used with unlocked classifier that have filters attached to actions
with goto chain, __tcf_chain_put() for last non action reference can race
with calls to same function from action cleanup code that releases last
action reference. In this case action cleanup handler could free the chain
if it executes after all references to chain were released, but before all
concurrent users finished using it. Modify __tcf_chain_put() to only access
tcf_chain fields when holding block->lock. Remove local variables that were
used to cache some tcf_chain fields and are no longer needed because their
values can now be obtained directly from chain under block->lock
protection.
Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previous to commit 22b5c0b63f32 ("vsock/virtio: fix kernel panic
after device hot-unplug"), vsock_core_init() was called from
virtio_vsock_probe(). Now, virtio_transport_reset_no_sock() can be called
before vsock_core_init() has the chance to run.
[Wed Feb 27 14:17:09 2019] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
[Wed Feb 27 14:17:09 2019] #PF error: [normal kernel read fault]
[Wed Feb 27 14:17:09 2019] PGD 0 P4D 0
[Wed Feb 27 14:17:09 2019] Oops: 0000 [#1] SMP PTI
[Wed Feb 27 14:17:09 2019] CPU: 3 PID: 59 Comm: kworker/3:1 Not tainted 5.0.0-rc7-390-generic-hvi #390
[Wed Feb 27 14:17:09 2019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[Wed Feb 27 14:17:09 2019] Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport]
[Wed Feb 27 14:17:09 2019] RIP: 0010:virtio_transport_reset_no_sock+0x8c/0xc0 [vmw_vsock_virtio_transport_common]
[Wed Feb 27 14:17:09 2019] Code: 35 8b 4f 14 48 8b 57 08 31 f6 44 8b 4f 10 44 8b 07 48 8d 7d c8 e8 84 f8 ff ff 48 85 c0 48 89 c3 74 2a e8 f7 31 03 00 48 89 df <48> 8b 80 10 01 00 00 e8 68 fb 69 ed 48 8b 75 f0 65 48 33 34 25 28
[Wed Feb 27 14:17:09 2019] RSP: 0018:ffffb42701ab7d40 EFLAGS: 00010282
[Wed Feb 27 14:17:09 2019] RAX: 0000000000000000 RBX: ffff9d79637ee080 RCX: 0000000000000003
[Wed Feb 27 14:17:09 2019] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff9d79637ee080
[Wed Feb 27 14:17:09 2019] RBP: ffffb42701ab7d78 R08: ffff9d796fae70e0 R09: ffff9d796f403500
[Wed Feb 27 14:17:09 2019] R10: ffffb42701ab7d90 R11: 0000000000000000 R12: ffff9d7969d09240
[Wed Feb 27 14:17:09 2019] R13: ffff9d79624e6840 R14: ffff9d7969d09318 R15: ffff9d796d48ff80
[Wed Feb 27 14:17:09 2019] FS: 0000000000000000(0000) GS:ffff9d796fac0000(0000) knlGS:0000000000000000
[Wed Feb 27 14:17:09 2019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Feb 27 14:17:09 2019] CR2: 0000000000000110 CR3: 0000000427f22000 CR4: 00000000000006e0
[Wed Feb 27 14:17:09 2019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Wed Feb 27 14:17:09 2019] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Wed Feb 27 14:17:09 2019] Call Trace:
[Wed Feb 27 14:17:09 2019] virtio_transport_recv_pkt+0x63/0x820 [vmw_vsock_virtio_transport_common]
[Wed Feb 27 14:17:09 2019] ? kfree+0x17e/0x190
[Wed Feb 27 14:17:09 2019] ? detach_buf_split+0x145/0x160
[Wed Feb 27 14:17:09 2019] ? __switch_to_asm+0x40/0x70
[Wed Feb 27 14:17:09 2019] virtio_transport_rx_work+0xa0/0x106 [vmw_vsock_virtio_transport]
[Wed Feb 27 14:17:09 2019] NET: Registered protocol family 40
[Wed Feb 27 14:17:09 2019] process_one_work+0x167/0x410
[Wed Feb 27 14:17:09 2019] worker_thread+0x4d/0x460
[Wed Feb 27 14:17:09 2019] kthread+0x105/0x140
[Wed Feb 27 14:17:09 2019] ? rescuer_thread+0x360/0x360
[Wed Feb 27 14:17:09 2019] ? kthread_destroy_worker+0x50/0x50
[Wed Feb 27 14:17:09 2019] ret_from_fork+0x35/0x40
[Wed Feb 27 14:17:09 2019] Modules linked in: vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common input_leds vsock serio_raw i2c_piix4 mac_hid qemu_fw_cfg autofs4 cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_net psmouse drm net_failover pata_acpi virtio_blk failover floppy
Fixes: 22b5c0b63f32 ("vsock/virtio: fix kernel panic after device hot-unplug")
Reported-by: Alexandru Herghelegiu <aherghelegiu@bitdefender.com>
Signed-off-by: Adalbert Lazăr <alazar@bitdefender.com>
Co-developed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Sparse warning below:
sudo make C=2 CF=-D__CHECK_ENDIAN__ M=net/bpf/
CHECK net/bpf//test_run.c
net/bpf//test_run.c:19:77: warning: Using plain integer as NULL pointer
./include/linux/bpf-cgroup.h:295:77: warning: Using plain integer as NULL pointer
Fixes: 8bad74f9840f ("bpf: extend cgroup bpf core to allow multiple cgroup storage types")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Passing a non-existing option in the options member of struct
xdp_desc was, incorrectly, silently ignored. This patch addresses
that behavior, and drops any Tx descriptor with non-existing options.
We have examined existing user space code, and to our best knowledge,
no one is relying on the current incorrect behavior. AF_XDP is still
in its infancy, so from our perspective, the risk of breakage is very
low, and addressing this problem now is important.
Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Passing a non-existing flag in the sxdp_flags member of struct
sockaddr_xdp was, incorrectly, silently ignored. This patch addresses
that behavior, and rejects any non-existing flags.
We have examined existing user space code, and to our best knowledge,
no one is relying on the current incorrect behavior. AF_XDP is still
in its infancy, so from our perspective, the risk of breakage is very
low, and addressing this problem now is important.
Fixes: 965a99098443 ("xsk: add support for bind for Rx")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
New ep's auth_hmacs should be set if old ep's is set, in case that
net->sctp.auth_enable has been changed to 0 by users and new ep's
auth_hmacs couldn't be set in sctp_endpoint_init().
It can even crash kernel by doing:
1. on server: sysctl -w net.sctp.auth_enable=1,
sysctl -w net.sctp.addip_enable=1,
sysctl -w net.sctp.addip_noauth_enable=0,
listen() on server,
sysctl -w net.sctp.auth_enable=0.
2. on client: connect() to server.
3. on server: accept() the asoc,
sysctl -w net.sctp.auth_enable=1.
4. on client: send() asconf packet to server.
The call trace:
[ 245.280251] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 245.286872] RIP: 0010:sctp_auth_calculate_hmac+0xa3/0x140 [sctp]
[ 245.304572] Call Trace:
[ 245.305091] <IRQ>
[ 245.311287] sctp_sf_authenticate+0x110/0x160 [sctp]
[ 245.312311] sctp_sf_eat_auth+0xf2/0x230 [sctp]
[ 245.313249] sctp_do_sm+0x9a/0x2d0 [sctp]
[ 245.321483] sctp_assoc_bh_rcv+0xed/0x1a0 [sctp]
[ 245.322495] sctp_rcv+0xa66/0xc70 [sctp]
It's because the old ep->auth_hmacs wasn't copied to the new ep while
ep->auth_hmacs is used in sctp_auth_calculate_hmac() when processing
the incoming auth chunks, and it should have been done when migrating
sock.
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sctp_auth_init_hmacs() is called only when ep->auth_enable is set.
It better to move up sctp_auth_init_hmacs() and remove auth_enable
check in it and check auth_enable only once in sctp_endpoint_init().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It should fail to create the new sk if sctp_bind_addr_dup() fails
when accepting or peeloff an association.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rxrpc_disconnect_client_call() reads the call's connection ID protocol
value (call->cid) as part of that function's variable declarations. This
is bad because it's not inside the locked section and so may race with
someone granting use of the channel to the call.
This manifests as an assertion failure (see below) where the call in the
presumed channel (0 because call->cid wasn't set when we read it) doesn't
match the call attached to the channel we were actually granted (if 1, 2 or
3).
Fix this by moving the read and dependent calculations inside of the
channel_lock section. Also, only set the channel number and pointer
variables if cid is not zero (ie. unset).
This problem can be induced by injecting an occasional error in
rxrpc_wait_for_channel() before the call to schedule().
Make two further changes also:
(1) Add a trace for wait failure in rxrpc_connect_call().
(2) Drop channel_lock before BUG'ing in the case of the assertion failure.
The failure causes a trace akin to the following:
rxrpc: Assertion failed - 18446612685268945920(0xffff8880beab8c00) == 18446612685268621312(0xffff8880bea69800) is false
------------[ cut here ]------------
kernel BUG at net/rxrpc/conn_client.c:824!
...
RIP: 0010:rxrpc_disconnect_client_call+0x2bf/0x99d
...
Call Trace:
rxrpc_connect_call+0x902/0x9b3
? wake_up_q+0x54/0x54
rxrpc_new_client_call+0x3a0/0x751
? rxrpc_kernel_begin_call+0x141/0x1bc
? afs_alloc_call+0x1b5/0x1b5
rxrpc_kernel_begin_call+0x141/0x1bc
afs_make_call+0x20c/0x525
? afs_alloc_call+0x1b5/0x1b5
? __lock_is_held+0x40/0x71
? lockdep_init_map+0xaf/0x193
? lockdep_init_map+0xaf/0x193
? __lock_is_held+0x40/0x71
? yfs_fs_fetch_data+0x33b/0x34a
yfs_fs_fetch_data+0x33b/0x34a
afs_fetch_data+0xdc/0x3b7
afs_read_dir+0x52d/0x97f
afs_dir_iterate+0xa0/0x661
? iterate_dir+0x63/0x141
iterate_dir+0xa2/0x141
ksys_getdents64+0x9f/0x11b
? filldir+0x111/0x111
? do_syscall_64+0x3e/0x1a0
__x64_sys_getdents64+0x16/0x19
do_syscall_64+0x7d/0x1a0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 45025bceef17 ("rxrpc: Improve management and caching of client connection objects")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot reported a NULL-ptr deref caused by that sched->init() in
sctp_stream_init() set stream->rr_next = NULL.
kasan: GPF could be caused by NULL-ptr deref or user memory access
RIP: 0010:sctp_sched_rr_dequeue+0xd3/0x170 net/sctp/stream_sched_rr.c:141
Call Trace:
sctp_outq_dequeue_data net/sctp/outqueue.c:90 [inline]
sctp_outq_flush_data net/sctp/outqueue.c:1079 [inline]
sctp_outq_flush+0xba2/0x2790 net/sctp/outqueue.c:1205
All sched info is saved in sout->ext now, in sctp_stream_init()
sctp_stream_alloc_out() will not change it, there's no need to
call sched->init() again, since sctp_outq_init() has already
done it.
Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations")
Reported-by: syzbot+4c9934f20522c0efd657@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The race occurs in __mkroute_output() when 2 threads lookup a dst:
CPU A CPU B
find_exception()
find_exception() [fnhe expires]
ip_del_fnhe() [fnhe is deleted]
rt_bind_exception()
In rt_bind_exception() it will bind a deleted fnhe with the new dst, and
this dst will get no chance to be freed. It causes a dev defcnt leak and
consecutive dmesg warnings:
unregister_netdevice: waiting for ethX to become free. Usage count = 1
Especially thanks Jon to identify the issue.
This patch fixes it by setting fnhe_daddr to 0 in ip_del_fnhe() to stop
binding the deleted fnhe with a new dst when checking fnhe's fnhe_daddr
and daddr in rt_bind_exception().
It works as both ip_del_fnhe() and rt_bind_exception() are protected by
fnhe_lock and the fhne is freed by kfree_rcu().
Fixes: deed49df7390 ("route: check and remove route cache when we get route")
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The abort path can cause a double-free of an anonymous set.
Added-and-to-be-aborted rule looks like this:
udp dport { 137, 138 } drop
The to-be-aborted transaction list looks like this:
newset
newsetelem
newsetelem
rule
This gets walked in reverse order, so first pass disables the rule, the
set elements, then the set.
After synchronize_rcu(), we then destroy those in same order: rule, set
element, set element, newset.
Problem is that the anonymous set has already been bound to the rule, so
the rule (lookup expression destructor) already frees the set, when then
cause use-after-free when trying to delete the elements from this set,
then try to free the set again when handling the newset expression.
Rule releases the bound set in first place from the abort path, this
causes the use-after-free on set element removal when undoing the new
element transactions. To handle this, skip new element transaction if
set is bound from the abort path.
This is still causes the use-after-free on set element removal. To
handle this, remove transaction from the list when the set is already
bound.
Joint work with Florian Westphal.
Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path")
Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Otherwise, we get notifier list corruption.
This is the most simple fix: remove the device notifier call chain
from the ipv6 masquerade register function and handle it only
in the ipv4 version.
The better fix is merge
nf_nat_masquerade_ipv4/6_(un)register_notifier
into a single
nf_nat_masquerade_(un)register_notifiers
but to do this its needed to first merge the two masquerade modules
into a single xt_MASQUERADE.
Furthermore, we need to use different refcounts for ipv4/ipv6
until we can merge MASQUERADE.
Fixes: d1aca8ab3104a ("netfilter: nat: merge ipv4 and ipv6 masquerade functionality")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
syzbot found another add_timer() issue, this time in net/hsr [1]
Let's use mod_timer() which is safe.
[1]
kernel BUG at kernel/time/timer.c:1136!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 15909 Comm: syz-executor.3 Not tainted 5.0.0+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
kobject: 'loop2' (00000000f5629718): kobject_uevent_env
RIP: 0010:add_timer kernel/time/timer.c:1136 [inline]
RIP: 0010:add_timer+0x654/0xbe0 kernel/time/timer.c:1134
Code: 0f 94 c5 31 ff 44 89 ee e8 09 61 0f 00 45 84 ed 0f 84 77 fd ff ff e8 bb 5f 0f 00 e8 07 10 a0 ff e9 68 fd ff ff e8 ac 5f 0f 00 <0f> 0b e8 a5 5f 0f 00 0f 0b e8 9e 5f 0f 00 4c 89 b5 58 ff ff ff e9
RSP: 0018:ffff8880656eeca0 EFLAGS: 00010246
kobject: 'loop2' (00000000f5629718): fill_kobj_path: path = '/devices/virtual/block/loop2'
RAX: 0000000000040000 RBX: 1ffff1100caddd9a RCX: ffffc9000c436000
RDX: 0000000000040000 RSI: ffffffff816056c4 RDI: ffff88806a2f6cc8
RBP: ffff8880656eed58 R08: ffff888067f4a300 R09: ffff888067f4abc8
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88806a2f6cc0
R13: dffffc0000000000 R14: 0000000000000001 R15: ffff8880656eed30
FS: 00007fc2019bf700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000738000 CR3: 0000000067e8e000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
hsr_check_announce net/hsr/hsr_device.c:99 [inline]
hsr_check_carrier_and_operstate+0x567/0x6f0 net/hsr/hsr_device.c:120
hsr_netdev_notify+0x297/0xa00 net/hsr/hsr_main.c:51
notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
call_netdevice_notifiers net/core/dev.c:1765 [inline]
dev_open net/core/dev.c:1436 [inline]
dev_open+0x143/0x160 net/core/dev.c:1424
team_port_add drivers/net/team/team.c:1203 [inline]
team_add_slave+0xa07/0x15d0 drivers/net/team/team.c:1933
do_set_master net/core/rtnetlink.c:2358 [inline]
do_set_master+0x1d4/0x230 net/core/rtnetlink.c:2332
do_setlink+0x966/0x3510 net/core/rtnetlink.c:2493
rtnl_setlink+0x271/0x3b0 net/core/rtnetlink.c:2747
rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192
netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210
netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:632
sock_write_iter+0x27c/0x3e0 net/socket.c:923
call_write_iter include/linux/fs.h:1869 [inline]
do_iter_readv_writev+0x5e0/0x8e0 fs/read_write.c:680
do_iter_write fs/read_write.c:956 [inline]
do_iter_write+0x184/0x610 fs/read_write.c:937
vfs_writev+0x1b3/0x2f0 fs/read_write.c:1001
do_writev+0xf6/0x290 fs/read_write.c:1036
__do_sys_writev fs/read_write.c:1109 [inline]
__se_sys_writev fs/read_write.c:1106 [inline]
__x64_sys_writev+0x75/0xb0 fs/read_write.c:1106
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457f29
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fc2019bec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457f29
RDX: 0000000000000001 RSI: 00000000200000c0 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc2019bf6d4
R13: 00000000004c4a60 R14: 00000000004dd218 R15: 00000000ffffffff
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I removed compat's universal assignment to 0, which allows this if
statement to fall through when compat is passed with a value other
than 0.
Fixes: f9d19a7494e5 ("net: atm: Use IS_ENABLED in atm_dev_ioctl")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When building with -Wsometimes-uninitialized, Clang warns:
net/atm/resources.c:256:6: warning: variable 'number' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
net/atm/resources.c:212:7: warning: variable 'iobuf_len' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
Clang won't realize that compat is 0 when CONFIG_COMPAT is not set until
the constant folding stage, which happens after this semantic analysis.
Use IS_ENABLED instead so that the zero is present at the semantic
analysis stage, which eliminates this warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/386
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
clang inlines the dev_ethtool() more aggressively than gcc does, leading
to a larger amount of used stack space:
net/core/ethtool.c:2536:24: error: stack frame size of 1216 bytes in function 'dev_ethtool' [-Werror,-Wframe-larger-than=]
Marking the sub-functions that require the most stack space as
noinline_for_stack gives us reasonable behavior on all compilers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We might have never enabled (started) the psock's parser, in which case it
will not get stopped when destroying the psock. This leads to a warning
when trying to cancel parser's work from psock's deferred destructor:
[ 405.325769] WARNING: CPU: 1 PID: 3216 at net/strparser/strparser.c:526 strp_done+0x3c/0x40
[ 405.326712] Modules linked in: [last unloaded: test_bpf]
[ 405.327359] CPU: 1 PID: 3216 Comm: kworker/1:164 Tainted: G W 5.0.0 #42
[ 405.328294] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
[ 405.329712] Workqueue: events sk_psock_destroy_deferred
[ 405.330254] RIP: 0010:strp_done+0x3c/0x40
[ 405.330706] Code: 28 e8 b8 d5 6b ff 48 8d bb 80 00 00 00 e8 9c d5 6b ff 48 8b 7b 18 48 85 ff 74 0d e8 1e a5 e8 ff 48 c7 43 18 00 00 00 00 5b c3 <0f> 0b eb cf 66 66 66 66 90 55 89 f5 53 48 89 fb 48 83 c7 28 e8 0b
[ 405.332862] RSP: 0018:ffffc900026bbe50 EFLAGS: 00010246
[ 405.333482] RAX: ffffffff819323e0 RBX: ffff88812cb83640 RCX: ffff88812cb829e8
[ 405.334228] RDX: 0000000000000001 RSI: ffff88812cb837e8 RDI: ffff88812cb83640
[ 405.335366] RBP: ffff88813fd22680 R08: 0000000000000000 R09: 000073746e657665
[ 405.336472] R10: 8080808080808080 R11: 0000000000000001 R12: ffff88812cb83600
[ 405.337760] R13: 0000000000000000 R14: ffff88811f401780 R15: ffff88812cb837e8
[ 405.338777] FS: 0000000000000000(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
[ 405.339903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 405.340821] CR2: 00007fb11489a6b8 CR3: 000000012d4d6000 CR4: 00000000000406e0
[ 405.341981] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 405.343131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 405.344415] Call Trace:
[ 405.344821] sk_psock_destroy_deferred+0x23/0x1b0
[ 405.345585] process_one_work+0x1ae/0x3e0
[ 405.346110] worker_thread+0x3c/0x3b0
[ 405.346576] ? pwq_unbound_release_workfn+0xd0/0xd0
[ 405.347187] kthread+0x11d/0x140
[ 405.347601] ? __kthread_parkme+0x80/0x80
[ 405.348108] ret_from_fork+0x35/0x40
[ 405.348566] ---[ end trace a4a3af4026a327d4 ]---
Stop psock's parser just before canceling its work.
Fixes: 1d79895aef18 ("sk_msg: Always cancel strp work before freeing the psock")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
GSO needs inner headers and inner protocol set properly to work.
skb->inner_mac_header: skb_reset_inner_headers() assigns the current
mac header value to inner_mac_header; but it is not set at the point,
so we need to call skb_reset_inner_mac_header, otherwise gre_gso_segment
fails: it does
int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
...
if (unlikely(!pskb_may_pull(skb, tnl_hlen)))
...
skb->inner_protocol should also be correctly set.
Fixes: ca78801a81e0 ("bpf: handle GSO in bpf_lwt_push_encap")
Signed-off-by: Peter Oskolkov <posk@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Fixes two typos in xsk_diag_put_umem()
syzbot reported the following crash :
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 7641 Comm: syz-executor946 Not tainted 5.0.0-rc7+ #95
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:xsk_diag_put_umem net/xdp/xsk_diag.c:71 [inline]
RIP: 0010:xsk_diag_fill net/xdp/xsk_diag.c:113 [inline]
RIP: 0010:xsk_diag_dump+0xdcb/0x13a0 net/xdp/xsk_diag.c:143
Code: 8d be c0 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 39 04 00 00 49 8b 96 c0 04 00 00 48 8d 7a 14 48 89 f8 48 c1 e8 03 <42> 0f b6 0c 20 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
RSP: 0018:ffff888090bcf2d8 EFLAGS: 00010203
RAX: 0000000000000002 RBX: ffff8880a0aacbc0 RCX: ffffffff86ffdc3c
RDX: 0000000000000000 RSI: ffffffff86ffdc70 RDI: 0000000000000014
RBP: ffff888090bcf438 R08: ffff88808e04a700 R09: ffffed1011c74174
R10: ffffed1011c74173 R11: ffff88808e3a0b9f R12: dffffc0000000000
R13: ffff888093a6d818 R14: ffff88808e365240 R15: ffff88808e3a0b40
FS: 00000000011ea880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 000000008fa13000 CR4: 00000000001406e0
Call Trace:
netlink_dump+0x55d/0xfb0 net/netlink/af_netlink.c:2252
__netlink_dump_start+0x5b4/0x7e0 net/netlink/af_netlink.c:2360
netlink_dump_start include/linux/netlink.h:226 [inline]
xsk_diag_handler_dump+0x1b2/0x250 net/xdp/xsk_diag.c:170
__sock_diag_cmd net/core/sock_diag.c:232 [inline]
sock_diag_rcv_msg+0x322/0x410 net/core/sock_diag.c:263
netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
sock_diag_rcv+0x2b/0x40 net/core/sock_diag.c:274
netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:632
sock_write_iter+0x27c/0x3e0 net/socket.c:923
call_write_iter include/linux/fs.h:1863 [inline]
do_iter_readv_writev+0x5e0/0x8e0 fs/read_write.c:680
do_iter_write fs/read_write.c:956 [inline]
do_iter_write+0x184/0x610 fs/read_write.c:937
vfs_writev+0x1b3/0x2f0 fs/read_write.c:1001
do_writev+0xf6/0x290 fs/read_write.c:1036
__do_sys_writev fs/read_write.c:1109 [inline]
__se_sys_writev fs/read_write.c:1106 [inline]
__x64_sys_writev+0x75/0xb0 fs/read_write.c:1106
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440139
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffcc966cc18 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440139
RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000000000000004 R11: 0000000000000246 R12: 00000000004019c0
R13: 0000000000401a50 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace 460a3c24d0a656c9 ]---
RIP: 0010:xsk_diag_put_umem net/xdp/xsk_diag.c:71 [inline]
RIP: 0010:xsk_diag_fill net/xdp/xsk_diag.c:113 [inline]
RIP: 0010:xsk_diag_dump+0xdcb/0x13a0 net/xdp/xsk_diag.c:143
Code: 8d be c0 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 39 04 00 00 49 8b 96 c0 04 00 00 48 8d 7a 14 48 89 f8 48 c1 e8 03 <42> 0f b6 0c 20 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
RSP: 0018:ffff888090bcf2d8 EFLAGS: 00010203
RAX: 0000000000000002 RBX: ffff8880a0aacbc0 RCX: ffffffff86ffdc3c
RDX: 0000000000000000 RSI: ffffffff86ffdc70 RDI: 0000000000000014
RBP: ffff888090bcf438 R08: ffff88808e04a700 R09: ffffed1011c74174
R10: ffffed1011c74173 R11: ffff88808e3a0b9f R12: dffffc0000000000
R13: ffff888093a6d818 R14: ffff88808e365240 R15: ffff88808e3a0b40
FS: 00000000011ea880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000001d22000 CR3: 000000008fa13000 CR4: 00000000001406f0
Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
BPF can adjust gso only for tcp bytestreams. Fail on other gso types.
But only on gso packets. It does not touch this field if !gso_size.
Fixes: b90efd225874 ("bpf: only adjust gso_size on bytestream protocols")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Returning 0 as inq to userspace indicates there is no more data to
read, and the application needs to wait for EPOLLIN. For a connection
that has received FIN from the remote peer, however, the application
must continue reading until getting EOF (return value of 0
from tcp_recvmsg) or an error, if edge-triggered epoll (EPOLLET) is
being used. Otherwise, the application will never receive a new
EPOLLIN, since there is no epoll edge after the FIN.
Return 1 when there is no data left on the queue but the
connection has received FIN, so that the applications continue
reading.
Fixes: b75eba76d3d72 (tcp: send in-queue bytes in cmsg upon read)
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to
add port, it directly returns res and forgets to free the node
that allocated in hsr_create_self_node(), and forgets to delete
the node->mac_list linked in hsr->self_node_db.
BUG: memory leak
unreferenced object 0xffff8881cfa0c780 (size 64):
comm "syz-executor.0", pid 2077, jiffies 4294717969 (age 2415.377s)
hex dump (first 32 bytes):
e0 c7 a0 cf 81 88 ff ff 00 02 00 00 00 00 ad de ................
00 e6 49 cd 81 88 ff ff c0 9b 87 d0 81 88 ff ff ..I.............
backtrace:
[<00000000e2ff5070>] hsr_dev_finalize+0x736/0x960 [hsr]
[<000000003ed2e597>] hsr_newlink+0x2b2/0x3e0 [hsr]
[<000000003fa8c6b6>] __rtnl_newlink+0xf1f/0x1600 net/core/rtnetlink.c:3182
[<000000001247a7ad>] rtnl_newlink+0x66/0x90 net/core/rtnetlink.c:3240
[<00000000e7d1b61d>] rtnetlink_rcv_msg+0x54e/0xb90 net/core/rtnetlink.c:5130
[<000000005556bd3a>] netlink_rcv_skb+0x129/0x340 net/netlink/af_netlink.c:2477
[<00000000741d5ee6>] netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
[<00000000741d5ee6>] netlink_unicast+0x49a/0x650 net/netlink/af_netlink.c:1336
[<000000009d56f9b7>] netlink_sendmsg+0x88b/0xdf0 net/netlink/af_netlink.c:1917
[<0000000046b35c59>] sock_sendmsg_nosec net/socket.c:621 [inline]
[<0000000046b35c59>] sock_sendmsg+0xc3/0x100 net/socket.c:631
[<00000000d208adc9>] __sys_sendto+0x33e/0x560 net/socket.c:1786
[<00000000b582837a>] __do_sys_sendto net/socket.c:1798 [inline]
[<00000000b582837a>] __se_sys_sendto net/socket.c:1794 [inline]
[<00000000b582837a>] __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1794
[<00000000c866801d>] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
[<00000000fea382d9>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<00000000e01dacb3>] 0xffffffffffffffff
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When adding new filter to flower classifier, fl_change() inserts it to
handle_idr before initializing filter extensions and assigning it a mask.
Normally this ordering doesn't matter because all flower classifier ops
callbacks assume rtnl lock protection. However, when filter has an action
that doesn't have its kernel module loaded, rtnl lock is released before
call to request_module(). During this time the filter can be accessed bu
concurrent task before its initialization is completed, which can lead to a
crash.
Example case of NULL pointer dereference in concurrent dump:
Task 1 Task 2
tc_new_tfilter()
fl_change()
idr_alloc_u32(fnew)
fl_set_parms()
tcf_exts_validate()
tcf_action_init()
tcf_action_init_1()
rtnl_unlock()
request_module()
... rtnl_lock()
tc_dump_tfilter()
tcf_chain_dump()
fl_walk()
idr_get_next_ul()
tcf_node_dump()
tcf_fill_node()
fl_dump()
mask = &f->mask->key; <- NULL ptr
rtnl_lock()
Extension initialization and mask assignment don't depend on fnew->handle
that is allocated by idr_alloc_u32(). Move idr allocation code after action
creation and mask assignment in fl_change() to prevent concurrent access
to not fully initialized filter when rtnl lock is released to load action
module.
Fixes: 01683a146999 ("net: sched: refactor flower walk to iterate over idr")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sendpage was not designed for processing of the Slab pages,
in some situations it can trigger BUG_ON on receiving side.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Naresh Kamboju noted the following oops during execution of selftest
tools/testing/selftests/bpf/test_tunnel.sh on x86_64:
[ 274.120445] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000000
[ 274.128285] #PF error: [INSTR]
[ 274.131351] PGD 8000000414a0e067 P4D 8000000414a0e067 PUD 3b6334067 PMD 0
[ 274.138241] Oops: 0010 [#1] SMP PTI
[ 274.141734] CPU: 1 PID: 11464 Comm: ping Not tainted
5.0.0-rc4-next-20190129 #1
[ 274.149046] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[ 274.156526] RIP: 0010: (null)
[ 274.160280] Code: Bad RIP value.
[ 274.163509] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
[ 274.168726] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
[ 274.175851] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
[ 274.182974] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
[ 274.190098] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
[ 274.197222] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
[ 274.204346] FS: 00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
knlGS:0000000000000000
[ 274.212424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 274.218162] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
[ 274.225292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 274.232416] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 274.239541] Call Trace:
[ 274.241988] ? tnl_update_pmtu+0x296/0x3b0
[ 274.246085] ip_md_tunnel_xmit+0x1bc/0x520
[ 274.250176] gre_fb_xmit+0x330/0x390
[ 274.253754] gre_tap_xmit+0x128/0x180
[ 274.257414] dev_hard_start_xmit+0xb7/0x300
[ 274.261598] sch_direct_xmit+0xf6/0x290
[ 274.265430] __qdisc_run+0x15d/0x5e0
[ 274.269007] __dev_queue_xmit+0x2c5/0xc00
[ 274.273011] ? dev_queue_xmit+0x10/0x20
[ 274.276842] ? eth_header+0x2b/0xc0
[ 274.280326] dev_queue_xmit+0x10/0x20
[ 274.283984] ? dev_queue_xmit+0x10/0x20
[ 274.287813] arp_xmit+0x1a/0xf0
[ 274.290952] arp_send_dst.part.19+0x46/0x60
[ 274.295138] arp_solicit+0x177/0x6b0
[ 274.298708] ? mod_timer+0x18e/0x440
[ 274.302281] neigh_probe+0x57/0x70
[ 274.305684] __neigh_event_send+0x197/0x2d0
[ 274.309862] neigh_resolve_output+0x18c/0x210
[ 274.314212] ip_finish_output2+0x257/0x690
[ 274.318304] ip_finish_output+0x219/0x340
[ 274.322314] ? ip_finish_output+0x219/0x340
[ 274.326493] ip_output+0x76/0x240
[ 274.329805] ? ip_fragment.constprop.53+0x80/0x80
[ 274.334510] ip_local_out+0x3f/0x70
[ 274.337992] ip_send_skb+0x19/0x40
[ 274.341391] ip_push_pending_frames+0x33/0x40
[ 274.345740] raw_sendmsg+0xc15/0x11d0
[ 274.349403] ? __might_fault+0x85/0x90
[ 274.353151] ? _copy_from_user+0x6b/0xa0
[ 274.357070] ? rw_copy_check_uvector+0x54/0x130
[ 274.361604] inet_sendmsg+0x42/0x1c0
[ 274.365179] ? inet_sendmsg+0x42/0x1c0
[ 274.368937] sock_sendmsg+0x3e/0x50
[ 274.372460] ___sys_sendmsg+0x26f/0x2d0
[ 274.376293] ? lock_acquire+0x95/0x190
[ 274.380043] ? __handle_mm_fault+0x7ce/0xb70
[ 274.384307] ? lock_acquire+0x95/0x190
[ 274.388053] ? __audit_syscall_entry+0xdd/0x130
[ 274.392586] ? ktime_get_coarse_real_ts64+0x64/0xc0
[ 274.397461] ? __audit_syscall_entry+0xdd/0x130
[ 274.401989] ? trace_hardirqs_on+0x4c/0x100
[ 274.406173] __sys_sendmsg+0x63/0xa0
[ 274.409744] ? __sys_sendmsg+0x63/0xa0
[ 274.413488] __x64_sys_sendmsg+0x1f/0x30
[ 274.417405] do_syscall_64+0x55/0x190
[ 274.421064] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 274.426113] RIP: 0033:0x7ff4ae0e6e87
[ 274.429686] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00
00 00 00 8b 05 ca d9 2b 00 48 63 d2 48 63 ff 85 c0 75 10 b8 2e 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 53 48 89 f3 48 83 ec 10 48 89 7c
24 08
[ 274.448422] RSP: 002b:00007ffcd9b76db8 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[ 274.455978] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007ff4ae0e6e87
[ 274.463104] RDX: 0000000000000000 RSI: 00000000006092e0 RDI: 0000000000000003
[ 274.470228] RBP: 0000000000000000 R08: 00007ffcd9bc40a0 R09: 00007ffcd9bc4080
[ 274.477349] R10: 000000000000060a R11: 0000000000000246 R12: 0000000000000003
[ 274.484475] R13: 0000000000000016 R14: 00007ffcd9b77fa0 R15: 00007ffcd9b78da4
[ 274.491602] Modules linked in: cls_bpf sch_ingress iptable_filter
ip_tables algif_hash af_alg x86_pkg_temp_thermal fuse [last unloaded:
test_bpf]
[ 274.504634] CR2: 0000000000000000
[ 274.507976] ---[ end trace 196d18386545eae1 ]---
[ 274.512588] RIP: 0010: (null)
[ 274.516334] Code: Bad RIP value.
[ 274.519557] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
[ 274.524775] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
[ 274.531921] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
[ 274.539082] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
[ 274.546205] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
[ 274.553329] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
[ 274.560456] FS: 00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
knlGS:0000000000000000
[ 274.568541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 274.574277] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
[ 274.581403] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 274.588535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 274.595658] Kernel panic - not syncing: Fatal exception in interrupt
[ 274.602046] Kernel Offset: 0x14400000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 274.612827] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---
[ 274.620387] ------------[ cut here ]------------
I'm also seeing the same failure on x86_64, and it reproduces
consistently.
>From poking around it looks like the skb's dst entry is being used
to calculate the mtu in:
mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
...but because that dst_entry has an "ops" value set to md_dst_ops,
the various ops (including mtu) are not set:
crash> struct sk_buff._skb_refdst ffff928f87447700 -x
_skb_refdst = 0xffffcd6fbf5ea590
crash> struct dst_entry.ops 0xffffcd6fbf5ea590
ops = 0xffffffffa0193800
crash> struct dst_ops.mtu 0xffffffffa0193800
mtu = 0x0
crash>
I confirmed that the dst entry also has dst->input set to
dst_md_discard, so it looks like it's an entry that's been
initialized via __metadata_dst_init alright.
I think the fix here is to use skb_valid_dst(skb) - it checks
for DST_METADATA also, and with that fix in place, the
problem - which was previously 100% reproducible - disappears.
The below patch resolves the panic and all bpf tunnel tests pass
without incident.
Fixes: c8b34e680a09 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a non local multicast packet reaches ip_route_input_rcu() while
the ingress device IPv4 private data (in_dev) is NULL, we end up
doing a NULL pointer dereference in IN_DEV_MFORWARD().
Since the later call to ip_route_input_mc() is going to fail if
!in_dev, we can fail early in such scenario and avoid the dangerous
code path.
v1 -> v2:
- clarified the commit message, no code changes
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dst_cache_destroy will be called in dst_release
dst_release-->dst_destroy_rcu-->dst_destroy-->metadata_dst_free
-->dst_cache_destroy
It should not call dst_cache_destroy before dst_release
Fixes: 41411e2fd6b8 ("net/sched: act_tunnel_key: Add dst_cache support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix regression bug introduced in
commit 365ad353c256 ("tipc: reduce risk of user starvation during link
congestion")
Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
sockaddr.
Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
The label is only used from inside the #ifdef and should be
hidden the same way, to avoid this warning:
net/sched/act_tunnel_key.c: In function 'tunnel_key_init':
net/sched/act_tunnel_key.c:389:1: error: label 'release_tun_meta' defined but not used [-Werror=unused-label]
release_tun_meta:
Fixes: 41411e2fd6b8 ("net/sched: act_tunnel_key: Add dst_cache support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When CONFIG_SYSCTL is turned off, we get a link failure for
the newly introduced tuning knob.
net/ipv6/addrconf.o: In function `addrconf_init_net':
addrconf.c:(.text+0x31dc): undefined reference to `sysctl_devconf_inherit_init_net'
Add an IS_ENABLED() check to fall back to the default behavior
(sysctl_devconf_inherit_init_net=0) here.
Fixes: 856c395cfa63 ("net: introduce a knob to control whether to inherit devconf config")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is possible that a reporter state will be updated due to a recover flow
which is not triggered by a devlink health related operation, but as a side
effect of some other operation in the system.
Expose devlink health API for a direct update of a reporter status.
Move devlink_health_reporter_state enum definition to devlink.h so it could
be used from drivers as a parameter of devlink_health_reporter_state_update.
In addition, add trace_devlink_health_reporter_state_update to provide user
notification for reporter state change.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If devlink_health_report() aborted the recover flow due to grace period checker,
it left the reporter status as DEVLINK_HEALTH_REPORTER_STATE_HEALTHY, which is
a bug. Fix that by always setting the reporter state to
DEVLINK_HEALTH_REPORTER_STATE_ERROR prior to running the checker mentioned above.
In addition, save the previous health_state in a temporary variable, then use
it in the abort check comparison instead of using reporter->health_state which
might be already changed.
Fixes: c8e1da0bf923 ("devlink: Add health report functionality")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The user msg is also copied to the abort packet when doing SCTP_ABORT in
sctp_sendmsg_check_sflags(). When SCTP_SENDALL is set, iov_iter_revert()
should have been called for sending abort on the next asoc with copying
this msg. Otherwise, memcpy_from_msg() in sctp_make_abort_user() will
fail and return error.
Fixes: 4910280503f3 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|