linux-toradex.git/kernel/bpf, branch v5.12-rc5

bpf: Fix umd memory leak in copy_process()

2021-03-19T21:23:19+00:00

The syzbot reported a memleak as follows:

BUG: memory leak
unreferenced object 0xffff888101b41d00 (size 120):
  comm "kworker/u4:0", pid 8, jiffies 4294944270 (age 12.780s)
  backtrace:
    [] alloc_pid+0x66/0x560
    [] copy_process+0x1465/0x25e0
    [] kernel_clone+0xf3/0x670
    [] kernel_thread+0x61/0x80
    [] call_usermodehelper_exec_work
    [] call_usermodehelper_exec_work+0xc4/0x120
    [] process_one_work+0x2c9/0x600
    [] worker_thread+0x59/0x5d0
    [] kthread+0x178/0x1b0
    [] ret_from_fork+0x1f/0x30

unreferenced object 0xffff888110ef5c00 (size 232):
  comm "kworker/u4:0", pid 8414, jiffies 4294944270 (age 12.780s)
  backtrace:
    [] kmem_cache_zalloc
    [] __alloc_file+0x1f/0xf0
    [] alloc_empty_file+0x69/0x120
    [] alloc_file+0x33/0x1b0
    [] alloc_file_pseudo+0xb2/0x140
    [] create_pipe_files+0x138/0x2e0
    [] umd_setup+0x33/0x220
    [] call_usermodehelper_exec_async+0xb4/0x1b0
    [] ret_from_fork+0x1f/0x30

After the UMD process exits, the pipe_to_umh/pipe_from_umh and
tgid need to be released.

Fixes: d71fa5c9763c ("bpf: Add kernel module with user mode driver that populates bpffs.")
Reported-by: syzbot+44908bb56d2bfe56b28e@syzkaller.appspotmail.com
Signed-off-by: Zqiang 
Signed-off-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20210317030915.2865-1-qiang.zhang@windriver.com

bpf: Fix fexit trampoline.

2021-03-17T23:22:51+00:00

The fexit/fmod_ret programs can be attached to kernel functions that can sleep.
The synchronize_rcu_tasks() will not wait for such tasks to complete.
In such case the trampoline image will be freed and when the task
wakes up the return IP will point to freed memory causing the crash.
Solve this by adding percpu_ref_get/put for the duration of trampoline
and separate trampoline vs its image life times.
The "half page" optimization has to be removed, since
first_half->second_half->first_half transition cannot be guaranteed to
complete in deterministic time. Every trampoline update becomes a new image.
The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and
call_rcu_tasks. Together they will wait for the original function and
trampoline asm to complete. The trampoline is patched from nop to jmp to skip
fexit progs. They are freed independently from the trampoline. The image with
fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which
will wait for both sleepable and non-sleepable progs to complete.

Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline")
Reported-by: Andrii Nakryiko 
Signed-off-by: Alexei Starovoitov 
Signed-off-by: Daniel Borkmann 
Acked-by: Paul E. McKenney   # for RCU
Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.com

bpf: Add sanity check for upper ptr_limit

2021-03-17T20:57:39+00:00

Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.

Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov

bpf: Simplify alu_limit masking for pointer arithmetic

2021-03-17T18:13:22+00:00

Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.

Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov

bpf: Fix off-by-one for area size in creating mask to left

2021-03-17T18:12:43+00:00

retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.

When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.

Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov

bpf: Prohibit alu ops for pointer types not defining ptr_limit

2021-03-17T18:12:02+00:00

The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.

The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.

Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.

Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov

bpf: Change inode_storage's lookup_elem return value from NULL to -EBADF

2021-03-08T15:08:06+00:00

bpf_fd_inode_storage_lookup_elem() returned NULL when getting a bad FD,
which caused -ENOENT in bpf_map_copy_value. -EBADF error is better than
-ENOENT for a bad FD behaviour.

The patch was partially contributed by CyberArk Software, Inc.

Fixes: 8ea636848aca ("bpf: Implement bpf_local_storage for inodes")
Signed-off-by: Tal Lossos 
Signed-off-by: Daniel Borkmann 
Acked-by: Yonghong Song 
Acked-by: KP Singh 
Link: https://lore.kernel.org/bpf/20210307120948.61414-1-tallossos@gmail.com

bpf: Dont allow vmlinux BTF to be used in map_create and prog_load.

2021-03-08T12:32:46+00:00

The syzbot got FD of vmlinux BTF and passed it into map_create which caused
crash in btf_type_id_size() when it tried to access resolved_ids. The vmlinux
BTF doesn't have 'resolved_ids' and 'resolved_sizes' initialized to save
memory. To avoid such issues disallow using vmlinux BTF in prog_load and
map_create commands.

Fixes: 5329722057d4 ("bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO")
Reported-by: syzbot+8bab8ed346746e7540e8@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov 
Signed-off-by: Daniel Borkmann 
Acked-by: Yonghong Song 
Link: https://lore.kernel.org/bpf/20210307225248.79031-1-alexei.starovoitov@gmail.com

bpf: Explicitly zero-extend R0 after 32-bit cmpxchg

2021-03-05T03:06:03+00:00

As pointed out by Ilya and explained in the new comment, there's a
discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
the value from memory into r0, while x86 only does so when r0 and the
value in memory are different. The same issue affects s390.

At first this might sound like pure semantics, but it makes a real
difference when the comparison is 32-bit, since the load will
zero-extend r0/rax.

The fix is to explicitly zero-extend rax after doing such a
CMPXCHG. Since this problem affects multiple archs, this is done in
the verifier by patching in a BPF_ZEXT_REG instruction after every
32-bit cmpxchg. Any archs that don't need such manual zero-extension
can do a look-ahead with insn_is_zext to skip the unnecessary mov.

Note this still goes on top of Ilya's patch:

https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u

Differences v5->v6[1]:
 - Moved is_cmpxchg_insn and ensured it can be safely re-used. Also renamed it
   and removed 'inline' to match the style of the is_*_function helpers.
 - Fixed up comments in verifier test (thanks for the careful review, Martin!)

Differences v4->v5[1]:
 - Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
   for suggesting this.

Differences v3->v4[1]:
 - Moved the optimization against pointless zext into the correct place:
   opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.

Differences v2->v3[1]:
 - Moved patching into fixup_bpf_calls (patch incoming to rename this function)
 - Added extra commentary on bpf_jit_needs_zext
 - Added check to avoid adding a pointless zext(r0) if there's already one there.

Difference v1->v2[1]: Now solved centrally in the verifier instead of
  specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!

[1] v5: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
    v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
    v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
    v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
    v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t

Reported-by: Ilya Leoshkevich 
Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Brendan Jackman 
Acked-by: Martin KaFai Lau 
Acked-by: Ilya Leoshkevich 
Tested-by: Ilya Leoshkevich 
Signed-off-by: Alexei Starovoitov

bpf: Account for BPF_FETCH in insn_has_def32()

2021-03-04T15:02:31+00:00

insn_has_def32() returns false for 32-bit BPF_FETCH insns. This makes
adjust_insn_aux_data() incorrectly set zext_dst, as can be seen in [1].
This happens because insn_no_def() does not know about the BPF_FETCH
variants of BPF_STX.

Fix in two steps.

First, replace insn_no_def() with insn_def_regno(), which returns the
register an insn defines. Normally insn_no_def() calls are followed by
insn->dst_reg uses; replace those with the insn_def_regno() return
value.

Second, adjust the BPF_STX special case in is_reg64() to deal with
queries made from opt_subreg_zext_lo32_rnd_hi32(), where the state
information is no longer available. Add a comment, since the purpose
of this special case is not clear at first glance.

  [1] https://lore.kernel.org/bpf/20210223150845.1857620-1-jackmanb@google.com/

Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Ilya Leoshkevich 
Signed-off-by: Daniel Borkmann 
Acked-by: Martin KaFai Lau 
Acked-by: Brendan Jackman 
Link: https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com