linux-toradex.git/kernel/bpf, branch v5.12-rc7

bpf: program: Refuse non-O_RDWR flags in BPF_OBJ_GET

2021-04-01T21:33:48+00:00

As for bpf_link, refuse creating a non-O_RDWR fd. Since program fds
currently don't allow modifications this is a precaution, not a
straight up bug fix.

Signed-off-by: Lorenz Bauer 
Signed-off-by: Alexei Starovoitov 
Acked-by: Andrii Nakryiko 
Acked-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20210326160501.46234-2-lmb@cloudflare.com

bpf: link: Refuse non-O_RDWR flags in BPF_OBJ_GET

2021-04-01T21:33:14+00:00

Invoking BPF_OBJ_GET on a pinned bpf_link checks the path access
permissions based on file_flags, but the returned fd ignores flags.
This means that any user can acquire a "read-write" fd for a pinned
link with mode 0664 by invoking BPF_OBJ_GET with BPF_F_RDONLY in
file_flags. The fd can be used to invoke BPF_LINK_DETACH, etc.

Fix this by refusing non-O_RDWR flags in BPF_OBJ_GET. This works
because OBJ_GET by default returns a read write mapping and libbpf
doesn't expose a way to override this behaviour for programs
and links.

Fixes: 70ed506c3bbc ("bpf: Introduce pinnable bpf_link abstraction")
Signed-off-by: Lorenz Bauer 
Signed-off-by: Alexei Starovoitov 
Acked-by: Andrii Nakryiko 
Acked-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20210326160501.46234-1-lmb@cloudflare.com

bpf: Refcount task stack in bpf_get_task_stack

2021-04-01T20:58:07+00:00

On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
offset of task->stack. The pt_regs are later dereferenced in
__bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
the task in question exits while bpf_get_task_stack is executing, as
warned by task_stack_page's comment:

* When accessing the stack of a non-current task that might exit, use
* try_get_task_stack() instead.  task_stack_page will return a pointer
* that could get freed out from under you.

Taking the comment's advice and using try_get_task_stack() and
put_task_stack() to hold task->stack refcount, or bail early if it's
already 0. Incrementing stack_refcount will ensure the task's stack
sticks around while we're using its data.

I noticed this bug while testing a bpf task iter similar to
bpf_iter_task_stack in selftests, except mine grabbed user stack, and
getting intermittent crashes, which resulted in dumps like:

  BUG: unable to handle page fault for address: 0000000000003fe0
  \#PF: supervisor read access in kernel mode
  \#PF: error_code(0x0000) - not-present page
  RIP: 0010:__bpf_get_stack+0xd0/0x230
  
  Call Trace:
  bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
  bpf_iter_run_prog+0x24/0x81
  __task_seq_show+0x58/0x80
  bpf_seq_read+0xf7/0x3d0
  vfs_read+0x91/0x140
  ksys_read+0x59/0xd0
  do_syscall_64+0x48/0x120
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()")
Signed-off-by: Dave Marchevsky 
Signed-off-by: Alexei Starovoitov 
Acked-by: Song Liu 
Link: https://lore.kernel.org/bpf/20210401000747.3648767-1-davemarchevsky@fb.com

bpf: Take module reference for trampoline in module

2021-03-27T02:30:11+00:00

Currently module can be unloaded even if there's a trampoline
register in it. It's easily reproduced by running in parallel:

  # while :; do ./test_progs -t module_attach; done
  # while :; do rmmod bpf_testmod; sleep 0.5; done

Taking the module reference in case the trampoline's ip is
within the module code. Releasing it when the trampoline's
ip is unregistered.

Signed-off-by: Jiri Olsa 
Signed-off-by: Alexei Starovoitov 
Link: https://lore.kernel.org/bpf/20210326105900.151466-1-jolsa@kernel.org

bpf: Fix a spelling typo in bpf_atomic_alu_string disasm

2021-03-26T16:56:48+00:00

The name string for BPF_XOR is "xor", not "or". Fix it.

Fixes: 981f94c3e921 ("bpf: Add bitwise atomic instructions")
Signed-off-by: Xu Kuohai 
Signed-off-by: Daniel Borkmann 
Acked-by: Brendan Jackman 
Link: https://lore.kernel.org/bpf/20210325134141.8533-1-xukuohai@huawei.com

bpf: Enforce that struct_ops programs be GPL-only

2021-03-26T16:50:39+00:00

With the introduction of the struct_ops program type, it became possible to
implement kernel functionality in BPF, making it viable to use BPF in place
of a regular kernel module for these particular operations.

Thus far, the only user of this mechanism is for implementing TCP
congestion control algorithms. These are clearly marked as GPL-only when
implemented as modules (as seen by the use of EXPORT_SYMBOL_GPL for
tcp_register_congestion_control()), so it seems like an oversight that this
was not carried over to BPF implementations. Since this is the only user
of the struct_ops mechanism, just enforcing GPL-only for the struct_ops
program type seems like the simplest way to fix this.

Fixes: 0baf26b0fcd7 ("bpf: tcp: Support tcp_congestion_ops in bpf")
Signed-off-by: Toke Høiland-Jørgensen 
Signed-off-by: Daniel Borkmann 
Acked-by: Martin KaFai Lau 
Link: https://lore.kernel.org/bpf/20210326100314.121853-1-toke@redhat.com

bpf: Fix umd memory leak in copy_process()

2021-03-19T21:23:19+00:00

The syzbot reported a memleak as follows:

BUG: memory leak
unreferenced object 0xffff888101b41d00 (size 120):
  comm "kworker/u4:0", pid 8, jiffies 4294944270 (age 12.780s)
  backtrace:
    [] alloc_pid+0x66/0x560
    [] copy_process+0x1465/0x25e0
    [] kernel_clone+0xf3/0x670
    [] kernel_thread+0x61/0x80
    [] call_usermodehelper_exec_work
    [] call_usermodehelper_exec_work+0xc4/0x120
    [] process_one_work+0x2c9/0x600
    [] worker_thread+0x59/0x5d0
    [] kthread+0x178/0x1b0
    [] ret_from_fork+0x1f/0x30

unreferenced object 0xffff888110ef5c00 (size 232):
  comm "kworker/u4:0", pid 8414, jiffies 4294944270 (age 12.780s)
  backtrace:
    [] kmem_cache_zalloc
    [] __alloc_file+0x1f/0xf0
    [] alloc_empty_file+0x69/0x120
    [] alloc_file+0x33/0x1b0
    [] alloc_file_pseudo+0xb2/0x140
    [] create_pipe_files+0x138/0x2e0
    [] umd_setup+0x33/0x220
    [] call_usermodehelper_exec_async+0xb4/0x1b0
    [] ret_from_fork+0x1f/0x30

After the UMD process exits, the pipe_to_umh/pipe_from_umh and
tgid need to be released.

Fixes: d71fa5c9763c ("bpf: Add kernel module with user mode driver that populates bpffs.")
Reported-by: syzbot+44908bb56d2bfe56b28e@syzkaller.appspotmail.com
Signed-off-by: Zqiang 
Signed-off-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20210317030915.2865-1-qiang.zhang@windriver.com

bpf: Fix fexit trampoline.

2021-03-17T23:22:51+00:00

The fexit/fmod_ret programs can be attached to kernel functions that can sleep.
The synchronize_rcu_tasks() will not wait for such tasks to complete.
In such case the trampoline image will be freed and when the task
wakes up the return IP will point to freed memory causing the crash.
Solve this by adding percpu_ref_get/put for the duration of trampoline
and separate trampoline vs its image life times.
The "half page" optimization has to be removed, since
first_half->second_half->first_half transition cannot be guaranteed to
complete in deterministic time. Every trampoline update becomes a new image.
The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and
call_rcu_tasks. Together they will wait for the original function and
trampoline asm to complete. The trampoline is patched from nop to jmp to skip
fexit progs. They are freed independently from the trampoline. The image with
fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which
will wait for both sleepable and non-sleepable progs to complete.

Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline")
Reported-by: Andrii Nakryiko 
Signed-off-by: Alexei Starovoitov 
Signed-off-by: Daniel Borkmann 
Acked-by: Paul E. McKenney   # for RCU
Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.com

bpf: Add sanity check for upper ptr_limit

2021-03-17T20:57:39+00:00

Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.

Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov

bpf: Simplify alu_limit masking for pointer arithmetic

2021-03-17T18:13:22+00:00

Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.

Signed-off-by: Piotr Krysiuk 
Co-developed-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov