From 6715df8d5d24655b9fd368e904028112b54c7de1 Mon Sep 17 00:00:00 2001 From: Eduard Zingerman Date: Sun, 19 Feb 2023 22:04:26 +0200 Subject: bpf: Allow reads from uninit stack This commits updates the following functions to allow reads from uninitialized stack locations when env->allow_uninit_stack option is enabled: - check_stack_read_fixed_off() - check_stack_range_initialized(), called from: - check_stack_read_var_off() - check_helper_mem_access() Such change allows to relax logic in stacksafe() to treat STACK_MISC and STACK_INVALID in a same way and make the following stack slot configurations equivalent: | Cached state | Current state | | stack slot | stack slot | |------------------+------------------| | STACK_INVALID or | STACK_INVALID or | | STACK_MISC | STACK_SPILL or | | | STACK_MISC or | | | STACK_ZERO or | | | STACK_DYNPTR | This leads to significant verification speed gains (see below). The idea was suggested by Andrii Nakryiko [1] and initial patch was created by Alexei Starovoitov [2]. Currently the env->allow_uninit_stack is allowed for programs loaded by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities. A number of test cases from verifier/*.c were expecting uninitialized stack access to be an error. These test cases were updated to execute in unprivileged mode (thus preserving the tests). The test progs/test_global_func10.c expected "invalid indirect read from stack" error message because of the access to uninitialized memory region. This error is no longer possible in privileged mode. The test is updated to provoke an error "invalid indirect access to stack" because of access to invalid stack address (such error is not verified by progs/test_global_func*.c series of tests). The following tests had to be removed because these can't be made unprivileged: - verifier/sock.c: - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init stack_value" BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode. - verifier/var_off.c: - "indirect variable-offset stack access, max_off+size > max_initialized" - "indirect variable-offset stack access, uninitialized" These tests verify that access to uninitialized stack values is detected when stack offset is not a constant. However, variable stack access is prohibited in unprivileged mode, thus these tests are no longer valid. * * * Here is veristat log comparing this patch with current master on a set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg and cilium BPF binaries (see [3]): $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log File Program States (A) States (B) States (DIFF) -------------------------- -------------------------- ---------- ---------- ---------------- bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%) bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%) bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%) bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%) bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%) bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%) bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%) bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%) pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%) strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%) strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%) xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%) xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%) -------------------------- -------------------------- ---------- ---------- ---------------- Note: I limited selection by states_pct<-30%. Inspection of differences in pyperf600_bpf_loop behavior shows that the following patch for the test removes almost all differences: - a/tools/testing/selftests/bpf/progs/pyperf.h + b/tools/testing/selftests/bpf/progs/pyperf.h @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx) } if (event->pthread_match || !pidData->use_tls) { - void* frame_ptr; - FrameData frame; + void* frame_ptr = 0; + FrameData frame = {}; Symbol sym = {}; int cur_cpu = bpf_get_smp_processor_id(); W/o this patch the difference comes from the following pattern (for different variables): static bool get_frame_data(... FrameData *frame ...) { ... bpf_probe_read_user(&frame->f_code, ...); if (!frame->f_code) return false; ... bpf_probe_read_user(&frame->co_name, ...); if (frame->co_name) ...; } int __on_event(struct bpf_raw_tracepoint_args *ctx) { FrameData frame; ... get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback ... } SEC("raw_tracepoint/kfree_skb") int on_event(struct bpf_raw_tracepoint_args* ctx) { ... ret |= __on_event(ctx); ret |= __on_event(ctx); ... } With regards to value `frame->co_name` the following is important: - Because of the conditional `if (!frame->f_code)` each call to __on_event() produces two states, one with `frame->co_name` marked as STACK_MISC, another with it as is (and marked STACK_INVALID on a first call). - The call to bpf_probe_read_user() does not mark stack slots corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks these slots as BPF_MISC, this happens because of the following loop in the check_helper_call(): for (i = 0; i < meta.access_size; i++) { err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B, BPF_WRITE, -1, false); if (err) return err; } Note the size of the write, it is a one byte write for each byte touched by a helper. The BPF_B write does not lead to write marks for the target stack slot. - Which means that w/o this patch when second __on_event() call is verified `if (frame->co_name)` will propagate read marks first to a stack slot with STACK_MISC marks and second to a stack slot with STACK_INVALID marks and these states would be considered different. [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/ [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/ [3] git@github.com:anakryiko/cilium.git Suggested-by: Andrii Nakryiko Co-developed-by: Alexei Starovoitov Signed-off-by: Eduard Zingerman Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov --- .../selftests/bpf/progs/test_global_func10.c | 8 +- tools/testing/selftests/bpf/verifier/calls.c | 13 ++- .../selftests/bpf/verifier/helper_access_var_len.c | 104 ++++++++++++++------- tools/testing/selftests/bpf/verifier/int_ptr.c | 9 +- .../selftests/bpf/verifier/search_pruning.c | 13 ++- tools/testing/selftests/bpf/verifier/sock.c | 27 ------ tools/testing/selftests/bpf/verifier/spill_fill.c | 7 +- tools/testing/selftests/bpf/verifier/var_off.c | 52 ----------- 8 files changed, 98 insertions(+), 135 deletions(-) (limited to 'tools/testing') diff --git a/tools/testing/selftests/bpf/progs/test_global_func10.c b/tools/testing/selftests/bpf/progs/test_global_func10.c index 98327bdbbfd2..8fba3f3649e2 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func10.c +++ b/tools/testing/selftests/bpf/progs/test_global_func10.c @@ -5,12 +5,12 @@ #include "bpf_misc.h" struct Small { - int x; + long x; }; struct Big { - int x; - int y; + long x; + long y; }; __noinline int foo(const struct Big *big) @@ -22,7 +22,7 @@ __noinline int foo(const struct Big *big) } SEC("cgroup_skb/ingress") -__failure __msg("invalid indirect read from stack") +__failure __msg("invalid indirect access to stack") int global_func10(struct __sk_buff *skb) { const struct Small small = {.x = skb->len }; diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c index 9d993926bf0e..289ed202ec66 100644 --- a/tools/testing/selftests/bpf/verifier/calls.c +++ b/tools/testing/selftests/bpf/verifier/calls.c @@ -2221,19 +2221,22 @@ * that fp-8 stack slot was unused in the fall-through * branch and will accept the program incorrectly */ - BPF_JMP_IMM(BPF_JGT, BPF_REG_1, 2, 2), + BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), + BPF_JMP_IMM(BPF_JGT, BPF_REG_0, 2, 2), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_JMP_IMM(BPF_JA, 0, 0, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), + BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_48b = { 6 }, - .errstr = "invalid indirect read from stack R2 off -8+0 size 8", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_XDP, + .fixup_map_hash_48b = { 7 }, + .errstr_unpriv = "invalid indirect read from stack R2 off -8+0 size 8", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "calls: ctx read at start of subprog", diff --git a/tools/testing/selftests/bpf/verifier/helper_access_var_len.c b/tools/testing/selftests/bpf/verifier/helper_access_var_len.c index a6c869a7319c..9c4885885aba 100644 --- a/tools/testing/selftests/bpf/verifier/helper_access_var_len.c +++ b/tools/testing/selftests/bpf/verifier/helper_access_var_len.c @@ -29,19 +29,30 @@ { "helper access to variable memory: stack, bitwise AND, zero included", .insns = { - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8), - BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), - BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64), - BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2, -128), - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, -128), - BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 64), - BPF_MOV64_IMM(BPF_REG_3, 0), - BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel), + /* set max stack size */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0), + /* set r3 to a random value */ + BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), + BPF_MOV64_REG(BPF_REG_3, BPF_REG_0), + /* use bitwise AND to limit r3 range to [0, 64] */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_3, 64), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64), + BPF_MOV64_IMM(BPF_REG_4, 0), + /* Call bpf_ringbuf_output(), it is one of a few helper functions with + * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode. + * For unpriv this should signal an error, because memory at &fp[-64] is + * not initialized. + */ + BPF_EMIT_CALL(BPF_FUNC_ringbuf_output), BPF_EXIT_INSN(), }, - .errstr = "invalid indirect read from stack R1 off -64+0 size 64", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_TRACEPOINT, + .fixup_map_ringbuf = { 4 }, + .errstr_unpriv = "invalid indirect read from stack R2 off -64+0 size 64", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "helper access to variable memory: stack, bitwise AND + JMP, wrong max", @@ -183,20 +194,31 @@ { "helper access to variable memory: stack, JMP, no min check", .insns = { - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8), - BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), - BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64), - BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2, -128), - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, -128), - BPF_JMP_IMM(BPF_JGT, BPF_REG_2, 64, 3), - BPF_MOV64_IMM(BPF_REG_3, 0), - BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel), + /* set max stack size */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0), + /* set r3 to a random value */ + BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), + BPF_MOV64_REG(BPF_REG_3, BPF_REG_0), + /* use JMP to limit r3 range to [0, 64] */ + BPF_JMP_IMM(BPF_JGT, BPF_REG_3, 64, 6), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64), + BPF_MOV64_IMM(BPF_REG_4, 0), + /* Call bpf_ringbuf_output(), it is one of a few helper functions with + * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode. + * For unpriv this should signal an error, because memory at &fp[-64] is + * not initialized. + */ + BPF_EMIT_CALL(BPF_FUNC_ringbuf_output), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr = "invalid indirect read from stack R1 off -64+0 size 64", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_TRACEPOINT, + .fixup_map_ringbuf = { 4 }, + .errstr_unpriv = "invalid indirect read from stack R2 off -64+0 size 64", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "helper access to variable memory: stack, JMP (signed), no min check", @@ -564,29 +586,41 @@ { "helper access to variable memory: 8 bytes leak", .insns = { - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 8), - BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), - BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -64), + /* set max stack size */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -128, 0), + /* set r3 to a random value */ + BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), + BPF_MOV64_REG(BPF_REG_3, BPF_REG_0), + BPF_LD_MAP_FD(BPF_REG_1, 0), + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -64), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -64), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -56), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -48), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -40), + /* Note: fp[-32] left uninitialized */ BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -24), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -8), - BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_2, -128), - BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_10, -128), - BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 63), - BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 1), - BPF_MOV64_IMM(BPF_REG_3, 0), - BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel), - BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), + /* Limit r3 range to [1, 64] */ + BPF_ALU64_IMM(BPF_AND, BPF_REG_3, 63), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 1), + BPF_MOV64_IMM(BPF_REG_4, 0), + /* Call bpf_ringbuf_output(), it is one of a few helper functions with + * ARG_CONST_SIZE_OR_ZERO parameter allowed in unpriv mode. + * For unpriv this should signal an error, because memory region [1, 64] + * at &fp[-64] is not fully initialized. + */ + BPF_EMIT_CALL(BPF_FUNC_ringbuf_output), + BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr = "invalid indirect read from stack R1 off -64+32 size 64", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_TRACEPOINT, + .fixup_map_ringbuf = { 3 }, + .errstr_unpriv = "invalid indirect read from stack R2 off -64+32 size 64", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "helper access to variable memory: 8 bytes no leak (init memory)", diff --git a/tools/testing/selftests/bpf/verifier/int_ptr.c b/tools/testing/selftests/bpf/verifier/int_ptr.c index 070893fb2900..02d9e004260b 100644 --- a/tools/testing/selftests/bpf/verifier/int_ptr.c +++ b/tools/testing/selftests/bpf/verifier/int_ptr.c @@ -54,12 +54,13 @@ /* bpf_strtoul() */ BPF_EMIT_CALL(BPF_FUNC_strtoul), - BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .result = REJECT, - .prog_type = BPF_PROG_TYPE_CGROUP_SYSCTL, - .errstr = "invalid indirect read from stack R4 off -16+4 size 8", + .result_unpriv = REJECT, + .errstr_unpriv = "invalid indirect read from stack R4 off -16+4 size 8", + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "ARG_PTR_TO_LONG misaligned", diff --git a/tools/testing/selftests/bpf/verifier/search_pruning.c b/tools/testing/selftests/bpf/verifier/search_pruning.c index d63fd8991b03..745d6b5842fd 100644 --- a/tools/testing/selftests/bpf/verifier/search_pruning.c +++ b/tools/testing/selftests/bpf/verifier/search_pruning.c @@ -128,9 +128,10 @@ BPF_EXIT_INSN(), }, .fixup_map_hash_8b = { 3 }, - .errstr = "invalid read from stack off -16+0 size 8", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_TRACEPOINT, + .errstr_unpriv = "invalid read from stack off -16+0 size 8", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "precision tracking for u32 spill/fill", @@ -258,6 +259,8 @@ BPF_EXIT_INSN(), }, .flags = BPF_F_TEST_STATE_FREQ, - .errstr = "invalid read from stack off -8+1 size 8", - .result = REJECT, + .errstr_unpriv = "invalid read from stack off -8+1 size 8", + .result_unpriv = REJECT, + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c index d11d0b28be41..108dd3ee1edd 100644 --- a/tools/testing/selftests/bpf/verifier/sock.c +++ b/tools/testing/selftests/bpf/verifier/sock.c @@ -530,33 +530,6 @@ .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = ACCEPT, }, -{ - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init stack_value", - .insns = { - BPF_MOV64_IMM(BPF_REG_2, 0), - BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8), - BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)), - BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2), - BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_EXIT_INSN(), - BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), - BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2), - BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_EXIT_INSN(), - BPF_MOV64_IMM(BPF_REG_4, 1), - BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), - BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -8), - BPF_MOV64_REG(BPF_REG_2, BPF_REG_0), - BPF_LD_MAP_FD(BPF_REG_1, 0), - BPF_EMIT_CALL(BPF_FUNC_sk_storage_get), - BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_EXIT_INSN(), - }, - .fixup_sk_storage_map = { 14 }, - .prog_type = BPF_PROG_TYPE_SCHED_CLS, - .result = REJECT, - .errstr = "invalid indirect read from stack", -}, { "bpf_map_lookup_elem(smap, &key)", .insns = { diff --git a/tools/testing/selftests/bpf/verifier/spill_fill.c b/tools/testing/selftests/bpf/verifier/spill_fill.c index 9bb302dade23..d1463bf4949a 100644 --- a/tools/testing/selftests/bpf/verifier/spill_fill.c +++ b/tools/testing/selftests/bpf/verifier/spill_fill.c @@ -171,9 +171,10 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "invalid read from stack off -4+0 size 4", - .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result_unpriv = REJECT, + .errstr_unpriv = "invalid read from stack off -4+0 size 4", + /* in privileged mode reads from uninitialized stack locations are permitted */ + .result = ACCEPT, }, { "Spill a u32 const scalar. Refill as u16. Offset to skb->data", diff --git a/tools/testing/selftests/bpf/verifier/var_off.c b/tools/testing/selftests/bpf/verifier/var_off.c index d37f512fad16..b183e26c03f1 100644 --- a/tools/testing/selftests/bpf/verifier/var_off.c +++ b/tools/testing/selftests/bpf/verifier/var_off.c @@ -212,31 +212,6 @@ .result = REJECT, .prog_type = BPF_PROG_TYPE_LWT_IN, }, -{ - "indirect variable-offset stack access, max_off+size > max_initialized", - .insns = { - /* Fill only the second from top 8 bytes of the stack. */ - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), - /* Get an unknown value. */ - BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 0), - /* Make it small and 4-byte aligned. */ - BPF_ALU64_IMM(BPF_AND, BPF_REG_2, 4), - BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, 16), - /* Add it to fp. We now have either fp-12 or fp-16, but we don't know - * which. fp-12 size 8 is partially uninitialized stack. - */ - BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_10), - /* Dereference it indirectly. */ - BPF_LD_MAP_FD(BPF_REG_1, 0), - BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), - BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_EXIT_INSN(), - }, - .fixup_map_hash_8b = { 5 }, - .errstr = "invalid indirect read from stack R2 var_off", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_LWT_IN, -}, { "indirect variable-offset stack access, min_off < min_initialized", .insns = { @@ -289,33 +264,6 @@ .result = ACCEPT, .prog_type = BPF_PROG_TYPE_CGROUP_SKB, }, -{ - "indirect variable-offset stack access, uninitialized", - .insns = { - BPF_MOV64_IMM(BPF_REG_2, 6), - BPF_MOV64_IMM(BPF_REG_3, 28), - /* Fill the top 16 bytes of the stack. */ - BPF_ST_MEM(BPF_W, BPF_REG_10, -16, 0), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), - /* Get an unknown value. */ - BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_1, 0), - /* Make it small and 4-byte aligned. */ - BPF_ALU64_IMM(BPF_AND, BPF_REG_4, 4), - BPF_ALU64_IMM(BPF_SUB, BPF_REG_4, 16), - /* Add it to fp. We now have either fp-12 or fp-16, we don't know - * which, but either way it points to initialized stack. - */ - BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_10), - BPF_MOV64_IMM(BPF_REG_5, 8), - /* Dereference it indirectly. */ - BPF_EMIT_CALL(BPF_FUNC_getsockopt), - BPF_MOV64_IMM(BPF_REG_0, 0), - BPF_EXIT_INSN(), - }, - .errstr = "invalid indirect read from stack R4 var_off", - .result = REJECT, - .prog_type = BPF_PROG_TYPE_SOCK_OPS, -}, { "indirect variable-offset stack access, ok", .insns = { -- cgit v1.2.3 From 6338a94d5ab42a94e96ea36edc5f7df1fe73e68e Mon Sep 17 00:00:00 2001 From: Eduard Zingerman Date: Sun, 19 Feb 2023 22:04:27 +0200 Subject: selftests/bpf: Tests for uninitialized stack reads Three testcases to make sure that stack reads from uninitialized locations are accepted by verifier when executed in privileged mode: - read from a fixed offset; - read from a variable offset; - passing a pointer to stack to a helper converts STACK_INVALID to STACK_MISC. Signed-off-by: Eduard Zingerman Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/r/20230219200427.606541-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov --- .../selftests/bpf/prog_tests/uninit_stack.c | 9 +++ tools/testing/selftests/bpf/progs/uninit_stack.c | 87 ++++++++++++++++++++++ 2 files changed, 96 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/uninit_stack.c create mode 100644 tools/testing/selftests/bpf/progs/uninit_stack.c (limited to 'tools/testing') diff --git a/tools/testing/selftests/bpf/prog_tests/uninit_stack.c b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c new file mode 100644 index 000000000000..e64c71948491 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/uninit_stack.c @@ -0,0 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "uninit_stack.skel.h" + +void test_uninit_stack(void) +{ + RUN_TESTS(uninit_stack); +} diff --git a/tools/testing/selftests/bpf/progs/uninit_stack.c b/tools/testing/selftests/bpf/progs/uninit_stack.c new file mode 100644 index 000000000000..8a403470e557 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/uninit_stack.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include "bpf_misc.h" + +/* Read an uninitialized value from stack at a fixed offset */ +SEC("socket") +__naked int read_uninit_stack_fixed_off(void *ctx) +{ + asm volatile (" \ + r0 = 0; \ + /* force stack depth to be 128 */ \ + *(u64*)(r10 - 128) = r1; \ + r1 = *(u8 *)(r10 - 8 ); \ + r0 += r1; \ + r1 = *(u8 *)(r10 - 11); \ + r1 = *(u8 *)(r10 - 13); \ + r1 = *(u8 *)(r10 - 15); \ + r1 = *(u16*)(r10 - 16); \ + r1 = *(u32*)(r10 - 32); \ + r1 = *(u64*)(r10 - 64); \ + /* read from a spill of a wrong size, it is a separate \ + * branch in check_stack_read_fixed_off() \ + */ \ + *(u32*)(r10 - 72) = r1; \ + r1 = *(u64*)(r10 - 72); \ + r0 = 0; \ + exit; \ +" + ::: __clobber_all); +} + +/* Read an uninitialized value from stack at a variable offset */ +SEC("socket") +__naked int read_uninit_stack_var_off(void *ctx) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + /* force stack depth to be 64 */ \ + *(u64*)(r10 - 64) = r0; \ + r0 = -r0; \ + /* give r0 a range [-31, -1] */ \ + if r0 s<= -32 goto exit_%=; \ + if r0 s>= 0 goto exit_%=; \ + /* access stack using r0 */ \ + r1 = r10; \ + r1 += r0; \ + r2 = *(u8*)(r1 + 0); \ +exit_%=: r0 = 0; \ + exit; \ +" + : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +static __noinline void dummy(void) {} + +/* Pass a pointer to uninitialized stack memory to a helper. + * Passed memory block should be marked as STACK_MISC after helper call. + */ +SEC("socket") +__log_level(7) __msg("fp-104=mmmmmmmm") +__naked int helper_uninit_to_misc(void *ctx) +{ + asm volatile (" \ + /* force stack depth to be 128 */ \ + *(u64*)(r10 - 128) = r1; \ + r1 = r10; \ + r1 += -128; \ + r2 = 32; \ + call %[bpf_trace_printk]; \ + /* Call to dummy() forces print_verifier_state(..., true), \ + * thus showing the stack state, matched by __msg(). \ + */ \ + call %[dummy]; \ + r0 = 0; \ + exit; \ +" + : + : __imm(bpf_trace_printk), + __imm(dummy) + : __clobber_all); +} + +char _license[] SEC("license") = "GPL"; -- cgit v1.2.3 From 32513d40d908b267508d37994753d9bd1600914b Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Fri, 10 Mar 2023 12:41:18 -0800 Subject: selftests/bpf: Fix progs/find_vma_fail1.c build error. The commit 11e456cae91e ("selftests/bpf: Fix compilation errors: Assign a value to a constant") fixed the issue cleanly in bpf-next. This is an alternative fix in bpf tree to avoid merge conflict between bpf and bpf-next. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/progs/find_vma_fail1.c | 1 + 1 file changed, 1 insertion(+) (limited to 'tools/testing') diff --git a/tools/testing/selftests/bpf/progs/find_vma_fail1.c b/tools/testing/selftests/bpf/progs/find_vma_fail1.c index b3b326b8e2d1..6dab9cffda13 100644 --- a/tools/testing/selftests/bpf/progs/find_vma_fail1.c +++ b/tools/testing/selftests/bpf/progs/find_vma_fail1.c @@ -2,6 +2,7 @@ /* Copyright (c) 2021 Facebook */ #include "vmlinux.h" #include +#define vm_flags vm_start char _license[] SEC("license") = "GPL"; -- cgit v1.2.3 From e8c8361cfdbf450f760e8a2bdbd4222d1947366b Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Fri, 10 Mar 2023 12:47:51 -0800 Subject: selftests/bpf: Fix progs/test_deny_namespace.c issues. The following build error can be seen: progs/test_deny_namespace.c:22:19: error: call to undeclared function 'BIT_LL'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] __u64 cap_mask = BIT_LL(CAP_SYS_ADMIN); The struct kernel_cap_struct no longer exists in the kernel as well. Adjust bpf prog to fix both issues. Fixes: f122a08b197d ("capability: just use a 'u64' instead of a 'u32[2]' array") Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/progs/test_deny_namespace.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) (limited to 'tools/testing') diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c index 591104e79812..e96b901a733c 100644 --- a/tools/testing/selftests/bpf/progs/test_deny_namespace.c +++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c @@ -5,12 +5,10 @@ #include #include -struct kernel_cap_struct { - __u64 val; -} __attribute__((preserve_access_index)); +typedef struct { unsigned long long val; } kernel_cap_t; struct cred { - struct kernel_cap_struct cap_effective; + kernel_cap_t cap_effective; } __attribute__((preserve_access_index)); char _license[] SEC("license") = "GPL"; @@ -18,8 +16,8 @@ char _license[] SEC("license") = "GPL"; SEC("lsm.s/userns_create") int BPF_PROG(test_userns_create, const struct cred *cred, int ret) { - struct kernel_cap_struct caps = cred->cap_effective; - __u64 cap_mask = BIT_LL(CAP_SYS_ADMIN); + kernel_cap_t caps = cred->cap_effective; + __u64 cap_mask = 1ULL << CAP_SYS_ADMIN; if (ret) return 0; -- cgit v1.2.3 From 05107edc910135d27fe557267dc45be9630bf3dd Mon Sep 17 00:00:00 2001 From: Nick Desaulniers Date: Wed, 8 Mar 2023 11:59:33 -0800 Subject: selftests: sigaltstack: fix -Wuninitialized Building sigaltstack with clang via: $ ARCH=x86 make LLVM=1 -C tools/testing/selftests/sigaltstack/ produces the following warning: warning: variable 'sp' is uninitialized when used here [-Wuninitialized] if (sp < (unsigned long)sstack || ^~ Clang expects these to be declared at global scope; we've fixed this in the kernel proper by using the macro `current_stack_pointer`. This is defined in different headers for different target architectures, so just create a new header that defines the arch-specific register names for the stack pointer register, and define it for more targets (at least the ones that support current_stack_pointer/ARCH_HAS_CURRENT_STACK_POINTER). Reported-by: Linux Kernel Functional Testing Link: https://lore.kernel.org/lkml/CA+G9fYsi3OOu7yCsMutpzKDnBMAzJBCPimBp86LhGBa0eCnEpA@mail.gmail.com/ Signed-off-by: Nick Desaulniers Reviewed-by: Kees Cook Tested-by: Linux Kernel Functional Testing Tested-by: Anders Roxell Signed-off-by: Shuah Khan --- .../selftests/sigaltstack/current_stack_pointer.h | 23 ++++++++++++++++++++++ tools/testing/selftests/sigaltstack/sas.c | 7 +------ 2 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 tools/testing/selftests/sigaltstack/current_stack_pointer.h (limited to 'tools/testing') diff --git a/tools/testing/selftests/sigaltstack/current_stack_pointer.h b/tools/testing/selftests/sigaltstack/current_stack_pointer.h new file mode 100644 index 000000000000..ea9bdf3a90b1 --- /dev/null +++ b/tools/testing/selftests/sigaltstack/current_stack_pointer.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#if __alpha__ +register unsigned long sp asm("$30"); +#elif __arm__ || __aarch64__ || __csky__ || __m68k__ || __mips__ || __riscv +register unsigned long sp asm("sp"); +#elif __i386__ +register unsigned long sp asm("esp"); +#elif __loongarch64 +register unsigned long sp asm("$sp"); +#elif __ppc__ +register unsigned long sp asm("r1"); +#elif __s390x__ +register unsigned long sp asm("%15"); +#elif __sh__ +register unsigned long sp asm("r15"); +#elif __x86_64__ +register unsigned long sp asm("rsp"); +#elif __XTENSA__ +register unsigned long sp asm("a1"); +#else +#error "implement current_stack_pointer equivalent" +#endif diff --git a/tools/testing/selftests/sigaltstack/sas.c b/tools/testing/selftests/sigaltstack/sas.c index c53b070755b6..98d37cb744fb 100644 --- a/tools/testing/selftests/sigaltstack/sas.c +++ b/tools/testing/selftests/sigaltstack/sas.c @@ -20,6 +20,7 @@ #include #include "../kselftest.h" +#include "current_stack_pointer.h" #ifndef SS_AUTODISARM #define SS_AUTODISARM (1U << 31) @@ -46,12 +47,6 @@ void my_usr1(int sig, siginfo_t *si, void *u) stack_t stk; struct stk_data *p; -#if __s390x__ - register unsigned long sp asm("%15"); -#else - register unsigned long sp asm("sp"); -#endif - if (sp < (unsigned long)sstack || sp >= (unsigned long)sstack + stack_size) { ksft_exit_fail_msg("SP is not on sigaltstack\n"); -- cgit v1.2.3 From 62faca1ca10cc84e99ae7f38aa28df2bc945369b Mon Sep 17 00:00:00 2001 From: "Chang S. Bae" Date: Mon, 27 Feb 2023 13:05:04 -0800 Subject: selftests/x86/amx: Add a ptrace test Include a test case to validate the XTILEDATA injection to the target. Also, it ensures the kernel's ability to copy states between different XSAVE formats. Refactor the memcmp() code to be usable for the state validation. Signed-off-by: Chang S. Bae Signed-off-by: Dave Hansen Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230227210504.18520-3-chang.seok.bae%40intel.com --- tools/testing/selftests/x86/amx.c | 108 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 105 insertions(+), 3 deletions(-) (limited to 'tools/testing') diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c index 625e42901237..d884fd69dd51 100644 --- a/tools/testing/selftests/x86/amx.c +++ b/tools/testing/selftests/x86/amx.c @@ -14,8 +14,10 @@ #include #include #include +#include #include #include +#include #include "../kselftest.h" /* For __cpuid_count() */ @@ -583,6 +585,13 @@ static void test_dynamic_state(void) _exit(0); } +static inline int __compare_tiledata_state(struct xsave_buffer *xbuf1, struct xsave_buffer *xbuf2) +{ + return memcmp(&xbuf1->bytes[xtiledata.xbuf_offset], + &xbuf2->bytes[xtiledata.xbuf_offset], + xtiledata.size); +} + /* * Save current register state and compare it to @xbuf1.' * @@ -599,9 +608,7 @@ static inline bool __validate_tiledata_regs(struct xsave_buffer *xbuf1) fatal_error("failed to allocate XSAVE buffer\n"); xsave(xbuf2, XFEATURE_MASK_XTILEDATA); - ret = memcmp(&xbuf1->bytes[xtiledata.xbuf_offset], - &xbuf2->bytes[xtiledata.xbuf_offset], - xtiledata.size); + ret = __compare_tiledata_state(xbuf1, xbuf2); free(xbuf2); @@ -826,6 +833,99 @@ static void test_context_switch(void) free(finfo); } +/* Ptrace test */ + +/* + * Make sure the ptracee has the expanded kernel buffer on the first + * use. Then, initialize the state before performing the state + * injection from the ptracer. + */ +static inline void ptracee_firstuse_tiledata(void) +{ + load_rand_tiledata(stashed_xsave); + init_xtiledata(); +} + +/* + * Ptracer injects the randomized tile data state. It also reads + * before and after that, which will execute the kernel's state copy + * functions. So, the tester is advised to double-check any emitted + * kernel messages. + */ +static void ptracer_inject_tiledata(pid_t target) +{ + struct xsave_buffer *xbuf; + struct iovec iov; + + xbuf = alloc_xbuf(); + if (!xbuf) + fatal_error("unable to allocate XSAVE buffer"); + + printf("\tRead the init'ed tiledata via ptrace().\n"); + + iov.iov_base = xbuf; + iov.iov_len = xbuf_size; + + memset(stashed_xsave, 0, xbuf_size); + + if (ptrace(PTRACE_GETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_GETREGSET"); + + if (!__compare_tiledata_state(stashed_xsave, xbuf)) + printf("[OK]\tThe init'ed tiledata was read from ptracee.\n"); + else + printf("[FAIL]\tThe init'ed tiledata was not read from ptracee.\n"); + + printf("\tInject tiledata via ptrace().\n"); + + load_rand_tiledata(xbuf); + + memcpy(&stashed_xsave->bytes[xtiledata.xbuf_offset], + &xbuf->bytes[xtiledata.xbuf_offset], + xtiledata.size); + + if (ptrace(PTRACE_SETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_SETREGSET"); + + if (ptrace(PTRACE_GETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_GETREGSET"); + + if (!__compare_tiledata_state(stashed_xsave, xbuf)) + printf("[OK]\tTiledata was correctly written to ptracee.\n"); + else + printf("[FAIL]\tTiledata was not correctly written to ptracee.\n"); +} + +static void test_ptrace(void) +{ + pid_t child; + int status; + + child = fork(); + if (child < 0) { + err(1, "fork"); + } else if (!child) { + if (ptrace(PTRACE_TRACEME, 0, NULL, NULL)) + err(1, "PTRACE_TRACEME"); + + ptracee_firstuse_tiledata(); + + raise(SIGTRAP); + _exit(0); + } + + do { + wait(&status); + } while (WSTOPSIG(status) != SIGTRAP); + + ptracer_inject_tiledata(child); + + ptrace(PTRACE_DETACH, child, NULL, NULL); + wait(&status); + if (!WIFEXITED(status) || WEXITSTATUS(status)) + err(1, "ptrace test"); +} + int main(void) { /* Check hardware availability at first */ @@ -846,6 +946,8 @@ int main(void) ctxtswtest_config.num_threads = 5; test_context_switch(); + test_ptrace(); + clearhandler(SIGILL); free_stashed_xsave(); -- cgit v1.2.3 From d035230ec9937a9138921d2a0eeb99496ea7eac0 Mon Sep 17 00:00:00 2001 From: Peter Xu Date: Wed, 8 Mar 2023 19:04:22 +0000 Subject: kselftest: vm: fix unused variable warning Remove unused variable from the MDWE test. [joey.gouly@arm.com: add commit message] Link: https://lkml.kernel.org/r/20230308190423.46491-4-joey.gouly@arm.com Fixes: 4cf1fe34fd18 ("kselftest: vm: add tests for memory-deny-write-execute") Signed-off-by: Peter Xu Signed-off-by: Joey Gouly Acked-by: Catalin Marinas Cc: Alexey Izbyshev Cc: Arnaldo Carvalho de Melo Cc: Kees Cook Cc: nd Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/mm/mdwe_test.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'tools/testing') diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c index f466a099f1bf..bc91bef5d254 100644 --- a/tools/testing/selftests/mm/mdwe_test.c +++ b/tools/testing/selftests/mm/mdwe_test.c @@ -163,9 +163,8 @@ TEST_F(mdwe, mprotect_WRITE_EXEC) TEST_F(mdwe, mmap_FIXED) { - void *p, *p2; + void *p; - p2 = mmap(NULL, self->size, PROT_READ | PROT_EXEC, self->flags, 0, 0); self->p = mmap(NULL, self->size, PROT_READ, self->flags, 0, 0); ASSERT_NE(self->p, MAP_FAILED); -- cgit v1.2.3 From 25209a3209ecc44f93300b7ee5287f451be1d6ff Mon Sep 17 00:00:00 2001 From: Arseniy Krasnov Date: Tue, 28 Mar 2023 14:33:07 +0300 Subject: test/vsock: new skbuff appending test This adds test which checks case when data of newly received skbuff is appended to the last skbuff in the socket's queue. It looks like simple test with 'send()' and 'recv()', but internally it triggers logic which appends one received skbuff to another. Test checks that this feature works correctly. This test is actual only for virtio transport. Signed-off-by: Arseniy Krasnov Reviewed-by: Stefano Garzarella Signed-off-by: Paolo Abeni --- tools/testing/vsock/vsock_test.c | 90 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) (limited to 'tools/testing') diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index 3de10dbb50f5..12b97c92fbb2 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -968,6 +968,91 @@ static void test_seqpacket_inv_buf_server(const struct test_opts *opts) test_inv_buf_server(opts, false); } +#define HELLO_STR "HELLO" +#define WORLD_STR "WORLD" + +static void test_stream_virtio_skb_merge_client(const struct test_opts *opts) +{ + ssize_t res; + int fd; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + /* Send first skbuff. */ + res = send(fd, HELLO_STR, strlen(HELLO_STR), 0); + if (res != strlen(HELLO_STR)) { + fprintf(stderr, "unexpected send(2) result %zi\n", res); + exit(EXIT_FAILURE); + } + + control_writeln("SEND0"); + /* Peer reads part of first skbuff. */ + control_expectln("REPLY0"); + + /* Send second skbuff, it will be appended to the first. */ + res = send(fd, WORLD_STR, strlen(WORLD_STR), 0); + if (res != strlen(WORLD_STR)) { + fprintf(stderr, "unexpected send(2) result %zi\n", res); + exit(EXIT_FAILURE); + } + + control_writeln("SEND1"); + /* Peer reads merged skbuff packet. */ + control_expectln("REPLY1"); + + close(fd); +} + +static void test_stream_virtio_skb_merge_server(const struct test_opts *opts) +{ + unsigned char buf[64]; + ssize_t res; + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + control_expectln("SEND0"); + + /* Read skbuff partially. */ + res = recv(fd, buf, 2, 0); + if (res != 2) { + fprintf(stderr, "expected recv(2) returns 2 bytes, got %zi\n", res); + exit(EXIT_FAILURE); + } + + control_writeln("REPLY0"); + control_expectln("SEND1"); + + res = recv(fd, buf + 2, sizeof(buf) - 2, 0); + if (res != 8) { + fprintf(stderr, "expected recv(2) returns 8 bytes, got %zi\n", res); + exit(EXIT_FAILURE); + } + + res = recv(fd, buf, sizeof(buf) - 8 - 2, MSG_DONTWAIT); + if (res != -1) { + fprintf(stderr, "expected recv(2) failure, got %zi\n", res); + exit(EXIT_FAILURE); + } + + if (memcmp(buf, HELLO_STR WORLD_STR, strlen(HELLO_STR WORLD_STR))) { + fprintf(stderr, "pattern mismatch\n"); + exit(EXIT_FAILURE); + } + + control_writeln("REPLY1"); + + close(fd); +} + static struct test_case test_cases[] = { { .name = "SOCK_STREAM connection reset", @@ -1038,6 +1123,11 @@ static struct test_case test_cases[] = { .run_client = test_seqpacket_inv_buf_client, .run_server = test_seqpacket_inv_buf_server, }, + { + .name = "SOCK_STREAM virtio skb merge", + .run_client = test_stream_virtio_skb_merge_client, + .run_server = test_stream_virtio_skb_merge_server, + }, {}, }; -- cgit v1.2.3 From f1594bc676579133a3cd906d7d27733289edfb86 Mon Sep 17 00:00:00 2001 From: Anh Tuan Phan Date: Fri, 24 Mar 2023 09:14:15 +0700 Subject: selftests mount: Fix mount_setattr_test builds failed MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When compiling selftests with target mount_setattr I encountered some errors with the below messages: mount_setattr_test.c: In function ‘mount_setattr_thread’: mount_setattr_test.c:343:16: error: variable ‘attr’ has initializer but incomplete type 343 | struct mount_attr attr = { | ^~~~~~~~~~ These errors might be because of linux/mount.h is not included. This patch resolves that issue. Signed-off-by: Anh Tuan Phan Acked-by: Christian Brauner Signed-off-by: Shuah Khan --- tools/testing/selftests/mount_setattr/mount_setattr_test.c | 1 + 1 file changed, 1 insertion(+) (limited to 'tools/testing') diff --git a/tools/testing/selftests/mount_setattr/mount_setattr_test.c b/tools/testing/selftests/mount_setattr/mount_setattr_test.c index 582669ca38e9..c6a8c732b802 100644 --- a/tools/testing/selftests/mount_setattr/mount_setattr_test.c +++ b/tools/testing/selftests/mount_setattr/mount_setattr_test.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "../kselftest_harness.h" -- cgit v1.2.3 From c13af03de46ba27674dd9fb31a17c0d480081139 Mon Sep 17 00:00:00 2001 From: "Liam R. Howlett" Date: Mon, 27 Feb 2023 09:36:04 -0800 Subject: maple_tree: fix write memory barrier of nodes once dead for RCU mode During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Signed-off-by: Suren Baghdasaryan Cc: Signed-off-by: Andrew Morton --- tools/testing/radix-tree/maple.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) (limited to 'tools/testing') diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index 958ee9bdb316..4c89ff333f6f 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -108,6 +108,7 @@ static noinline void check_new_node(struct maple_tree *mt) MT_BUG_ON(mt, mn->slot[1] != NULL); MT_BUG_ON(mt, mas_allocated(&mas) != 0); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); mas.node = MAS_START; mas_nomem(&mas, GFP_KERNEL); @@ -160,6 +161,7 @@ static noinline void check_new_node(struct maple_tree *mt) MT_BUG_ON(mt, mas_allocated(&mas) != i); MT_BUG_ON(mt, !mn); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } @@ -192,6 +194,7 @@ static noinline void check_new_node(struct maple_tree *mt) MT_BUG_ON(mt, not_empty(mn)); MT_BUG_ON(mt, mas_allocated(&mas) != i - 1); MT_BUG_ON(mt, !mn); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } @@ -210,6 +213,7 @@ static noinline void check_new_node(struct maple_tree *mt) mn = mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); MT_BUG_ON(mt, mas_allocated(&mas) != j - 1); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -233,6 +237,7 @@ static noinline void check_new_node(struct maple_tree *mt) MT_BUG_ON(mt, mas_allocated(&mas) != i - j); mn = mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); MT_BUG_ON(mt, mas_allocated(&mas) != i - j - 1); } @@ -269,6 +274,7 @@ static noinline void check_new_node(struct maple_tree *mt) mn = mas_pop_node(&mas); /* get the next node. */ MT_BUG_ON(mt, mn == NULL); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -294,6 +300,7 @@ static noinline void check_new_node(struct maple_tree *mt) mn = mas_pop_node(&mas2); /* get the next node. */ MT_BUG_ON(mt, mn == NULL); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas2) != 0); @@ -334,10 +341,12 @@ static noinline void check_new_node(struct maple_tree *mt) MT_BUG_ON(mt, mas_allocated(&mas) != MAPLE_ALLOC_SLOTS + 2); mn = mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); for (i = 1; i <= MAPLE_ALLOC_SLOTS + 1; i++) { mn = mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -375,6 +384,7 @@ static noinline void check_new_node(struct maple_tree *mt) mas_node_count(&mas, i); /* Request */ mas_nomem(&mas, GFP_KERNEL); /* Fill request */ mn = mas_pop_node(&mas); /* get the next node. */ + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); mas_destroy(&mas); @@ -382,10 +392,13 @@ static noinline void check_new_node(struct maple_tree *mt) mas_node_count(&mas, i); /* Request */ mas_nomem(&mas, GFP_KERNEL); /* Fill request */ mn = mas_pop_node(&mas); /* get the next node. */ + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); mn = mas_pop_node(&mas); /* get the next node. */ + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); mn = mas_pop_node(&mas); /* get the next node. */ + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); mas_destroy(&mas); } @@ -35369,6 +35382,7 @@ static noinline void check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, allocated != 1 + height * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); MT_BUG_ON(mt, mas_preallocate(&mas, GFP_KERNEL) != 0); mas_destroy(&mas); @@ -35386,6 +35400,7 @@ static noinline void check_prealloc(struct maple_tree *mt) mas_destroy(&mas); allocated = mas_allocated(&mas); MT_BUG_ON(mt, allocated != 0); + mn->parent = ma_parent_ptr(mn); ma_free_rcu(mn); MT_BUG_ON(mt, mas_preallocate(&mas, GFP_KERNEL) != 0); @@ -35756,6 +35771,7 @@ void farmer_tests(void) tree.ma_root = mt_mk_node(node, maple_leaf_64); mt_dump(&tree); + node->parent = ma_parent_ptr(node); ma_free_rcu(node); /* Check things that will make lockdep angry */ -- cgit v1.2.3 From 38e058cc7d245dc8034426415bee8fec16ace1bd Mon Sep 17 00:00:00 2001 From: Hangbin Liu Date: Tue, 4 Apr 2023 15:24:11 +0800 Subject: selftests: net: rps_default_mask.sh: delete veth link specifically When deleting the netns and recreating a new one while re-adding the veth interface, there is a small window of time during which the old veth interface has not yet been removed. This can cause the new addition to fail. To resolve this issue, we can either wait for a short while to ensure that the old veth interface is deleted, or we can specifically remove the veth interface. Before this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] RTNETLINK answers: File exists changing rps_default_mask in child ns don't affect the main one[ ok ] cat: /sys/class/net/vethC11an1/queues/rx-0/rps_cpus: No such file or directory changing rps_default_mask in child ns affects new childns devices./rps_default_mask.sh: line 36: [: -eq: unary operator expected [fail] expected 1 found changing rps_default_mask in child ns don't affect existing devices[ ok ] After this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] changing rps_default_mask in child ns don't affect the main one[ ok ] changing rps_default_mask in child ns affects new childns devices[ ok ] changing rps_default_mask in child ns don't affect existing devices[ ok ] Fixes: 3a7d84eae03b ("self-tests: more rps self tests") Signed-off-by: Hangbin Liu Acked-by: Paolo Abeni Link: https://lore.kernel.org/r/20230404072411.879476-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/rps_default_mask.sh | 1 + 1 file changed, 1 insertion(+) (limited to 'tools/testing') diff --git a/tools/testing/selftests/net/rps_default_mask.sh b/tools/testing/selftests/net/rps_default_mask.sh index 0fd0d2db3abc..a26c5624429f 100755 --- a/tools/testing/selftests/net/rps_default_mask.sh +++ b/tools/testing/selftests/net/rps_default_mask.sh @@ -60,6 +60,7 @@ ip link set dev $VETH up ip -n $NETNS link set dev $VETH up chk_rps "changing rps_default_mask affect newly created devices" "" $VETH 3 chk_rps "changing rps_default_mask don't affect newly child netns[II]" $NETNS $VETH 0 +ip link del dev $VETH ip netns del $NETNS setup -- cgit v1.2.3