diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-09-30 17:58:11 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-09-30 17:58:11 -0700 |
commit | ae28ed4578e6d5a481e39c5a9827f27048661fdd (patch) | |
tree | fd29a311fe5f4ab052c4973fca50bca55e82bf94 /arch | |
parent | 4b81e2eb9e4db8f6094c077d0c8b27c264901c1b (diff) | |
parent | 4ef77dd584cfd915526328f516fec59e3a54d66e (diff) |
Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc
(Amery Hung)
Applied as a stable branch in bpf-next and net-next trees.
- Support reading skb metadata via bpf_dynptr (Jakub Sitnicki)
Also a stable branch in bpf-next and net-next trees.
- Enforce expected_attach_type for tailcall compatibility (Daniel
Borkmann)
- Replace path-sensitive with path-insensitive live stack analysis in
the verifier (Eduard Zingerman)
This is a significant change in the verification logic. More details,
motivation, long term plans are in the cover letter/merge commit.
- Support signed BPF programs (KP Singh)
This is another major feature that took years to materialize.
Algorithm details are in the cover letter/marge commit
- Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich)
- Add support for may_goto instruction to arm64 JIT (Puranjay Mohan)
- Fix USDT SIB argument handling in libbpf (Jiawei Zhao)
- Allow uprobe-bpf program to change context registers (Jiri Olsa)
- Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and
Puranjay Mohan)
- Allow access to union arguments in tracing programs (Leon Hwang)
- Optimize rcu_read_lock() + migrate_disable() combination where it's
used in BPF subsystem (Menglong Dong)
- Introduce bpf_task_work_schedule*() kfuncs to schedule deferred
execution of BPF callback in the context of a specific task using the
kernel’s task_work infrastructure (Mykyta Yatsenko)
- Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya
Dwivedi)
- Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi)
- Improve the precision of tnum multiplier verifier operation
(Nandakumar Edamana)
- Use tnums to improve is_branch_taken() logic (Paul Chaignon)
- Add support for atomic operations in arena in riscv JIT (Pu Lehui)
- Report arena faults to BPF error stream (Puranjay Mohan)
- Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin
Monnet)
- Add bpf_strcasecmp() kfunc (Rong Tao)
- Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao
Chen)
* tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits)
libbpf: Replace AF_ALG with open coded SHA-256
selftests/bpf: Add stress test for rqspinlock in NMI
selftests/bpf: Add test case for different expected_attach_type
bpf: Enforce expected_attach_type for tailcall compatibility
bpftool: Remove duplicate string.h header
bpf: Remove duplicate crypto/sha2.h header
libbpf: Fix error when st-prefix_ops and ops from differ btf
selftests/bpf: Test changing packet data from kfunc
selftests/bpf: Add stacktrace map lookup_and_delete_elem test case
selftests/bpf: Refactor stacktrace_map case with skeleton
bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE
selftests/bpf: Fix flaky bpf_cookie selftest
selftests/bpf: Test changing packet data from global functions with a kfunc
bpf: Emit struct bpf_xdp_sock type in vmlinux BTF
selftests/bpf: Task_work selftest cleanup fixes
MAINTAINERS: Delete inactive maintainers from AF_XDP
bpf: Mark kfuncs as __noclone
selftests/bpf: Add kprobe multi write ctx attach test
selftests/bpf: Add kprobe write ctx attach test
selftests/bpf: Add uprobe context ip register change test
...
Diffstat (limited to 'arch')
-rw-r--r-- | arch/arm64/net/Makefile | 2 | ||||
-rw-r--r-- | arch/arm64/net/bpf_jit_comp.c | 127 | ||||
-rw-r--r-- | arch/arm64/net/bpf_timed_may_goto.S | 40 | ||||
-rw-r--r-- | arch/riscv/Kconfig | 1 | ||||
-rw-r--r-- | arch/riscv/include/asm/cmpxchg.h | 6 | ||||
-rw-r--r-- | arch/riscv/kernel/setup.c | 1 | ||||
-rw-r--r-- | arch/riscv/net/bpf_jit.h | 70 | ||||
-rw-r--r-- | arch/riscv/net/bpf_jit_comp64.c | 569 | ||||
-rw-r--r-- | arch/s390/net/Makefile | 2 | ||||
-rw-r--r-- | arch/s390/net/bpf_jit_comp.c | 148 | ||||
-rw-r--r-- | arch/s390/net/bpf_timed_may_goto.S | 45 | ||||
-rw-r--r-- | arch/x86/net/bpf_jit_comp.c | 125 |
12 files changed, 640 insertions, 496 deletions
diff --git a/arch/arm64/net/Makefile b/arch/arm64/net/Makefile index 5c540efb7d9b..3ae382bfca87 100644 --- a/arch/arm64/net/Makefile +++ b/arch/arm64/net/Makefile @@ -2,4 +2,4 @@ # # ARM64 networking code # -obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o +obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o bpf_timed_may_goto.o diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 28996e0a9b00..ab83089c3d8f 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1066,19 +1066,53 @@ static void build_epilogue(struct jit_ctx *ctx, bool was_classic) emit(A64_RET(A64_LR), ctx); } -#define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0) +/* + * Metadata encoding for exception handling in JITed code. + * + * Format of `fixup` field in `struct exception_table_entry`: + * + * Bit layout of `fixup` (32-bit): + * + * +-----------+--------+-----------+-----------+----------+ + * | 31-27 | 26-22 | 21 | 20-16 | 15-0 | + * | | | | | | + * | FIXUP_REG | Unused | ARENA_ACC | ARENA_REG | OFFSET | + * +-----------+--------+-----------+-----------+----------+ + * + * - OFFSET (16 bits): Offset used to compute address for Load/Store instruction. + * - ARENA_REG (5 bits): Register that is used to calculate the address for load/store when + * accessing the arena region. + * - ARENA_ACCESS (1 bit): This bit is set when the faulting instruction accessed the arena region. + * - FIXUP_REG (5 bits): Destination register for the load instruction (cleared on fault) or set to + * DONT_CLEAR if it is a store instruction. + */ + +#define BPF_FIXUP_OFFSET_MASK GENMASK(15, 0) +#define BPF_FIXUP_ARENA_REG_MASK GENMASK(20, 16) +#define BPF_ARENA_ACCESS BIT(21) #define BPF_FIXUP_REG_MASK GENMASK(31, 27) #define DONT_CLEAR 5 /* Unused ARM64 register from BPF's POV */ bool ex_handler_bpf(const struct exception_table_entry *ex, struct pt_regs *regs) { - off_t offset = FIELD_GET(BPF_FIXUP_OFFSET_MASK, ex->fixup); int dst_reg = FIELD_GET(BPF_FIXUP_REG_MASK, ex->fixup); + s16 off = FIELD_GET(BPF_FIXUP_OFFSET_MASK, ex->fixup); + int arena_reg = FIELD_GET(BPF_FIXUP_ARENA_REG_MASK, ex->fixup); + bool is_arena = !!(ex->fixup & BPF_ARENA_ACCESS); + bool is_write = (dst_reg == DONT_CLEAR); + unsigned long addr; + + if (is_arena) { + addr = regs->regs[arena_reg] + off; + bpf_prog_report_arena_violation(is_write, addr, regs->pc); + } if (dst_reg != DONT_CLEAR) regs->regs[dst_reg] = 0; - regs->pc = (unsigned long)&ex->fixup - offset; + /* Skip the faulting instruction */ + regs->pc += AARCH64_INSN_SIZE; + return true; } @@ -1088,7 +1122,9 @@ static int add_exception_handler(const struct bpf_insn *insn, int dst_reg) { off_t ins_offset; - off_t fixup_offset; + s16 off = insn->off; + bool is_arena; + int arena_reg; unsigned long pc; struct exception_table_entry *ex; @@ -1097,11 +1133,16 @@ static int add_exception_handler(const struct bpf_insn *insn, return 0; if (BPF_MODE(insn->code) != BPF_PROBE_MEM && - BPF_MODE(insn->code) != BPF_PROBE_MEMSX && - BPF_MODE(insn->code) != BPF_PROBE_MEM32 && - BPF_MODE(insn->code) != BPF_PROBE_ATOMIC) + BPF_MODE(insn->code) != BPF_PROBE_MEMSX && + BPF_MODE(insn->code) != BPF_PROBE_MEM32 && + BPF_MODE(insn->code) != BPF_PROBE_MEM32SX && + BPF_MODE(insn->code) != BPF_PROBE_ATOMIC) return 0; + is_arena = (BPF_MODE(insn->code) == BPF_PROBE_MEM32) || + (BPF_MODE(insn->code) == BPF_PROBE_MEM32SX) || + (BPF_MODE(insn->code) == BPF_PROBE_ATOMIC); + if (!ctx->prog->aux->extable || WARN_ON_ONCE(ctx->exentry_idx >= ctx->prog->aux->num_exentries)) return -EINVAL; @@ -1120,22 +1161,6 @@ static int add_exception_handler(const struct bpf_insn *insn, return -ERANGE; /* - * Since the extable follows the program, the fixup offset is always - * negative and limited to BPF_JIT_REGION_SIZE. Store a positive value - * to keep things simple, and put the destination register in the upper - * bits. We don't need to worry about buildtime or runtime sort - * modifying the upper bits because the table is already sorted, and - * isn't part of the main exception table. - * - * The fixup_offset is set to the next instruction from the instruction - * that may fault. The execution will jump to this after handling the - * fault. - */ - fixup_offset = (long)&ex->fixup - (pc + AARCH64_INSN_SIZE); - if (!FIELD_FIT(BPF_FIXUP_OFFSET_MASK, fixup_offset)) - return -ERANGE; - - /* * The offsets above have been calculated using the RO buffer but we * need to use the R/W buffer for writes. * switch ex to rw buffer for writing. @@ -1147,8 +1172,26 @@ static int add_exception_handler(const struct bpf_insn *insn, if (BPF_CLASS(insn->code) != BPF_LDX) dst_reg = DONT_CLEAR; - ex->fixup = FIELD_PREP(BPF_FIXUP_OFFSET_MASK, fixup_offset) | - FIELD_PREP(BPF_FIXUP_REG_MASK, dst_reg); + ex->fixup = FIELD_PREP(BPF_FIXUP_REG_MASK, dst_reg); + + if (is_arena) { + ex->fixup |= BPF_ARENA_ACCESS; + /* + * insn->src_reg/dst_reg holds the address in the arena region with upper 32-bits + * being zero because of a preceding addr_space_cast(r<n>, 0x0, 0x1) instruction. + * This address is adjusted with the addition of arena_vm_start (see the + * implementation of BPF_PROBE_MEM32 and BPF_PROBE_ATOMIC) before being used for the + * memory access. Pass the reg holding the unmodified 32-bit address to + * ex_handler_bpf. + */ + if (BPF_CLASS(insn->code) == BPF_LDX) + arena_reg = bpf2a64[insn->src_reg]; + else + arena_reg = bpf2a64[insn->dst_reg]; + + ex->fixup |= FIELD_PREP(BPF_FIXUP_OFFSET_MASK, off) | + FIELD_PREP(BPF_FIXUP_ARENA_REG_MASK, arena_reg); + } ex->type = EX_TYPE_BPF; @@ -1558,7 +1601,13 @@ emit_cond_jmp: if (ret < 0) return ret; emit_call(func_addr, ctx); - emit(A64_MOV(1, r0, A64_R(0)), ctx); + /* + * Call to arch_bpf_timed_may_goto() is emitted by the + * verifier and called with custom calling convention with + * first argument and return value in BPF_REG_AX (x9). + */ + if (func_addr != (u64)arch_bpf_timed_may_goto) + emit(A64_MOV(1, r0, A64_R(0)), ctx); break; } /* tail call */ @@ -1612,7 +1661,11 @@ emit_cond_jmp: case BPF_LDX | BPF_PROBE_MEM32 | BPF_H: case BPF_LDX | BPF_PROBE_MEM32 | BPF_W: case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW: - if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) { + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_B: + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_H: + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_W: + if (BPF_MODE(insn->code) == BPF_PROBE_MEM32 || + BPF_MODE(insn->code) == BPF_PROBE_MEM32SX) { emit(A64_ADD(1, tmp2, src, arena_vm_base), ctx); src = tmp2; } @@ -1624,7 +1677,8 @@ emit_cond_jmp: off_adj = off; } sign_extend = (BPF_MODE(insn->code) == BPF_MEMSX || - BPF_MODE(insn->code) == BPF_PROBE_MEMSX); + BPF_MODE(insn->code) == BPF_PROBE_MEMSX || + BPF_MODE(insn->code) == BPF_PROBE_MEM32SX); switch (BPF_SIZE(code)) { case BPF_W: if (is_lsi_offset(off_adj, 2)) { @@ -1832,9 +1886,11 @@ emit_cond_jmp: if (ret) return ret; - ret = add_exception_handler(insn, ctx, dst); - if (ret) - return ret; + if (BPF_MODE(insn->code) == BPF_PROBE_ATOMIC) { + ret = add_exception_handler(insn, ctx, dst); + if (ret) + return ret; + } break; default: @@ -2767,7 +2823,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, goto out; } - bpf_flush_icache(ro_image, ro_image + size); out: kvfree(image); return ret; @@ -3038,6 +3093,11 @@ bool bpf_jit_bypass_spec_v4(void) return true; } +bool bpf_jit_supports_timed_may_goto(void) +{ + return true; +} + bool bpf_jit_inlines_helper_call(s32 imm) { switch (imm) { @@ -3064,8 +3124,7 @@ void bpf_jit_free(struct bpf_prog *prog) * before freeing it. */ if (jit_data) { - bpf_arch_text_copy(&jit_data->ro_header->size, &jit_data->header->size, - sizeof(jit_data->header->size)); + bpf_jit_binary_pack_finalize(jit_data->ro_header, jit_data->header); kfree(jit_data); } prog->bpf_func -= cfi_get_offset(); diff --git a/arch/arm64/net/bpf_timed_may_goto.S b/arch/arm64/net/bpf_timed_may_goto.S new file mode 100644 index 000000000000..894cfcd7b241 --- /dev/null +++ b/arch/arm64/net/bpf_timed_may_goto.S @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2025 Puranjay Mohan <puranjay@kernel.org> */ + +#include <linux/linkage.h> + +SYM_FUNC_START(arch_bpf_timed_may_goto) + /* Allocate stack space and emit frame record */ + stp x29, x30, [sp, #-64]! + mov x29, sp + + /* Save BPF registers R0 - R5 (x7, x0-x4)*/ + stp x7, x0, [sp, #16] + stp x1, x2, [sp, #32] + stp x3, x4, [sp, #48] + + /* + * Stack depth was passed in BPF_REG_AX (x9), add it to the BPF_FP + * (x25) to get the pointer to count and timestamp and pass it as the + * first argument in x0. + * + * Before generating the call to arch_bpf_timed_may_goto, the verifier + * generates a load instruction using FP, i.e. REG_AX = *(u64 *)(FP - + * stack_off_cnt), so BPF_REG_FP (x25) is always set up by the arm64 + * jit in this case. + */ + add x0, x9, x25 + bl bpf_check_timed_may_goto + /* BPF_REG_AX(x9) will be stored into count, so move return value to it. */ + mov x9, x0 + + /* Restore BPF registers R0 - R5 (x7, x0-x4) */ + ldp x7, x0, [sp, #16] + ldp x1, x2, [sp, #32] + ldp x3, x4, [sp, #48] + + /* Restore FP and LR */ + ldp x29, x30, [sp], #64 + + ret +SYM_FUNC_END(arch_bpf_timed_may_goto) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index a9ef4eeb72ab..2181dde50d6e 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -708,7 +708,6 @@ config TOOLCHAIN_HAS_ZACAS config RISCV_ISA_ZACAS bool "Zacas extension support for atomic CAS" - depends on TOOLCHAIN_HAS_ZACAS depends on RISCV_ALTERNATIVE default y help diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 80bd52363c68..122e1485d39a 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -134,6 +134,7 @@ ({ \ if (IS_ENABLED(CONFIG_RISCV_ISA_ZABHA) && \ IS_ENABLED(CONFIG_RISCV_ISA_ZACAS) && \ + IS_ENABLED(CONFIG_TOOLCHAIN_HAS_ZACAS) && \ riscv_has_extension_unlikely(RISCV_ISA_EXT_ZABHA) && \ riscv_has_extension_unlikely(RISCV_ISA_EXT_ZACAS)) { \ r = o; \ @@ -181,6 +182,7 @@ r, p, co, o, n) \ ({ \ if (IS_ENABLED(CONFIG_RISCV_ISA_ZACAS) && \ + IS_ENABLED(CONFIG_TOOLCHAIN_HAS_ZACAS) && \ riscv_has_extension_unlikely(RISCV_ISA_EXT_ZACAS)) { \ r = o; \ \ @@ -316,7 +318,7 @@ arch_cmpxchg_release((ptr), (o), (n)); \ }) -#if defined(CONFIG_64BIT) && defined(CONFIG_RISCV_ISA_ZACAS) +#if defined(CONFIG_64BIT) && defined(CONFIG_RISCV_ISA_ZACAS) && defined(CONFIG_TOOLCHAIN_HAS_ZACAS) #define system_has_cmpxchg128() riscv_has_extension_unlikely(RISCV_ISA_EXT_ZACAS) @@ -352,7 +354,7 @@ union __u128_halves { #define arch_cmpxchg128_local(ptr, o, n) \ __arch_cmpxchg128((ptr), (o), (n), "") -#endif /* CONFIG_64BIT && CONFIG_RISCV_ISA_ZACAS */ +#endif /* CONFIG_64BIT && CONFIG_RISCV_ISA_ZACAS && CONFIG_TOOLCHAIN_HAS_ZACAS */ #ifdef CONFIG_RISCV_ISA_ZAWRS /* diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index f90cce7a3ace..14235e58c539 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -290,6 +290,7 @@ static void __init riscv_spinlock_init(void) if (IS_ENABLED(CONFIG_RISCV_ISA_ZABHA) && IS_ENABLED(CONFIG_RISCV_ISA_ZACAS) && + IS_ENABLED(CONFIG_TOOLCHAIN_HAS_ZACAS) && riscv_isa_extension_available(NULL, ZABHA) && riscv_isa_extension_available(NULL, ZACAS)) { using_ext = "using Zabha"; diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h index e7b032dfd17f..632ced07bca4 100644 --- a/arch/riscv/net/bpf_jit.h +++ b/arch/riscv/net/bpf_jit.h @@ -13,21 +13,15 @@ #include <linux/filter.h> #include <asm/cacheflush.h> +/* verify runtime detection extension status */ +#define rv_ext_enabled(ext) \ + (IS_ENABLED(CONFIG_RISCV_ISA_##ext) && riscv_has_extension_likely(RISCV_ISA_EXT_##ext)) + static inline bool rvc_enabled(void) { return IS_ENABLED(CONFIG_RISCV_ISA_C); } -static inline bool rvzba_enabled(void) -{ - return IS_ENABLED(CONFIG_RISCV_ISA_ZBA) && riscv_has_extension_likely(RISCV_ISA_EXT_ZBA); -} - -static inline bool rvzbb_enabled(void) -{ - return IS_ENABLED(CONFIG_RISCV_ISA_ZBB) && riscv_has_extension_likely(RISCV_ISA_EXT_ZBB); -} - enum { RV_REG_ZERO = 0, /* The constant value 0 */ RV_REG_RA = 1, /* Return address */ @@ -84,6 +78,8 @@ struct rv_jit_context { int epilogue_offset; int *offset; /* BPF to RV */ int nexentries; + int ex_insn_off; + int ex_jmp_off; unsigned long flags; int stack_size; u64 arena_vm_start; @@ -757,6 +753,17 @@ static inline u16 rvc_swsp(u32 imm8, u8 rs2) return rv_css_insn(0x6, imm, rs2, 0x2); } +/* RVZACAS instructions. */ +static inline u32 rvzacas_amocas_w(u8 rd, u8 rs2, u8 rs1, u8 aq, u8 rl) +{ + return rv_amo_insn(0x5, aq, rl, rs2, rs1, 2, rd, 0x2f); +} + +static inline u32 rvzacas_amocas_d(u8 rd, u8 rs2, u8 rs1, u8 aq, u8 rl) +{ + return rv_amo_insn(0x5, aq, rl, rs2, rs1, 3, rd, 0x2f); +} + /* RVZBA instructions. */ static inline u32 rvzba_sh2add(u8 rd, u8 rs1, u8 rs2) { @@ -1123,7 +1130,7 @@ static inline void emit_sw(u8 rs1, s32 off, u8 rs2, struct rv_jit_context *ctx) static inline void emit_sh2add(u8 rd, u8 rs1, u8 rs2, struct rv_jit_context *ctx) { - if (rvzba_enabled()) { + if (rv_ext_enabled(ZBA)) { emit(rvzba_sh2add(rd, rs1, rs2), ctx); return; } @@ -1134,7 +1141,7 @@ static inline void emit_sh2add(u8 rd, u8 rs1, u8 rs2, struct rv_jit_context *ctx static inline void emit_sh3add(u8 rd, u8 rs1, u8 rs2, struct rv_jit_context *ctx) { - if (rvzba_enabled()) { + if (rv_ext_enabled(ZBA)) { emit(rvzba_sh3add(rd, rs1, rs2), ctx); return; } @@ -1184,7 +1191,7 @@ static inline void emit_subw(u8 rd, u8 rs1, u8 rs2, struct rv_jit_context *ctx) static inline void emit_sextb(u8 rd, u8 rs, struct rv_jit_context *ctx) { - if (rvzbb_enabled()) { + if (rv_ext_enabled(ZBB)) { emit(rvzbb_sextb(rd, rs), ctx); return; } @@ -1195,7 +1202,7 @@ static inline void emit_sextb(u8 rd, u8 rs, struct rv_jit_context *ctx) static inline void emit_sexth(u8 rd, u8 rs, struct rv_jit_context *ctx) { - if (rvzbb_enabled()) { + if (rv_ext_enabled(ZBB)) { emit(rvzbb_sexth(rd, rs), ctx); return; } @@ -1211,7 +1218,7 @@ static inline void emit_sextw(u8 rd, u8 rs, struct rv_jit_context *ctx) static inline void emit_zexth(u8 rd, u8 rs, struct rv_jit_context *ctx) { - if (rvzbb_enabled()) { + if (rv_ext_enabled(ZBB)) { emit(rvzbb_zexth(rd, rs), ctx); return; } @@ -1222,7 +1229,7 @@ static inline void emit_zexth(u8 rd, u8 rs, struct rv_jit_context *ctx) static inline void emit_zextw(u8 rd, u8 rs, struct rv_jit_context *ctx) { - if (rvzba_enabled()) { + if (rv_ext_enabled(ZBA)) { emit(rvzba_zextw(rd, rs), ctx); return; } @@ -1233,7 +1240,7 @@ static inline void emit_zextw(u8 rd, u8 rs, struct rv_jit_context *ctx) static inline void emit_bswap(u8 rd, s32 imm, struct rv_jit_context *ctx) { - if (rvzbb_enabled()) { + if (rv_ext_enabled(ZBB)) { int bits = 64 - imm; emit(rvzbb_rev8(rd, rd), ctx); @@ -1289,6 +1296,35 @@ out_be: emit_mv(rd, RV_REG_T2, ctx); } +static inline void emit_cmpxchg(u8 rd, u8 rs, u8 r0, bool is64, struct rv_jit_context *ctx) +{ + int jmp_offset; + + if (rv_ext_enabled(ZACAS)) { + ctx->ex_insn_off = ctx->ninsns; + emit(is64 ? rvzacas_amocas_d(r0, rs, rd, 1, 1) : + rvzacas_amocas_w(r0, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; + if (!is64) + emit_zextw(r0, r0, ctx); + return; + } + + if (is64) + emit_mv(RV_REG_T2, r0, ctx); + else + emit_addiw(RV_REG_T2, r0, 0, ctx); + emit(is64 ? rv_lr_d(r0, 0, rd, 0, 0) : + rv_lr_w(r0, 0, rd, 0, 0), ctx); + jmp_offset = ninsns_rvoff(8); + emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx); + emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) : + rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx); + jmp_offset = ninsns_rvoff(-6); + emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx); + emit_fence_rw_rw(ctx); +} + #endif /* __riscv_xlen == 64 */ void bpf_jit_build_prologue(struct rv_jit_context *ctx, bool is_subprog); diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 206b2b8552a7..45cbc7c6fe49 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -473,138 +473,92 @@ static inline void emit_kcfi(u32 hash, struct rv_jit_context *ctx) emit(hash, ctx); } -static int emit_load_8(bool sign_ext, u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) +static void emit_ldx_insn(u8 rd, s16 off, u8 rs, u8 size, bool sign_ext, + struct rv_jit_context *ctx) { - int insns_start; - - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lb(rd, off, rs), ctx); - else - emit(rv_lbu(rd, off, rs), ctx); - return ctx->ninsns - insns_start; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rs, ctx); - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lb(rd, 0, RV_REG_T1), ctx); - else - emit(rv_lbu(rd, 0, RV_REG_T1), ctx); - return ctx->ninsns - insns_start; -} - -static int emit_load_16(bool sign_ext, u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) -{ - int insns_start; - - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lh(rd, off, rs), ctx); - else - emit(rv_lhu(rd, off, rs), ctx); - return ctx->ninsns - insns_start; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rs, ctx); - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lh(rd, 0, RV_REG_T1), ctx); - else - emit(rv_lhu(rd, 0, RV_REG_T1), ctx); - return ctx->ninsns - insns_start; -} - -static int emit_load_32(bool sign_ext, u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) -{ - int insns_start; - - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lw(rd, off, rs), ctx); - else - emit(rv_lwu(rd, off, rs), ctx); - return ctx->ninsns - insns_start; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rs, ctx); - insns_start = ctx->ninsns; - if (sign_ext) - emit(rv_lw(rd, 0, RV_REG_T1), ctx); - else - emit(rv_lwu(rd, 0, RV_REG_T1), ctx); - return ctx->ninsns - insns_start; -} - -static int emit_load_64(bool sign_ext, u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) -{ - int insns_start; - - if (is_12b_int(off)) { - insns_start = ctx->ninsns; + switch (size) { + case BPF_B: + emit(sign_ext ? rv_lb(rd, off, rs) : rv_lbu(rd, off, rs), ctx); + break; + case BPF_H: + emit(sign_ext ? rv_lh(rd, off, rs) : rv_lhu(rd, off, rs), ctx); + break; + case BPF_W: + emit(sign_ext ? rv_lw(rd, off, rs) : rv_lwu(rd, off, rs), ctx); + break; + case BPF_DW: emit_ld(rd, off, rs, ctx); - return ctx->ninsns - insns_start; + break; } - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rs, ctx); - insns_start = ctx->ninsns; - emit_ld(rd, 0, RV_REG_T1, ctx); - return ctx->ninsns - insns_start; } -static void emit_store_8(u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) +static void emit_stx_insn(u8 rd, s16 off, u8 rs, u8 size, struct rv_jit_context *ctx) { - if (is_12b_int(off)) { + switch (size) { + case BPF_B: emit(rv_sb(rd, off, rs), ctx); - return; + break; + case BPF_H: + emit(rv_sh(rd, off, rs), ctx); + break; + case BPF_W: + emit_sw(rd, off, rs, ctx); + break; + case BPF_DW: + emit_sd(rd, off, rs, ctx); + break; } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - emit(rv_sb(RV_REG_T1, 0, rs), ctx); } -static void emit_store_16(u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) +static void emit_ldx(u8 rd, s16 off, u8 rs, u8 size, bool sign_ext, + struct rv_jit_context *ctx) { if (is_12b_int(off)) { - emit(rv_sh(rd, off, rs), ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_ldx_insn(rd, off, rs, size, sign_ext, ctx); + ctx->ex_jmp_off = ctx->ninsns; return; } emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - emit(rv_sh(RV_REG_T1, 0, rs), ctx); + emit_add(RV_REG_T1, RV_REG_T1, rs, ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_ldx_insn(rd, 0, RV_REG_T1, size, sign_ext, ctx); + ctx->ex_jmp_off = ctx->ninsns; } -static void emit_store_32(u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) +static void emit_st(u8 rd, s16 off, s32 imm, u8 size, struct rv_jit_context *ctx) { + emit_imm(RV_REG_T1, imm, ctx); if (is_12b_int(off)) { - emit_sw(rd, off, rs, ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_stx_insn(rd, off, RV_REG_T1, size, ctx); + ctx->ex_jmp_off = ctx->ninsns; return; } - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - emit_sw(RV_REG_T1, 0, rs, ctx); + emit_imm(RV_REG_T2, off, ctx); + emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_stx_insn(RV_REG_T2, 0, RV_REG_T1, size, ctx); + ctx->ex_jmp_off = ctx->ninsns; } -static void emit_store_64(u8 rd, s32 off, u8 rs, struct rv_jit_context *ctx) +static void emit_stx(u8 rd, s16 off, u8 rs, u8 size, struct rv_jit_context *ctx) { if (is_12b_int(off)) { - emit_sd(rd, off, rs, ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_stx_insn(rd, off, rs, size, ctx); + ctx->ex_jmp_off = ctx->ninsns; return; } emit_imm(RV_REG_T1, off, ctx); emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - emit_sd(RV_REG_T1, 0, rs, ctx); + ctx->ex_insn_off = ctx->ninsns; + emit_stx_insn(RV_REG_T1, 0, rs, size, ctx); + ctx->ex_jmp_off = ctx->ninsns; } static int emit_atomic_ld_st(u8 rd, u8 rs, const struct bpf_insn *insn, @@ -617,20 +571,12 @@ static int emit_atomic_ld_st(u8 rd, u8 rs, const struct bpf_insn *insn, switch (imm) { /* dst_reg = load_acquire(src_reg + off16) */ case BPF_LOAD_ACQ: - switch (BPF_SIZE(code)) { - case BPF_B: - emit_load_8(false, rd, off, rs, ctx); - break; - case BPF_H: - emit_load_16(false, rd, off, rs, ctx); - break; - case BPF_W: - emit_load_32(false, rd, off, rs, ctx); - break; - case BPF_DW: - emit_load_64(false, rd, off, rs, ctx); - break; + if (BPF_MODE(code) == BPF_PROBE_ATOMIC) { + emit_add(RV_REG_T2, rs, RV_REG_ARENA, ctx); + rs = RV_REG_T2; } + + emit_ldx(rd, off, rs, BPF_SIZE(code), false, ctx); emit_fence_r_rw(ctx); /* If our next insn is a redundant zext, return 1 to tell @@ -641,21 +587,13 @@ static int emit_atomic_ld_st(u8 rd, u8 rs, const struct bpf_insn *insn, break; /* store_release(dst_reg + off16, src_reg) */ case BPF_STORE_REL: - emit_fence_rw_w(ctx); - switch (BPF_SIZE(code)) { - case BPF_B: - emit_store_8(rd, off, rs, ctx); - break; - case BPF_H: - emit_store_16(rd, off, rs, ctx); - break; - case BPF_W: - emit_store_32(rd, off, rs, ctx); - break; - case BPF_DW: - emit_store_64(rd, off, rs, ctx); - break; + if (BPF_MODE(code) == BPF_PROBE_ATOMIC) { + emit_add(RV_REG_T2, rd, RV_REG_ARENA, ctx); + rd = RV_REG_T2; } + + emit_fence_rw_w(ctx); + emit_stx(rd, off, rs, BPF_SIZE(code), ctx); break; default: pr_err_once("bpf-jit: invalid atomic load/store opcode %02x\n", imm); @@ -668,17 +606,15 @@ static int emit_atomic_ld_st(u8 rd, u8 rs, const struct bpf_insn *insn, static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn, struct rv_jit_context *ctx) { - u8 r0, code = insn->code; + u8 code = insn->code; s16 off = insn->off; s32 imm = insn->imm; - int jmp_offset; - bool is64; + bool is64 = BPF_SIZE(code) == BPF_DW; if (BPF_SIZE(code) != BPF_W && BPF_SIZE(code) != BPF_DW) { pr_err_once("bpf-jit: 1- and 2-byte RMW atomics are not supported\n"); return -EINVAL; } - is64 = BPF_SIZE(code) == BPF_DW; if (off) { if (is_12b_int(off)) { @@ -690,72 +626,82 @@ static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn, rd = RV_REG_T1; } + if (BPF_MODE(code) == BPF_PROBE_ATOMIC) { + emit_add(RV_REG_T1, rd, RV_REG_ARENA, ctx); + rd = RV_REG_T1; + } + switch (imm) { /* lock *(u32/u64 *)(dst_reg + off16) <op>= src_reg */ case BPF_ADD: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoadd_d(RV_REG_ZERO, rs, rd, 0, 0) : rv_amoadd_w(RV_REG_ZERO, rs, rd, 0, 0), ctx); + ctx->ex_jmp_off = ctx->ninsns; break; case BPF_AND: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoand_d(RV_REG_ZERO, rs, rd, 0, 0) : rv_amoand_w(RV_REG_ZERO, rs, rd, 0, 0), ctx); + ctx->ex_jmp_off = ctx->ninsns; break; case BPF_OR: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoor_d(RV_REG_ZERO, rs, rd, 0, 0) : rv_amoor_w(RV_REG_ZERO, rs, rd, 0, 0), ctx); + ctx->ex_jmp_off = ctx->ninsns; break; case BPF_XOR: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoxor_d(RV_REG_ZERO, rs, rd, 0, 0) : rv_amoxor_w(RV_REG_ZERO, rs, rd, 0, 0), ctx); + ctx->ex_jmp_off = ctx->ninsns; break; /* src_reg = atomic_fetch_<op>(dst_reg + off16, src_reg) */ case BPF_ADD | BPF_FETCH: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoadd_d(rs, rs, rd, 1, 1) : rv_amoadd_w(rs, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; if (!is64) emit_zextw(rs, rs, ctx); break; case BPF_AND | BPF_FETCH: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoand_d(rs, rs, rd, 1, 1) : rv_amoand_w(rs, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; if (!is64) emit_zextw(rs, rs, ctx); break; case BPF_OR | BPF_FETCH: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoor_d(rs, rs, rd, 1, 1) : rv_amoor_w(rs, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; if (!is64) emit_zextw(rs, rs, ctx); break; case BPF_XOR | BPF_FETCH: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoxor_d(rs, rs, rd, 1, 1) : rv_amoxor_w(rs, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; if (!is64) emit_zextw(rs, rs, ctx); break; /* src_reg = atomic_xchg(dst_reg + off16, src_reg); */ case BPF_XCHG: + ctx->ex_insn_off = ctx->ninsns; emit(is64 ? rv_amoswap_d(rs, rs, rd, 1, 1) : rv_amoswap_w(rs, rs, rd, 1, 1), ctx); + ctx->ex_jmp_off = ctx->ninsns; if (!is64) emit_zextw(rs, rs, ctx); break; /* r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg); */ case BPF_CMPXCHG: - r0 = bpf_to_rv_reg(BPF_REG_0, ctx); - if (is64) - emit_mv(RV_REG_T2, r0, ctx); - else - emit_addiw(RV_REG_T2, r0, 0, ctx); - emit(is64 ? rv_lr_d(r0, 0, rd, 0, 0) : - rv_lr_w(r0, 0, rd, 0, 0), ctx); - jmp_offset = ninsns_rvoff(8); - emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx); - emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) : - rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx); - jmp_offset = ninsns_rvoff(-6); - emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx); - emit_fence_rw_rw(ctx); + emit_cmpxchg(rd, rs, regmap[BPF_REG_0], is64, ctx); break; default: pr_err_once("bpf-jit: invalid atomic RMW opcode %02x\n", imm); @@ -765,6 +711,39 @@ static int emit_atomic_rmw(u8 rd, u8 rs, const struct bpf_insn *insn, return 0; } +/* + * Sign-extend the register if necessary + */ +static int sign_extend(u8 rd, u8 rs, u8 sz, bool sign, struct rv_jit_context *ctx) +{ + if (!sign && (sz == 1 || sz == 2)) { + if (rd != rs) + emit_mv(rd, rs, ctx); + return 0; + } + + switch (sz) { + case 1: + emit_sextb(rd, rs, ctx); + break; + case 2: + emit_sexth(rd, rs, ctx); + break; + case 4: + emit_sextw(rd, rs, ctx); + break; + case 8: + if (rd != rs) + emit_mv(rd, rs, ctx); + break; + default: + pr_err("bpf-jit: invalid size %d for sign_extend\n", sz); + return -EINVAL; + } + + return 0; +} + #define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0) #define BPF_FIXUP_REG_MASK GENMASK(31, 27) #define REG_DONT_CLEAR_MARKER 0 /* RV_REG_ZERO unused in pt_regmap */ @@ -783,9 +762,8 @@ bool ex_handler_bpf(const struct exception_table_entry *ex, } /* For accesses to BTF pointers, add an entry to the exception table */ -static int add_exception_handler(const struct bpf_insn *insn, - struct rv_jit_context *ctx, - int dst_reg, int insn_len) +static int add_exception_handler(const struct bpf_insn *insn, int dst_reg, + struct rv_jit_context *ctx) { struct exception_table_entry *ex; unsigned long pc; @@ -793,21 +771,23 @@ static int add_exception_handler(const struct bpf_insn *insn, off_t fixup_offset; if (!ctx->insns || !ctx->ro_insns || !ctx->prog->aux->extable || - (BPF_MODE(insn->code) != BPF_PROBE_MEM && BPF_MODE(insn->code) != BPF_PROBE_MEMSX && - BPF_MODE(insn->code) != BPF_PROBE_MEM32)) + ctx->ex_insn_off <= 0 || ctx->ex_jmp_off <= 0) return 0; - if (WARN_ON_ONCE(ctx->nexentries >= ctx->prog->aux->num_exentries)) - return -EINVAL; + if (BPF_MODE(insn->code) != BPF_PROBE_MEM && + BPF_MODE(insn->code) != BPF_PROBE_MEMSX && + BPF_MODE(insn->code) != BPF_PROBE_MEM32 && + BPF_MODE(insn->code) != BPF_PROBE_ATOMIC) + return 0; - if (WARN_ON_ONCE(insn_len > ctx->ninsns)) + if (WARN_ON_ONCE(ctx->nexentries >= ctx->prog->aux->num_exentries)) return -EINVAL; - if (WARN_ON_ONCE(!rvc_enabled() && insn_len == 1)) + if (WARN_ON_ONCE(ctx->ex_insn_off > ctx->ninsns || ctx->ex_jmp_off > ctx->ninsns)) return -EINVAL; ex = &ctx->prog->aux->extable[ctx->nexentries]; - pc = (unsigned long)&ctx->ro_insns[ctx->ninsns - insn_len]; + pc = (unsigned long)&ctx->ro_insns[ctx->ex_insn_off]; /* * This is the relative offset of the instruction that may fault from @@ -831,7 +811,7 @@ static int add_exception_handler(const struct bpf_insn *insn, * that may fault. The execution will jump to this after handling the * fault. */ - fixup_offset = (long)&ex->fixup - (pc + insn_len * sizeof(u16)); + fixup_offset = (long)&ex->fixup - (long)&ctx->ro_insns[ctx->ex_jmp_off]; if (!FIELD_FIT(BPF_FIXUP_OFFSET_MASK, fixup_offset)) return -ERANGE; @@ -848,6 +828,8 @@ static int add_exception_handler(const struct bpf_insn *insn, FIELD_PREP(BPF_FIXUP_REG_MASK, dst_reg); ex->type = EX_TYPE_BPF; + ctx->ex_insn_off = 0; + ctx->ex_jmp_off = 0; ctx->nexentries++; return 0; } @@ -1079,10 +1061,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, stack_size += 16; save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET); - if (save_ret) { + if (save_ret) stack_size += 16; /* Save both A5 (BPF R0) and A0 */ - retval_off = stack_size; - } + retval_off = stack_size; stack_size += nr_arg_slots * 8; args_off = stack_size; @@ -1226,8 +1207,15 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, restore_args(min_t(int, nr_arg_slots, RV_MAX_REG_ARGS), args_off, ctx); if (save_ret) { - emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); emit_ld(regmap[BPF_REG_0], -(retval_off - 8), RV_REG_FP, ctx); + if (is_struct_ops) { + ret = sign_extend(RV_REG_A0, regmap[BPF_REG_0], m->ret_size, + m->ret_flags & BTF_FMODEL_SIGNED_ARG, ctx); + if (ret) + goto out; + } else { + emit_ld(RV_REG_A0, -retval_off, RV_REG_FP, ctx); + } } emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx); @@ -1320,7 +1308,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, goto out; } - bpf_flush_icache(ro_image, ro_image_end); out: kvfree(image); return ret < 0 ? ret : size; @@ -1857,7 +1844,6 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW: { bool sign_ext; - int insn_len; sign_ext = BPF_MODE(insn->code) == BPF_MEMSX || BPF_MODE(insn->code) == BPF_PROBE_MEMSX; @@ -1867,22 +1853,9 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, rs = RV_REG_T2; } - switch (BPF_SIZE(code)) { - case BPF_B: - insn_len = emit_load_8(sign_ext, rd, off, rs, ctx); - break; - case BPF_H: - insn_len = emit_load_16(sign_ext, rd, off, rs, ctx); - break; - case BPF_W: - insn_len = emit_load_32(sign_ext, rd, off, rs, ctx); - break; - case BPF_DW: - insn_len = emit_load_64(sign_ext, rd, off, rs, ctx); - break; - } + emit_ldx(rd, off, rs, BPF_SIZE(code), sign_ext, ctx); - ret = add_exception_handler(insn, ctx, rd, insn_len); + ret = add_exception_handler(insn, rd, ctx); if (ret) return ret; @@ -1890,238 +1863,73 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, return 1; break; } + /* speculation barrier */ case BPF_ST | BPF_NOSPEC: break; /* ST: *(size *)(dst + off) = imm */ case BPF_ST | BPF_MEM | BPF_B: - emit_imm(RV_REG_T1, imm, ctx); - if (is_12b_int(off)) { - emit(rv_sb(rd, off, RV_REG_T1), ctx); - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - emit(rv_sb(RV_REG_T2, 0, RV_REG_T1), ctx); - break; - case BPF_ST | BPF_MEM | BPF_H: - emit_imm(RV_REG_T1, imm, ctx); - if (is_12b_int(off)) { - emit(rv_sh(rd, off, RV_REG_T1), ctx); - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - emit(rv_sh(RV_REG_T2, 0, RV_REG_T1), ctx); - break; case BPF_ST | BPF_MEM | BPF_W: - emit_imm(RV_REG_T1, imm, ctx); - if (is_12b_int(off)) { - emit_sw(rd, off, RV_REG_T1, ctx); - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - emit_sw(RV_REG_T2, 0, RV_REG_T1, ctx); - break; case BPF_ST | BPF_MEM | BPF_DW: - emit_imm(RV_REG_T1, imm, ctx); - if (is_12b_int(off)) { - emit_sd(rd, off, RV_REG_T1, ctx); - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - emit_sd(RV_REG_T2, 0, RV_REG_T1, ctx); - break; - + /* ST | PROBE_MEM32: *(size *)(dst + RV_REG_ARENA + off) = imm */ case BPF_ST | BPF_PROBE_MEM32 | BPF_B: case BPF_ST | BPF_PROBE_MEM32 | BPF_H: case BPF_ST | BPF_PROBE_MEM32 | BPF_W: case BPF_ST | BPF_PROBE_MEM32 | BPF_DW: - { - int insn_len, insns_start; - - emit_add(RV_REG_T3, rd, RV_REG_ARENA, ctx); - rd = RV_REG_T3; - - /* Load imm to a register then store it */ - emit_imm(RV_REG_T1, imm, ctx); - - switch (BPF_SIZE(code)) { - case BPF_B: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit(rv_sb(rd, off, RV_REG_T1), ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - insns_start = ctx->ninsns; - emit(rv_sb(RV_REG_T2, 0, RV_REG_T1), ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_H: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit(rv_sh(rd, off, RV_REG_T1), ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - insns_start = ctx->ninsns; - emit(rv_sh(RV_REG_T2, 0, RV_REG_T1), ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_W: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit_sw(rd, off, RV_REG_T1, ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - insns_start = ctx->ninsns; - emit_sw(RV_REG_T2, 0, RV_REG_T1, ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_DW: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit_sd(rd, off, RV_REG_T1, ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T2, off, ctx); - emit_add(RV_REG_T2, RV_REG_T2, rd, ctx); - insns_start = ctx->ninsns; - emit_sd(RV_REG_T2, 0, RV_REG_T1, ctx); - insn_len = ctx->ninsns - insns_start; - break; + if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) { + emit_add(RV_REG_T3, rd, RV_REG_ARENA, ctx); + rd = RV_REG_T3; } - ret = add_exception_handler(insn, ctx, REG_DONT_CLEAR_MARKER, - insn_len); + emit_st(rd, off, imm, BPF_SIZE(code), ctx); + + ret = add_exception_handler(insn, REG_DONT_CLEAR_MARKER, ctx); if (ret) return ret; - break; - } /* STX: *(size *)(dst + off) = src */ case BPF_STX | BPF_MEM | BPF_B: - emit_store_8(rd, off, rs, ctx); - break; case BPF_STX | BPF_MEM | BPF_H: - emit_store_16(rd, off, rs, ctx); - break; case BPF_STX | BPF_MEM | BPF_W: - emit_store_32(rd, off, rs, ctx); - break; case BPF_STX | BPF_MEM | BPF_DW: - emit_store_64(rd, off, rs, ctx); + /* STX | PROBE_MEM32: *(size *)(dst + RV_REG_ARENA + off) = src */ + case BPF_STX | BPF_PROBE_MEM32 | BPF_B: + case BPF_STX | BPF_PROBE_MEM32 | BPF_H: + case BPF_STX | BPF_PROBE_MEM32 | BPF_W: + case BPF_STX | BPF_PROBE_MEM32 | BPF_DW: + if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) { + emit_add(RV_REG_T2, rd, RV_REG_ARENA, ctx); + rd = RV_REG_T2; + } + + emit_stx(rd, off, rs, BPF_SIZE(code), ctx); + + ret = add_exception_handler(insn, REG_DONT_CLEAR_MARKER, ctx); + if (ret) + return ret; break; + + /* Atomics */ case BPF_STX | BPF_ATOMIC | BPF_B: case BPF_STX | BPF_ATOMIC | BPF_H: case BPF_STX | BPF_ATOMIC | BPF_W: case BPF_STX | BPF_ATOMIC | BPF_DW: + case BPF_STX | BPF_PROBE_ATOMIC | BPF_B: + case BPF_STX | BPF_PROBE_ATOMIC | BPF_H: + case BPF_STX | BPF_PROBE_ATOMIC | BPF_W: + case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW: if (bpf_atomic_is_load_store(insn)) ret = emit_atomic_ld_st(rd, rs, insn, ctx); else ret = emit_atomic_rmw(rd, rs, insn, ctx); - if (ret) - return ret; - break; - case BPF_STX | BPF_PROBE_MEM32 | BPF_B: - case BPF_STX | BPF_PROBE_MEM32 | BPF_H: - case BPF_STX | BPF_PROBE_MEM32 | BPF_W: - case BPF_STX | BPF_PROBE_MEM32 | BPF_DW: - { - int insn_len, insns_start; - - emit_add(RV_REG_T2, rd, RV_REG_ARENA, ctx); - rd = RV_REG_T2; - - switch (BPF_SIZE(code)) { - case BPF_B: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit(rv_sb(rd, off, rs), ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - insns_start = ctx->ninsns; - emit(rv_sb(RV_REG_T1, 0, rs), ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_H: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit(rv_sh(rd, off, rs), ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - insns_start = ctx->ninsns; - emit(rv_sh(RV_REG_T1, 0, rs), ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_W: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit_sw(rd, off, rs, ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - insns_start = ctx->ninsns; - emit_sw(RV_REG_T1, 0, rs, ctx); - insn_len = ctx->ninsns - insns_start; - break; - case BPF_DW: - if (is_12b_int(off)) { - insns_start = ctx->ninsns; - emit_sd(rd, off, rs, ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - emit_imm(RV_REG_T1, off, ctx); - emit_add(RV_REG_T1, RV_REG_T1, rd, ctx); - insns_start = ctx->ninsns; - emit_sd(RV_REG_T1, 0, rs, ctx); - insn_len = ctx->ninsns - insns_start; - break; - } - - ret = add_exception_handler(insn, ctx, REG_DONT_CLEAR_MARKER, - insn_len); + ret = ret ?: add_exception_handler(insn, REG_DONT_CLEAR_MARKER, ctx); if (ret) return ret; - break; - } default: pr_err("bpf-jit: unknown opcode %02x\n", code); @@ -2249,6 +2057,25 @@ bool bpf_jit_supports_arena(void) return true; } +bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena) +{ + if (in_arena) { + switch (insn->code) { + case BPF_STX | BPF_ATOMIC | BPF_W: + case BPF_STX | BPF_ATOMIC | BPF_DW: + if (insn->imm == BPF_CMPXCHG) + return rv_ext_enabled(ZACAS); + break; + case BPF_LDX | BPF_MEMSX | BPF_B: + case BPF_LDX | BPF_MEMSX | BPF_H: + case BPF_LDX | BPF_MEMSX | BPF_W: + return false; + } + } + + return true; +} + bool bpf_jit_supports_percpu_insn(void) { return true; diff --git a/arch/s390/net/Makefile b/arch/s390/net/Makefile index 8cab6deb0403..9275cf63192a 100644 --- a/arch/s390/net/Makefile +++ b/arch/s390/net/Makefile @@ -2,5 +2,5 @@ # # Arch-specific network modules # -obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o +obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o bpf_timed_may_goto.o obj-$(CONFIG_HAVE_PNETID) += pnet.o diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index bb17efe29d65..cf461d76e9da 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -675,20 +675,6 @@ static void bpf_jit_prologue(struct bpf_jit *jit, struct bpf_prog *fp) } while (0) /* - * Call r1 either directly or via __s390_indirect_jump_r1 thunk - */ -static void call_r1(struct bpf_jit *jit) -{ - if (nospec_uses_trampoline()) - /* brasl %r14,__s390_indirect_jump_r1 */ - EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, - __s390_indirect_jump_r1); - else - /* basr %r14,%r1 */ - EMIT2(0x0d00, REG_14, REG_1); -} - -/* * Function epilogue */ static void bpf_jit_epilogue(struct bpf_jit *jit) @@ -1790,20 +1776,21 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, REG_SET_SEEN(BPF_REG_5); jit->seen |= SEEN_FUNC; + /* * Copy the tail call counter to where the callee expects it. - * - * Note 1: The callee can increment the tail call counter, but - * we do not load it back, since the x86 JIT does not do this - * either. - * - * Note 2: We assume that the verifier does not let us call the - * main program, which clears the tail call counter on entry. */ - /* mvc tail_call_cnt(4,%r15),frame_off+tail_call_cnt(%r15) */ - _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt), - 0xf000 | (jit->frame_off + - offsetof(struct prog_frame, tail_call_cnt))); + + if (insn->src_reg == BPF_PSEUDO_CALL) + /* + * mvc tail_call_cnt(4,%r15), + * frame_off+tail_call_cnt(%r15) + */ + _EMIT6(0xd203f000 | offsetof(struct prog_frame, + tail_call_cnt), + 0xf000 | (jit->frame_off + + offsetof(struct prog_frame, + tail_call_cnt))); /* Sign-extend the kfunc arguments. */ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { @@ -1819,12 +1806,38 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, } } - /* lgrl %w1,func */ - EMIT6_PCREL_RILB(0xc4080000, REG_W1, _EMIT_CONST_U64(func)); - /* %r1() */ - call_r1(jit); - /* lgr %b0,%r2: load return value into %b0 */ - EMIT4(0xb9040000, BPF_REG_0, REG_2); + if ((void *)func == arch_bpf_timed_may_goto) { + /* + * arch_bpf_timed_may_goto() has a special ABI: the + * parameters are in BPF_REG_AX and BPF_REG_10; the + * return value is in BPF_REG_AX; and all GPRs except + * REG_W0, REG_W1, and BPF_REG_AX are callee-saved. + */ + + /* brasl %r0,func */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_0, (void *)func); + } else { + /* brasl %r14,func */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, (void *)func); + /* lgr %b0,%r2: load return value into %b0 */ + EMIT4(0xb9040000, BPF_REG_0, REG_2); + } + + /* + * Copy the potentially updated tail call counter back. + */ + + if (insn->src_reg == BPF_PSEUDO_CALL) + /* + * mvc frame_off+tail_call_cnt(%r15), + * tail_call_cnt(4,%r15) + */ + _EMIT6(0xd203f000 | (jit->frame_off + + offsetof(struct prog_frame, + tail_call_cnt)), + 0xf000 | offsetof(struct prog_frame, + tail_call_cnt)); + break; } case BPF_JMP | BPF_TAIL_CALL: { @@ -2517,14 +2530,12 @@ static int invoke_bpf_prog(struct bpf_tramp_jit *tjit, * goto skip; */ - /* %r1 = __bpf_prog_enter */ - load_imm64(jit, REG_1, (u64)bpf_trampoline_enter(p)); /* %r2 = p */ load_imm64(jit, REG_2, (u64)p); /* la %r3,run_ctx_off(%r15) */ EMIT4_DISP(0x41000000, REG_3, REG_15, tjit->run_ctx_off); - /* %r1() */ - call_r1(jit); + /* brasl %r14,__bpf_prog_enter */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, bpf_trampoline_enter(p)); /* ltgr %r7,%r2 */ EMIT4(0xb9020000, REG_7, REG_2); /* brcl 8,skip */ @@ -2535,15 +2546,13 @@ static int invoke_bpf_prog(struct bpf_tramp_jit *tjit, * retval = bpf_func(args, p->insnsi); */ - /* %r1 = p->bpf_func */ - load_imm64(jit, REG_1, (u64)p->bpf_func); /* la %r2,bpf_args_off(%r15) */ EMIT4_DISP(0x41000000, REG_2, REG_15, tjit->bpf_args_off); /* %r3 = p->insnsi */ if (!p->jited) load_imm64(jit, REG_3, (u64)p->insnsi); - /* %r1() */ - call_r1(jit); + /* brasl %r14,p->bpf_func */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, p->bpf_func); /* stg %r2,retval_off(%r15) */ if (save_ret) { if (sign_extend(jit, REG_2, m->ret_size, m->ret_flags)) @@ -2560,16 +2569,14 @@ static int invoke_bpf_prog(struct bpf_tramp_jit *tjit, * __bpf_prog_exit(p, start, &run_ctx); */ - /* %r1 = __bpf_prog_exit */ - load_imm64(jit, REG_1, (u64)bpf_trampoline_exit(p)); /* %r2 = p */ load_imm64(jit, REG_2, (u64)p); /* lgr %r3,%r7 */ EMIT4(0xb9040000, REG_3, REG_7); /* la %r4,run_ctx_off(%r15) */ EMIT4_DISP(0x41000000, REG_4, REG_15, tjit->run_ctx_off); - /* %r1() */ - call_r1(jit); + /* brasl %r14,__bpf_prog_exit */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, bpf_trampoline_exit(p)); return 0; } @@ -2729,9 +2736,6 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, /* lgr %r8,%r0 */ EMIT4(0xb9040000, REG_8, REG_0); - } else { - /* %r8 = func_addr + S390X_PATCH_SIZE */ - load_imm64(jit, REG_8, (u64)func_addr + S390X_PATCH_SIZE); } /* @@ -2757,12 +2761,10 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, * __bpf_tramp_enter(im); */ - /* %r1 = __bpf_tramp_enter */ - load_imm64(jit, REG_1, (u64)__bpf_tramp_enter); /* %r2 = im */ load_imm64(jit, REG_2, (u64)im); - /* %r1() */ - call_r1(jit); + /* brasl %r14,__bpf_tramp_enter */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, __bpf_tramp_enter); } for (i = 0; i < fentry->nr_links; i++) @@ -2815,13 +2817,25 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, /* mvc tail_call_cnt(4,%r15),tccnt_off(%r15) */ _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt), 0xf000 | tjit->tccnt_off); - /* lgr %r1,%r8 */ - EMIT4(0xb9040000, REG_1, REG_8); - /* %r1() */ - call_r1(jit); + if (flags & BPF_TRAMP_F_ORIG_STACK) { + if (nospec_uses_trampoline()) + /* brasl %r14,__s390_indirect_jump_r8 */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, + __s390_indirect_jump_r8); + else + /* basr %r14,%r8 */ + EMIT2(0x0d00, REG_14, REG_8); + } else { + /* brasl %r14,func_addr+S390X_PATCH_SIZE */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, + func_addr + S390X_PATCH_SIZE); + } /* stg %r2,retval_off(%r15) */ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15, tjit->retval_off); + /* mvc tccnt_off(%r15),tail_call_cnt(4,%r15) */ + _EMIT6(0xd203f000 | tjit->tccnt_off, + 0xf000 | offsetof(struct prog_frame, tail_call_cnt)); im->ip_after_call = jit->prg_buf + jit->prg; @@ -2846,12 +2860,10 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, * __bpf_tramp_exit(im); */ - /* %r1 = __bpf_tramp_exit */ - load_imm64(jit, REG_1, (u64)__bpf_tramp_exit); /* %r2 = im */ load_imm64(jit, REG_2, (u64)im); - /* %r1() */ - call_r1(jit); + /* brasl %r14,__bpf_tramp_exit */ + EMIT6_PCREL_RILB_PTR(0xc0050000, REG_14, __bpf_tramp_exit); } /* lmg %r2,%rN,reg_args_off(%r15) */ @@ -2860,7 +2872,8 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, REG_2 + (nr_reg_args - 1), REG_15, tjit->reg_args_off); /* lgr %r1,%r8 */ - if (!(flags & BPF_TRAMP_F_SKIP_FRAME)) + if (!(flags & BPF_TRAMP_F_SKIP_FRAME) && + (flags & BPF_TRAMP_F_ORIG_STACK)) EMIT4(0xb9040000, REG_1, REG_8); /* lmg %r7,%r8,r7_r8_off(%r15) */ EMIT6_DISP_LH(0xeb000000, 0x0004, REG_7, REG_8, REG_15, @@ -2879,9 +2892,12 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, EMIT4_IMM(0xa70b0000, REG_15, tjit->stack_size); if (flags & BPF_TRAMP_F_SKIP_FRAME) EMIT_JUMP_REG(14); - else + else if (flags & BPF_TRAMP_F_ORIG_STACK) EMIT_JUMP_REG(1); - + else + /* brcl 0xf,func_addr+S390X_PATCH_SIZE */ + EMIT6_PCREL_RILC_PTR(0xc0040000, 0xf, + func_addr + S390X_PATCH_SIZE); return 0; } @@ -2951,6 +2967,11 @@ bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena) case BPF_STX | BPF_ATOMIC | BPF_DW: if (bpf_atomic_is_load_store(insn)) return false; + break; + case BPF_LDX | BPF_MEMSX | BPF_B: + case BPF_LDX | BPF_MEMSX | BPF_H: + case BPF_LDX | BPF_MEMSX | BPF_W: + return false; } return true; } @@ -2989,3 +3010,8 @@ void arch_bpf_stack_walk(bool (*consume_fn)(void *, u64, u64, u64), prev_addr = addr; } } + +bool bpf_jit_supports_timed_may_goto(void) +{ + return true; +} diff --git a/arch/s390/net/bpf_timed_may_goto.S b/arch/s390/net/bpf_timed_may_goto.S new file mode 100644 index 000000000000..06f567a460d7 --- /dev/null +++ b/arch/s390/net/bpf_timed_may_goto.S @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include <linux/export.h> +#include <linux/linkage.h> +#include <asm/asm-offsets.h> +#include <asm/nospec-insn.h> + +#define R2_OFF 0 +#define R5_OFF (R2_OFF + (5 - 2 + 1) * 8) +#define R14_OFF (R5_OFF + 8) +#define RETADDR_OFF (R14_OFF + 8) +#define R15_OFF (RETADDR_OFF + 8) +#define BACKCHAIN_OFF (R15_OFF + 8) +#define FRAME_SIZE (BACKCHAIN_OFF + 8) +#define FRAME_OFF (STACK_FRAME_OVERHEAD - FRAME_SIZE) +#if (FRAME_OFF + BACKCHAIN_OFF) != __SF_BACKCHAIN +#error Stack frame layout calculation is broken +#endif + + GEN_BR_THUNK %r1 + +SYM_FUNC_START(arch_bpf_timed_may_goto) + /* + * This function has a special ABI: the parameters are in %r12 and + * %r13; the return value is in %r12; all GPRs except %r0, %r1, and + * %r12 are callee-saved; and the return address is in %r0. + */ + stmg %r2,%r5,FRAME_OFF+R2_OFF(%r15) + stg %r14,FRAME_OFF+R14_OFF(%r15) + stg %r0,FRAME_OFF+RETADDR_OFF(%r15) + stg %r15,FRAME_OFF+R15_OFF(%r15) + lgr %r1,%r15 + lay %r15,-FRAME_SIZE(%r15) + stg %r1,__SF_BACKCHAIN(%r15) + + lay %r2,0(%r12,%r13) + brasl %r14,bpf_check_timed_may_goto + lgr %r12,%r2 + + lg %r15,FRAME_SIZE+FRAME_OFF+R15_OFF(%r15) + lmg %r2,%r5,FRAME_OFF+R2_OFF(%r15) + lg %r14,FRAME_OFF+R14_OFF(%r15) + lg %r1,FRAME_OFF+RETADDR_OFF(%r15) + BR_EX %r1 +SYM_FUNC_END(arch_bpf_timed_may_goto) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 7e3fca164620..fc13306af15f 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -8,6 +8,7 @@ #include <linux/netdevice.h> #include <linux/filter.h> #include <linux/if_vlan.h> +#include <linux/bitfield.h> #include <linux/bpf.h> #include <linux/memory.h> #include <linux/sort.h> @@ -1151,11 +1152,38 @@ static void emit_ldx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 i *pprog = prog; } +static void emit_ldsx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* movsx rax, byte ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x48, src_reg, dst_reg, index_reg), 0x0F, 0xBE); + break; + case BPF_H: + /* movsx rax, word ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x48, src_reg, dst_reg, index_reg), 0x0F, 0xBF); + break; + case BPF_W: + /* movsx rax, dword ptr [rax + r12 + off] */ + EMIT2(add_3mod(0x48, src_reg, dst_reg, index_reg), 0x63); + break; + } + emit_insn_suffix_SIB(&prog, src_reg, dst_reg, index_reg, off); + *pprog = prog; +} + static void emit_ldx_r12(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { emit_ldx_index(pprog, size, dst_reg, src_reg, X86_REG_R12, off); } +static void emit_ldsx_r12(u8 **prog, u32 size, u32 dst_reg, u32 src_reg, int off) +{ + emit_ldsx_index(prog, size, dst_reg, src_reg, X86_REG_R12, off); +} + /* STX: *(u8*)(dst_reg + off) = src_reg */ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { @@ -1388,16 +1416,67 @@ static int emit_atomic_ld_st_index(u8 **pprog, u32 atomic_op, u32 size, return 0; } +/* + * Metadata encoding for exception handling in JITed code. + * + * Format of `fixup` and `data` fields in `struct exception_table_entry`: + * + * Bit layout of `fixup` (32-bit): + * + * +-----------+--------+-----------+---------+----------+ + * | 31 | 30-24 | 23-16 | 15-8 | 7-0 | + * | | | | | | + * | ARENA_ACC | Unused | ARENA_REG | DST_REG | INSN_LEN | + * +-----------+--------+-----------+---------+----------+ + * + * - INSN_LEN (8 bits): Length of faulting insn (max x86 insn = 15 bytes (fits in 8 bits)). + * - DST_REG (8 bits): Offset of dst_reg from reg2pt_regs[] (max offset = 112 (fits in 8 bits)). + * This is set to DONT_CLEAR if the insn is a store. + * - ARENA_REG (8 bits): Offset of the register that is used to calculate the + * address for load/store when accessing the arena region. + * - ARENA_ACCESS (1 bit): This bit is set when the faulting instruction accessed the arena region. + * + * Bit layout of `data` (32-bit): + * + * +--------------+--------+--------------+ + * | 31-16 | 15-8 | 7-0 | + * | | | | + * | ARENA_OFFSET | Unused | EX_TYPE_BPF | + * +--------------+--------+--------------+ + * + * - ARENA_OFFSET (16 bits): Offset used to calculate the address for load/store when + * accessing the arena region. + */ + #define DONT_CLEAR 1 +#define FIXUP_INSN_LEN_MASK GENMASK(7, 0) +#define FIXUP_REG_MASK GENMASK(15, 8) +#define FIXUP_ARENA_REG_MASK GENMASK(23, 16) +#define FIXUP_ARENA_ACCESS BIT(31) +#define DATA_ARENA_OFFSET_MASK GENMASK(31, 16) bool ex_handler_bpf(const struct exception_table_entry *x, struct pt_regs *regs) { - u32 reg = x->fixup >> 8; + u32 reg = FIELD_GET(FIXUP_REG_MASK, x->fixup); + u32 insn_len = FIELD_GET(FIXUP_INSN_LEN_MASK, x->fixup); + bool is_arena = !!(x->fixup & FIXUP_ARENA_ACCESS); + bool is_write = (reg == DONT_CLEAR); + unsigned long addr; + s16 off; + u32 arena_reg; + + if (is_arena) { + arena_reg = FIELD_GET(FIXUP_ARENA_REG_MASK, x->fixup); + off = FIELD_GET(DATA_ARENA_OFFSET_MASK, x->data); + addr = *(unsigned long *)((void *)regs + arena_reg) + off; + bpf_prog_report_arena_violation(is_write, addr, regs->ip); + } /* jump over faulting load and clear dest register */ if (reg != DONT_CLEAR) *(unsigned long *)((void *)regs + reg) = 0; - regs->ip += x->fixup & 0xff; + regs->ip += insn_len; + return true; } @@ -2057,19 +2136,27 @@ st: if (is_imm8(insn->off)) case BPF_LDX | BPF_PROBE_MEM32 | BPF_H: case BPF_LDX | BPF_PROBE_MEM32 | BPF_W: case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW: + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_B: + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_H: + case BPF_LDX | BPF_PROBE_MEM32SX | BPF_W: case BPF_STX | BPF_PROBE_MEM32 | BPF_B: case BPF_STX | BPF_PROBE_MEM32 | BPF_H: case BPF_STX | BPF_PROBE_MEM32 | BPF_W: case BPF_STX | BPF_PROBE_MEM32 | BPF_DW: start_of_ldx = prog; - if (BPF_CLASS(insn->code) == BPF_LDX) - emit_ldx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); - else + if (BPF_CLASS(insn->code) == BPF_LDX) { + if (BPF_MODE(insn->code) == BPF_PROBE_MEM32SX) + emit_ldsx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); + else + emit_ldx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); + } else { emit_stx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); + } populate_extable: { struct exception_table_entry *ex; u8 *_insn = image + proglen + (start_of_ldx - temp); + u32 arena_reg, fixup_reg; s64 delta; if (!bpf_prog->aux->extable) @@ -2089,8 +2176,29 @@ populate_extable: ex->data = EX_TYPE_BPF; - ex->fixup = (prog - start_of_ldx) | - ((BPF_CLASS(insn->code) == BPF_LDX ? reg2pt_regs[dst_reg] : DONT_CLEAR) << 8); + /* + * src_reg/dst_reg holds the address in the arena region with upper + * 32-bits being zero because of a preceding addr_space_cast(r<n>, + * 0x0, 0x1) instruction. This address is adjusted with the addition + * of arena_vm_start (see the implementation of BPF_PROBE_MEM32 and + * BPF_PROBE_ATOMIC) before being used for the memory access. Pass + * the reg holding the unmodified 32-bit address to + * ex_handler_bpf(). + */ + if (BPF_CLASS(insn->code) == BPF_LDX) { + arena_reg = reg2pt_regs[src_reg]; + fixup_reg = reg2pt_regs[dst_reg]; + } else { + arena_reg = reg2pt_regs[dst_reg]; + fixup_reg = DONT_CLEAR; + } + + ex->fixup = FIELD_PREP(FIXUP_INSN_LEN_MASK, prog - start_of_ldx) | + FIELD_PREP(FIXUP_ARENA_REG_MASK, arena_reg) | + FIELD_PREP(FIXUP_REG_MASK, fixup_reg); + ex->fixup |= FIXUP_ARENA_ACCESS; + + ex->data |= FIELD_PREP(DATA_ARENA_OFFSET_MASK, insn->off); } break; @@ -2208,7 +2316,8 @@ populate_extable: * End result: x86 insn "mov rbx, qword ptr [rax+0x14]" * of 4 bytes will be ignored and rbx will be zero inited. */ - ex->fixup = (prog - start_of_ldx) | (reg2pt_regs[dst_reg] << 8); + ex->fixup = FIELD_PREP(FIXUP_INSN_LEN_MASK, prog - start_of_ldx) | + FIELD_PREP(FIXUP_REG_MASK, reg2pt_regs[dst_reg]); } break; |