linux-toradex.git/arch/arm/net, branch v4.1.10

ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.

2015-05-10T23:21:49+00:00

The ARM JIT code emits "ldr rX, [pc, #offset]" to access the literal
pool. #offset maximum value is 4095 and if the generated code is too
large, the #offset value can overflow and not point to the expected
slot in the literal pool. Additionally, when overflow occurs, bits of
the overflow can end up changing the destination register of the ldr
instruction.

Fix that by detecting the overflow in imm_offset() and setting a flag
that is checked for each BPF instructions converted in
build_body(). As of now it can only be detected in the second pass. As
a result the second build_body() call can now fail, so add the
corresponding cleanup code in that case.

Using multiple literal pools in the JITed code is going to require
lots of intrusive changes to the JIT code (which would better be done
as a feature instead of fix), just delegating to the kernel BPF
interpreter in that case is a more straight forward, minimal fix and
easy to backport.

Fixes: ddecdfcea0ae ("ARM: 7259/3: net: JIT compiler for packet filters")
Signed-off-by: Nicolas Schichan 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller

ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.

2015-05-10T23:20:32+00:00

In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
and loading rm first into ARM_R0 will result in jit_udiv() function
being called the same dividend and divisor. Fix that by loading rn
first into ARM_R1 and then rm into ARM_R0.

Signed-off-by: Nicolas Schichan 
Cc:  # v3.13+
Fixes: aee636c4809f (bpf: do not use reciprocal divide)
Acked-by: Mircea Gherzan 
Signed-off-by: David S. Miller

net: bpf: arm: make hole-faulting more robust

2014-09-23T16:40:22+00:00

Will Deacon pointed out, that the currently used opcode for filling holes,
that is 0xe7ffffff, seems not robust enough ...

  $ echo 0xffffffe7 | xxd -r > test.bin
  $ arm-linux-gnueabihf-objdump -m arm -D -b binary test.bin
  ...
  0: e7ffffff     udf    #65535  ; 0xffff

... while for Thumb, it ends up as ...

  0: ffff e7ff    vqshl.u64  q15, , #63

... which is a bit fragile. The ARM specification defines some *permanently*
guaranteed undefined instruction (UDF) space, for example for ARM in ARMv7-AR,
section A5.4 and for Thumb in ARMv7-M, section A5.2.6.

Similarly, ptrace, kprobes, kgdb, bug and uprobes make use of such instruction
as well to trap. Given mentioned section from the specification, we can find
such a universe as (where 'x' denotes 'don't care'):

  ARM:    xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
  Thumb:  1101 1110 xxxx xxxx

We therefore should use a more robust opcode that fits both. Russell King
suggested that we can even reuse a single 32-bit word, that is, 0xe7fddef1
which will fault if executed in ARM *or* Thumb mode as done in f928d4f2a86f
("ARM: poison the vectors page"). That will still hold our requirements:

  $ echo 0xf1defde7 | xxd -r > test.bin
  $ arm-unknown-linux-gnueabi-objdump -m arm -D -b binary test.bin
  ...
  0: e7fddef1     udf    #56801 ; 0xdde1
  $ echo 0xf1defde7f1defde7f1defde7 | xxd -r > test.bin
  $ arm-unknown-linux-gnueabi-objdump -marm -Mforce-thumb -D -b binary test.bin
  ...
  0: def1         udf    #241 ; 0xf1
  2: e7fd         b.n    0x0
  4: def1         udf    #241 ; 0xf1
  6: e7fd         b.n    0x4
  8: def1         udf    #241 ; 0xf1
  a: e7fd         b.n    0x8

So on ARM 0xe7fddef1 conforms to the above UDF pattern, and the low 16 bit
likewise correspond to UDF in Thumb case. The 0xe7fd part is an unconditional
branch back to the UDF instruction.

Signed-off-by: Daniel Borkmann 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mircea Gherzan 
Cc: Alexei Starovoitov 
Signed-off-by: David S. Miller

net: bpf: be friendly to kmemcheck

2014-09-09T23:58:56+00:00

Reported by Mikulas Patocka, kmemcheck currently barks out a
false positive since we don't have special kmemcheck annotation
for bitfields used in bpf_prog structure.

We currently have jited:1, len:31 and thus when accessing len
while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
we're reading uninitialized memory.

As we don't need the whole bit universe for pages member, we
can just split it to u16 and use a bool flag for jited instead
of a bitfield.

Signed-off-by: Mikulas Patocka 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

net: bpf: arm: address randomize and write protect JIT code

2014-09-09T23:58:56+00:00

This is the ARM variant for 314beb9bcab ("x86: bpf_jit_comp: secure bpf
jit against spraying attacks").

It is now possible to implement it due to commits 75374ad47c64 ("ARM: mm:
Define set_memory_* functions for ARM") and dca9aa92fc7c ("ARM: add
DEBUG_SET_MODULE_RONX option to Kconfig") which added infrastructure for
this facility.

Thus, this patch makes sure the BPF generated JIT code is marked RO, as
other kernel text sections, and also lets the generated JIT code start
at a pseudo random offset instead on a page boundary. The holes are filled
with illegal instructions.

JIT tested on armv7hl with BPF test suite.

Reference: http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
Signed-off-by: Daniel Borkmann 
Signed-off-by: Alexei Starovoitov 
Acked-by: Mircea Gherzan 
Signed-off-by: David S. Miller

net: bpf: make eBPF interpreter images read-only

2014-09-05T19:02:48+00:00

With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler 
Signed-off-by: Daniel Borkmann 
Signed-off-by: Hannes Frederic Sowa 
Cc: Alexei Starovoitov 
Cc: Kees Cook 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

net: filter: split 'struct sk_filter' into socket and bpf parts

2014-08-02T22:03:58+00:00

clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix

split 'struct sk_filter' into
struct sk_filter {
	atomic_t        refcnt;
	struct rcu_head rcu;
	struct bpf_prog *prog;
};
and
struct bpf_prog {
        u32                     jited:1,
                                len:31;
        struct sock_fprog_kern  *orig_prog;
        unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                            const struct bpf_insn *filter);
        union {
                struct sock_filter      insns[0];
                struct bpf_insn         insnsi[0];
                struct work_struct      work;
        };
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases

split SK_RUN_FILTER macro into:
    SK_RUN_FILTER to be used with 'struct sk_filter *' and
    BPF_PROG_RUN to be used with 'struct bpf_prog *'

__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function

also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:

sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter

API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet

API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf

Signed-off-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

net: filter: get rid of BPF_S_* enum

2014-06-02T05:16:58+00:00

This patch finally allows us to get rid of the BPF_S_* enum.
Currently, the code performs unnecessary encode and decode
workarounds in seccomp and filter migration itself when a filter
is being attached in order to overcome BPF_S_* encoding which
is not used anymore by the new interpreter resp. JIT compilers.

Keeping it around would mean that also in future we would need
to extend and maintain this enum and related encoders/decoders.
We can get rid of all that and save us these operations during
filter attaching. Naturally, also JIT compilers need to be updated
by this.

Before JIT conversion is being done, each compiler checks if A
is being loaded at startup to obtain information if it needs to
emit instructions to clear A first. Since BPF extensions are a
subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements
for extensions can be removed at that point. To ease and minimalize
code changes in the classic JITs, we have introduced bpf_anc_helper().

Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int),
arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we
unfortunately didn't have access, but changes are analogous to
the rest.

Joint work with Alexei Starovoitov.

Signed-off-by: Daniel Borkmann 
Signed-off-by: Alexei Starovoitov 
Cc: Benjamin Herrenschmidt 
Cc: Martin Schwidefsky 
Cc: Mircea Gherzan 
Cc: Kees Cook 
Acked-by: Chema Gonzalez 
Signed-off-by: David S. Miller

net: filter: add jited flag to indicate jit compiled filters

2014-03-31T04:45:08+00:00

This patch adds a jited flag into sk_filter struct in order to indicate
whether a filter is currently jited or not. The size of sk_filter is
not being expanded as the 32 bit 'len' member allows upper bits to be
reused since a filter can currently only grow as large as BPF_MAXINSNS.

Therefore, there's enough room also for other in future needed flags to
reuse 'len' field if necessary. The jited flag also allows for having
alternative interpreter functions running as currently, we can only
detect jit compiled filters by testing fp->bpf_func to not equal the
address of sk_run_filter().

Joint work with Alexei Starovoitov.

Signed-off-by: Alexei Starovoitov 
Signed-off-by: Daniel Borkmann 
Cc: Pablo Neira Ayuso 
Signed-off-by: David S. Miller

net: Rename skb->rxhash to skb->hash

2014-03-26T19:58:20+00:00

The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.

Signed-off-by: Tom Herbert 
Signed-off-by: Eric Dumazet 
Cc: Mahesh Bandewar 
Signed-off-by: David S. Miller