linux-toradex.git/kernel/bpf, branch v4.14-rc7

bpf: rename sk_actions to align with bpf infrastructure

2017-10-29T02:18:48+00:00

Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.

To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.

This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.

Signed-off-by: John Fastabend 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: bpf_compute_data uses incorrect cb structure

2017-10-29T02:18:48+00:00

SK_SKB program types use bpf_compute_data to store the end of the
packet data. However, bpf_compute_data assumes the cb is stored in the
qdisc layer format. But, for SK_SKB this is the wrong layer of the
stack for this type.

It happens to work (sort of!) because in most cases nothing happens
to be overwritten today. This is very fragile and error prone.
Fortunately, we have another hole in tcp_skb_cb we can use so lets
put the data_end value there.

Note, SK_SKB program types do not use data_meta, they are failed by
sk_skb_is_valid_access().

Signed-off-by: John Fastabend 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: fix pattern matches for direct packet access

2017-10-21T23:56:09+00:00

Alexander had a test program with direct packet access, where
the access test was in the form of data + X > data_end. In an
unrelated change to the program LLVM decided to swap the branches
and emitted code for the test in form of data + X <= data_end.
We hadn't seen these being generated previously, thus verifier
would reject the program. Therefore, fix up the verifier to
detect all test cases, so we don't run into such issues in the
future.

Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier")
Reported-by: Alexander Alemayhu 
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Acked-by: John Fastabend 
Signed-off-by: David S. Miller

bpf: fix off by one for range markings with L{T, E} patterns

2017-10-21T23:56:09+00:00

During review I noticed that the current logic for direct packet
access marking in check_cond_jmp_op() has an off by one for the
upper right range border when marking in find_good_pkt_pointers()
with BPF_JLT and BPF_JLE. It's not really harmful given access
up to pkt_end is always safe, but we should nevertheless correct
the range marking before it becomes ABI. If pkt_data' denotes a
pkt_data derived pointer (pkt_data + X), then for pkt_data' < pkt_end
in the true branch as well as for pkt_end <= pkt_data' in the false
branch we mark the range with X although it should really be X - 1
in these cases. For example, X could be pkt_end - pkt_data, then
when testing for pkt_data' < pkt_end the verifier simulation cannot
deduce that a byte load of pkt_data' - 1 would succeed in this
branch.

Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier")
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Acked-by: John Fastabend 
Signed-off-by: David S. Miller

bpf: devmap fix arithmetic overflow in bitmap_size calculation

2017-10-21T23:54:09+00:00

An integer overflow is possible in dev_map_bitmap_size() when
calculating the BITS_TO_LONG logic which becomes, after macro
replacement,

	(((n) + (d) - 1)/ (d))

where 'n' is a __u32 and 'd' is (8 * sizeof(long)). To avoid
overflow cast to u64 before arithmetic.

Reported-by: Richard Weinberger 
Acked-by: Daniel Borkmann 
Signed-off-by: John Fastabend 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: require CAP_NET_ADMIN when using devmap

2017-10-20T12:01:29+00:00

Devmap is used with XDP which requires CAP_NET_ADMIN so lets also
make CAP_NET_ADMIN required to use the map.

Signed-off-by: John Fastabend 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: require CAP_NET_ADMIN when using sockmap maps

2017-10-20T12:01:29+00:00

Restrict sockmap to CAP_NET_ADMIN.

Signed-off-by: John Fastabend 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb region

2017-10-20T12:01:29+00:00

SK_SKB BPF programs are run from the socket/tcp context but early in
the stack before much of the TCP metadata is needed in tcp_skb_cb. So
we can use some unused fields to place BPF metadata needed for SK_SKB
programs when implementing the redirect function.

This allows us to drop the preempt disable logic. It does however
require an API change so sk_redirect_map() has been updated to
additionally provide ctx_ptr to skb. Note, we do however continue to
disable/enable preemption around actual BPF program running to account
for map updates.

Signed-off-by: John Fastabend 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: enforce TCP only support for sockmap

2017-10-20T12:01:29+00:00

Only TCP sockets have been tested and at the moment the state change
callback only handles TCP sockets. This adds a check to ensure that
sockets actually being added are TCP sockets.

For net-next we can consider UDP support.

Signed-off-by: John Fastabend 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations

2017-10-19T12:13:50+00:00

PCPU_MIN_UNIT_SIZE is an implementation detail of the percpu
allocator. Given we support __GFP_NOWARN now, lets just let
the allocation request fail naturally instead. The two call
sites from BPF mistakenly assumed __GFP_NOWARN would work, so
no changes needed to their actual __alloc_percpu_gfp() calls
which use the flag already.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Acked-by: John Fastabend 
Signed-off-by: David S. Miller