linux-toradex.git/kernel/trace/bpf_trace.c, branch v4.9.62

bpf: fix bpf_trace_printk on 32 bit archs

2017-08-30T08:21:39+00:00

[ Upstream commit 88a5c690b66110ad255380d8f629c629cf6ca559 ]

James reported that on MIPS32 bpf_trace_printk() is currently
broken while MIPS64 works fine:

  bpf_trace_printk() uses conditional operators to attempt to
  pass different types to __trace_printk() depending on the
  format operators. This doesn't work as intended on 32-bit
  architectures where u32 and long are passed differently to
  u64, since the result of C conditional operators follows the
  "usual arithmetic conversions" rules, such that the values
  passed to __trace_printk() will always be u64 [causing issues
  later in the va_list handling for vscnprintf()].

  For example the samples/bpf/tracex5 test printed lines like
  below on MIPS32, where the fd and buf have come from the u64
  fd argument, and the size from the buf argument:

    [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)

  Instead of this:

    [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)

One way to get it working is to expand various combinations
of argument types into 8 different combinations for 32 bit
and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
as well that it resolves the issue.

Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: James Hogan 
Tested-by: James Hogan 
Signed-off-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bpf: add BPF_CALL_x macros for declaring helpers

2016-09-10T02:36:04+00:00

This work adds BPF_CALL_() macros and converts all the eBPF helper functions
to use them, in a similar fashion like we do with SYSCALL_DEFINE() macros
that are used today. Motivation for this is to hide all the register handling
and all necessary casts from the user, so that it is done automatically in the
background when adding a BPF_CALL_() call.

This makes current helpers easier to review, eases to write future helpers,
avoids getting the casting mess wrong, and allows for extending all helpers at
once (f.e. build time checks, etc). It also helps detecting more easily in
code reviews that unused registers are not instrumented in the code by accident,
breaking compatibility with existing programs.

BPF_CALL_() internals are quite similar to SYSCALL_DEFINE() ones with some
fundamental differences, for example, for generating the actual helper function
that carries all u64 regs, we need to fill unused regs, so that we always end up
with 5 u64 regs as an argument.

I reviewed several 0-5 generated BPF_CALL_() variants of the .i results and
they look all as expected. No sparse issue spotted. We let this also sit for a
few days with Fengguang's kbuild test robot, and there were no issues seen. On
s390, it barked on the "uses dynamic stack allocation" notice, which is an old
one from bpf_perf_event_output{,_tp}() reappearing here due to the conversion
to the call wrapper, just telling that the perf raw record/frag sits on stack
(gcc with s390's -mwarn-dynamicstack), but that's all. Did various runtime tests
and they were fine as well. All eBPF helpers are now converted to use these
macros, getting rid of a good chunk of all the raw castings.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: add BPF_SIZEOF and BPF_FIELD_SIZEOF macros

2016-09-10T02:36:04+00:00

Add BPF_SIZEOF() and BPF_FIELD_SIZEOF() macros to improve the code a bit
which otherwise often result in overly long bytes_to_bpf_size(sizeof())
and bytes_to_bpf_size(FIELD_SIZEOF()) lines. So place them into a macro
helper instead. Moreover, we currently have a BUILD_BUG_ON(BPF_FIELD_SIZEOF())
check in convert_bpf_extensions(), but we should rather make that generic
as well and add a BUILD_BUG_ON() test in all BPF_SIZEOF()/BPF_FIELD_SIZEOF()
users to detect any rewriter size issues at compile time. Note, there are
currently none, but we want to assert that it stays this way.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type

2016-09-02T17:46:44+00:00

Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)

The program visible context meta structure is
struct bpf_perf_event_data {
    struct pt_regs regs;
     __u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
  ... ctx->sample_period ...
  ... ctx->regs.ip ...
}

The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller

bpf: allow bpf_get_prandom_u32() to be used in tracing

2016-08-13T04:57:05+00:00

bpf_get_prandom_u32() was initially introduced for socket filters
and later requested numberous times to be added to tracing bpf programs
for the same reason as in socket filters: to be able to randomly
select incoming events.

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller

bpf: Add bpf_current_task_under_cgroup helper

2016-08-13T04:49:41+00:00

This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".

Signed-off-by: Sargun Dhillon 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Tejun Heo 
Acked-by: Tejun Heo 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller

bpf: Add bpf_probe_write_user BPF helper to be called in tracers

2016-07-26T01:07:48+00:00

This allows user memory to be written to during the course of a kprobe.
It shouldn't be used to implement any kind of security mechanism
because of TOC-TOU attacks, but rather to debug, divert, and
manipulate execution of semi-cooperative processes.

Although it uses probe_kernel_write, we limit the address space
the probe can write into by checking the space with access_ok.
We do this as opposed to calling copy_to_user directly, in order
to avoid sleeping. In addition we ensure the threads's current fs
/ segment is USER_DS and the thread isn't exiting nor a kernel thread.

Given this feature is meant for experiments, and it has a risk of
crashing the system, and running programs, we print a warning on
when a proglet that attempts to use this helper is installed,
along with the pid and process name.

Signed-off-by: Sargun Dhillon 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

kernel/trace/bpf_trace.c: work around gcc-4.4.4 anon union initialization bug

2016-07-20T02:27:01+00:00

kernel/trace/bpf_trace.c: In function 'bpf_event_output':
kernel/trace/bpf_trace.c:312: error: unknown field 'next' specified in initializer
kernel/trace/bpf_trace.c:312: warning: missing braces around initializer
kernel/trace/bpf_trace.c:312: warning: (near initialization for 'raw.frag.')

Fixes: 555c8a8623a3a87 ("bpf: avoid stack copy and use skb ctx for event output")
Acked-by: Daniel Borkmann 
Cc: Alexei Starovoitov 
Cc: David S. Miller 
Signed-off-by: Andrew Morton 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf: avoid stack copy and use skb ctx for event output

2016-07-15T21:23:56+00:00

This work addresses a couple of issues bpf_skb_event_output()
helper currently has: i) We need two copies instead of just a
single one for the skb data when it should be part of a sample.
The data can be non-linear and thus needs to be extracted via
bpf_skb_load_bytes() helper first, and then copied once again
into the ring buffer slot. ii) Since bpf_skb_load_bytes()
currently needs to be used first, the helper needs to see a
constant size on the passed stack buffer to make sure BPF
verifier can do sanity checks on it during verification time.
Thus, just passing skb->len (or any other non-constant value)
wouldn't work, but changing bpf_skb_load_bytes() is also not
the proper solution, since the two copies are generally still
needed. iii) bpf_skb_load_bytes() is just for rather small
buffers like headers, since they need to sit on the limited
BPF stack anyway. Instead of working around in bpf_skb_load_bytes(),
this work improves the bpf_skb_event_output() helper to address
all 3 at once.

We can make use of the passed in skb context that we have in
the helper anyway, and use some of the reserved flag bits as
a length argument. The helper will use the new __output_custom()
facility from perf side with bpf_skb_copy() as callback helper
to walk and extract the data. It will pass the data for setup
to bpf_event_output(), which generates and pushes the raw record
with an additional frag part. The linear data used in the first
frag of the record serves as programmatically defined meta data
passed along with the appended sample.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

bpf, perf: split bpf_perf_event_output

2016-07-15T21:23:56+00:00

Split the bpf_perf_event_output() helper as a preparation into
two parts. The new bpf_perf_event_output() will prepare the raw
record itself and test for unknown flags from BPF trace context,
where the __bpf_perf_event_output() does the core work. The
latter will be reused later on from bpf_event_output() directly.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller