diff options
| author | David S. Miller <davem@davemloft.net> | 2016-05-06 16:01:55 -0400 |
|---|---|---|
| committer | David S. Miller <davem@davemloft.net> | 2016-05-06 16:01:55 -0400 |
| commit | 4b307a8edb6b6f59b6f2bfe9f36fcec6e43ec911 (patch) | |
| tree | 5c6929abadfb41b794e04c47dbd5aaf0b8af47f7 /include | |
| parent | 95aef7cecbc229e7d6dc26780a7d39e864dc1ed8 (diff) | |
| parent | 883e44e4de71c023d3d74e02f35ca462c67d07dc (diff) | |
Merge branch 'bpf-direct-pkt-access'
Alexei Starovoitov says:
====================
bpf: introduce direct packet access
This set of patches introduce 'direct packet access' from
cls_bpf and act_bpf programs (which are root only).
Current bpf programs use LD_ABS, LD_INS instructions which have
to do 'if (off < skb_headlen)' for every packet access.
It's ok for socket filters, but too slow for XDP, since single
LD_ABS insn consumes 3% of cpu. Therefore we have to amortize the cost
of length check over multiple packet accesses via direct access
to skb->data, data_end pointers.
The existing packet parser typically look like:
if (load_half(skb, offsetof(struct ethhdr, h_proto)) != ETH_P_IP)
return 0;
if (load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)) != IPPROTO_UDP ||
load_byte(skb, ETH_HLEN) != 0x45)
return 0;
...
with 'direct packet access' the bpf program becomes:
void *data = (void *)(long)skb->data;
void *data_end = (void *)(long)skb->data_end;
struct eth_hdr *eth = data;
struct iphdr *iph = data + sizeof(*eth);
if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
return 0;
if (eth->h_proto != htons(ETH_P_IP))
return 0;
if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
return 0;
...
which is more natural to write and significantly faster.
See patch 6 for performance tests:
21Mpps(old) vs 24Mpps(new) with just 5 loads.
For more complex parsers the performance gain is higher.
The other approach implemented in [1] was adding two new instructions
to interpreter and JITs and was too hard to use from llvm side.
The approach presented here doesn't need any instruction changes,
but the verifier has to work harder to check safety of the packet access.
Patch 1 prepares the code and Patch 2 adds new checks for direct
packet access and all of them are gated with 'env->allow_ptr_leaks'
which is true for root only.
Patch 3 improves search pruning for large programs.
Patch 4 wires in verifier's changes with net/core/filter side.
Patch 5 updates docs
Patches 6 and 7 add tests.
[1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include')
| -rw-r--r-- | include/linux/filter.h | 16 | ||||
| -rw-r--r-- | include/uapi/linux/bpf.h | 2 |
2 files changed, 18 insertions, 0 deletions
diff --git a/include/linux/filter.h b/include/linux/filter.h index 43aa1f8855c7..ec1411c89105 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -352,6 +352,22 @@ struct sk_filter { #define BPF_SKB_CB_LEN QDISC_CB_PRIV_LEN +struct bpf_skb_data_end { + struct qdisc_skb_cb qdisc_cb; + void *data_end; +}; + +/* compute the linear packet data range [data, data_end) which + * will be accessed by cls_bpf and act_bpf programs + */ +static inline void bpf_compute_data_end(struct sk_buff *skb) +{ + struct bpf_skb_data_end *cb = (struct bpf_skb_data_end *)skb->cb; + + BUILD_BUG_ON(sizeof(*cb) > FIELD_SIZEOF(struct sk_buff, cb)); + cb->data_end = skb->data + skb_headlen(skb); +} + static inline u8 *bpf_skb_cb(struct sk_buff *skb) { /* eBPF programs may read/write skb->cb[] area to transfer meta diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b7b0fb1292e7..406459b935a2 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -370,6 +370,8 @@ struct __sk_buff { __u32 cb[5]; __u32 hash; __u32 tc_classid; + __u32 data; + __u32 data_end; }; struct bpf_tunnel_key { |
