diff options
| author | Daniel Borkmann <daniel@iogearbox.net> | 2019-06-29 01:31:10 +0200 |
|---|---|---|
| committer | Daniel Borkmann <daniel@iogearbox.net> | 2019-06-29 01:31:11 +0200 |
| commit | 8daed7677a1da676332e0294db8a09cad030e693 (patch) | |
| tree | 5e29d398bd0fcaac393d22b98ce3f0bc48f49f05 /include/linux | |
| parent | 2d6dbb9a65f4001f2878512078394c11301994f3 (diff) | |
| parent | 0cdbb4b09a0658b72c563638d476113aadd91afb (diff) | |
Merge branch 'bpf-lookup-devmap'
Toke Høiland-Jørgensen says:
====================
When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
program cannot currently know whether the redirect will succeed, which makes it
impossible to gracefully handle errors. To properly fix this will probably
require deeper changes to the way TX resources are allocated, but one thing that
is fairly straight forward to fix is to allow lookups into devmaps, so programs
can at least know when a redirect is *guaranteed* to fail because there is no
entry in the map. Currently, programs work around this by keeping a shadow map
of another type which indicates whether a map index is valid.
This series contains two changes that are complementary ways to fix this issue:
- Moving the map lookup into the bpf_redirect_map() helper (and caching the
result), so the helper can return an error if no value is found in the map.
This includes a refactoring of the devmap and cpumap code to not care about
the index on enqueue.
- Allowing regular lookups into devmaps from eBPF programs, using the read-only
flag to make sure they don't change the values.
The performance impact of the series is negligible, in the sense that I cannot
measure it because the variance between test runs is higher than the difference
pre/post series.
Changelog:
v6:
- Factor out list handling in maps to a helper in list.h (new patch 1)
- Rename variables in struct bpf_redirect_info (new patch 3 + patch 4)
- Explain why we are clearing out the map in the info struct on lookup failure
- Remove unneeded check for forwarding target in tracepoint macro
v5:
- Rebase on latest bpf-next.
- Update documentation for bpf_redirect_map() with the new meaning of flags.
v4:
- Fix a few nits from Andrii
- Lose the #defines in bpf.h and just compare the flags argument directly to
XDP_TX in bpf_xdp_redirect_map().
v3:
- Adopt Jonathan's idea of using the lower two bits of the flag value as the
return code.
- Always do the lookup, and cache the result for use in xdp_do_redirect(); to
achieve this, refactor the devmap and cpumap code to get rid the bitmap for
selecting which devices to flush.
v2:
- For patch 1, make it clear that the change works for any map type.
- For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return
value read-only.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/filter.h | 3 | ||||
| -rw-r--r-- | include/linux/list.h | 14 |
2 files changed, 16 insertions, 1 deletions
diff --git a/include/linux/filter.h b/include/linux/filter.h index 340f7d648974..1fe53e78c7e3 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -578,8 +578,9 @@ struct bpf_skb_data_end { }; struct bpf_redirect_info { - u32 ifindex; u32 flags; + u32 tgt_index; + void *tgt_value; struct bpf_map *map; struct bpf_map *map_to_flush; u32 kern_flags; diff --git a/include/linux/list.h b/include/linux/list.h index e951228db4b2..85c92555e31f 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -106,6 +106,20 @@ static inline void __list_del(struct list_head * prev, struct list_head * next) WRITE_ONCE(prev->next, next); } +/* + * Delete a list entry and clear the 'prev' pointer. + * + * This is a special-purpose list clearing method used in the networking code + * for lists allocated as per-cpu, where we don't want to incur the extra + * WRITE_ONCE() overhead of a regular list_del_init(). The code that uses this + * needs to check the node 'prev' pointer instead of calling list_empty(). + */ +static inline void __list_del_clearprev(struct list_head *entry) +{ + __list_del(entry->prev, entry->next); + entry->prev = NULL; +} + /** * list_del - deletes entry from list. * @entry: the element to delete from the list. |
