From 37fdffb217a45609edccbb8b407d031143f551c0 Mon Sep 17 00:00:00 2001 From: Tariq Toukan Date: Tue, 21 Aug 2018 14:41:41 +0300 Subject: net/mlx5: WQ, fixes for fragmented WQ buffers API mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API. Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue. This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix() The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers. New calculation is also more efficient, and improves performance as follows: Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size. Before: 16,477,619 pps After: 17,085,793 pps 3.7% improvement Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan Reviewed-by: Eran Ben Elisha Signed-off-by: Saeed Mahameed --- include/linux/mlx5/driver.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include/linux') diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 66d94b4557cf..88a041b73abf 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1032,6 +1032,14 @@ static inline void *mlx5_frag_buf_get_wqe(struct mlx5_frag_buf_ctrl *fbc, ((fbc->frag_sz_m1 & ix) << fbc->log_stride); } +static inline u32 +mlx5_frag_buf_get_idx_last_contig_stride(struct mlx5_frag_buf_ctrl *fbc, u32 ix) +{ + u32 last_frag_stride_idx = (ix + fbc->strides_offset) | fbc->frag_sz_m1; + + return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1); +} + int mlx5_cmd_init(struct mlx5_core_dev *dev); void mlx5_cmd_cleanup(struct mlx5_core_dev *dev); void mlx5_cmd_use_events(struct mlx5_core_dev *dev); -- cgit v1.2.3 From ca2b497253ad01c80061a1f3ee9eb91b5d54a849 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Fri, 5 Oct 2018 13:24:36 +0100 Subject: arm64: perf: Reject stand-alone CHAIN events for PMUv3 It doesn't make sense for a perf event to be configured as a CHAIN event in isolation, so extend the arm_pmu structure with a ->filter_match() function to allow the backend PMU implementation to reject CHAIN events early. Cc: Reviewed-by: Suzuki K Poulose Signed-off-by: Will Deacon --- include/linux/perf/arm_pmu.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 10f92e1d8e7b..bf309ff6f244 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -99,6 +99,7 @@ struct arm_pmu { void (*stop)(struct arm_pmu *); void (*reset)(void *); int (*map_event)(struct perf_event *event); + int (*filter_match)(struct perf_event *event); int num_events; bool secure_access; /* 32-bit ARM only */ #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40 -- cgit v1.2.3 From 9c0be3f6b5d776dfe3ed249862c244a4486414dc Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Sat, 13 Oct 2018 15:10:50 -0400 Subject: tracepoint: Fix tracepoint array element size mismatch commit 46e0c9be206f ("kernel: tracepoints: add support for relative references") changes the layout of the __tracepoint_ptrs section on architectures supporting relative references. However, it does so without turning struct tracepoint * const into const int elsewhere in the tracepoint code, which has the following side-effect: Setting mod->num_tracepoints is done in by module.c: mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs", sizeof(*mod->tracepoints_ptrs), &mod->num_tracepoints); Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size (rather than sizeof(int)), num_tracepoints is erroneously set to half the size it should be on 64-bit arch. So a module with an odd number of tracepoints misses the last tracepoint due to effect of integer division. So in the module going notifier: for_each_tracepoint_range(mod->tracepoints_ptrs, mod->tracepoints_ptrs + mod->num_tracepoints, tp_module_going_check_quiescent, NULL); the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually evaluates to something within the bounds of the array, but miss the last tracepoint if the number of tracepoints is odd on 64-bit arch. Fix this by introducing a new typedef: tracepoint_ptr_t, which is either "const int" on architectures that have PREL32 relocations, or "struct tracepoint * const" on architectures that does not have this feature. Also provide a new tracepoint_ptr_defer() static inline to encapsulate deferencing this type rather than duplicate code and ugly idefs within the for_each_tracepoint_range() implementation. This issue appears in 4.19-rc kernels, and should ideally be fixed before the end of the rc cycle. Acked-by: Ard Biesheuvel Acked-by: Jessica Yu Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org Cc: Michael Ellerman Cc: Ingo Molnar Cc: Ard Biesheuvel Cc: Arnd Bergmann Cc: Benjamin Herrenschmidt Cc: Bjorn Helgaas Cc: Catalin Marinas Cc: James Morris Cc: James Morris Cc: Josh Poimboeuf Cc: Kees Cook Cc: Nicolas Pitre Cc: Paul Mackerras Cc: Petr Mladek Cc: Russell King Cc: "Serge E. Hallyn" Cc: Sergey Senozhatsky Cc: Thomas Garnier Cc: Thomas Gleixner Cc: Will Deacon Cc: Andrew Morton Cc: Linus Torvalds Cc: Greg Kroah-Hartman Signed-off-by: Mathieu Desnoyers Signed-off-by: Steven Rostedt (VMware) --- include/linux/module.h | 3 ++- include/linux/tracepoint-defs.h | 6 ++++++ include/linux/tracepoint.h | 36 +++++++++++++++++++++++------------- 3 files changed, 31 insertions(+), 14 deletions(-) (limited to 'include/linux') diff --git a/include/linux/module.h b/include/linux/module.h index f807f15bebbe..e19ae08c7fb8 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -430,7 +431,7 @@ struct module { #ifdef CONFIG_TRACEPOINTS unsigned int num_tracepoints; - struct tracepoint * const *tracepoints_ptrs; + tracepoint_ptr_t *tracepoints_ptrs; #endif #ifdef HAVE_JUMP_LABEL struct jump_entry *jump_entries; diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h index 22c5a46e9693..49ba9cde7e4b 100644 --- a/include/linux/tracepoint-defs.h +++ b/include/linux/tracepoint-defs.h @@ -35,6 +35,12 @@ struct tracepoint { struct tracepoint_func __rcu *funcs; }; +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +typedef const int tracepoint_ptr_t; +#else +typedef struct tracepoint * const tracepoint_ptr_t; +#endif + struct bpf_raw_event_map { struct tracepoint *tp; void *bpf_func; diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 041f7e56a289..538ba1a58f5b 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -99,6 +99,29 @@ extern void syscall_unregfunc(void); #define TRACE_DEFINE_ENUM(x) #define TRACE_DEFINE_SIZEOF(x) +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) +{ + return offset_to_ptr(p); +} + +#define __TRACEPOINT_ENTRY(name) \ + asm(" .section \"__tracepoints_ptrs\", \"a\" \n" \ + " .balign 4 \n" \ + " .long __tracepoint_" #name " - . \n" \ + " .previous \n") +#else +static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) +{ + return *p; +} + +#define __TRACEPOINT_ENTRY(name) \ + static tracepoint_ptr_t __tracepoint_ptr_##name __used \ + __attribute__((section("__tracepoints_ptrs"))) = \ + &__tracepoint_##name +#endif + #endif /* _LINUX_TRACEPOINT_H */ /* @@ -253,19 +276,6 @@ extern void syscall_unregfunc(void); return static_key_false(&__tracepoint_##name.key); \ } -#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS -#define __TRACEPOINT_ENTRY(name) \ - asm(" .section \"__tracepoints_ptrs\", \"a\" \n" \ - " .balign 4 \n" \ - " .long __tracepoint_" #name " - . \n" \ - " .previous \n") -#else -#define __TRACEPOINT_ENTRY(name) \ - static struct tracepoint * const __tracepoint_ptr_##name __used \ - __attribute__((section("__tracepoints_ptrs"))) = \ - &__tracepoint_##name -#endif - /* * We have no guarantee that gcc and the linker won't up-align the tracepoint * structures, so we create an array of pointers that will be used for iteration -- cgit v1.2.3 From eb66ae030829605d61fbef1909ce310e29f78821 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Fri, 12 Oct 2018 15:22:59 -0700 Subject: mremap: properly flush TLB before releasing the page Jann Horn points out that our TLB flushing was subtly wrong for the mremap() case. What makes mremap() special is that we don't follow the usual "add page to list of pages to be freed, then flush tlb, and then free pages". No, mremap() obviously just _moves_ the page from one page table location to another. That matters, because mremap() thus doesn't directly control the lifetime of the moved page with a freelist: instead, the lifetime of the page is controlled by the page table locking, that serializes access to the entry. As a result, we need to flush the TLB not just before releasing the lock for the source location (to avoid any concurrent accesses to the entry), but also before we release the destination page table lock (to avoid the TLB being flushed after somebody else has already done something to that page). This also makes the whole "need_flush" logic unnecessary, since we now always end up flushing the TLB for every valid entry. Reported-and-tested-by: Jann Horn Acked-by: Will Deacon Tested-by: Ingo Molnar Acked-by: Peter Zijlstra (Intel) Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- include/linux/huge_mm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/linux') diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 99c19b06d9a4..fdcb45999b26 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -43,7 +43,7 @@ extern int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned char *vec); extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, unsigned long old_end, - pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush); + pmd_t *old_pmd, pmd_t *new_pmd); extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, int prot_numa); -- cgit v1.2.3