From 69b27baf00fa9b7b14b3263c105390d1683425b2 Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Fri, 25 Mar 2016 14:20:21 -0700 Subject: sched: add schedule_timeout_idle() This will be needed in the patch "mm, oom: introduce oom reaper". Acked-by: Michal Hocko Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/sched.h b/include/linux/sched.h index 589c4780b077..478b41de7f7d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -426,6 +426,7 @@ extern signed long schedule_timeout(signed long timeout); extern signed long schedule_timeout_interruptible(signed long timeout); extern signed long schedule_timeout_killable(signed long timeout); extern signed long schedule_timeout_uninterruptible(signed long timeout); +extern signed long schedule_timeout_idle(signed long timeout); asmlinkage void schedule(void); extern void schedule_preempt_disabled(void); -- cgit v1.2.3 From aac453635549699c13a84ea1456d5b0e574ef855 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 25 Mar 2016 14:20:24 -0700 Subject: mm, oom: introduce oom reaper This patch (of 5): This is based on the idea from Mel Gorman discussed during LSFMM 2015 and independently brought up by Oleg Nesterov. The OOM killer currently allows to kill only a single task in a good hope that the task will terminate in a reasonable time and frees up its memory. Such a task (oom victim) will get an access to memory reserves via mark_oom_victim to allow a forward progress should there be a need for additional memory during exit path. It has been shown (e.g. by Tetsuo Handa) that it is not that hard to construct workloads which break the core assumption mentioned above and the OOM victim might take unbounded amount of time to exit because it might be blocked in the uninterruptible state waiting for an event (e.g. lock) which is blocked by another task looping in the page allocator. This patch reduces the probability of such a lockup by introducing a specialized kernel thread (oom_reaper) which tries to reclaim additional memory by preemptively reaping the anonymous or swapped out memory owned by the oom victim under an assumption that such a memory won't be needed when its owner is killed and kicked from the userspace anyway. There is one notable exception to this, though, if the OOM victim was in the process of coredumping the result would be incomplete. This is considered a reasonable constrain because the overall system health is more important than debugability of a particular application. A kernel thread has been chosen because we need a reliable way of invocation so workqueue context is not appropriate because all the workers might be busy (e.g. allocating memory). Kswapd which sounds like another good fit is not appropriate as well because it might get blocked on locks during reclaim as well. oom_reaper has to take mmap_sem on the target task for reading so the solution is not 100% because the semaphore might be held or blocked for write but the probability is reduced considerably wrt. basically any lock blocking forward progress as described above. In order to prevent from blocking on the lock without any forward progress we are using only a trylock and retry 10 times with a short sleep in between. Users of mmap_sem which need it for write should be carefully reviewed to use _killable waiting as much as possible and reduce allocations requests done with the lock held to absolute minimum to reduce the risk even further. The API between oom killer and oom reaper is quite trivial. wake_oom_reaper updates mm_to_reap with cmpxchg to guarantee only NULL->mm transition and oom_reaper clear this atomically once it is done with the work. This means that only a single mm_struct can be reaped at the time. As the operation is potentially disruptive we are trying to limit it to the ncessary minimum and the reaper blocks any updates while it operates on an mm. mm_struct is pinned by mm_count to allow parallel exit_mmap and a race is detected by atomic_inc_not_zero(mm_users). Signed-off-by: Michal Hocko Suggested-by: Oleg Nesterov Suggested-by: Mel Gorman Acked-by: Mel Gorman Acked-by: David Rientjes Cc: Mel Gorman Cc: Tetsuo Handa Cc: Oleg Nesterov Cc: Hugh Dickins Cc: Andrea Argangeli Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/mm.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/linux') diff --git a/include/linux/mm.h b/include/linux/mm.h index 450fc977ed02..ed6407d1b7b5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1132,6 +1132,8 @@ struct zap_details { struct address_space *check_mapping; /* Check page->mapping if set */ pgoff_t first_index; /* Lowest page->index to unmap */ pgoff_t last_index; /* Highest page->index to unmap */ + bool ignore_dirty; /* Ignore dirty pages */ + bool check_swap_entries; /* Check also swap entries */ }; struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, -- cgit v1.2.3 From 36324a990cf578b57828c04cd85ac62cd25cf5a4 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 25 Mar 2016 14:20:27 -0700 Subject: oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space When oom_reaper manages to unmap all the eligible vmas there shouldn't be much of the freable memory held by the oom victim left anymore so it makes sense to clear the TIF_MEMDIE flag for the victim and allow the OOM killer to select another task. The lack of TIF_MEMDIE also means that the victim cannot access memory reserves anymore but that shouldn't be a problem because it would get the access again if it needs to allocate and hits the OOM killer again due to the fatal_signal_pending resp. PF_EXITING check. We can safely hide the task from the OOM killer because it is clearly not a good candidate anymore as everyhing reclaimable has been torn down already. This patch will allow to cap the time an OOM victim can keep TIF_MEMDIE and thus hold off further global OOM killer actions granted the oom reaper is able to take mmap_sem for the associated mm struct. This is not guaranteed now but further steps should make sure that mmap_sem for write should be blocked killable which will help to reduce such a lock contention. This is not done by this patch. Note that exit_oom_victim might be called on a remote task from __oom_reap_task now so we have to check and clear the flag atomically otherwise we might race and underflow oom_victims or wake up waiters too early. Signed-off-by: Michal Hocko Suggested-by: Johannes Weiner Suggested-by: Tetsuo Handa Cc: Andrea Argangeli Cc: David Rientjes Cc: Hugh Dickins Cc: Mel Gorman Cc: Oleg Nesterov Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/oom.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/linux') diff --git a/include/linux/oom.h b/include/linux/oom.h index 03e6257321f0..45993b840ed6 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -91,7 +91,7 @@ extern enum oom_scan_t oom_scan_process_thread(struct oom_control *oc, extern bool out_of_memory(struct oom_control *oc); -extern void exit_oom_victim(void); +extern void exit_oom_victim(struct task_struct *tsk); extern int register_oom_notifier(struct notifier_block *nb); extern int unregister_oom_notifier(struct notifier_block *nb); -- cgit v1.2.3 From 03049269de433cb5fe2859be9ae4469ceb1163ed Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 25 Mar 2016 14:20:33 -0700 Subject: mm, oom_reaper: implement OOM victims queuing wake_oom_reaper has allowed only 1 oom victim to be queued. The main reason for that was the simplicity as other solutions would require some way of queuing. The current approach is racy and that was deemed sufficient as the oom_reaper is considered a best effort approach to help with oom handling when the OOM victim cannot terminate in a reasonable time. The race could lead to missing an oom victim which can get stuck out_of_memory wake_oom_reaper cmpxchg // OK oom_reaper oom_reap_task __oom_reap_task oom_victim terminates atomic_inc_not_zero // fail out_of_memory wake_oom_reaper cmpxchg // fails task_to_reap = NULL This race requires 2 OOM invocations in a short time period which is not very likely but certainly not impossible. E.g. the original victim might have not released a lot of memory for some reason. The situation would improve considerably if wake_oom_reaper used a more robust queuing. This is what this patch implements. This means adding oom_reaper_list list_head into task_struct (eat a hole before embeded thread_struct for that purpose) and a oom_reaper_lock spinlock for queuing synchronization. wake_oom_reaper will then add the task on the queue and oom_reaper will dequeue it. Signed-off-by: Michal Hocko Cc: Vladimir Davydov Cc: Andrea Argangeli Cc: David Rientjes Cc: Hugh Dickins Cc: Johannes Weiner Cc: Mel Gorman Cc: Oleg Nesterov Cc: Rik van Riel Cc: Tetsuo Handa Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'include/linux') diff --git a/include/linux/sched.h b/include/linux/sched.h index 478b41de7f7d..788f223f8f8f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1849,6 +1849,9 @@ struct task_struct { unsigned long task_state_change; #endif int pagefault_disabled; +#ifdef CONFIG_MMU + struct list_head oom_reaper_list; +#endif /* CPU-specific state of this task */ struct thread_struct thread; /* -- cgit v1.2.3 From 855b018325737f7691f9b7d86339df40aa4e47c3 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 25 Mar 2016 14:20:36 -0700 Subject: oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task Tetsuo has reported that oom_kill_allocating_task=1 will cause oom_reaper_list corruption because oom_kill_process doesn't follow standard OOM exclusion (aka ignores TIF_MEMDIE) and allows to enqueue the same task multiple times - e.g. by sacrificing the same child multiple times. This patch fixes the issue by introducing a new MMF_OOM_KILLED mm flag which is set in oom_kill_process atomically and oom reaper is disabled if the flag was already set. Signed-off-by: Michal Hocko Reported-by: Tetsuo Handa Cc: David Rientjes Cc: Mel Gorman Cc: Oleg Nesterov Cc: Hugh Dickins Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/linux') diff --git a/include/linux/sched.h b/include/linux/sched.h index 788f223f8f8f..c2d2d7c5d463 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -512,6 +512,8 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_HAS_UPROBES 19 /* has uprobes */ #define MMF_RECALC_UPROBES 20 /* MMF_HAS_UPROBES can be wrong */ +#define MMF_OOM_KILLED 21 /* OOM killer has chosen this mm */ + #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK) struct sighand_struct { -- cgit v1.2.3 From 29c696e1c6eceb5db6b21f0c89495fcfcd40c0eb Mon Sep 17 00:00:00 2001 From: Vladimir Davydov Date: Fri, 25 Mar 2016 14:20:39 -0700 Subject: oom: make oom_reaper_list single linked Entries are only added/removed from oom_reaper_list at head so we can use a single linked list and hence save a word in task_struct. Signed-off-by: Vladimir Davydov Signed-off-by: Michal Hocko Cc: Tetsuo Handa Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/linux') diff --git a/include/linux/sched.h b/include/linux/sched.h index c2d2d7c5d463..49b1febcf7c3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1852,7 +1852,7 @@ struct task_struct { #endif int pagefault_disabled; #ifdef CONFIG_MMU - struct list_head oom_reaper_list; + struct task_struct *oom_reaper_list; #endif /* CPU-specific state of this task */ struct thread_struct thread; -- cgit v1.2.3 From bb29902a7515208846114b3b36a4281a9bbf766a Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Fri, 25 Mar 2016 14:20:44 -0700 Subject: oom, oom_reaper: protect oom_reaper_list using simpler way "oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task" tried to protect oom_reaper_list using MMF_OOM_KILLED flag. But we can do it by simply checking tsk->oom_reaper_list != NULL. Signed-off-by: Tetsuo Handa Signed-off-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 2 -- 1 file changed, 2 deletions(-) (limited to 'include/linux') diff --git a/include/linux/sched.h b/include/linux/sched.h index 49b1febcf7c3..60bba7e032dc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -512,8 +512,6 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_HAS_UPROBES 19 /* has uprobes */ #define MMF_RECALC_UPROBES 20 /* MMF_HAS_UPROBES can be wrong */ -#define MMF_OOM_KILLED 21 /* OOM killer has chosen this mm */ - #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK) struct sighand_struct { -- cgit v1.2.3 From aaf4fb712b8311d8b950e89937479d61e9c25ba8 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Fri, 25 Mar 2016 14:21:53 -0700 Subject: include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill() A leftover from commit c32b3cbe0d06 ("oom, PM: make OOM detection in the freezer path raceless"). Signed-off-by: Tetsuo Handa Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/oom.h | 2 -- 1 file changed, 2 deletions(-) (limited to 'include/linux') diff --git a/include/linux/oom.h b/include/linux/oom.h index 45993b840ed6..628a43242a34 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -76,8 +76,6 @@ extern unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg, const nodemask_t *nodemask, unsigned long totalpages); -extern int oom_kills_count(void); -extern void note_oom_kill(void); extern void oom_kill_process(struct oom_control *oc, struct task_struct *p, unsigned int points, unsigned long totalpages, struct mem_cgroup *memcg, const char *message); -- cgit v1.2.3 From 7ed2f9e663854db313f177a511145630e398b402 Mon Sep 17 00:00:00 2001 From: Alexander Potapenko Date: Fri, 25 Mar 2016 14:21:59 -0700 Subject: mm, kasan: SLAB support Add KASAN hooks to SLAB allocator. This patch is based on the "mm: kasan: unified support for SLUB and SLAB allocators" patch originally prepared by Dmitry Chernenkov. Signed-off-by: Alexander Potapenko Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Steven Rostedt Cc: Konstantin Serebryany Cc: Dmitry Chernenkov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/kasan.h | 12 ++++++++++++ include/linux/slab.h | 6 ++++++ include/linux/slab_def.h | 14 ++++++++++++++ include/linux/slub_def.h | 11 +++++++++++ 4 files changed, 43 insertions(+) (limited to 'include/linux') diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 0fdc798e3ff7..839f2007a0f9 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -48,6 +48,9 @@ void kasan_unpoison_task_stack(struct task_struct *task); void kasan_alloc_pages(struct page *page, unsigned int order); void kasan_free_pages(struct page *page, unsigned int order); +void kasan_cache_create(struct kmem_cache *cache, size_t *size, + unsigned long *flags); + void kasan_poison_slab(struct page *page); void kasan_unpoison_object_data(struct kmem_cache *cache, void *object); void kasan_poison_object_data(struct kmem_cache *cache, void *object); @@ -61,6 +64,11 @@ void kasan_krealloc(const void *object, size_t new_size); void kasan_slab_alloc(struct kmem_cache *s, void *object); void kasan_slab_free(struct kmem_cache *s, void *object); +struct kasan_cache { + int alloc_meta_offset; + int free_meta_offset; +}; + int kasan_module_alloc(void *addr, size_t size); void kasan_free_shadow(const struct vm_struct *vm); @@ -76,6 +84,10 @@ static inline void kasan_disable_current(void) {} static inline void kasan_alloc_pages(struct page *page, unsigned int order) {} static inline void kasan_free_pages(struct page *page, unsigned int order) {} +static inline void kasan_cache_create(struct kmem_cache *cache, + size_t *size, + unsigned long *flags) {} + static inline void kasan_poison_slab(struct page *page) {} static inline void kasan_unpoison_object_data(struct kmem_cache *cache, void *object) {} diff --git a/include/linux/slab.h b/include/linux/slab.h index e4b568738ca3..aa61595a1482 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -92,6 +92,12 @@ # define SLAB_ACCOUNT 0x00000000UL #endif +#ifdef CONFIG_KASAN +#define SLAB_KASAN 0x08000000UL +#else +#define SLAB_KASAN 0x00000000UL +#endif + /* The following flags affect the page allocator grouping pages by mobility */ #define SLAB_RECLAIM_ACCOUNT 0x00020000UL /* Objects are reclaimable */ #define SLAB_TEMPORARY SLAB_RECLAIM_ACCOUNT /* Objects are short-lived */ diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h index e878ba35ae91..9edbbf352340 100644 --- a/include/linux/slab_def.h +++ b/include/linux/slab_def.h @@ -76,8 +76,22 @@ struct kmem_cache { #ifdef CONFIG_MEMCG struct memcg_cache_params memcg_params; #endif +#ifdef CONFIG_KASAN + struct kasan_cache kasan_info; +#endif struct kmem_cache_node *node[MAX_NUMNODES]; }; +static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, + void *x) { + void *object = x - (x - page->s_mem) % cache->size; + void *last_object = page->s_mem + (cache->num - 1) * cache->size; + + if (unlikely(object > last_object)) + return last_object; + else + return object; +} + #endif /* _LINUX_SLAB_DEF_H */ diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index ac5143f95ee6..665cd0cd18b8 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -130,4 +130,15 @@ static inline void *virt_to_obj(struct kmem_cache *s, void object_err(struct kmem_cache *s, struct page *page, u8 *object, char *reason); +static inline void *nearest_obj(struct kmem_cache *cache, struct page *page, + void *x) { + void *object = x - (x - page_address(page)) % cache->size; + void *last_object = page_address(page) + + (page->objects - 1) * cache->size; + if (unlikely(object > last_object)) + return last_object; + else + return object; +} + #endif /* _LINUX_SLUB_DEF_H */ -- cgit v1.2.3 From 505f5dcb1c419e55a9621a01f83eb5745d8d7398 Mon Sep 17 00:00:00 2001 From: Alexander Potapenko Date: Fri, 25 Mar 2016 14:22:02 -0700 Subject: mm, kasan: add GFP flags to KASAN API Add GFP flags to KASAN hooks for future patches to use. This patch is based on the "mm: kasan: unified support for SLUB and SLAB allocators" patch originally prepared by Dmitry Chernenkov. Signed-off-by: Alexander Potapenko Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Steven Rostedt Cc: Konstantin Serebryany Cc: Dmitry Chernenkov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/kasan.h | 19 +++++++++++-------- include/linux/slab.h | 4 ++-- 2 files changed, 13 insertions(+), 10 deletions(-) (limited to 'include/linux') diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 839f2007a0f9..737371b56044 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -55,13 +55,14 @@ void kasan_poison_slab(struct page *page); void kasan_unpoison_object_data(struct kmem_cache *cache, void *object); void kasan_poison_object_data(struct kmem_cache *cache, void *object); -void kasan_kmalloc_large(const void *ptr, size_t size); +void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags); void kasan_kfree_large(const void *ptr); void kasan_kfree(void *ptr); -void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size); -void kasan_krealloc(const void *object, size_t new_size); +void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, + gfp_t flags); +void kasan_krealloc(const void *object, size_t new_size, gfp_t flags); -void kasan_slab_alloc(struct kmem_cache *s, void *object); +void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags); void kasan_slab_free(struct kmem_cache *s, void *object); struct kasan_cache { @@ -94,14 +95,16 @@ static inline void kasan_unpoison_object_data(struct kmem_cache *cache, static inline void kasan_poison_object_data(struct kmem_cache *cache, void *object) {} -static inline void kasan_kmalloc_large(void *ptr, size_t size) {} +static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {} static inline void kasan_kfree_large(const void *ptr) {} static inline void kasan_kfree(void *ptr) {} static inline void kasan_kmalloc(struct kmem_cache *s, const void *object, - size_t size) {} -static inline void kasan_krealloc(const void *object, size_t new_size) {} + size_t size, gfp_t flags) {} +static inline void kasan_krealloc(const void *object, size_t new_size, + gfp_t flags) {} -static inline void kasan_slab_alloc(struct kmem_cache *s, void *object) {} +static inline void kasan_slab_alloc(struct kmem_cache *s, void *object, + gfp_t flags) {} static inline void kasan_slab_free(struct kmem_cache *s, void *object) {} static inline int kasan_module_alloc(void *addr, size_t size) { return 0; } diff --git a/include/linux/slab.h b/include/linux/slab.h index aa61595a1482..508bd827e6dc 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -376,7 +376,7 @@ static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, { void *ret = kmem_cache_alloc(s, flags); - kasan_kmalloc(s, ret, size); + kasan_kmalloc(s, ret, size, flags); return ret; } @@ -387,7 +387,7 @@ kmem_cache_alloc_node_trace(struct kmem_cache *s, { void *ret = kmem_cache_alloc_node(s, gfpflags, node); - kasan_kmalloc(s, ret, size); + kasan_kmalloc(s, ret, size, gfpflags); return ret; } #endif /* CONFIG_TRACING */ -- cgit v1.2.3 From be7635e7287e0e8013af3c89a6354a9e0182594c Mon Sep 17 00:00:00 2001 From: Alexander Potapenko Date: Fri, 25 Mar 2016 14:22:05 -0700 Subject: arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to so that the users don't need to pull in . Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko Acked-by: Steven Rostedt Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Konstantin Serebryany Cc: Dmitry Chernenkov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/ftrace.h | 11 ----------- include/linux/interrupt.h | 20 ++++++++++++++++++++ 2 files changed, 20 insertions(+), 11 deletions(-) (limited to 'include/linux') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 6d9df3f7e334..dea12a6e413b 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -811,16 +811,6 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth, */ #define __notrace_funcgraph notrace -/* - * We want to which function is an entrypoint of a hardirq. - * That will help us to put a signal on output. - */ -#define __irq_entry __attribute__((__section__(".irqentry.text"))) - -/* Limits of hardirq entrypoints */ -extern char __irqentry_text_start[]; -extern char __irqentry_text_end[]; - #define FTRACE_NOTRACE_DEPTH 65536 #define FTRACE_RETFUNC_DEPTH 50 #define FTRACE_RETSTACK_ALLOC_SIZE 32 @@ -857,7 +847,6 @@ static inline void unpause_graph_tracing(void) #else /* !CONFIG_FUNCTION_GRAPH_TRACER */ #define __notrace_funcgraph -#define __irq_entry #define INIT_FTRACE_GRAPH static inline void ftrace_graph_init_task(struct task_struct *t) { } diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 358076eda364..9fcabeb07787 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -683,4 +683,24 @@ extern int early_irq_init(void); extern int arch_probe_nr_irqs(void); extern int arch_early_irq_init(void); +#if defined(CONFIG_FUNCTION_GRAPH_TRACER) || defined(CONFIG_KASAN) +/* + * We want to know which function is an entrypoint of a hardirq or a softirq. + */ +#define __irq_entry __attribute__((__section__(".irqentry.text"))) +#define __softirq_entry \ + __attribute__((__section__(".softirqentry.text"))) + +/* Limits of hardirq entrypoints */ +extern char __irqentry_text_start[]; +extern char __irqentry_text_end[]; +/* Limits of softirq entrypoints */ +extern char __softirqentry_text_start[]; +extern char __softirqentry_text_end[]; + +#else +#define __irq_entry +#define __softirq_entry +#endif + #endif -- cgit v1.2.3 From cd11016e5f5212c13c0cec7384a525edc93b4921 Mon Sep 17 00:00:00 2001 From: Alexander Potapenko Date: Fri, 25 Mar 2016 14:22:08 -0700 Subject: mm, kasan: stackdepot implementation. Enable stackdepot for SLAB Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot will allow KASAN store allocation/deallocation stack traces for memory chunks. The stack traces are stored in a hash table and referenced by handles which reside in the kasan_alloc_meta and kasan_free_meta structures in the allocated memory chunks. IRQ stack traces are cut below the IRQ entry point to avoid unnecessary duplication. Right now stackdepot support is only enabled in SLAB allocator. Once KASAN features in SLAB are on par with those in SLUB we can switch SLUB to stackdepot as well, thus removing the dependency on SLUB stack bookkeeping, which wastes a lot of memory. This patch is based on the "mm: kasan: stack depots" patch originally prepared by Dmitry Chernenkov. Joonsoo has said that he plans to reuse the stackdepot code for the mm/page_owner.c debugging facility. [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t] [aryabinin@virtuozzo.com: comment style fixes] Signed-off-by: Alexander Potapenko Signed-off-by: Andrey Ryabinin Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Steven Rostedt Cc: Konstantin Serebryany Cc: Dmitry Chernenkov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/stackdepot.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 include/linux/stackdepot.h (limited to 'include/linux') diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h new file mode 100644 index 000000000000..7978b3e2c1e1 --- /dev/null +++ b/include/linux/stackdepot.h @@ -0,0 +1,32 @@ +/* + * A generic stack depot implementation + * + * Author: Alexander Potapenko + * Copyright (C) 2016 Google, Inc. + * + * Based on code by Dmitry Chernenkov. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_STACKDEPOT_H +#define _LINUX_STACKDEPOT_H + +typedef u32 depot_stack_handle_t; + +struct stack_trace; + +depot_stack_handle_t depot_save_stack(struct stack_trace *trace, gfp_t flags); + +void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace); + +#endif -- cgit v1.2.3