From 6e399cd144d8500ffb5d40fa6848890e2580a80a Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Thu, 16 Apr 2015 12:47:59 -0700 Subject: prctl: avoid using mmap_sem for exe_file serialization Oleg cleverly suggested using xchg() to set the new mm->exe_file instead of calling set_mm_exe_file() which requires some form of serialization -- mmap_sem in this case. For archs that do not have atomic rmw instructions we still fallback to a spinlock alternative, so this should always be safe. As such, we only need the mmap_sem for looking up the backing vm_file, which can be done sharing the lock. Naturally, this means we need to manually deal with both the new and old file reference counting, and we need not worry about the MMF_EXE_FILE_CHANGED bits, which can probably be deleted in the future anyway. Signed-off-by: Davidlohr Bueso Suggested-by: Oleg Nesterov Acked-by: Oleg Nesterov Reviewed-by: Konstantin Khlebnikov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index c7f9b733406d..a5fef835ebc5 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1078,7 +1078,13 @@ int flush_old_exec(struct linux_binprm * bprm) if (retval) goto out; + /* + * Must be called _before_ exec_mmap() as bprm->mm is + * not visibile until then. This also enables the update + * to be lockless. + */ set_mm_exe_file(bprm->mm, bprm->file); + /* * Release all of the old mmap stuff */ -- cgit v1.2.3 From dfcce791fb0ad06f3f0b745a23160b9d8858fe39 Mon Sep 17 00:00:00 2001 From: Kirill Tkhai Date: Thu, 16 Apr 2015 12:48:01 -0700 Subject: fs/exec.c:de_thread: move notify_count write under lock We set sig->notify_count = -1 between RELEASE and ACQUIRE operations: spin_unlock_irq(lock); ... if (!thread_group_leader(tsk)) { ... for (;;) { sig->notify_count = -1; write_lock_irq(&tasklist_lock); There are no restriction on it so other processors may see this STORE mixed with other STOREs in both areas limited by the spinlocks. Probably, it may be reordered with the above sig->group_exit_task = tsk; sig->notify_count = zap_other_threads(tsk); in some way. Set it under tasklist_lock locked to be sure nothing will be reordered. Signed-off-by: Kirill Tkhai Acked-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index a5fef835ebc5..02bfd980a40c 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -926,10 +926,14 @@ static int de_thread(struct task_struct *tsk) if (!thread_group_leader(tsk)) { struct task_struct *leader = tsk->group_leader; - sig->notify_count = -1; /* for exit_notify() */ for (;;) { threadgroup_change_begin(tsk); write_lock_irq(&tasklist_lock); + /* + * Do this under tasklist_lock to ensure that + * exit_notify() can't miss ->group_exit_task + */ + sig->notify_count = -1; if (likely(leader->exit_state)) break; __set_current_state(TASK_KILLABLE); -- cgit v1.2.3