From 4585fc59c0e813188d6a4c5de1f6976fce461fc2 Mon Sep 17 00:00:00 2001 From: Masayoshi Mizuma Date: Mon, 30 Sep 2019 16:56:00 -0400 Subject: arm64/sve: Fix wrong free for task->thread.sve_state The system which has SVE feature crashed because of the memory pointed by task->thread.sve_state was destroyed by someone. That is because sve_state is freed while the forking the child process. The child process has the pointer of sve_state which is same as the parent's because the child's task_struct is copied from the parent's one. If the copy_process() fails as an error on somewhere, for example, copy_creds(), then the sve_state is freed even if the parent is alive. The flow is as follows. copy_process p = dup_task_struct => arch_dup_task_struct *dst = *src; // copy the entire region. : retval = copy_creds if (retval < 0) goto bad_fork_free; : bad_fork_free: ... delayed_free_task(p); => free_task => arch_release_task_struct => fpsimd_release_task => __sve_free => kfree(task->thread.sve_state); // free the parent's sve_state Move child's sve_state = NULL and clearing TIF_SVE flag to arch_dup_task_struct() so that the child doesn't free the parent's one. There is no need to wait until copy_process() to clear TIF_SVE for dst, because the thread flags for dst are initialized already by copying the src task_struct. This change simplifies the code, so get rid of comments that are no longer needed. As a note, arm64 used to have thread_info on the stack. So it would not be possible to clear TIF_SVE until the stack is initialized. From commit c02433dd6de3 ("arm64: split thread_info from task stack"), the thread_info is part of the task, so it should be valid to modify the flag from arch_dup_task_struct(). Cc: stable@vger.kernel.org # 4.15.x- Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Signed-off-by: Masayoshi Mizuma Reported-by: Hidetoshi Seto Suggested-by: Dave Martin Reviewed-by: Dave Martin Tested-by: Julien Grall Signed-off-by: Will Deacon --- arch/arm64/kernel/process.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) (limited to 'arch/arm64/kernel/process.c') diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index a47462def04b..1fb2819fc048 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -332,22 +332,27 @@ void arch_release_task_struct(struct task_struct *tsk) fpsimd_release_task(tsk); } -/* - * src and dst may temporarily have aliased sve_state after task_struct - * is copied. We cannot fix this properly here, because src may have - * live SVE state and dst's thread_info may not exist yet, so tweaking - * either src's or dst's TIF_SVE is not safe. - * - * The unaliasing is done in copy_thread() instead. This works because - * dst is not schedulable or traceable until both of these functions - * have been called. - */ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { if (current->mm) fpsimd_preserve_current_state(); *dst = *src; + /* We rely on the above assignment to initialize dst's thread_flags: */ + BUILD_BUG_ON(!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK)); + + /* + * Detach src's sve_state (if any) from dst so that it does not + * get erroneously used or freed prematurely. dst's sve_state + * will be allocated on demand later on if dst uses SVE. + * For consistency, also clear TIF_SVE here: this could be done + * later in copy_process(), but to avoid tripping up future + * maintainers it is best not to leave TIF_SVE and sve_state in + * an inconsistent state, even temporarily. + */ + dst->thread.sve_state = NULL; + clear_tsk_thread_flag(dst, TIF_SVE); + return 0; } @@ -360,13 +365,6 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context)); - /* - * Unalias p->thread.sve_state (if any) from the parent task - * and disable discard SVE state for p: - */ - clear_tsk_thread_flag(p, TIF_SVE); - p->thread.sve_state = NULL; - /* * In case p was allocated the same task_struct pointer as some * other recently-exited task, make sure p is disassociated from -- cgit v1.2.3 From 19c95f261c6558d4c2cbbfacd2d8bb6501384601 Mon Sep 17 00:00:00 2001 From: Julien Thierry Date: Tue, 15 Oct 2019 18:25:44 +0100 Subject: arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled Preempting from IRQ-return means that the task has its PSTATE saved on the stack, which will get restored when the task is resumed and does the actual IRQ return. However, enabling some CPU features requires modifying the PSTATE. This means that, if a task was scheduled out during an IRQ-return before all CPU features are enabled, the task might restore a PSTATE that does not include the feature enablement changes once scheduled back in. * Task 1: PAN == 0 ---| |--------------- | |<- return from IRQ, PSTATE.PAN = 0 | <- IRQ | +--------+ <- preempt() +-- ^ | reschedule Task 1, PSTATE.PAN == 1 * Init: --------------------+------------------------ ^ | enable_cpu_features set PSTATE.PAN on all CPUs Worse than this, since PSTATE is untouched when task switching is done, a task missing the new bits in PSTATE might affect another task, if both do direct calls to schedule() (outside of IRQ/exception contexts). Fix this by preventing preemption on IRQ-return until features are enabled on all CPUs. This way the only PSTATE values that are saved on the stack are from synchronous exceptions. These are expected to be fatal this early, the exception is BRK for WARN_ON(), but as this uses do_debug_exception() which keeps IRQs masked, it shouldn't call schedule(). Signed-off-by: Julien Thierry [james: Replaced a really cool hack, with an even simpler static key in C. expanded commit message with Julien's cover-letter ascii art] Signed-off-by: James Morse Signed-off-by: Will Deacon --- arch/arm64/kernel/process.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'arch/arm64/kernel/process.c') diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 1fb2819fc048..71f788cd2b18 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -44,6 +45,7 @@ #include #include #include +#include #include #include #include @@ -631,3 +633,19 @@ static int __init tagged_addr_init(void) core_initcall(tagged_addr_init); #endif /* CONFIG_ARM64_TAGGED_ADDR_ABI */ + +asmlinkage void __sched arm64_preempt_schedule_irq(void) +{ + lockdep_assert_irqs_disabled(); + + /* + * Preempting a task from an IRQ means we leave copies of PSTATE + * on the stack. cpufeature's enable calls may modify PSTATE, but + * resuming one of these preempted tasks would undo those changes. + * + * Only allow a task to be preempted once cpufeatures have been + * enabled. + */ + if (static_branch_likely(&arm64_const_caps_ready)) + preempt_schedule_irq(); +} -- cgit v1.2.3