diff options
| author | Liam R. Howlett <Liam.Howlett@oracle.com> | 2026-01-21 11:49:36 -0500 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2026-02-12 15:42:53 -0800 |
| commit | bed76bec3111c956a9756643d759d5c7a2193b37 (patch) | |
| tree | 795642e07a7f7a23f4450d11dea02f76203397ed /include/linux/pgtable.h | |
| parent | 3f54ef56fd4540142d2d9a7e7cf0b70dd221c1e1 (diff) | |
mm: relocate the page table ceiling and floor definitions
Patch series " Remove XA_ZERO from error recovery of dup_mmap()", v3.
It is possible that the dup_mmap() call fails on allocating or setting up
a vma after the maple tree of the oldmm is copied. Today, that failure
point is marked by inserting an XA_ZERO entry over the failure point so
that the exact location does not need to be communicated through to
exit_mmap().
However, a race exists in the tear down process because the dup_mmap()
drops the mmap lock before exit_mmap() can remove the partially set up vma
tree. This means that other tasks may get to the mm tree and find the
invalid vma pointer (since it's an XA_ZERO entry), even though the mm is
marked as MMF_OOM_SKIP and MMF_UNSTABLE.
To remove the race fully, the tree must be cleaned up before dropping the
lock. This is accomplished by extracting the vma cleanup in exit_mmap()
and changing the required functions to pass through the vma search limit.
Any other tree modifications would require extra cycles which should be
spent on freeing memory.
This does run the risk of increasing the possibility of finding no vmas
(which is already possible!) in code that isn't careful.
The final four patches are to address the excessive argument lists being
passed between the functions. Using the struct unmap_desc also allows
some special-case code to be removed in favour of the struct setup
differences.
This patch (of 11):
pgtables.h defines a fallback for ceiling and floor of the page tables
within the CONFIG_MMU section. Moving the definitions to outside the
CONFIG_MMU allows for using them in generic code.
[akpm@linux-foundation.org: remove stray newline, per SeongJae]
Link: https://lkml.kernel.org/r/20260121164946.2093480-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20260121164946.2093480-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: SeongJae Park <sj@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include/linux/pgtable.h')
| -rw-r--r-- | include/linux/pgtable.h | 38 |
1 files changed, 19 insertions, 19 deletions
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 827dca25c0bc..21b67d937555 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -23,25 +23,6 @@ #endif /* - * On almost all architectures and configurations, 0 can be used as the - * upper ceiling to free_pgtables(): on many architectures it has the same - * effect as using TASK_SIZE. However, there is one configuration which - * must impose a more careful limit, to avoid freeing kernel pgtables. - */ -#ifndef USER_PGTABLES_CEILING -#define USER_PGTABLES_CEILING 0UL -#endif - -/* - * This defines the first usable user address. Platforms - * can override its value with custom FIRST_USER_ADDRESS - * defined in their respective <asm/pgtable.h>. - */ -#ifndef FIRST_USER_ADDRESS -#define FIRST_USER_ADDRESS 0UL -#endif - -/* * This defines the generic helper for accessing PMD page * table page. Although platforms can still override this * via their respective <asm/pgtable.h>. @@ -1630,6 +1611,25 @@ void arch_sync_kernel_mappings(unsigned long start, unsigned long end); #endif /* CONFIG_MMU */ /* + * On almost all architectures and configurations, 0 can be used as the + * upper ceiling to free_pgtables(): on many architectures it has the same + * effect as using TASK_SIZE. However, there is one configuration which + * must impose a more careful limit, to avoid freeing kernel pgtables. + */ +#ifndef USER_PGTABLES_CEILING +#define USER_PGTABLES_CEILING 0UL +#endif + +/* + * This defines the first usable user address. Platforms + * can override its value with custom FIRST_USER_ADDRESS + * defined in their respective <asm/pgtable.h>. + */ +#ifndef FIRST_USER_ADDRESS +#define FIRST_USER_ADDRESS 0UL +#endif + +/* * No-op macros that just return the current protection value. Defined here * because these macros can be used even if CONFIG_MMU is not defined. */ |
