diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-12 11:32:37 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-12 11:32:37 -0800 |
| commit | 4cff5c05e076d2ee4e34122aa956b84a2eaac587 (patch) | |
| tree | 6717207240b3881d1b48ff7cd86b193506756e6c /arch/powerpc | |
| parent | 541c43310e85dbf35368b43b720c6724bc8ad8ec (diff) | |
| parent | fb4ddf2085115ed28dedc427d9491707b476bbfe (diff) | |
Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes
arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- "zram: introduce compressed data writeback" implements data
compression for zram writeback (Richard Chang and Sergey Senozhatsky)
- "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
page ranges for hugepages. Large improvements during demand faulting
are demonstrated (David Hildenbrand)
- "memcg cleanups" tidies up some memcg code (Chen Ridong)
- "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
stats" improves DAMOS stat's provided information, deterministic
control, and readability (SeongJae Park)
- "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
issues in the hugetlb cgroup charging selftests (Li Wang)
- "Fix va_high_addr_switch.sh test failure - again" addresses several
issues in the va_high_addr_switch test (Chunyu Hu)
- "mm/damon/tests/core-kunit: extend existing test scenarios" improves
the KUnit test coverage for DAMON (Shu Anzai)
- "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
transiently return -EAGAIN (Shivank Garg)
- "arch, mm: consolidate hugetlb early reservation" reworks and
consolidates a pile of straggly code related to reservation of
hugetlb memory from bootmem and creation of CMA areas for hugetlb
(Mike Rapoport)
- "mm: clean up anon_vma implementation" cleans up the anon_vma
implementation in various ways (Lorenzo Stoakes)
- "tweaks for __alloc_pages_slowpath()" does a little streamlining of
the page allocator's slowpath code (Vlastimil Babka)
- "memcg: separate private and public ID namespaces" cleans up the
memcg ID code and prevents the internal-only private IDs from being
exposed to userspace (Shakeel Butt)
- "mm: hugetlb: allocate frozen gigantic folio" cleans up the
allocation of frozen folios and avoids some atomic refcount
operations (Kefeng Wang)
- "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
of memory betewwn the active and inactive LRUs and adds auto-tuning
of the ratio-based quotas and of monitoring intervals (SeongJae Park)
- "Support page table check on PowerPC" makes
CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)
- "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
nodes_and() and nodes_andnot() propagate the return values from the
underlying bit operations, enabling some cleanup in calling code
(Yury Norov)
- "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
some DAMON internal interfaces (SeongJae Park)
- "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
in khupaged and fixes a scan limit accounting issue (Shivank Garg)
- "mm: balloon infrastructure cleanups" goes to town on the balloon
infrastructure and its page migration function. Mainly cleanups, also
some locking simplification (David Hildenbrand)
- "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
additional tracepoints to the page reclaim code (Jiayuan Chen)
- "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
part of Marco's kernel-wide migration from the legacy workqueue APIs
over to the preferred unbound workqueues (Marco Crivellari)
- "Various mm kselftests improvements/fixes" provides various unrelated
improvements/fixes for the mm kselftests (Kevin Brodsky)
- "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
folio allocation, mainly by avoiding unnecessary work in
pfn_range_valid_contig() (Kefeng Wang)
- "selftests/damon: improve leak detection and wss estimation
reliability" improves the reliability of two of the DAMON selftests
(SeongJae Park)
- "mm/damon: cleanup kdamond, damon_call(), damos filter and
DAMON_MIN_REGION" does some cleanup work in the core DAMON code
(SeongJae Park)
- "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
performs maintenance work on the DAMON documentation (SeongJae Park)
- "mm: add and use vma_assert_stabilised() helper" refactors and cleans
up the core VMA code. The main aim here is to be able to use the mmap
write lock's lockdep state to perform various assertions regarding
the locking which the VMA code requires (Lorenzo Stoakes)
- "mm, swap: swap table phase II: unify swapin use" removes some old
swap code (swap cache bypassing and swap synchronization) which
wasn't working very well. Various other cleanups and simplifications
were made. The end result is a 20% speedup in one benchmark (Kairui
Song)
- "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
available on 64-bit alpha, loongarch, mips, parisc, and um. Various
cleanups were performed along the way (Qi Zheng)
* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
mm/memory: handle non-split locks correctly in zap_empty_pte_table()
mm: move pte table reclaim code to memory.c
mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
um: mm: enable MMU_GATHER_RCU_TABLE_FREE
parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
zsmalloc: make common caches global
mm: add SPDX id lines to some mm source files
mm/zswap: use %pe to print error pointers
mm/vmscan: use %pe to print error pointers
mm/readahead: fix typo in comment
mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
mm: refactor vma_map_pages to use vm_insert_pages
mm/damon: unify address range representation with damon_addr_range
mm/cma: replace snprintf with strscpy in cma_new_area
...
Diffstat (limited to 'arch/powerpc')
25 files changed, 175 insertions, 147 deletions
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b8d36a261009..ad7a2fe63a2a 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -172,6 +172,7 @@ config PPC select ARCH_STACKWALK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC_BOOK3S || PPC_8xx + select ARCH_SUPPORTS_PAGE_TABLE_CHECK if !HUGETLB_PAGE select ARCH_SUPPORTS_SCHED_MC if SMP select ARCH_SUPPORTS_SCHED_SMT if PPC64 && SMP select SCHED_MC if ARCH_SUPPORTS_SCHED_MC @@ -304,6 +305,7 @@ config PPC select LOCK_MM_AND_FIND_VMA select MMU_GATHER_PAGE_SIZE select MMU_GATHER_RCU_TABLE_FREE + select HAVE_ARCH_TLB_REMOVE_TABLE select MMU_GATHER_MERGE_VMAS select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64 select MODULES_USE_ELF_RELA diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 41ae404d0b7a..001e28f9eabc 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -198,6 +198,7 @@ void unmap_kernel_page(unsigned long va); #ifndef __ASSEMBLER__ #include <linux/sched.h> #include <linux/threads.h> +#include <linux/page_table_check.h> /* Bits to mask out from a PGD to get to the PUD page */ #define PGD_MASKED_BITS 0 @@ -311,7 +312,11 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0)); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_SET_WRPROTECT @@ -433,6 +438,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return true; } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && !is_kernel_addr(addr); +} + /* Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. * diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index aac8ce30cd3b..1a91762b455d 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -144,6 +144,8 @@ #define PAGE_KERNEL_ROX __pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX) #ifndef __ASSEMBLER__ +#include <linux/page_table_check.h> + /* * page table defines */ @@ -416,8 +418,11 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0); - return __pte(old); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL @@ -426,11 +431,16 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, pte_t *ptep, int full) { if (full && radix_enabled()) { + pte_t old_pte; + /* * We know that this is a full mm pte clear and * hence can be sure there is no parallel set_pte. */ - return radix__ptep_get_and_clear_full(mm, addr, ptep, full); + old_pte = radix__ptep_get_and_clear_full(mm, addr, ptep, full); + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } return ptep_get_and_clear(mm, addr, ptep); } @@ -539,6 +549,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return arch_pte_access_permitted(pte_val(pte), write, 0); } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && pte_user(pte); +} + /* * Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. @@ -909,6 +924,12 @@ static inline bool pud_access_permitted(pud_t pud, bool write) return pte_access_permitted(pud_pte(pud), write); } +#define pud_user_accessible_page pud_user_accessible_page +static inline bool pud_user_accessible_page(pud_t pud, unsigned long addr) +{ + return pud_leaf(pud) && pte_user_accessible_page(pud_pte(pud), addr); +} + #define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) }) static inline __be64 p4d_raw(p4d_t x) { @@ -1074,6 +1095,12 @@ static inline bool pmd_access_permitted(pmd_t pmd, bool write) return pte_access_permitted(pmd_pte(pmd), write); } +#define pmd_user_accessible_page pmd_user_accessible_page +static inline bool pmd_user_accessible_page(pmd_t pmd, unsigned long addr) +{ + return pmd_leaf(pmd) && pte_user_accessible_page(pmd_pte(pmd), addr); +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot); extern pud_t pfn_pud(unsigned long pfn, pgprot_t pgprot); @@ -1284,19 +1311,34 @@ extern int pudp_test_and_clear_young(struct vm_area_struct *vma, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { - if (radix_enabled()) - return radix__pmdp_huge_get_and_clear(mm, addr, pmdp); - return hash__pmdp_huge_get_and_clear(mm, addr, pmdp); + pmd_t old_pmd; + + if (radix_enabled()) { + old_pmd = radix__pmdp_huge_get_and_clear(mm, addr, pmdp); + } else { + old_pmd = hash__pmdp_huge_get_and_clear(mm, addr, pmdp); + } + + page_table_check_pmd_clear(mm, addr, old_pmd); + + return old_pmd; } #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) { - if (radix_enabled()) - return radix__pudp_huge_get_and_clear(mm, addr, pudp); - BUG(); - return *pudp; + pud_t old_pud; + + if (radix_enabled()) { + old_pud = radix__pudp_huge_get_and_clear(mm, addr, pudp); + } else { + BUG(); + } + + page_table_check_pud_clear(mm, addr, old_pud); + + return old_pud; } static inline pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h index 146287d9580f..6cc9abcd7b3d 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h @@ -12,7 +12,6 @@ #define PPC64_TLB_BATCH_NR 192 struct ppc64_tlb_batch { - int active; unsigned long index; struct mm_struct *mm; real_pte_t pte[PPC64_TLB_BATCH_NR]; @@ -24,12 +23,8 @@ DECLARE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch); extern void __flush_tlb_pending(struct ppc64_tlb_batch *batch); -#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE - static inline void arch_enter_lazy_mmu_mode(void) { - struct ppc64_tlb_batch *batch; - if (radix_enabled()) return; /* @@ -37,11 +32,9 @@ static inline void arch_enter_lazy_mmu_mode(void) * operating on kernel page tables. */ preempt_disable(); - batch = this_cpu_ptr(&ppc64_tlb_batch); - batch->active = 1; } -static inline void arch_leave_lazy_mmu_mode(void) +static inline void arch_flush_lazy_mmu_mode(void) { struct ppc64_tlb_batch *batch; @@ -51,11 +44,16 @@ static inline void arch_leave_lazy_mmu_mode(void) if (batch->index) __flush_tlb_pending(batch); - batch->active = 0; - preempt_enable(); } -#define arch_flush_lazy_mmu_mode() do {} while (0) +static inline void arch_leave_lazy_mmu_mode(void) +{ + if (radix_enabled()) + return; + + arch_flush_lazy_mmu_mode(); + preempt_enable(); +} extern void hash__tlbiel_all(unsigned int action); diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 86326587e58d..6d32a4299445 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -68,7 +68,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty); -void gigantic_hugetlb_cma_reserve(void) __init; #include <asm-generic/hugetlb.h> #else /* ! CONFIG_HUGETLB_PAGE */ @@ -77,10 +76,6 @@ static inline void flush_hugetlb_page(struct vm_area_struct *vma, { } -static inline void __init gigantic_hugetlb_cma_reserve(void) -{ -} - static inline void __init hugetlbpage_init_defaultsize(void) { } diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 5af168b7f292..e6da5eaccff6 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -29,6 +29,8 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p #ifndef __ASSEMBLER__ +#include <linux/page_table_check.h> + extern int icache_44x_need_flush; #ifndef pte_huge_size @@ -122,7 +124,11 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0)); + + page_table_check_pte_clear(mm, addr, old_pte); + + return old_pte; } #define __HAVE_ARCH_PTEP_GET_AND_CLEAR @@ -243,6 +249,11 @@ static inline bool pte_access_permitted(pte_t pte, bool write) return true; } +static inline bool pte_user_accessible_page(pte_t pte, unsigned long addr) +{ + return pte_present(pte) && !is_kernel_addr(addr); +} + /* Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. * diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index b28fbb1d57eb..f2bb1f98eebe 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -271,6 +271,7 @@ static inline const void *pfn_to_kaddr(unsigned long pfn) struct page; extern void clear_user_page(void *page, unsigned long vaddr, struct page *pg); +#define clear_user_page clear_user_page extern void copy_user_page(void *to, void *from, unsigned long vaddr, struct page *p); extern int devmem_is_allowed(unsigned long pfn); diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 17fd7ff6e535..dcd3a88caaf6 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -34,6 +34,8 @@ struct mm_struct; void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); #define set_ptes set_ptes +void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); #define update_mmu_cache(vma, addr, ptep) \ update_mmu_cache_range(NULL, vma, addr, ptep, 1) @@ -202,6 +204,14 @@ static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size) #endif /* CONFIG_PPC64 */ +#ifndef pmd_user_accessible_page +#define pmd_user_accessible_page(pmd, addr) false +#endif + +#ifndef pud_user_accessible_page +#define pud_user_accessible_page(pud, addr) false +#endif + #endif /* __ASSEMBLER__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h index 50a92b24628d..6d60ea4868ab 100644 --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -20,7 +20,11 @@ extern void reloc_got2(unsigned long); void check_for_initrd(void); void mem_topology_setup(void); +#ifdef CONFIG_NUMA void initmem_init(void); +#else +static inline void initmem_init(void) {} +#endif void setup_panic(void); #define ARCH_PANIC_TIMEOUT 180 diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index b0f200aba2b3..97f35f9b1a96 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -154,12 +154,10 @@ void arch_setup_new_exec(void); /* Don't move TLF_NAPPING without adjusting the code in entry_32.S */ #define TLF_NAPPING 0 /* idle thread enabled NAP mode */ #define TLF_SLEEPING 1 /* suspend code enabled SLEEP mode */ -#define TLF_LAZY_MMU 3 /* tlb_batch is active */ #define TLF_RUNLATCH 4 /* Is the runlatch enabled? */ #define _TLF_NAPPING (1 << TLF_NAPPING) #define _TLF_SLEEPING (1 << TLF_SLEEPING) -#define _TLF_LAZY_MMU (1 << TLF_LAZY_MMU) #define _TLF_RUNLATCH (1 << TLF_RUNLATCH) #ifndef __ASSEMBLER__ diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 2058e8d3e013..1ca7d4c4b90d 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -37,7 +37,6 @@ extern void tlb_flush(struct mmu_gather *tlb); */ #define tlb_needs_table_invalidate() radix_enabled() -#define __HAVE_ARCH_TLB_REMOVE_TABLE /* Get the generic bits... */ #include <asm-generic/tlb.h> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index a45fe147868b..a15d0b619b1f 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1281,9 +1281,6 @@ struct task_struct *__switch_to(struct task_struct *prev, { struct thread_struct *new_thread, *old_thread; struct task_struct *last; -#ifdef CONFIG_PPC_64S_HASH_MMU - struct ppc64_tlb_batch *batch; -#endif new_thread = &new->thread; old_thread = ¤t->thread; @@ -1291,14 +1288,6 @@ struct task_struct *__switch_to(struct task_struct *prev, WARN_ON(!irqs_disabled()); #ifdef CONFIG_PPC_64S_HASH_MMU - batch = this_cpu_ptr(&ppc64_tlb_batch); - if (batch->active) { - current_thread_info()->local_flags |= _TLF_LAZY_MMU; - if (batch->index) - __flush_tlb_pending(batch); - batch->active = 0; - } - /* * On POWER9 the copy-paste buffer can only paste into * foreign real addresses, so unprivileged processes can not @@ -1369,20 +1358,6 @@ struct task_struct *__switch_to(struct task_struct *prev, */ #ifdef CONFIG_PPC_BOOK3S_64 -#ifdef CONFIG_PPC_64S_HASH_MMU - /* - * This applies to a process that was context switched while inside - * arch_enter_lazy_mmu_mode(), to re-activate the batch that was - * deactivated above, before _switch(). This will never be the case - * for new tasks. - */ - if (current_thread_info()->local_flags & _TLF_LAZY_MMU) { - current_thread_info()->local_flags &= ~_TLF_LAZY_MMU; - batch = this_cpu_ptr(&ppc64_tlb_batch); - batch->active = 1; - } -#endif - /* * Math facilities are masked out of the child MSR in copy_thread. * A new task does not need to restore_math because it will diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index c8c42b419742..cb5b73adc250 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -1003,7 +1003,6 @@ void __init setup_arch(char **cmdline_p) fadump_cma_init(); kdump_cma_reserve(); kvm_cma_reserve(); - gigantic_hugetlb_cma_reserve(); early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn << PAGE_SHIFT); diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c index 82d31177630b..ac2a24d15d2e 100644 --- a/arch/powerpc/mm/book3s64/hash_pgtable.c +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c @@ -8,6 +8,7 @@ #include <linux/sched.h> #include <linux/mm_types.h> #include <linux/mm.h> +#include <linux/page_table_check.h> #include <linux/stop_machine.h> #include <asm/sections.h> @@ -230,6 +231,9 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres pmd = *pmdp; pmd_clear(pmdp); + + page_table_check_pmd_clear(vma->vm_mm, address, pmd); + /* * Wait for all pending hash_page to finish. This is needed * in case of subpage collapse. When we collapse normal pages diff --git a/arch/powerpc/mm/book3s64/hash_tlb.c b/arch/powerpc/mm/book3s64/hash_tlb.c index 21fcad97ae80..ec2941cec815 100644 --- a/arch/powerpc/mm/book3s64/hash_tlb.c +++ b/arch/powerpc/mm/book3s64/hash_tlb.c @@ -25,11 +25,12 @@ #include <asm/tlb.h> #include <asm/bug.h> #include <asm/pte-walk.h> - +#include <kunit/visibility.h> #include <trace/events/thp.h> DEFINE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch); +EXPORT_SYMBOL_IF_KUNIT(ppc64_tlb_batch); /* * A linux PTE was changed and the corresponding hash table entry @@ -100,7 +101,7 @@ void hpte_need_flush(struct mm_struct *mm, unsigned long addr, * Check if we have an active batch on this CPU. If not, just * flush now and return. */ - if (!batch->active) { + if (!is_lazy_mmu_mode_active()) { flush_hash_page(vpn, rpte, psize, ssize, mm_is_thread_local(mm)); put_cpu_var(ppc64_tlb_batch); return; @@ -154,6 +155,7 @@ void __flush_tlb_pending(struct ppc64_tlb_batch *batch) flush_hash_range(i, local); batch->index = 0; } +EXPORT_SYMBOL_IF_KUNIT(__flush_tlb_pending); void hash__tlb_flush(struct mmu_gather *tlb) { @@ -205,7 +207,7 @@ void __flush_hash_table_range(unsigned long start, unsigned long end) * way to do things but is fine for our needs here. */ local_irq_save(flags); - arch_enter_lazy_mmu_mode(); + lazy_mmu_mode_enable(); for (; start < end; start += PAGE_SIZE) { pte_t *ptep = find_init_mm_pte(start, &hugepage_shift); unsigned long pte; @@ -217,7 +219,7 @@ void __flush_hash_table_range(unsigned long start, unsigned long end) continue; hpte_need_flush(&init_mm, start, ptep, pte, hugepage_shift); } - arch_leave_lazy_mmu_mode(); + lazy_mmu_mode_disable(); local_irq_restore(flags); } @@ -237,7 +239,7 @@ void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long * way to do things but is fine for our needs here. */ local_irq_save(flags); - arch_enter_lazy_mmu_mode(); + lazy_mmu_mode_enable(); start_pte = pte_offset_map(pmd, addr); if (!start_pte) goto out; @@ -249,6 +251,6 @@ void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long } pte_unmap(start_pte); out: - arch_leave_lazy_mmu_mode(); + lazy_mmu_mode_disable(); local_irq_restore(flags); } diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index e3485db7de02..4b09c04654a8 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -10,6 +10,7 @@ #include <linux/pkeys.h> #include <linux/debugfs.h> #include <linux/proc_fs.h> +#include <linux/page_table_check.h> #include <asm/pgalloc.h> #include <asm/tlb.h> @@ -127,7 +128,8 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pmd_leaf(pmd))); #endif trace_hugepage_set_pmd(addr, pmd_val(pmd)); - return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd)); + page_table_check_pmd_set(mm, addr, pmdp, pmd); + return set_pte_at_unchecked(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd)); } void set_pud_at(struct mm_struct *mm, unsigned long addr, @@ -144,7 +146,8 @@ void set_pud_at(struct mm_struct *mm, unsigned long addr, WARN_ON(!(pud_leaf(pud))); #endif trace_hugepage_set_pud(addr, pud_val(pud)); - return set_pte_at(mm, addr, pudp_ptep(pudp), pud_pte(pud)); + page_table_check_pud_set(mm, addr, pudp, pud); + return set_pte_at_unchecked(mm, addr, pudp_ptep(pudp), pud_pte(pud)); } static void do_serialize(void *arg) @@ -179,23 +182,27 @@ void serialize_against_pte_lookup(struct mm_struct *mm) pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { - unsigned long old_pmd; + pmd_t old_pmd; VM_WARN_ON_ONCE(!pmd_present(*pmdp)); - old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID); + old_pmd = __pmd(pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID)); flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); - return __pmd(old_pmd); + page_table_check_pmd_clear(vma->vm_mm, address, old_pmd); + + return old_pmd; } pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, pud_t *pudp) { - unsigned long old_pud; + pud_t old_pud; VM_WARN_ON_ONCE(!pud_present(*pudp)); - old_pud = pud_hugepage_update(vma->vm_mm, address, pudp, _PAGE_PRESENT, _PAGE_INVALID); + old_pud = __pud(pud_hugepage_update(vma->vm_mm, address, pudp, _PAGE_PRESENT, _PAGE_INVALID)); flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); - return __pud(old_pud); + page_table_check_pud_clear(vma->vm_mm, address, old_pud); + + return old_pud; } pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma, @@ -550,7 +557,7 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, if (radix_enabled()) return radix__ptep_modify_prot_commit(vma, addr, ptep, old_pte, pte); - set_pte_at(vma->vm_mm, addr, ptep, pte); + set_pte_at_unchecked(vma->vm_mm, addr, ptep, pte); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index 73977dbabcf2..10aced261cff 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -14,6 +14,7 @@ #include <linux/of.h> #include <linux/of_fdt.h> #include <linux/mm.h> +#include <linux/page_table_check.h> #include <linux/hugetlb.h> #include <linux/string_helpers.h> #include <linux/memory.h> @@ -1474,6 +1475,8 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre pmd = *pmdp; pmd_clear(pmdp); + page_table_check_pmd_clear(vma->vm_mm, address, pmd); + radix__flush_tlb_collapsed_pmd(vma->vm_mm, address); return pmd; @@ -1606,7 +1609,7 @@ void radix__ptep_modify_prot_commit(struct vm_area_struct *vma, (atomic_read(&mm->context.copros) > 0)) radix__flush_tlb_page(vma, addr); - set_pte_at(mm, addr, ptep, pte); + set_pte_at_unchecked(mm, addr, ptep, pte); } int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) @@ -1617,7 +1620,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) if (!radix_enabled()) return 0; - set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pud); + set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pud); return 1; } @@ -1664,7 +1667,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) if (!radix_enabled()) return 0; - set_pte_at(&init_mm, 0 /* radix unused */, ptep, new_pmd); + set_pte_at_unchecked(&init_mm, 0 /* radix unused */, ptep, new_pmd); return 1; } diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c index ec98e526167e..07c47673bba2 100644 --- a/arch/powerpc/mm/book3s64/subpage_prot.c +++ b/arch/powerpc/mm/book3s64/subpage_prot.c @@ -73,13 +73,13 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr, pte = pte_offset_map_lock(mm, pmd, addr, &ptl); if (!pte) return; - arch_enter_lazy_mmu_mode(); + lazy_mmu_mode_enable(); for (; npages > 0; --npages) { pte_update(mm, addr, pte, 0, 0, 0); addr += PAGE_SIZE; ++pte; } - arch_leave_lazy_mmu_mode(); + lazy_mmu_mode_disable(); pte_unmap_unlock(pte - 1, ptl); } diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d3c1b749dcfc..558fafb82b8a 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -200,18 +200,15 @@ static int __init hugetlbpage_init(void) arch_initcall(hugetlbpage_init); -void __init gigantic_hugetlb_cma_reserve(void) +unsigned int __init arch_hugetlb_cma_order(void) { - unsigned long order = 0; - if (radix_enabled()) - order = PUD_SHIFT - PAGE_SHIFT; + return PUD_SHIFT - PAGE_SHIFT; else if (!firmware_has_feature(FW_FEATURE_LPAR) && mmu_psize_defs[MMU_PAGE_16G].shift) /* * For pseries we do use ibm,expected#pages for reserving 16G pages. */ - order = mmu_psize_to_shift(MMU_PAGE_16G) - PAGE_SHIFT; + return mmu_psize_to_shift(MMU_PAGE_16G) - PAGE_SHIFT; - if (order) - hugetlb_cma_reserve(order); + return 0; } diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index bc0f1a9eb0bc..29bf347f6012 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -182,11 +182,6 @@ void __init mem_topology_setup(void) memblock_set_node(0, PHYS_ADDR_MAX, &memblock.memory, 0); } -void __init initmem_init(void) -{ - sparse_init(); -} - /* mark pages that don't exist as nosave */ static int __init mark_nonram_nosave(void) { @@ -221,7 +216,16 @@ static int __init mark_nonram_nosave(void) * anyway) will take a first dip into ZONE_NORMAL and get otherwise served by * ZONE_DMA. */ -static unsigned long max_zone_pfns[MAX_NR_ZONES]; +void __init arch_zone_limits_init(unsigned long *max_zone_pfns) +{ +#ifdef CONFIG_ZONE_DMA + max_zone_pfns[ZONE_DMA] = min((zone_dma_limit >> PAGE_SHIFT) + 1, max_low_pfn); +#endif + max_zone_pfns[ZONE_NORMAL] = max_low_pfn; +#ifdef CONFIG_HIGHMEM + max_zone_pfns[ZONE_HIGHMEM] = max_pfn; +#endif +} /* * paging_init() sets up the page tables - in fact we've already done this. @@ -259,17 +263,6 @@ void __init paging_init(void) zone_dma_limit = DMA_BIT_MASK(zone_dma_bits); -#ifdef CONFIG_ZONE_DMA - max_zone_pfns[ZONE_DMA] = min(max_low_pfn, - 1UL << (zone_dma_bits - PAGE_SHIFT)); -#endif - max_zone_pfns[ZONE_NORMAL] = max_low_pfn; -#ifdef CONFIG_HIGHMEM - max_zone_pfns[ZONE_HIGHMEM] = max_pfn; -#endif - - free_area_init(max_zone_pfns); - mark_nonram_nosave(); } diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 603a0f652ba6..f4cf3ae036de 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1213,8 +1213,6 @@ void __init initmem_init(void) setup_node_data(nid, start_pfn, end_pfn); } - sparse_init(); - /* * We need the numa_cpu_lookup_table to be accurate for all CPUs, * even before we online them, so that we can use cpu_to_{node,mem} diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 56d7e8960e77..a9be337be3e4 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -22,6 +22,7 @@ #include <linux/mm.h> #include <linux/percpu.h> #include <linux/hardirq.h> +#include <linux/page_table_check.h> #include <linux/hugetlb.h> #include <asm/tlbflush.h> #include <asm/tlb.h> @@ -206,6 +207,9 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, * and not hw_valid ptes. Hence there is no translation cache flush * involved that need to be batched. */ + + page_table_check_ptes_set(mm, addr, ptep, pte, nr); + for (;;) { /* @@ -224,6 +228,14 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, } } +void set_pte_at_unchecked(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); + pte = set_pte_filter(pte, addr); + __set_pte_at(mm, addr, ptep, pte, 0); +} + void unmap_kernel_page(unsigned long va) { pmd_t *pmdp = pmd_off_k(va); diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index 4c321a8ea896..f399917c17bd 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -93,6 +93,7 @@ config PPC_BOOK3S_64 select IRQ_WORK select PPC_64S_HASH_MMU if !PPC_RADIX_MMU select KASAN_VMALLOC if KASAN + select ARCH_HAS_LAZY_MMU_MODE config PPC_BOOK3E_64 bool "Embedded processors" diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 3e042218d6cd..f7052b131a4c 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -120,7 +120,7 @@ config PPC_SMLPAR config CMM tristate "Collaborative memory management" depends on PPC_SMLPAR - select MEMORY_BALLOON + select BALLOON default y help Select this option, if you want to enable the kernel interface diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c index 4cbbe2ee58ab..8d83df12430f 100644 --- a/arch/powerpc/platforms/pseries/cmm.c +++ b/arch/powerpc/platforms/pseries/cmm.c @@ -19,7 +19,7 @@ #include <linux/stringify.h> #include <linux/swap.h> #include <linux/device.h> -#include <linux/balloon_compaction.h> +#include <linux/balloon.h> #include <asm/firmware.h> #include <asm/hvcall.h> #include <asm/mmu.h> @@ -165,7 +165,6 @@ static long cmm_alloc_pages(long nr) balloon_page_enqueue(&b_dev_info, page); atomic_long_inc(&loaned_pages); - adjust_managed_page_count(page, -1); nr--; } @@ -190,7 +189,6 @@ static long cmm_free_pages(long nr) if (!page) break; plpar_page_set_active(page); - adjust_managed_page_count(page, 1); __free_page(page); atomic_long_dec(&loaned_pages); nr--; @@ -496,13 +494,11 @@ static struct notifier_block cmm_mem_nb = { .priority = CMM_MEM_HOTPLUG_PRI }; -#ifdef CONFIG_BALLOON_COMPACTION +#ifdef CONFIG_BALLOON_MIGRATION static int cmm_migratepage(struct balloon_dev_info *b_dev_info, struct page *newpage, struct page *page, enum migrate_mode mode) { - unsigned long flags; - /* * loan/"inflate" the newpage first. * @@ -517,47 +513,17 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info, return -EBUSY; } - /* balloon page list reference */ - get_page(newpage); - - /* - * When we migrate a page to a different zone, we have to fixup the - * count of both involved zones as we adjusted the managed page count - * when inflating. - */ - if (page_zone(page) != page_zone(newpage)) { - adjust_managed_page_count(page, 1); - adjust_managed_page_count(newpage, -1); - } - - spin_lock_irqsave(&b_dev_info->pages_lock, flags); - balloon_page_insert(b_dev_info, newpage); - __count_vm_event(BALLOON_MIGRATE); - b_dev_info->isolated_pages--; - spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); - /* * activate/"deflate" the old page. We ignore any errors just like the * other callers. */ plpar_page_set_active(page); - - balloon_page_finalize(page); - /* balloon page list reference */ - put_page(page); - return 0; } - -static void cmm_balloon_compaction_init(void) -{ - b_dev_info.migratepage = cmm_migratepage; -} -#else /* CONFIG_BALLOON_COMPACTION */ -static void cmm_balloon_compaction_init(void) -{ -} -#endif /* CONFIG_BALLOON_COMPACTION */ +#else /* CONFIG_BALLOON_MIGRATION */ +int cmm_migratepage(struct balloon_dev_info *b_dev_info, struct page *newpage, + struct page *page, enum migrate_mode mode); +#endif /* CONFIG_BALLOON_MIGRATION */ /** * cmm_init - Module initialization @@ -573,11 +539,13 @@ static int cmm_init(void) return -EOPNOTSUPP; balloon_devinfo_init(&b_dev_info); - cmm_balloon_compaction_init(); + b_dev_info.adjust_managed_page_count = true; + if (IS_ENABLED(CONFIG_BALLOON_MIGRATION)) + b_dev_info.migratepage = cmm_migratepage; rc = register_oom_notifier(&cmm_oom_nb); if (rc < 0) - goto out_balloon_compaction; + return rc; if ((rc = register_reboot_notifier(&cmm_reboot_nb))) goto out_oom_notifier; @@ -606,7 +574,6 @@ out_reboot_notifier: unregister_reboot_notifier(&cmm_reboot_nb); out_oom_notifier: unregister_oom_notifier(&cmm_oom_nb); -out_balloon_compaction: return rc; } |
