summaryrefslogtreecommitdiff
path: root/mm/pagewalk.c
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-09-29 18:48:39 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-09-29 18:48:39 -0700
commitfeafee284579d29537a5a56ba8f23894f0463f3d (patch)
tree6677a27bc63d72af245430186b40da0c3be28482 /mm/pagewalk.c
parentfe68bb2861808ed5c48d399bd7e670ab76829d55 (diff)
parentea0b39168d3a2313eabd145fb3440c946ccff4d1 (diff)
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon: "There's good stuff across the board, including some nice mm improvements for CPUs with the 'noabort' BBML2 feature and a clever patch to allow ptdump to play nicely with block mappings in the vmalloc area. Confidential computing: - Add support for accepting secrets from firmware (e.g. ACPI CCEL) and mapping them with appropriate attributes. CPU features: - Advertise atomic floating-point instructions to userspace - Extend Spectre workarounds to cover additional Arm CPU variants - Extend list of CPUs that support break-before-make level 2 and guarantee not to generate TLB conflict aborts for changes of mapping granularity (BBML2_NOABORT) - Add GCS support to our uprobes implementation. Documentation: - Remove bogus SME documentation concerning register state when entering/exiting streaming mode. Entry code: - Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY) - Micro-optimise syscall entry path with a compiler branch hint. Memory management: - Enable huge mappings in vmalloc space even when kernel page-table dumping is enabled - Tidy up the types used in our early MMU setup code - Rework rodata= for closer parity with the behaviour on x86 - For CPUs implementing BBML2_NOABORT, utilise block mappings in the linear map even when rodata= applies to virtual aliases - Don't re-allocate the virtual region between '_text' and '_stext', as doing so confused tools parsing /proc/vmcore. Miscellaneous: - Clean-up Kconfig menuconfig text for architecture features - Avoid redundant bitmap_empty() during determination of supported SME vector lengths - Re-enable warnings when building the 32-bit vDSO object - Avoid breaking our eggs at the wrong end. Perf and PMUs: - Support for v3 of the Hisilicon L3C PMU - Support for Hisilicon's MN and NoC PMUs - Support for Fujitsu's Uncore PMU - Support for SPE's extended event filtering feature - Preparatory work to enable data source filtering in SPE - Support for multiple lanes in the DWC PCIe PMU - Support for i.MX94 in the IMX DDR PMU driver - MAINTAINERS update (Thank you, Yicong) - Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets). Selftests: - Add basic LSFE check to the existing hwcaps test - Support nolibc in GCS tests - Extend SVE ptrace test to pass unsupported regsets and invalid vector lengths - Minor cleanups (typos, cosmetic changes). System registers: - Fix ID_PFR1_EL1 definition - Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1 - Sync TCR_EL1 definition with the latest Arm ARM (L.b) - Be stricter about the input fed into our AWK sysreg generator script - Typo fixes and removal of redundant definitions. ACPI, EFI and PSCI: - Decouple Arm's "Software Delegated Exception Interface" (SDEI) support from the ACPI GHES code so that it can be used by platforms booted with device-tree - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI runtime calls - Fix a node refcount imbalance in the PSCI device-tree code. CPU Features: - Ensure register sanitisation is applied to fields in ID_AA64MMFR4 - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests can reliably query the underlying CPU types from the VMM - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes to our context-switching, signal handling and ptrace code" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: cpufeature: Remove duplicate asm/mmu.h header arm64: Kconfig: Make CPU_BIG_ENDIAN depend on BROKEN perf/dwc_pcie: Fix use of uninitialized variable arm/syscalls: mark syscall invocation as likely in invoke_syscall Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU Documentation: hisi-pmu: Fix of minor format error drivers/perf: hisi: Add support for L3C PMU v3 drivers/perf: hisi: Refactor the event configuration of L3C PMU drivers/perf: hisi: Extend the field of tt_core drivers/perf: hisi: Extract the event filter check of L3C PMU drivers/perf: hisi: Simplify the probe process of each L3C PMU version drivers/perf: hisi: Export hisi_uncore_pmu_isr() drivers/perf: hisi: Relax the event ID check in the framework perf: Fujitsu: Add the Uncore PMU driver arm64: map [_text, _stext) virtual address range non-executable+read-only arm64/sysreg: Update TCR_EL1 register arm64: Enable vmalloc-huge with ptdump arm64: cpufeature: add Neoverse-V3AE to BBML2 allow list arm64: errata: Apply workarounds for Neoverse-V3AE arm64: cputype: Add Neoverse-V3AE definitions ...
Diffstat (limited to 'mm/pagewalk.c')
-rw-r--r--mm/pagewalk.c36
1 files changed, 24 insertions, 12 deletions
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index 648038247a8d..936689d8bcac 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -606,10 +606,32 @@ int walk_page_range(struct mm_struct *mm, unsigned long start,
int walk_kernel_page_table_range(unsigned long start, unsigned long end,
const struct mm_walk_ops *ops, pgd_t *pgd, void *private)
{
- struct mm_struct *mm = &init_mm;
+ /*
+ * Kernel intermediate page tables are usually not freed, so the mmap
+ * read lock is sufficient. But there are some exceptions.
+ * E.g. memory hot-remove. In which case, the mmap lock is insufficient
+ * to prevent the intermediate kernel pages tables belonging to the
+ * specified address range from being freed. The caller should take
+ * other actions to prevent this race.
+ */
+ mmap_assert_locked(&init_mm);
+
+ return walk_kernel_page_table_range_lockless(start, end, ops, pgd,
+ private);
+}
+
+/*
+ * Use this function to walk the kernel page tables locklessly. It should be
+ * guaranteed that the caller has exclusive access over the range they are
+ * operating on - that there should be no concurrent access, for example,
+ * changing permissions for vmalloc objects.
+ */
+int walk_kernel_page_table_range_lockless(unsigned long start, unsigned long end,
+ const struct mm_walk_ops *ops, pgd_t *pgd, void *private)
+{
struct mm_walk walk = {
.ops = ops,
- .mm = mm,
+ .mm = &init_mm,
.pgd = pgd,
.private = private,
.no_vma = true
@@ -620,16 +642,6 @@ int walk_kernel_page_table_range(unsigned long start, unsigned long end,
if (!check_ops_valid(ops))
return -EINVAL;
- /*
- * Kernel intermediate page tables are usually not freed, so the mmap
- * read lock is sufficient. But there are some exceptions.
- * E.g. memory hot-remove. In which case, the mmap lock is insufficient
- * to prevent the intermediate kernel pages tables belonging to the
- * specified address range from being freed. The caller should take
- * other actions to prevent this race.
- */
- mmap_assert_locked(mm);
-
return walk_pgd_range(start, end, &walk);
}