linux-toradex.git/arch/um/include/asm, branch master

Merge tag 'locking-core-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip

2026-06-15T08:51:14+00:00

Pull locking updates from Ingo Molnar:
 "Futex updates:

   - Optimize futex hash bucket access patterns (Peter Zijlstra)

   - Large series to address the robust futex unlock race for real, by
     Thomas Gleixner:

      "The robust futex unlock mechanism is racy in respect to the
       clearing of the robust_list_head::list_op_pending pointer because
       unlock and clearing the pointer are not atomic.

       The race window is between the unlock and clearing the pending op
       pointer. If the task is forced to exit in this window, exit will
       access a potentially invalid pending op pointer when cleaning up
       the robust list.

       That happens if another task manages to unmap the object
       containing the lock before the cleanup, which results in an UAF.

       In the worst case this UAF can lead to memory corruption when
       unrelated content has been mapped to the same address by the time
       the access happens.

       User space can't solve this problem without help from the kernel.
       This series provides the kernel side infrastructure to help it
       along:

        1) Combined unlock, pointer clearing, wake-up for the
           contended case

        2) VDSO based unlock and pointer clearing helpers with a
           fix-up function in the kernel when user space was interrupted
           within the critical section.

      ... with help by André Almeida:

        - Add a note about robust list race condition (André Almeida)
        - Add self-tests for robust release operations (André Almeida)

  Context analysis updates:

   - Implement context analysis for 'struct rt_mutex'. (Bart Van Assche)
   - Bump required Clang version to 23 (Marco Elver)

  Guard infrastructure updates:

   - Series to remove NULL check from unconditional guards (Dmitry
     Ilvokhin)

  Lockdep updates:

   - Restore self-test migrate_disable() and sched_rt_mutex state on
     PREEMPT_RT (Karl Mehltretter)

  Membarriers updates:

   - Use per-CPU mutexes for targeted commands (Aniket Gattani)
   - Modernize membarrier_global_expedited with cleanup guards (Aniket
     Gattani)
   - Add rseq stress test for CFS throttle interactions (Aniket Gattani)

  percpu-rwsems updates:

   - Extract __percpu_up_read() to optimize inlining overhead (Dmitry
     Ilvokhin)

  Seqlocks updates:

   - Allow UBSAN_ALIGNMENT to fail optimizing (Heiko Carstens)

  Lock tracing:

   - Add contended_release tracepoint to sleepable locks such as
     mutexes, percpu-rwsems, rtmutexes, rwsems and semaphores (Dmitry
     Ilvokhin)

  MAINTAINERS updates:

   - MAINTAINERS: Add RUST [SYNC] entry (Boqun Feng)

  Misc updates and fixes by Randy Dunlap, YE WEI-HONG, Fabricio Parra,
  Dmitry Ilvokhin and Peter Zijlstra"

* tag 'locking-core-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (36 commits)
  locking: Add contended_release tracepoint to sleepable locks
  locking/percpu-rwsem: Extract __percpu_up_read()
  tracing/lock: Remove unnecessary linux/sched.h include
  futex: Optimize futex hash bucket access patterns
  rust: sync: completion: Mark inline complete_all and wait_for_completion
  MAINTAINERS: Add RUST [SYNC] entry
  cleanup: Specify nonnull argument index
  selftests: futex: Add tests for robust release operations
  Documentation: futex: Add a note about robust list race condition
  x86/vdso: Implement __vdso_futex_robust_try_unlock()
  x86/vdso: Prepare for robust futex unlock support
  futex: Provide infrastructure to plug the non contended robust futex unlock race
  futex: Add robust futex unlock IP range
  futex: Add support for unlocking robust futexes
  futex: Cleanup UAPI defines
  x86: Select ARCH_MEMORY_ORDER_TSO
  uaccess: Provide unsafe_atomic_store_release_user()
  futex: Provide UABI defines for robust list entry modifiers
  futex: Move futex related mm_struct data into a struct
  futex: Make futex_mm_init() void
  ...

percpu: Sanitize __percpu_qual include hell

2026-06-03T09:38:48+00:00

Slapping __percpu_qual into the next available header is sloppy at best.

It's required by __percpu which is defined in compiler_types.h and that is
meant to be included without requiring a boatload of other headers so that
a struct or function declaration can contain a __percpu qualifier w/o
further prerequisites.

This implicit dependency on linux/percpu.h makes that impossible and causes
a major problem when trying to separate headers.

Create asm/percpu_types.h and move it there. Include that from
compiler_types.h and the whole recursion problem goes away.

Fix up UM so it uses the generic header and includes it in the UM_HOST
build, which pulls in compiler_types.h. The USER_CFLAGS fix was suggested
by Richard.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://patch.msgid.link/20260602090535.254874125@kernel.org

ring-buffer: Flush and stop persistent ring buffer on panic

2026-05-21T12:20:58+00:00

On real hardware, panic and machine reboot may not flush hardware cache
to memory. This means the persistent ring buffer, which relies on a
coherent state of memory, may not have its events written to the buffer
and they may be lost. Moreover, there may be inconsistency with the
counters which are used for validation of the integrity of the
persistent ring buffer which may cause all data to be discarded.

To avoid this issue, stop recording of the ring buffer on panic and
flush the cache of the ring buffer's memory.

Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance")
Cc: stable@vger.kernel.org
Cc: Will Deacon 
Cc: Mathieu Desnoyers 
Cc: Ian Rogers 
Link: https://patch.msgid.link/177751969602.2136606.12031934362587643488.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) 
Acked-by: Catalin Marinas 
Acked-by: Geert Uytterhoeven 
Signed-off-by: Steven Rostedt

Merge tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

2026-04-20T23:36:46+00:00

Pull uml updates from Johannes Berg:
 "Mostly cleanups and small things, notably:

   - musl libc compatibility

   - vDSO installation fix

   - TLB sync race fix for recent SMP support

   - build fix for 32-bit with Clang 20/21"

* tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21
  um: drivers: call kernel_strrchr() explicitly in cow_user.c
  um: Replace strncpy() with strnlen()+memcpy_and_pad() in strncpy_chunk_from_user()
  x86/um: fix vDSO installation
  um: Remove CONFIG_FRAME_WARN from x86_64_defconfig
  um: Fix pte_read() and pte_exec() for kernel mappings
  um: Fix potential race condition in TLB sync
  um: time-travel: clean up kernel-doc warnings
  um: avoid struct sigcontext redefinition with musl
  um: fix address-of CMSG_DATA() rvalue in stub

Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-04-17T03:11:56+00:00

Pull non-MM updates from Andrew Morton:

 - "pid: make sub-init creation retryable" (Oleg Nesterov)

   Make creation of init in a new namespace more robust by clearing away
   some historical cruft which is no longer needed. Also some
   documentation fixups

 - "selftests/fchmodat2: Error handling and general" (Mark Brown)

   Fix and a cleanup for the fchmodat2() syscall selftest

 - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)

 - "hung_task: Provide runtime reset interface for hung task detector"
   (Aaron Tomlin)

   Give administrators the ability to zero out
   /proc/sys/kernel/hung_task_detect_count

 - "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" (Thomas Weißschuh)

   Teach getdelays to use the in-kernel UAPI headers rather than the
   system-provided ones

 - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)

   Several cleanups and fixups to the hardlockup detector code and its
   documentation

 - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)

   A couple of small/theoretical fixes in the bch code

 - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)

 - "cleanup the RAID5 XOR library" (Christoph Hellwig)

   A quite far-reaching cleanup to this code. I can't do better than to
   quote Christoph:

     "The XOR library used for the RAID5 parity is a bit of a mess right
      now. The main file sits in crypto/ despite not being cryptography
      and not using the crypto API, with the generic implementations
      sitting in include/asm-generic and the arch implementations
      sitting in an asm/ header in theory. The latter doesn't work for
      many cases, so architectures often build the code directly into
      the core kernel, or create another module for the architecture
      code.

      Change this to a single module in lib/ that also contains the
      architecture optimizations, similar to the library work Eric
      Biggers has done for the CRC and crypto libraries later. After
      that it changes to better calling conventions that allow for
      smarter architecture implementations (although none is contained
      here yet), and uses static_call to avoid indirection function call
      overhead"

 - "lib/list_sort: Clean up list_sort() scheduling workarounds"
   (Kuan-Wei Chiu)

   Clean up this library code by removing a hacky thing which was added
   for UBIFS, which UBIFS doesn't actually need

 - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)

   Fix a few bugs in the scatterlist code, add in-kernel tests for the
   now-fixed bugs and fix a leak in the test itself

 - "kdump: Enable LUKS-encrypted dump target support in ARM64 and
   PowerPC" (Coiby Xu)

   Enable support of the LUKS-encrypted device dump target on arm64 and
   powerpc

 - "ocfs2: consolidate extent list validation into block read callbacks"
   (Joseph Qi)

   Cleanup, simplify, and make more robust ocfs2's validation of extent
   list fields (Kernel test robot loves mounting corrupted fs images!)

* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
  ocfs2: validate group add input before caching
  ocfs2: validate bg_bits during freefrag scan
  ocfs2: fix listxattr handling when the buffer is full
  doc: watchdog: fix typos etc
  update Sean's email address
  ocfs2: use get_random_u32() where appropriate
  ocfs2: split transactions in dio completion to avoid credit exhaustion
  ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
  ocfs2: validate extent block list fields during block read
  ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
  ocfs2: validate dx_root extent list fields during block read
  ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
  ocfs2: handle invalid dinode in ocfs2_group_extend
  .get_maintainer.ignore: add Askar
  ocfs2: validate bg_list extent bounds in discontig groups
  checkpatch: exclude forward declarations of const structs
  tools/accounting: handle truncated taskstats netlink messages
  taskstats: set version in TGID exit notifications
  ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
  arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
  ...

arch, mm: consolidate empty_zero_page

2026-04-05T20:53:01+00:00

Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.

Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).

Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).

ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.

For the most part the change is mechanical, with these being noteworthy:

* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
  parameters. Switching to a generic empty_zero_page removes the aliasing
  and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
  ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
  although they do not support zero page coloring and having it in BSS
  will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
  can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
  but instead of allocating it, make mem_map_zero point to
  empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
  Rename the parameters page to boot_params_page and let sh use the generic
  empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page

	/* A handy thing to have if one has the RAM. Declared in head.S */

  that unfortunately had to go :)

Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) 
Acked-by: Helge Deller 		[parisc]
Tested-by: Helge Deller 		[parisc]
Reviewed-by: Christophe Leroy (CS GROUP) 
Acked-by: Dave Hansen 
Acked-by: Catalin Marinas 
Acked-by: Magnus Lindholm 	[alpha]
Acked-by: Dinh Nguyen 	[nios2]
Acked-by: Andreas Larsson 	[sparc]
Acked-by: David Hildenbrand (Arm) 
Acked-by: Liam R. Howlett 
Cc: "Borislav Petkov (AMD)" 
Cc: David S. Miller 
Cc: Geert Uytterhoeven 
Cc: Guo Ren 
Cc: Huacai Chen 
Cc: Ingo Molnar 
Cc: Johannes Berg 
Cc: John Paul Adrian Glaubitz 
Cc: Lorenzo Stoakes 
Cc: Madhavan Srinivasan 
Cc: Matt Turner 
Cc: Max Filippov 
Cc: Michael Ellerman 
Cc: Michal Hocko 
Cc: Michal Simek 
Cc: Palmer Dabbelt 
Cc: Richard Weinberger 
Cc: Russell King 
Cc: Stafford Horne 
Cc: Suren Baghdasaryan 
Cc: Vineet Gupta 
Cc: Vlastimil Babka 
Cc: Will Deacon 
Signed-off-by: Andrew Morton

xor: make xor.ko self-contained in lib/raid/

2026-04-03T06:36:20+00:00

Move the asm/xor.h headers to lib/raid/xor/$(SRCARCH)/xor_arch.h and
include/linux/raid/xor_impl.h to lib/raid/xor/xor_impl.h so that the
xor.ko module implementation is self-contained in lib/raid/.

As this remove the asm-generic mechanism a new kconfig symbol is added to
indicate that a architecture-specific implementations exists, and
xor_arch.h should be included.

Link: https://lkml.kernel.org/r/20260327061704.3707577-22-hch@lst.de
Signed-off-by: Christoph Hellwig 
Reviewed-by: Eric Biggers 
Tested-by: Eric Biggers 
Cc: Albert Ou 
Cc: Alexander Gordeev 
Cc: Alexandre Ghiti 
Cc: Andreas Larsson 
Cc: Anton Ivanov 
Cc: Ard Biesheuvel 
Cc: Arnd Bergmann 
Cc: "Borislav Petkov (AMD)" 
Cc: Catalin Marinas 
Cc: Chris Mason 
Cc: Christian Borntraeger 
Cc: Dan Williams 
Cc: David S. Miller 
Cc: David Sterba 
Cc: Heiko Carstens 
Cc: Herbert Xu 
Cc: "H. Peter Anvin" 
Cc: Huacai Chen 
Cc: Ingo Molnar 
Cc: Jason A. Donenfeld 
Cc: Johannes Berg 
Cc: Li Nan 
Cc: Madhavan Srinivasan 
Cc: Magnus Lindholm 
Cc: Matt Turner 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Palmer Dabbelt 
Cc: Richard Henderson 
Cc: Richard Weinberger 
Cc: Russell King 
Cc: Song Liu 
Cc: Sven Schnelle 
Cc: Ted Ts'o 
Cc: Vasily Gorbik 
Cc: WANG Xuerui 
Cc: Will Deacon 
Signed-off-by: Andrew Morton

um/xor: cleanup xor.h

2026-04-03T06:36:16+00:00

Since commit c055e3eae0f1 ("crypto: xor - use ktime for template
benchmarking") the benchmarking works just fine even for TT_MODE_INFCPU,
so drop the workarounds.  Note that for CPUs supporting AVX2, which
includes almost everything built in the last 10 years, the AVX2
implementation is forced anyway.

CONFIG_X86_32 is always correctly set for UM in arch/x86/um/Kconfig, so
don't override it either.

Link: https://lkml.kernel.org/r/20260327061704.3707577-5-hch@lst.de
Signed-off-by: Christoph Hellwig 
Acked-by: Richard Weinberger 
Reviewed-by: Eric Biggers 
Tested-by: Eric Biggers 
Cc: Albert Ou 
Cc: Alexander Gordeev 
Cc: Alexandre Ghiti 
Cc: Andreas Larsson 
Cc: Anton Ivanov 
Cc: Ard Biesheuvel 
Cc: Arnd Bergmann 
Cc: "Borislav Petkov (AMD)" 
Cc: Catalin Marinas 
Cc: Chris Mason 
Cc: Christian Borntraeger 
Cc: Dan Williams 
Cc: David S. Miller 
Cc: David Sterba 
Cc: Heiko Carstens 
Cc: Herbert Xu 
Cc: "H. Peter Anvin" 
Cc: Huacai Chen 
Cc: Ingo Molnar 
Cc: Jason A. Donenfeld 
Cc: Johannes Berg 
Cc: Li Nan 
Cc: Madhavan Srinivasan 
Cc: Magnus Lindholm 
Cc: Matt Turner 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Palmer Dabbelt 
Cc: Richard Henderson 
Cc: Russell King 
Cc: Song Liu 
Cc: Sven Schnelle 
Cc: Ted Ts'o 
Cc: Vasily Gorbik 
Cc: WANG Xuerui 
Cc: Will Deacon 
Signed-off-by: Andrew Morton

um: Fix pte_read() and pte_exec() for kernel mappings

2026-03-21T09:42:24+00:00

The pte_read() and pte_exec() helpers are only used during the TLB
sync to determine the read/exec permissions for mmap. However, for
kernel mappings, they will always return 0. This leads to kern_map()
having to unconditionally set the exec flag to 1 and the read flag
unexpectedly always being 0. Remove the unnecessary check for the
_PAGE_USER bit in these helpers to ensure that the kernel mapping
permissions can be correctly determined.

Signed-off-by: Tiwei Bie 
Link: https://patch.msgid.link/20260302235224.1915380-3-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg

treewide: provide a generic clear_user_page() variant

2026-01-21T03:24:39+00:00

Patch series "mm: folio_zero_user: clear page ranges", v11.

This series adds clearing of contiguous page ranges for hugepages.

The series improves on the current discontiguous clearing approach in two
ways:

  - clear pages in a contiguous fashion.
  - use batched clearing via clear_pages() wherever exposed.

The first is useful because it allows us to make much better use of
hardware prefetchers.

The second, enables advertising the real extent to the processor.  Where
specific instructions support it (ex.  string instructions on x86; "mops"
on arm64 etc), a processor can optimize based on this because, instead of
seeing a sequence of 8-byte stores, or a sequence of 4KB pages, it sees a
larger unit being operated on.

For instance, AMD Zen uarchs (for extents larger than LLC-size) switch to
a mode where they start eliding cacheline allocation.  This is helpful not
just because it results in higher bandwidth, but also because now the
cache is not evicting useful cachelines and replacing them with zeroes.

Demand faulting a 64GB region shows performance improvement:

 $ perf bench mem mmap -p $pg-sz -f demand -s 64GB -l 5

                       baseline              +series
                   (GBps +- %stdev)      (GBps +- %stdev)

   pg-sz=2MB       11.76 +- 1.10%        25.34 +- 1.18% [*]   +115.47%  	preempt=*

   pg-sz=1GB       24.85 +- 2.41%        39.22 +- 2.32%       + 57.82%  	preempt=none|voluntary
   pg-sz=1GB         (similar)           52.73 +- 0.20% [#]   +112.19%  	preempt=full|lazy

 [*] This improvement is because switching to sequential clearing
  allows the hardware prefetchers to do a much better job.

 [#] For pg-sz=1GB a large part of the improvement is because of the
  cacheline elision mentioned above. preempt=full|lazy improves upon
  that because, not needing explicit invocations of cond_resched() to
  ensure reasonable preemption latency, it can clear the full extent
  as a single unit. In comparison the maximum extent used for
  preempt=none|voluntary is PROCESS_PAGES_NON_PREEMPT_BATCH (32MB).

  When provided the full extent the processor forgoes allocating
  cachelines on this path almost entirely.

  (The hope is that eventually, in the fullness of time, the lazy
   preemption model will be able to do the same job that none or
   voluntary models are used for, allowing us to do away with
   cond_resched().)

Raghavendra also tested previous version of the series on AMD Genoa and
sees similar improvement [1] with preempt=lazy.

  $ perf bench mem map -p $page-size -f populate -s 64GB -l 10

                    base               patched              change
   pg-sz=2MB       12.731939 GB/sec    26.304263 GB/sec     106.6%
   pg-sz=1GB       26.232423 GB/sec    61.174836 GB/sec     133.2%


This patch (of 8):

Let's drop all variants that effectively map to clear_page() and provide
it in a generic variant instead.

We'll use the macro clear_user_page to indicate whether an architecture
provides it's own variant.

Also, clear_user_page() is only called from the generic variant of
clear_user_highpage(), so define it only if the architecture does not
provide a clear_user_highpage().  And, for simplicity define it in
linux/highmem.h.

Note that for parisc, clear_page() and clear_user_page() map to
clear_page_asm(), so we can just get rid of the custom clear_user_page()
implementation.  There is a clear_user_page_asm() function on parisc, that
seems to be unused.  Not sure what's up with that.

Link: https://lkml.kernel.org/r/20260107072009.1615991-1-ankur.a.arora@oracle.com
Link: https://lkml.kernel.org/r/20260107072009.1615991-2-ankur.a.arora@oracle.com
Signed-off-by: David Hildenbrand 
Co-developed-by: Ankur Arora 
Signed-off-by: Ankur Arora 
Cc: Andy Lutomirski 
Cc: Ankur Arora 
Cc: "Borislav Petkov (AMD)" 
Cc: Boris Ostrovsky 
Cc: David Hildenbrand 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Konrad Rzessutek Wilk 
Cc: Lance Yang 
Cc: "Liam R. Howlett" 
Cc: Li Zhe 
Cc: Lorenzo Stoakes 
Cc: Mateusz Guzik 
Cc: Matthew Wilcox (Oracle) 
Cc: Michal Hocko 
Cc: Mike Rapoport 
Cc: Peter Zijlstra 
Cc: Raghavendra K T 
Cc: Suren Baghdasaryan 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Signed-off-by: Andrew Morton