linux-toradex.git/kernel/kexec_core.c, branch v7.0-rc7

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using

    git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook

kernel/kexec: fix IMA when allocation happens in CMA area

2025-12-23T19:23:14+00:00

*** Bug description ***

When I tested kexec with the latest kernel, I ran into the following warning:

[   40.712410] ------------[ cut here ]------------
[   40.712576] WARNING: CPU: 2 PID: 1562 at kernel/kexec_core.c:1001 kimage_map_segment+0x144/0x198
[...]
[   40.816047] Call trace:
[   40.818498]  kimage_map_segment+0x144/0x198 (P)
[   40.823221]  ima_kexec_post_load+0x58/0xc0
[   40.827246]  __do_sys_kexec_file_load+0x29c/0x368
[...]
[   40.855423] ---[ end trace 0000000000000000 ]---

*** How to reproduce ***

This bug is only triggered when the kexec target address is allocated in
the CMA area. If no CMA area is reserved in the kernel, use the "cma="
option in the kernel command line to reserve one.

*** Root cause ***
The commit 07d24902977e ("kexec: enable CMA based contiguous
allocation") allocates the kexec target address directly on the CMA area
to avoid copying during the jump. In this case, there is no IND_SOURCE
for the kexec segment.  But the current implementation of
kimage_map_segment() assumes that IND_SOURCE pages exist and map them
into a contiguous virtual address by vmap().

*** Solution ***
If IMA segment is allocated in the CMA area, use its page_address()
directly.

Link: https://lkml.kernel.org/r/20251216014852.8737-2-piliu@redhat.com
Fixes: 07d24902977e ("kexec: enable CMA based contiguous allocation")
Signed-off-by: Pingfan Liu 
Acked-by: Baoquan He 
Cc: Alexander Graf 
Cc: Steven Chen 
Cc: Mimi Zohar 
Cc: Roberto Sassu 
Cc: 
Signed-off-by: Andrew Morton

kernel/kexec: change the prototype of kimage_map_segment()

2025-12-23T19:23:13+00:00

The kexec segment index will be required to extract the corresponding
information for that segment in kimage_map_segment().  Additionally,
kexec_segment already holds the kexec relocation destination address and
size.  Therefore, the prototype of kimage_map_segment() can be changed.

Link: https://lkml.kernel.org/r/20251216014852.8737-1-piliu@redhat.com
Fixes: 07d24902977e ("kexec: enable CMA based contiguous allocation")
Signed-off-by: Pingfan Liu 
Acked-by: Baoquan He 
Cc: Mimi Zohar 
Cc: Roberto Sassu 
Cc: Alexander Graf 
Cc: Steven Chen 
Cc: 
Signed-off-by: Andrew Morton

kexec: move sysfs entries to /sys/kernel/kexec

2025-11-27T22:24:42+00:00

Patch series "kexec: reorganize kexec and kdump sysfs", v6.

All existing kexec and kdump sysfs entries are moved to a new location,
/sys/kernel/kexec, to keep /sys/kernel/ clean and better organized.
Symlinks are created at the old locations for backward compatibility and
can be removed in the future [01/03].

While doing this cleanup, the old kexec and kdump sysfs entries are
marked as deprecated in the existing ABI documentation [02/03]. This
makes it clear that these older interfaces should no longer be used.
New ABI documentation is added to describe the reorganized interfaces
[03/03], so users and tools can rely on the updated sysfs interfaces
going forward.


This patch (of 3):

Several kexec and kdump sysfs entries are currently placed directly under
/sys/kernel/, which clutters the directory and makes it harder to identify
unrelated entries.  To improve organization and readability, these entries
are now moved under a dedicated directory, /sys/kernel/kexec.

The following sysfs moved under new kexec sysfs node
+------------------------------------+------------------+
|    Old sysfs name         |     New sysfs name        |
|  (under /sys/kernel)      | (under /sys/kernel/kexec) |
+---------------------------+---------------------------+
| kexec_loaded              | loaded                    |
+---------------------------+---------------------------+
| kexec_crash_loaded        | crash_loaded              |
+---------------------------+---------------------------+
| kexec_crash_size          | crash_size                |
+---------------------------+---------------------------+
| crash_elfcorehdr_size     | crash_elfcorehdr_size     |
+---------------------------+---------------------------+
| kexec_crash_cma_ranges    | crash_cma_ranges          |
+---------------------------+---------------------------+

For backward compatibility, symlinks are created at the old locations so
that existing tools and scripts continue to work.  These symlinks can be
removed in the future once users have switched to the new path.

While creating symlinks, entries are added in /sys/kernel/ that point to
their new locations under /sys/kernel/kexec/.  If an error occurs while
adding a symlink, it is logged but does not stop initialization of the
remaining kexec sysfs symlinks.

The /sys/kernel/
entry is now controlled by CONFIG_CRASH_DUMP instead of
CONFIG_VMCORE_INFO, as CONFIG_CRASH_DUMP also enables CONFIG_VMCORE_INFO.

Link: https://lkml.kernel.org/r/20251118114507.1769455-1-sourabhjain@linux.ibm.com
Link: https://lkml.kernel.org/r/20251118114507.1769455-2-sourabhjain@linux.ibm.com
Signed-off-by: Sourabh Jain 
Acked-by: Baoquan He 
Cc: Aditya Gupta 
Cc: Dave Young 
Cc: Hari Bathini 
Cc: Jiri Bohac 
Cc: Madhavan Srinivasan 
Cc: Mahesh J Salgaonkar 
Cc: Pingfan Liu 
Cc: Ritesh Harjani (IBM) 
Cc: Shivang Upadhyay 
Cc: Vivek Goyal 
Signed-off-by: Andrew Morton

kexec: call liveupdate_reboot() before kexec

2025-11-27T22:24:38+00:00

Modify the kernel_kexec() to call liveupdate_reboot().

This ensures that the Live Update Orchestrator is notified just before the
kernel executes the kexec jump.  The liveupdate_reboot() function triggers
the final freeze event, allowing participating FDs perform last-minute
check or state saving within the blackout window.

If liveupdate_reboot() returns an error (indicating a failure during LUO
finalization), the kexec operation is aborted to prevent proceeding with
an inconsistent state.  An error is returned to user.

Link: https://lkml.kernel.org/r/20251125165850.3389713-4-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin 
Reviewed-by: Mike Rapoport (Microsoft) 
Reviewed-by: Pratyush Yadav 
Tested-by: David Matlack 
Cc: Aleksander Lobakin 
Cc: Alexander Graf 
Cc: Alice Ryhl 
Cc: Andriy Shevchenko 
Cc: anish kumar 
Cc: Anna Schumaker 
Cc: Bartosz Golaszewski 
Cc: Bjorn Helgaas 
Cc: Borislav Betkov 
Cc: Chanwoo Choi 
Cc: Chen Ridong 
Cc: Chris Li 
Cc: Christian Brauner 
Cc: Daniel Wagner 
Cc: Danilo Krummrich 
Cc: Dan Williams 
Cc: David Hildenbrand 
Cc: David Jeffery 
Cc: David Rientjes 
Cc: Greg Kroah-Hartman 
Cc: Guixin Liu 
Cc: "H. Peter Anvin" 
Cc: Hugh Dickins 
Cc: Ilpo Järvinen 
Cc: Ingo Molnar 
Cc: Ira Weiny 
Cc: Jann Horn 
Cc: Jason Gunthorpe 
Cc: Jens Axboe 
Cc: Joanthan Cameron 
Cc: Joel Granados 
Cc: Johannes Weiner 
Cc: Jonathan Corbet 
Cc: Lennart Poettering 
Cc: Leon Romanovsky 
Cc: Leon Romanovsky 
Cc: Lukas Wunner 
Cc: Marc Rutland 
Cc: Masahiro Yamada 
Cc: Matthew Maurer 
Cc: Miguel Ojeda 
Cc: Myugnjoo Ham 
Cc: Parav Pandit 
Cc: Pratyush Yadav 
Cc: Randy Dunlap 
Cc: Roman Gushchin 
Cc: Saeed Mahameed 
Cc: Samiullah Khawaja 
Cc: Song Liu 
Cc: Steven Rostedt 
Cc: Stuart Hayes 
Cc: Tejun Heo 
Cc: Thomas Gleinxer 
Cc: Thomas Weißschuh 
Cc: Vincent Guittot 
Cc: William Tu 
Cc: Yoann Congal 
Cc: Zhu Yanjun 
Cc: Zijun Hu 
Signed-off-by: Andrew Morton

kexec_core: remove superfluous page offset handling in segment loading

2025-11-12T18:00:13+00:00

During kexec_segment loading, when copying the content of the segment
(i.e.  kexec_segment::kbuf or kexec_segment::buf) to its associated pages,
kimage_load_{cma,normal,crash}_segment handle the case where the physical
address of the segment is not page aligned, e.g.  in
kimage_load_normal_segment:

	page = kimage_alloc_page(image, GFP_HIGHUSER, maddr);
	// ...
	ptr = kmap_local_page(page);
	// ...
	ptr += maddr & ~PAGE_MASK;
	mchunk = min_t(size_t, mbytes,
		PAGE_SIZE - (maddr & ~PAGE_MASK));
	// ^^^^ Non page-aligned segments handled here ^^^
	// ...
	if (image->file_mode)
		memcpy(ptr, kbuf, uchunk);
	else
		result = copy_from_user(ptr, buf, uchunk);

(similar logic is present in kimage_load_{cma,crash}_segment).

This is actually not needed because, prior to their loading, all
kexec_segments first go through a vetting step in
`sanity_check_segment_list`, which rejects any segment that is not
page-aligned:

	for (i = 0; i < nr_segments; i++) {
		unsigned long mstart, mend;
		mstart = image->segment[i].mem;
		mend   = mstart + image->segment[i].memsz;
		// ...
		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
			return -EADDRNOTAVAIL;
		// ...
	}

In case `sanity_check_segment_list` finds a non-page aligned the whole
kexec load is aborted and no segment is loaded.

This means that `kimage_load_{cma,normal,crash}_segment` never actually
have to handle non page-aligned segments and `(maddr & ~PAGE_MASK) == 0`
is always true no matter if the segment is coming from a file (i.e. 
`kexec_file_load` syscall), from a user-space buffer (i.e.  `kexec_load`
syscall) or created by the kernel through `kexec_add_buffer`.  In the
latter case, `kexec_add_buffer` actually enforces the page alignment:

	/* Ensure minimum alignment needed for segments. */
	kbuf->memsz = ALIGN(kbuf->memsz, PAGE_SIZE);
	kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE);

[jbouron@amazon.com: v3]
  Link: https://lkml.kernel.org/r/20251024155009.39502-1-jbouron@amazon.com
Link: https://lkml.kernel.org/r/20250929160220.47616-1-jbouron@amazon.com
Signed-off-by: Justinien Bouron 
Reviewed-by: Gunnar Kudrjavets 
Reviewed-by: Andy Shevchenko 
Acked-by: Baoquan He 
Cc: Alexander Graf 
Cc: Marcos Paulo de Souza 
Cc: Mario Limonciello 
Cc: Petr Mladek 
Cc: Yan Zhao 
Signed-off-by: Andrew Morton

kexec_core: remove redundant 0 value initialization

2025-09-14T00:32:49+00:00

The kimage struct is already zeroed by kzalloc(). It's redundant to
initialize image->head to 0.

Link: https://lkml.kernel.org/r/20250825123307.306634-1-liaoyuanhong@vivo.com
Signed-off-by: Liao Yuanhong 
Signed-off-by: Andrew Morton

Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2025-08-03T23:23:09+00:00

Pull non-MM updates from Andrew Morton:
 "Significant patch series in this pull request:

   - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
     us closer to being able to remove page->mapping

   - "relayfs: misc changes" (Jason Xing) does some maintenance and
     minor feature addition work in relayfs

   - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
     us from static preallocation of the kdump crashkernel's working
     memory over to dynamic allocation. So the difficulty of a-priori
     estimation of the second kernel's needs is removed and the first
     kernel obtains extra memory

   - "generalize panic_print's dump function to be used by other
     kernel parts" (Feng Tang) implements some consolidation and
     rationalization of the various ways in which a failing kernel
     splats information at the operator

* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
  tools/getdelays: add backward compatibility for taskstats version
  kho: add test for kexec handover
  delaytop: enhance error logging and add PSI feature description
  samples: Kconfig: fix spelling mistake "instancess" -> "instances"
  fat: fix too many log in fat_chain_add()
  scripts/spelling.txt: add notifer||notifier to spelling.txt
  xen/xenbus: fix typo "notifer"
  net: mvneta: fix typo "notifer"
  drm/xe: fix typo "notifer"
  cxl: mce: fix typo "notifer"
  KVM: x86: fix typo "notifer"
  MAINTAINERS: add maintainers for delaytop
  ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
  ucount: fix atomic_long_inc_below() argument type
  kexec: enable CMA based contiguous allocation
  stackdepot: make max number of pools boot-time configurable
  lib/xxhash: remove unused functions
  init/Kconfig: restore CONFIG_BROKEN help text
  lib/raid6: update recov_rvv.c zero page usage
  docs: update docs after introducing delaytop
  ...

kexec: enable CMA based contiguous allocation

2025-08-02T19:01:38+00:00

When booting a new kernel with kexec_file, the kernel picks a target
location that the kernel should live at, then allocates random pages,
checks whether any of those patches magically happens to coincide with a
target address range and if so, uses them for that range.

For every page allocated this way, it then creates a page list that the
relocation code - code that executes while all CPUs are off and we are
just about to jump into the new kernel - copies to their final memory
location.  We can not put them there before, because chances are pretty
good that at least some page in the target range is already in use by the
currently running Linux environment.  Copying is happening from a single
CPU at RAM rate, which takes around 4-50 ms per 100 MiB.

All of this is inefficient and error prone.

To successfully kexec, we need to quiesce all devices of the outgoing
kernel so they don't scribble over the new kernel's memory.  We have seen
cases where that does not happen properly (*cough* GIC *cough*) and hence
the new kernel was corrupted.  This started a month long journey to root
cause failing kexecs to eventually see memory corruption, because the new
kernel was corrupted severely enough that it could not emit output to tell
us about the fact that it was corrupted.  By allocating memory for the
next kernel from a memory range that is guaranteed scribbling free, we can
boot the next kernel up to a point where it is at least able to detect
corruption and maybe even stop it before it becomes severe.  This
increases the chance for successful kexecs.

Since kexec got introduced, Linux has gained the CMA framework which can
perform physically contiguous memory mappings, while keeping that memory
available for movable memory when it is not needed for contiguous
allocations.  The default CMA allocator is for DMA allocations.

This patch adds logic to the kexec file loader to attempt to place the
target payload at a location allocated from CMA.  If successful, it uses
that memory range directly instead of creating copy instructions during
the hot phase.  To ensure that there is a safety net in case anything goes
wrong with the CMA allocation, it also adds a flag for user space to force
disable CMA allocations.

Using CMA allocations has two advantages:

  1) Faster by 4-50 ms per 100 MiB. There is no more need to copy in the
     hot phase.
  2) More robust. Even if by accident some page is still in use for DMA,
     the new kernel image will be safe from that access because it resides
     in a memory region that is considered allocated in the old kernel and
     has a chance to reinitialize that component.

Link: https://lkml.kernel.org/r/20250610085327.51817-1-graf@amazon.com
Signed-off-by: Alexander Graf 
Acked-by: Baoquan He 
Reviewed-by: Pasha Tatashin 
Cc: Zhongkun He 
Signed-off-by: Andrew Morton