summaryrefslogtreecommitdiff
path: root/init/Kconfig
AgeCommit message (Collapse)Author
2025-01-31Merge tag 'kbuild-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Support multiple hook locations for maint scripts of Debian package - Remove 'cpio' from the build tool requirement - Introduce gendwarfksyms tool, which computes CRCs for export symbols based on the DWARF information - Support CONFIG_MODVERSIONS for Rust - Resolve all conflicts in the genksyms parser - Fix several syntax errors in genksyms * tag 'kbuild-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (64 commits) kbuild: fix Clang LTO with CONFIG_OBJTOOL=n kbuild: Strip runtime const RELA sections correctly kconfig: fix memory leak in sym_warn_unmet_dep() kconfig: fix file name in warnings when loading KCONFIG_DEFCONFIG_LIST genksyms: fix syntax error for attribute before init-declarator genksyms: fix syntax error for builtin (u)int*x*_t types genksyms: fix syntax error for attribute after 'union' genksyms: fix syntax error for attribute after 'struct' genksyms: fix syntax error for attribute after abstact_declarator genksyms: fix syntax error for attribute before nested_declarator genksyms: fix syntax error for attribute before abstract_declarator genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier genksyms: record attributes consistently for init-declarator genksyms: restrict direct-declarator to take one parameter-type-list genksyms: restrict direct-abstract-declarator to take one parameter-type-list genksyms: remove Makefile hack genksyms: fix last 3 shift/reduce conflicts genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts genksyms: reduce type_qualifier directly to decl_specifier genksyms: rename cvar_qualifier to type_qualifier ...
2025-01-27Merge tag 'drm-next-2025-01-27' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Simona Vetter: "cgroup: - fix Koncfig fallout from new dmem controller Driver Changes: - v3d NULL pointer regression fix in fence signalling race - virtio: uaf in dma_buf free path - xlnx: fix kerneldoc - bochs: fix double-free on driver removal - zynqmp: add missing locking to DP bridge driver - amdgpu fixes all over: - documentation, display, sriov, various hw block drivers - use drm/sched helper - mark some debug module options as unsafe - amdkfd: mark some debug module options as unsafe, trap handler updates, fix partial migration handling DRM core: - fix fbdev Kconfig select rules, improve tiled-based display support" * tag 'drm-next-2025-01-27' of https://gitlab.freedesktop.org/drm/kernel: (40 commits) drm/amd/display: Optimize cursor position updates drm/amd/display: Add hubp cache reset when powergating drm/amd/amdgpu: Enable scratch data dump for mes 12 drm/amd: Clarify kdoc for amdgpu.gttsize drm/amd/amdgpu: Prevent null pointer dereference in GPU bandwidth calculation drm/amd/display: Fix error pointers in amdgpu_dm_crtc_mem_type_changed drm/amdgpu: fix ring timeout issue in gfx10 sr-iov environment drm/amd/pm: Fix smu v13.0.6 caps initialization drm/amd/pm: Refactor SMU 13.0.6 SDMA reset firmware version checks revert "drm/amdgpu/pm: add definition PPSMC_MSG_ResetSDMA2" revert "drm/amdgpu/pm: Implement SDMA queue reset for different asic" drm/amd/pm: Add capability flags for SMU v13.0.6 drm/amd/display: fix SUBVP DC_DEBUG_MASK documentation drm/amd/display: fix CEC DC_DEBUG_MASK documentation drm/amdgpu: fix the PCIe lanes reporting in the INFO IOCTL drm/amdgpu: cache gpu pcie link width drm/amd/display: mark static functions noinline_for_stack drm/amdkfd: Clear MODE.VSKIP in gfx9 trap handler drm/amdgpu: Refine ip detection log message drm/amdgpu: Add handler for SDMA context empty ...
2025-01-21Merge tag 'rust-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Finish the move to custom FFI integer types started in the previous cycle and finally map 'long' to 'isize' and 'char' to 'u8'. Do a few cleanups on top thanks to that. - Start to use 'derive(CoercePointee)' on Rust >= 1.84.0. This is a major milestone on the path to build the kernel using only stable Rust features. In particular, previously we were using the unstable features 'coerce_unsized', 'dispatch_from_dyn' and 'unsize', and now we will use the new 'derive_coerce_pointee' one, which is on track to stabilization. This new feature is a macro that essentially expands into code that internally uses the unstable features that we were using before, without having to expose those. With it, stable Rust users, including the kernel, will be able to build custom smart pointers that work with trait objects, e.g.: fn f(p: &Arc<dyn Display>) { pr_info!("{p}\n"); } let a: Arc<dyn Display> = Arc::new(42i32, GFP_KERNEL)?; let b: Arc<dyn Display> = Arc::new("hello there", GFP_KERNEL)?; f(&a); // Prints "42". f(&b); // Prints "hello there". Together with the 'arbitrary_self_types' feature that we started using in the previous cycle, using our custom smart pointers like 'Arc' will eventually only rely in stable Rust. - Introduce 'PROCMACROLDFLAGS' environment variable to allow to link Rust proc macros using different flags than those used for linking Rust host programs (e.g. when 'rustc' uses a different C library than the host programs' one), which Android needs. - Help kernel builds under macOS with Rust enabled by accomodating other naming conventions for dynamic libraries (i.e. '.so' vs. '.dylib') which are used for Rust procedural macros. The actual support for macOS (i.e. the rest of the pieces needed) is provided out-of-tree by others, following the policy used for other parts of the kernel by Kbuild. - Run Clippy for 'rusttest' code too and clean the bits it spotted. - Provide Clippy with the minimum supported Rust version to improve the suggestions it gives. - Document 'bindgen' 0.71.0 regression. 'kernel' crate: - 'build_error!': move users of the hidden function to the documented macro, prevent such uses in the future by moving the function elsewhere and add the macro to the prelude. - 'types' module: add improved version of 'ForeignOwnable::borrow_mut' (which was removed in the past since it was problematic); change 'ForeignOwnable' pointer type to '*mut'. - 'alloc' module: implement 'Display' for 'Box' and align the 'Debug' implementation to it; add example (doctest) for 'ArrayLayout::new()' - 'sync' module: document 'PhantomData' in 'Arc'; use 'NonNull::new_unchecked' in 'ForeignOwnable for Arc' impl. - 'uaccess' module: accept 'Vec's with different allocators in 'UserSliceReader::read_all'. - 'workqueue' module: enable run-testing a couple more doctests. - 'error' module: simplify 'from_errno()'. - 'block' module: fix formatting in code documentation (a lint to catch these is being implemented). - Avoid 'unwrap()'s in doctests, which also improves the examples by showing how kernel code is supposed to be written. - Avoid 'as' casts with 'cast{,_mut}' calls which are a bit safer. And a few other cleanups" * tag 'rust-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (32 commits) kbuild: rust: add PROCMACROLDFLAGS rust: uaccess: generalize userSliceReader to support any Vec rust: kernel: add improved version of `ForeignOwnable::borrow_mut` rust: kernel: reorder `ForeignOwnable` items rust: kernel: change `ForeignOwnable` pointer to mut rust: arc: split unsafe block, add missing comment rust: types: avoid `as` casts rust: arc: use `NonNull::new_unchecked` rust: use derive(CoercePointee) on rustc >= 1.84.0 rust: alloc: add doctest for `ArrayLayout::new()` rust: init: update `stack_try_pin_init` examples rust: error: import `kernel`'s `LayoutError` instead of `core`'s rust: str: replace unwraps with question mark operators rust: page: remove unnecessary helper function from doctest rust: rbtree: remove unwrap in asserts rust: init: replace unwraps with question mark operators rust: use host dylib naming convention to support macOS rust: add `build_error!` to the prelude rust: kernel: move `build_error` hidden function to prevent mistakes rust: use the `build_error!` macro, not the hidden function ...
2025-01-18rust: Use gendwarfksyms + extended modversions for CONFIG_MODVERSIONSSami Tolvanen
Previously, two things stopped Rust from using MODVERSIONS: 1. Rust symbols are occasionally too long to be represented in the original versions table 2. Rust types cannot be properly hashed by the existing genksyms approach because: * Looking up type definitions in Rust is more complex than C * Type layout is potentially dependent on the compiler in Rust, not just the source type declaration. CONFIG_EXTENDED_MODVERSIONS addresses the first point, and CONFIG_GENDWARFKSYMS the second. If Rust wants to use MODVERSIONS, allow it to do so by selecting both features. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Co-developed-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-01-17cgroup/rdma: Drop bogus PAGE_COUNTER selectGeert Uytterhoeven
When adding the Device memory controller (DMEM), "select PAGE_COUNTER" was added to CGROUP_RDMA, presumably instead of CGROUP_DMEM. While commit e33b51499a0a6bca ("cgroup/dmem: Select PAGE_COUNTER") added the missing select to CGROUP_DMEM, the bogus select is still there. Remove it. Fixes: b168ed458ddecc17 ("kernel/cgroup: Add "dmem" memory accounting cgroup") Closes: https://lore.kernel.org/CAMuHMdUmPfahsnZwx2iB5yfh8rjjW25LNcnYujNBgcKotUXBNg@mail.gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Tejun Heo <tj@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/b4d462f038a2f895f30ae759928397c8183f6f7e.1737020925.git.geert+renesas@glider.be Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-15cgroup/dmem: Select PAGE_COUNTERMaxime Ripard
The dmem cgroup the page counting API implemented behing the PAGE_COUNTER kconfig option. However, it doesn't select it, resulting in potential build breakages. Select PAGE_COUNTER. Fixes: b168ed458dde ("kernel/cgroup: Add "dmem" memory accounting cgroup") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501111330.3VuUx8vf-lkp@intel.com/ Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250113092608.1349287-1-mripard@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-13rust: use derive(CoercePointee) on rustc >= 1.84.0Xiangfei Ding
The `kernel` crate relies on both `coerce_unsized` and `dispatch_from_dyn` unstable features. Alice Ryhl has proposed [1] the introduction of the unstable macro `SmartPointer` to reduce such dependence, along with a RFC patch [2]. Since Rust 1.81.0 this macro, later renamed to `CoercePointee` in Rust 1.84.0 [3], has been fully implemented with the naming discussion resolved. This feature is now on track to stabilization in the language. In order to do so, we shall start using this macro in the `kernel` crate to prove the functionality and utility of the macro as the justification of its stabilization. This patch makes this switch in such a way that the crate remains backward compatible with older Rust compiler versions, via the new Kconfig option `RUSTC_HAS_COERCE_POINTEE`. A minimal demonstration example is added to the `samples/rust/rust_print_main.rs` module. Link: https://rust-lang.github.io/rfcs/3621-derive-smart-pointer.html [1] Link: https://lore.kernel.org/all/20240823-derive-smart-pointer-v1-1-53769cd37239@google.com/ [2] Link: https://github.com/rust-lang/rust/pull/131284 [3] Signed-off-by: Xiangfei Ding <dingxiangfei2009@gmail.com> Reviewed-by: Fiona Behrens <me@kloenk.dev> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20241203205050.679106-2-dingxiangfei2009@gmail.com [ Fixed version to 1.84. Renamed option to `RUSTC_HAS_COERCE_POINTEE` to match `CC_HAS_*` ones. Moved up new config option, closer to the `CC_HAS_*` ones. Simplified Kconfig line. Fixed typos and slightly reworded example and commit. Added Link to PR. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-01-10rust: document `bindgen` 0.71.0 regressionMiguel Ojeda
`bindgen` 0.71.0 regressed [1] on the "`--version` requires header" issue which appeared in 0.69.0 first [2] and was fixed in 0.69.1. It has been fixed again in 0.71.1 [3]. Thus document it so that, when we upgrade the minimum past 0.69.0 in the future, we do not forget that we cannot remove the workaround until we arrive at 0.71.1 at least. Link: https://github.com/rust-lang/rust-bindgen/issues/3039 [1] Link: https://github.com/rust-lang/rust-bindgen/issues/2677 [2] Link: https://github.com/rust-lang/rust-bindgen/blob/main/CHANGELOG.md#v0711-2024-12-09 [3] Reviewed-by: Fiona Behrens <me@kloenk.dev> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20241209212544.1977065-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-01-06kernel/cgroup: Add "dmem" memory accounting cgroupMaarten Lankhorst
This code is based on the RDMA and misc cgroup initially, but now uses page_counter. It uses the same min/low/max semantics as the memory cgroup as a result. There's a small mismatch as TTM uses u64, and page_counter long pages. In practice it's not a problem. 32-bits systems don't really come with >=4GB cards and as long as we're consistently wrong with units, it's fine. The device page size may not be in the same units as kernel page size, and each region might also have a different page size (VRAM vs GART for example). The interface is simple: - Call dmem_cgroup_register_region() - Use dmem_cgroup_try_charge to check if you can allocate a chunk of memory, use dmem_cgroup__uncharge when freeing it. This may return an error code, or -EAGAIN when the cgroup limit is reached. In that case a reference to the limiting pool is returned. - The limiting cs can be used as compare function for dmem_cgroup_state_evict_valuable. - After having evicted enough, drop reference to limiting cs with dmem_cgroup_pool_state_put. This API allows you to limit device resources with cgroups. You can see the supported cards in /sys/fs/cgroup/dmem.capacity You need to echo +dmem to cgroup.subtree_control, and then you can partition device memory. Co-developed-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Co-developed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20241204143112.1250983-1-dev@lankhorst.se Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-11-25Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - The series "resource: A couple of cleanups" from Andy Shevchenko performs some cleanups in the resource management code - The series "Improve the copy of task comm" from Yafang Shao addresses possible race-induced overflows in the management of task_struct.comm[] - The series "Remove unnecessary header includes from {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a small fix to the list_sort library code and to its selftest - The series "Enhance min heap API with non-inline functions and optimizations" also from Kuan-Wei Chiu optimizes and cleans up the min_heap library code - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi finishes off nilfs2's folioification - The series "add detect count for hung tasks" from Lance Yang adds more userspace visibility into the hung-task detector's activity - Apart from that, singelton patches in many places - please see the individual changelogs for details * tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) gdb: lx-symbols: do not error out on monolithic build kernel/reboot: replace sprintf() with sysfs_emit() lib: util_macros_kunit: add kunit test for util_macros.h util_macros.h: fix/rework find_closest() macros Improve consistency of '#error' directive messages ocfs2: fix uninitialized value in ocfs2_file_read_iter() hung_task: add docs for hung_task_detect_count hung_task: add detect count for hung tasks dma-buf: use atomic64_inc_return() in dma_buf_getfile() fs/proc/kcore.c: fix coccinelle reported ERROR instances resource: avoid unnecessary resource tree walking in __region_intersects() ocfs2: remove unused errmsg function and table ocfs2: cluster: fix a typo lib/scatterlist: use sg_phys() helper checkpatch: always parse orig_commit in fixes tag nilfs2: convert metadata aops from writepage to writepages nilfs2: convert nilfs_recovery_copy_block() to take a folio nilfs2: convert nilfs_page_count_clean_buffers() to take a folio nilfs2: remove nilfs_writepage nilfs2: convert checkpoint file to be folio-based ...
2024-11-25Merge tag 'hardening-v6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - Disable __counted_by in Clang < 19.1.3 (Jan Hendrik Farr) - string_helpers: Silence output truncation warning (Bartosz Golaszewski) - compiler.h: Avoid needing BUILD_BUG_ON_ZERO() (Philipp Reisner) - MAINTAINERS: Add kernel hardening keywords __counted_by{_le|_be} (Thorsten Blum) * tag 'hardening-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: Compiler Attributes: disable __counted_by for clang < 19.1.3 compiler.h: Fix undefined BUILD_BUG_ON_ZERO() lib: string_helpers: silence snprintf() output truncation warning MAINTAINERS: Add kernel hardening keywords __counted_by{_le|_be}
2024-11-22Merge tag 'trace-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Addition of faultable tracepoints There's a tracepoint attached to both a system call entry and exit. This location is known to allow page faults. The tracepoints are called under an rcu_read_lock() which does not allow faults that can sleep. This limits the ability of tracepoint handlers to page fault in user space system call parameters. Now these tracepoints have been made "faultable", allowing the callbacks to fault in user space parameters and record them. Note, only the infrastructure has been implemented. The consumers (perf, ftrace, BPF) now need to have their code modified to allow faults. - Fix up of BPF code for the tracepoint faultable logic - Update tracepoints to use the new static branch API - Remove trace_*_rcuidle() variants and the SRCU protection they used - Remove unused TRACE_EVENT_FL_FILTERED logic - Replace strncpy() with strscpy() and memcpy() - Use replace per_cpu_ptr(smp_processor_id()) with this_cpu_ptr() - Fix perf events to not duplicate samples when tracing is enabled - Replace atomic64_add_return(1, counter) with atomic64_inc_return(counter) - Make stack trace buffer 4K instead of PAGE_SIZE - Remove TRACE_FLAG_IRQS_NOSUPPORT flag as it was never used - Get the true return address for function tracer when function graph tracer is also running. When function_graph trace is running along with function tracer, the parent function of the function tracer sometimes is "return_to_handler", which is the function graph trampoline to record the exit of the function. Use existing logic that calls into the fgraph infrastructure to find the real return address. - Remove (un)regfunc pointers out of tracepoint structure - Added last minute bug fix for setting pending modules in stack function filter. echo "write*:mod:ext3" > /sys/kernel/tracing/stack_trace_filter Would cause a kernel NULL dereference. - Minor clean ups * tag 'trace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (31 commits) ftrace: Fix regression with module command in stack_trace_filter tracing: Fix function name for trampoline ftrace: Get the true parent ip for function tracer tracing: Remove redundant check on field->field in histograms bpf: ensure RCU Tasks Trace GP for sleepable raw tracepoint BPF links bpf: decouple BPF link/attach hook and BPF program sleepable semantics bpf: put bpf_link's program when link is safe to be deallocated tracing: Replace strncpy() with strscpy() when copying comm tracing: Add might_fault() check in __DECLARE_TRACE_SYSCALL tracing: Fix syscall tracepoint use-after-free tracing: Introduce tracepoint_is_faultable() tracing: Introduce tracepoint extended structure tracing: Remove TRACE_FLAG_IRQS_NOSUPPORT tracing: Replace multiple deprecated strncpy with memcpy tracing: Make percpu stack trace buffer invariant to PAGE_SIZE tracing: Use atomic64_inc_return() in trace_clock_counter() trace/trace_event_perf: remove duplicate samples on the first tracepoint event tracing/bpf: Add might_fault check to syscall probes tracing/perf: Add might_fault check to syscall probes tracing/ftrace: Add might_fault check to syscall probes ...
2024-11-19Compiler Attributes: disable __counted_by for clang < 19.1.3Jan Hendrik Farr
This patch disables __counted_by for clang versions < 19.1.3 because of the two issues listed below. It does this by introducing CONFIG_CC_HAS_COUNTED_BY. 1. clang < 19.1.2 has a bug that can lead to __bdos returning 0: https://github.com/llvm/llvm-project/pull/110497 2. clang < 19.1.3 has a bug that can lead to __bdos being off by 4: https://github.com/llvm/llvm-project/pull/112636 Fixes: c8248faf3ca2 ("Compiler Attributes: counted_by: Adjust name and identifier expansion") Cc: stable@vger.kernel.org # 6.6.x: 16c31dd7fdf6: Compiler Attributes: counted_by: bump min gcc version Cc: stable@vger.kernel.org # 6.6.x: 2993eb7a8d34: Compiler Attributes: counted_by: fixup clang URL Cc: stable@vger.kernel.org # 6.6.x: 231dc3f0c936: lkdtm/bugs: Improve warning message for compilers without counted_by support Cc: stable@vger.kernel.org # 6.6.x Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/ Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20241029140036.577804-2-kernel@jfarr.cc Signed-off-by: Kees Cook <kees@kernel.org>
2024-11-05lib/Makefile: make union-find compilation conditional on CONFIG_CPUSETSKuan-Wei Chiu
Currently, cpuset is the only user of the union-find implementation. Compiling union-find in all configurations unnecessarily increases the code size when building the kernel without cgroup support. Modify the build system to compile union-find only when CONFIG_CPUSETS is enabled. Link: https://lore.kernel.org/lkml/1ccd6411-5002-4574-bb8e-3e64bba6a757@redhat.com/ Link: https://lkml.kernel.org/r/20241011141214.87096-1-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Suggested-by: Waiman Long <llong@redhat.com> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Xavier <xavier_qy@163.com> Cc: Zefan Li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-13cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERSAlice Ryhl
The HAVE_CFI_ICALL_NORMALIZE_INTEGERS option has some tricky conditions when KASAN or GCOV are turned on, as in that case we need some clang and rustc fixes [1][2] to avoid boot failures. The intent with the current setup is that you should be able to override the check and turn on the option if your clang/rustc has the fix. However, this override does not work in practice. Thus, use the new RUSTC_LLVM_VERSION to correctly implement the check for whether the fix is available. Additionally, remove KASAN_HW_TAGS from the list of incompatible options. The CFI_ICALL_NORMALIZE_INTEGERS option is incompatible with KASAN because LLVM will emit some constructors when using KASAN that are assigned incorrect CFI tags. These constructors are emitted due to use of -fsanitize=kernel-address or -fsanitize=kernel-hwaddress that are respectively passed when KASAN_GENERIC or KASAN_SW_TAGS are enabled. However, the KASAN_HW_TAGS option relies on hardware support for MTE instead and does not pass either flag. (Note also that KASAN_HW_TAGS does not `select CONSTRUCTORS`.) Link: https://github.com/llvm/llvm-project/pull/104826 [1] Link: https://github.com/rust-lang/rust/pull/129373 [2] Fixes: 4c66f8307ac0 ("cfi: encode cfi normalized integers + kasan/gcov bug in Kconfig") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241010-icall-detect-vers-v1-2-8f114956aa88@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-13kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`Gary Guo
Each version of Rust supports a range of LLVM versions. There are cases where we want to gate a config on the LLVM version instead of the Rust version. Normalized cfi integer tags are one example [1]. The invocation of rustc-version is being moved from init/Kconfig to scripts/Kconfig.include for consistency with cc-version. Link: https://lore.kernel.org/all/20240925-cfi-norm-kasan-fix-v1-1-0328985cdf33@google.com/ [1] Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20241011114040.3900487-1-gary@garyguo.net [ Added missing `-llvm` to the Usage documentation. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-09tracing: Allow system call tracepoints to handle page faultsMathieu Desnoyers
Use Tasks Trace RCU to protect iteration of system call enter/exit tracepoint probes to allow those probes to handle page faults. In preparation for this change, all tracers registering to system call enter/exit tracepoints should expect those to be called with preemption enabled. This allows tracers to fault-in userspace system call arguments such as path strings within their probe callbacks. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-6-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-26cfi: encode cfi normalized integers + kasan/gcov bug in KconfigAlice Ryhl
There is a bug in the LLVM implementation of KASAN and GCOV that makes these options incompatible with the CFI_ICALL_NORMALIZE_INTEGERS option. The bug has already been fixed in llvm/clang [1] and rustc [2]. However, Kconfig currently has no way to gate features on the LLVM version inside rustc, so we cannot write down a precise `depends on` clause in this case. Instead, a `def_bool` option is defined for whether CFI_ICALL_NORMALIZE_INTEGERS is available, and its default value is set to false when GCOV or KASAN are turned on. End users using a patched clang/rustc can turn on the HAVE_CFI_ICALL_NORMALIZE_INTEGERS option directly to override this. An alternative solution is to inspect a binary created by clang or rustc to see whether the faulty CFI tags are in the binary. This would be a precise check, but it would involve hard-coding the *hashed* version of the CFI tag. This is because there's no way to get clang or rustc to output the unhased version of the CFI tag. Relying on the precise hashing algorithm using by CFI seems too fragile, so I have not pursued this option. Besides, this kind of hack is exactly what lead to the LLVM bug in the first place. If the CFI_ICALL_NORMALIZE_INTEGERS option is used without CONFIG_RUST, then we actually can perform a precise check today: just compare the clang version number. This works since clang and llvm are always updated in lockstep. However, encoding this in Kconfig would give the HAVE_CFI_ICALL_NORMALIZE_INTEGERS option a dependency on CONFIG_RUST, which is not possible as the reverse dependency already exists. HAVE_CFI_ICALL_NORMALIZE_INTEGERS is defined to be a `def_bool` instead of `bool` to avoid asking end users whether they want to turn on the option. Turning it on explicitly is something only experts should do, so making it hard to do so is not an issue. I added a `depends on CFI_CLANG` clause to the new Kconfig option. I'm not sure whether that makes sense or not, but it doesn't seem to make a big difference. In a future kernel release, I would like to add a Kconfig option similar to CLANG_VERSION/RUSTC_VERSION for inspecting the version of the LLVM inside rustc. Once that feature lands, this logic will be replaced with a precise version check. This check is not being introduced here to avoid introducing a new _VERSION constant in a fix. Link: https://github.com/llvm/llvm-project/pull/104826 [1] Link: https://github.com/rust-lang/rust/pull/129373 [2] Fixes: ce4a2620985c ("cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202409231044.4f064459-oliver.sang@intel.com Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20240925-cfi-norm-kasan-fix-v1-1-0328985cdf33@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26rust: KASAN+RETHUNK requires rustc 1.83.0Alice Ryhl
When enabling both KASAN and RETHUNK, objtool emits the following warnings: rust/core.o: warning: objtool: asan.module_ctor+0x13: 'naked' return found in MITIGATION_RETHUNK build rust/core.o: warning: objtool: asan.module_dtor+0x13: 'naked' return found in MITIGATION_RETHUNK build This is caused by the -Zfunction-return=thunk-extern flag in rustc not informing LLVM about the mitigation at the module level (it does so at the function level only currently, which covers most cases, but both are required), which means that the KASAN functions asan.module_ctor and asan.module_dtor are generated without the rethunk mitigation. The other mitigations that we enabled for Rust (SLS, RETPOLINE) do not have the same bug, as they're being applied through the target-feature functionality instead. This is being fixed for rustc 1.83.0, so update Kconfig to reject this configuration on older compilers. Link: https://github.com/rust-lang/rust/pull/130824 Fixes: d7868550d573 ("x86/rust: support MITIGATION_RETHUNK") Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://lore.kernel.org/all/CANiq72myZL4_poCMuNFevtpYYc0V0embjSuKb7y=C+m3vVA_8g@mail.gmail.com/ Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240926093849.1192264-1-aliceryhl@google.com [ Reworded to add the details mentioned in the list. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-26rust: cfi: fix `patchable-function-entry` starting versionMiguel Ojeda
The `-Zpatchable-function-entry` flag is available since Rust 1.81.0, not Rust 1.80.0, i.e. commit ac7595fdb1ee ("Support for -Z patchable-function-entry") in upstream Rust. Fixes: ca627e636551 ("rust: cfi: add support for CFI_CLANG with Rust") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Fiona Behrens <me@kloenk.dev> Link: https://lore.kernel.org/r/20240925141944.277936-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-25Merge tag 'rust-6.12' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up objtool warnings), teach objtool about 'noreturn' Rust symbols and mimic '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we should be objtool-warning-free, so enable it to run for all Rust object files. - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support. - Support 'RUSTC_VERSION', including re-config and re-build on change. - Split helpers file into several files in a folder, to avoid conflicts in it. Eventually those files will be moved to the right places with the new build system. In addition, remove the need to manually export the symbols defined there, reusing existing machinery for that. - Relax restriction on configurations with Rust + GCC plugins to just the RANDSTRUCT plugin. 'kernel' crate: - New 'list' module: doubly-linked linked list for use with reference counted values, which is heavily used by the upcoming Rust Binder. This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed unique for the given ID), 'AtomicTracker' (tracks whether a 'ListArc' exists using an atomic), 'ListLinks' (the prev/next pointers for an item in a linked list), 'List' (the linked list itself), 'Iter' (an iterator over a 'List'), 'Cursor' (a cursor into a 'List' that allows to remove elements), 'ListArcField' (a field exclusively owned by a 'ListArc'), as well as support for heterogeneous lists. - New 'rbtree' module: red-black tree abstractions used by the upcoming Rust Binder. This includes 'RBTree' (the red-black tree itself), 'RBTreeNode' (a node), 'RBTreeNodeReservation' (a memory reservation for a node), 'Iter' and 'IterMut' (immutable and mutable iterators), 'Cursor' (bidirectional cursor that allows to remove elements), as well as an entry API similar to the Rust standard library one. - 'init' module: add 'write_[pin_]init' methods and the 'InPlaceWrite' trait. Add the 'assert_pinned!' macro. - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by introducing an associated type in the trait. - 'alloc' module: add 'drop_contents' method to 'BoxExt'. - 'types' module: implement the 'ForeignOwnable' trait for 'Pin<Box<T>>' and improve the trait's documentation. In addition, add the 'into_raw' method to the 'ARef' type. - 'error' module: in preparation for the upcoming Rust support for 32-bit architectures, like arm, locally allow Clippy lint for those. Documentation: - https://rust.docs.kernel.org has been announced, so link to it. - Enable rustdoc's "jump to definition" feature, making its output a bit closer to the experience in a cross-referencer. - Debian Testing now also provides recent Rust releases (outside of the freeze period), so add it to the list. MAINTAINERS: - Trevor is joining as reviewer of the "RUST" entry. And a few other small bits" * tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux: (54 commits) kasan: rust: Add KASAN smoke test via UAF kbuild: rust: Enable KASAN support rust: kasan: Rust does not support KHWASAN kbuild: rust: Define probing macros for rustc kasan: simplify and clarify Makefile rust: cfi: add support for CFI_CLANG with Rust cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS rust: support for shadow call stack sanitizer docs: rust: include other expressions in conditional compilation section kbuild: rust: replace proc macros dependency on `core.o` with the version text kbuild: rust: rebuild if the version text changes kbuild: rust: re-run Kconfig if the version text changes kbuild: rust: add `CONFIG_RUSTC_VERSION` rust: avoid `box_uninit_write` feature MAINTAINERS: add Trevor Gross as Rust reviewer rust: rbtree: add `RBTree::entry` rust: rbtree: add cursor rust: rbtree: add mutable iterator rust: rbtree: add iterator rust: rbtree: add red-black tree implementation backed by the C version ...
2024-09-21Merge tag 'sched_ext-for-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext support from Tejun Heo: "This implements a new scheduler class called ‘ext_sched_class’, or sched_ext, which allows scheduling policies to be implemented as BPF programs. The goals of this are: - Ease of experimentation and exploration: Enabling rapid iteration of new scheduling policies. - Customization: Building application-specific schedulers which implement policies that are not applicable to general-purpose schedulers. - Rapid scheduler deployments: Non-disruptive swap outs of scheduling policies in production environments" See individual commits for more documentation, but also the cover letter for the latest series: Link: https://lore.kernel.org/all/20240618212056.2833381-1-tj@kernel.org/ * tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (110 commits) sched: Move update_other_load_avgs() to kernel/sched/pelt.c sched_ext: Don't trigger ops.quiescent/runnable() on migrations sched_ext: Synchronize bypass state changes with rq lock scx_qmap: Implement highpri boosting sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq() sched_ext: Compact struct bpf_iter_scx_dsq_kern sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq() sched_ext: Move consume_local_task() upward sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq() sched_ext: Reorder args for consume_local/remote_task() sched_ext: Restructure dispatch_to_local_dsq() sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON sched_ext: Refactor consume_remote_task() sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate sched_ext: Add missing static to scx_dump_data sched_ext: Add missing static to scx_has_op[] sched_ext: Temporarily work around pick_task_scx() being called without balance_scx() sched_ext: Add a cgroup scheduler which uses flattened hierarchy sched_ext: Add cgroup support ...
2024-09-21Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Many singleton patches - please see the various changelogs for details. Quite a lot of nilfs2 work this time around. Notable patch series in this pull request are: - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64() to provide (much) more accurate results. The current implementation was causing Uwe some issues in the PWM drivers. - "xz: Updates to license, filters, and compression options" from Lasse Collin. Miscellaneous maintenance and kinor feature work to the xz decompressor. - "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee. Fixes and enhancements to the gdb scripts. - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this. - "nilfs2: add support for some common ioctls" from Ryusuke Konishi. Adds various commonly-available ioctls to nilfs2. - "This series fixes a number of formatting issues in kernel doc comments" from Ryusuke Konishi does that. - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi. Fix issues where -ENOENT was being unintentionally and inappropriately returned to userspace. - "nilfs2: assorted cleanups" from Huang Xiaojia. - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke Konishi fixes some issues which can occur on corrupted nilfs2 filesystems. - "scripts/decode_stacktrace.sh: improve error reporting and usability" from Luca Ceresoli does those things" * tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits) list: test: increase coverage of list_test_list_replace*() list: test: fix tests for list_cut_position() proc: use __auto_type more treewide: correct the typo 'retun' ocfs2: cleanup return value and mlog in ocfs2_global_read_info() nilfs2: remove duplicate 'unlikely()' usage nilfs2: fix potential oob read in nilfs_btree_check_delete() nilfs2: determine empty node blocks as corrupted nilfs2: fix potential null-ptr-deref in nilfs_btree_insert() user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation tools/mm: rm thp_swap_allocator_test when make clean squashfs: fix percpu address space issues in decompressor_multi_percpu.c lib: glob.c: added null check for character class nilfs2: refactor nilfs_segctor_thread() nilfs2: use kthread_create and kthread_stop for the log writer thread nilfs2: remove sc_timer_task nilfs2: do not repair reserved inode bitmap in nilfs_new_inode() nilfs2: eliminate the shared counter and spinlock for i_generation nilfs2: separate inode type information from i_state field nilfs2: use the BITS_PER_LONG macro ...
2024-09-21Merge tag 'mm-stable-2024-09-20-02-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...
2024-09-18Merge tag 'cgroup-for-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cpuset isolation improvements - cpuset cgroup1 support is split into its own file behind the new config option CONFIG_CPUSET_V1. This makes it the second controller which makes cgroup1 support optional after memcg - Handling of unavailable v1 controller handling improved during cgroup1 mount operations - union_find applied to cpuset. It makes code simpler and more efficient - Reduce spurious events in pids.events - Cleanups and other misc changes - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes that further changes build upon * tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (34 commits) cgroup: Do not report unavailable v1 controllers in /proc/cgroups cgroup: Disallow mounting v1 hierarchies without controller implementation cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only cgroup/cpuset: Move cpu.h include to cpuset-internal.h cgroup/cpuset: add sefltest for cpuset v1 cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1 cgroup/cpuset: rename functions shared between v1 and v2 cgroup/cpuset: move v1 interfaces to cpuset-v1.c cgroup/cpuset: move validate_change_legacy to cpuset-v1.c cgroup/cpuset: move legacy hotplug update to cpuset-v1.c cgroup/cpuset: add callback_lock helper cgroup/cpuset: move memory_spread to cpuset-v1.c cgroup/cpuset: move relax_domain_level to cpuset-v1.c cgroup/cpuset: move memory_pressure to cpuset-v1.c cgroup/cpuset: move common code to cpuset-internal.h cgroup/cpuset: introduce cpuset-v1.c selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs cgroup/cpuset: Account for boot time isolated CPUs cgroup/cpuset: remove use_parent_ecpus of cpuset cgroup/cpuset: remove fetch_xcpus ...
2024-09-16rust: kasan: Rust does not support KHWASANMatthew Maurer
Rust does not yet have support for software tags. Prevent RUST from being selected if KASAN_SW_TAGS is enabled. Signed-off-by: Matthew Maurer <mmaurer@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20240820194910.187826-3-mmaurer@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-16rust: cfi: add support for CFI_CLANG with RustMatthew Maurer
Make it possible to use the Control Flow Integrity (CFI) sanitizer when Rust is enabled. Enabling CFI with Rust requires that CFI is configured to normalize integer types so that all integer types of the same size and signedness are compatible under CFI. Rust and C use the same LLVM backend for code generation, so Rust KCFI is compatible with the KCFI used in the kernel for C. In the case of FineIBT, CFI also depends on -Zpatchable-function-entry for rewriting the function prologue, so we set that flag for Rust as well. The flag for FineIBT requires rustc 1.80.0 or later, so include a Kconfig requirement for that. Enabling Rust will select CFI_ICALL_NORMALIZE_INTEGERS because the flag is required to use Rust with CFI. Using select rather than `depends on` avoids the case where Rust is not visible in menuconfig due to CFI_ICALL_NORMALIZE_INTEGERS not being enabled. One disadvantage of select is that RUST must `depends on` all of the things that CFI_ICALL_NORMALIZE_INTEGERS depends on to avoid invalid configurations. Alice has been using KCFI on her phone for several months, so it is reasonably well tested on arm64. Signed-off-by: Matthew Maurer <mmaurer@google.com> Co-developed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Gatlin Newhouse <gatlin.newhouse@gmail.com> Acked-by: Kees Cook <kees@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240801-kcfi-v2-2-c93caed3d121@google.com [ Replaced `!FINEIBT` requirement with `!CALL_PADDING` to prevent a build error on older Rust compilers. Fixed typo. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-13rust: support for shadow call stack sanitizerAlice Ryhl
Add all of the flags that are needed to support the shadow call stack (SCS) sanitizer with Rust, and updates Kconfig to allow only configurations that work. The -Zfixed-x18 flag is required to use SCS on arm64, and requires rustc version 1.80.0 or greater. This restriction is reflected in Kconfig. When CONFIG_DYNAMIC_SCS is enabled, the build will be configured to include unwind tables in the build artifacts. Dynamic SCS uses the unwind tables at boot to find all places that need to be patched. The -Cforce-unwind-tables=y flag ensures that unwind tables are available for Rust code. In non-dynamic mode, the -Zsanitizer=shadow-call-stack flag is what enables the SCS sanitizer. Using this flag requires rustc version 1.82.0 or greater on the targets used by Rust in the kernel. This restriction is reflected in Kconfig. It is possible to avoid the requirement of rustc 1.80.0 by using -Ctarget-feature=+reserve-x18 instead of -Zfixed-x18. However, this flag emits a warning during the build, so this patch does not add support for using it and instead requires 1.80.0 or greater. The dependency is placed on `select HAVE_RUST` to avoid a situation where enabling Rust silently turns off the sanitizer. Instead, turning on the sanitizer results in Rust being disabled. We generally do not want changes to CONFIG_RUST to result in any mitigations being changed or turned off. At the time of writing, rustc 1.82.0 only exists via the nightly release channel. There is a chance that the -Zsanitizer=shadow-call-stack flag will end up needing 1.83.0 instead, but I think it is small. Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240829-shadow-call-stack-v7-1-2f62a4432abf@google.com [ Fixed indentation using spaces. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-05kbuild: rust: re-run Kconfig if the version text changesMiguel Ojeda
Re-run Kconfig if we detect the Rust compiler has changed via the version text, like it is done for C. Unlike C, and unlike `RUSTC_VERSION`, the `RUSTC_VERSION_TEXT` is kept under `depends on RUST`, since it should not be needed unless `RUST` is enabled. Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240902165535.1101978-3-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-05kbuild: rust: add `CONFIG_RUSTC_VERSION`Miguel Ojeda
Now that we support several Rust versions, introduce `CONFIG_RUSTC_VERSION` so that it can be used in Kconfig to enable and disable configuration options based on the `rustc` version. The approach taken resembles `pahole`'s -- see commit 613fe1692377 ("kbuild: Add CONFIG_PAHOLE_VERSION"), i.e. a simple version parsing without trying to identify several kinds of compilers, since so far there is only one (`rustc`). However, unlike `pahole`'s, we also print a zero if executing failed for any reason, rather than checking if the command is found and executable (which still leaves things like a file that exists and is executable, but e.g. is built for another platform [1]). An equivalent approach to the one here was also submitted for `pahole` [2]. Link: https://lore.kernel.org/rust-for-linux/CANiq72=4vX_tJMJLE6e+bg7ZECHkS-AQpm8GBzuK75G1EB7+Nw@mail.gmail.com/ [1] Link: https://lore.kernel.org/linux-kbuild/20240728125527.690726-1-ojeda@kernel.org/ [2] Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240902165535.1101978-2-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-04Merge branch 'bpf/master' into for-6.12Tejun Heo
Pull bpf/master to receive baebe9aaba1e ("bpf: allow passing struct bpf_iter_<type> as kfunc arguments") and related changes in preparation for the DSQ iterator patchset. Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-04sched_ext: Add cgroup supportTejun Heo
Add sched_ext_ops operations to init/exit cgroups, and track task migrations and config changes. A BPF scheduler may not implement or implement only subset of cgroup features. The implemented features can be indicated using %SCX_OPS_HAS_CGOUP_* flags. If cgroup configuration makes use of features that are not implemented, a warning is triggered. While a BPF scheduler is being enabled and disabled, relevant cgroup operations are locked out using scx_cgroup_rwsem. This avoids situations like task prep taking place while the task is being moved across cgroups, making things easier for BPF schedulers. v7: - cgroup interface file visibility toggling is dropped in favor just warning messages. Dynamically changing interface visiblity caused more confusion than helping. v6: - Updated to reflect the removal of SCX_KF_SLEEPABLE. - Updated to use CONFIG_GROUP_SCHED_WEIGHT and fixes for !CONFIG_FAIR_GROUP_SCHED && CONFIG_EXT_GROUP_SCHED. v5: - Flipped the locking order between scx_cgroup_rwsem and cpus_read_lock() to avoid locking order conflict w/ cpuset. Better documentation around locking. - sched_move_task() takes an early exit if the source and destination are identical. This triggered the warning in scx_cgroup_can_attach() as it left p->scx.cgrp_moving_from uncleared. Updated the cgroup migration path so that ops.cgroup_prep_move() is skipped for identity migrations so that its invocations always match ops.cgroup_move() one-to-one. v4: - Example schedulers moved into their own patches. - Fix build failure when !CONFIG_CGROUP_SCHED, reported by Andrea Righi. v3: - Make scx_example_pair switch all tasks by default. - Convert to BPF inline iterators. - scx_bpf_task_cgroup() is added to determine the current cgroup from CPU controller's POV. This allows BPF schedulers to accurately track CPU cgroup membership. - scx_example_flatcg added. This demonstrates flattened hierarchy implementation of CPU cgroup control and shows significant performance improvement when cgroups which are nested multiple levels are under competition. v2: - Build fixes for different CONFIG combinations. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: David Vernet <dvernet@meta.com> Acked-by: Josh Don <joshdon@google.com> Acked-by: Hao Luo <haoluo@google.com> Acked-by: Barret Rhoden <brho@google.com> Reported-by: kernel test robot <lkp@intel.com> Cc: Andrea Righi <andrea.righi@canonical.com>
2024-09-04sched: Introduce CONFIG_GROUP_SCHED_WEIGHTTejun Heo
sched_ext will soon add cgroup cpu.weigh support. The cgroup interface code is currently gated behind CONFIG_FAIR_GROUP_SCHED. As the fair class and/or SCX may implement the feature, put the interface code behind the new CONFIG_CGROUP_SCHED_WEIGHT which is selected by CONFIG_FAIR_GROUP_SCHED. This allows either sched class to enable the itnerface code without ading more complex CONFIG tests. When !CONFIG_FAIR_GROUP_SCHED, a dummy version of sched_group_set_shares() is added to support later CONFIG_CGROUP_SCHED_WEIGHT && !CONFIG_FAIR_GROUP_SCHED builds. No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-01xz: adjust arch-specific options for better kernel compressionLasse Collin
Use LZMA2 options that match the arch-specific alignment of instructions. This change reduces compressed kernel size 0-2 % depending on the arch. On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs it helps the most. Use the ARM-Thumb filter for ARM-Thumb2 kernels. This reduces compressed kernel size about 5 %.[1] Previously such kernels were compressed using the ARM filter which didn't do anything useful with ARM-Thumb2 code. Add BCJ filter support for ARM64 and RISC-V. Compared to unfiltered XZ or plain LZMA, the compressed kernel size is reduced about 5 % on ARM64 and 7 % on RISC-V. A new enough version of the xz tool is required: 5.4.0 for ARM64 and 5.6.0 for RISC-V. With an old xz version, a message is printed to standard error and the kernel is compressed without the filter. Update lib/decompress_unxz.c to match the changes to xz_wrap.sh. Update the CONFIG_KERNEL_XZ help text in init/Kconfig: - Add the RISC-V and ARM64 filters. - Clarify that the PowerPC filter is for big endian only. - Omit IA-64. Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1] Link: https://lkml.kernel.org/r/20240721133633.47721-15-lasse.collin@tukaani.org Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Reviewed-by: Sam James <sam@gentoo.org> Cc: Simon Glass <sjg@chromium.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Jubin Zhong <zhongjubin@huawei.com> Cc: Jules Maselbas <jmaselbas@zdiv.net> Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Joel Stanley <joel@jms.id.au> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rui Li <me@lirui.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm: fix typo in KconfigValdis Kletnieks
Fix typo in Kconfig help Link: https://lkml.kernel.org/r/78656.1720853990@turing-police Fixes: e93d4166b40a ("mm: memcg: put cgroup v1-specific code under a config option") Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-30cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1Chen Ridong
This patch introduces CONFIG_CPUSETS_V1 and guard cpuset-v1 code under CONFIG_CPUSETS_V1. The default value of CONFIG_CPUSETS_V1 is N, so that user who adopted v2 don't have 'pay' for cpuset v1. Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30io_uring: add GCOV_PROFILE_URING Kconfig optionJens Axboe
If GCOV is enabled and this option is set, it enables code coverage profiling of the io_uring subsystem. Only use this for test purposes, as it will impact the runtime performance. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-18init/Kconfig: Only block on RANDSTRUCT for RUSTNeal Gompa
When enabling Rust in the kernel, we only need to block on the RANDSTRUCT feature and GCC plugin. The rest of the GCC plugins are reasonably safe to enable. [ Originally (years ago) we only had this restriction, but we ended up restricting also the rest of the GCC plugins 1) to be on the safe side, 2) since compiler plugin support could be going away in the kernel and 3) since mixed builds are best effort so far; so I asked Neal about his experience enabling the other plugins -- Neal says: When I originally wrote this patch two years ago to get things working, Fedora used all the GCC plugins, so I was trying to get GCC + Rust to work while minimizing the delta on build differences. This was the combination that worked. We've been carrying this patch in the Asahi tree for a year now. And while Fedora does not currently have GCC plugins enabled because it caused issues with some third-party modules (I think it was the NVIDIA driver, but I'm not sure), it was around long enough for me to know with some confidence that it was fine this way. - Miguel ] Signed-off-by: Neal Gompa <neal@gompa.dev> Link: https://lore.kernel.org/r/20240731125615.3368813-1-neal@gompa.dev Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-08-16Merge tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull rust fixes from Miguel Ojeda: - Fix '-Os' Rust 1.80.0+ builds adding more intrinsics (also tweaked in upstream Rust for the upcoming 1.82.0). - Fix support for the latest version of rust-analyzer due to a change on rust-analyzer config file semantics (considered a fix since most developers use the latest version of the tool, which is the only one actually supported by upstream). I am discussing stability of the config file with upstream -- they may be able to start versioning it. - Fix GCC 14 builds due to '-fmin-function-alignment' not skipped for libclang (bindgen). - A couple Kconfig fixes around '{RUSTC,BINDGEN}_VERSION_TEXT' to suppress error messages in a foreign architecture chroot and to use a proper default format. - Clean 'rust-analyzer' target warning due to missing recursive make invocation mark. - Clean Clippy warning due to missing indentation in docs. - Clean LLVM 19 build warning due to removed 3dnow feature upstream. * tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linux: rust: x86: remove `-3dnow{,a}` from target features kbuild: rust-analyzer: mark `rust_is_available.sh` invocation as recursive rust: add intrinsics to fix `-Os` builds kbuild: rust: skip -fmin-function-alignment in bindgen flags rust: Support latest version of `rust-analyzer` rust: macros: indent list item in `module!`'s docs rust: fix the default format for CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT rust: suppress error messages from CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT
2024-08-01rust: SHADOW_CALL_STACK is incompatible with RustAlice Ryhl
When using the shadow call stack sanitizer, all code must be compiled with the -ffixed-x18 flag, but this flag is not currently being passed to Rust. This results in crashes that are extremely difficult to debug. To ensure that nobody else has to go through the same debugging session that I had to, prevent configurations that enable both SHADOW_CALL_STACK and RUST. It is rather common for people to backport 724a75ac9542 ("arm64: rust: Enable Rust support for AArch64"), so I recommend applying this fix all the way back to 6.1. Cc: stable@vger.kernel.org # 6.1 and later Fixes: 724a75ac9542 ("arm64: rust: Enable Rust support for AArch64") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240729-shadow-call-stack-v4-1-2a664b082ea4@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-29rust: fix the default format for CONFIG_{RUSTC,BINDGEN}_VERSION_TEXTMasahiro Yamada
Another oddity in these config entries is their default value can fall back to 'n', which is a value for bool or tristate symbols. The '|| echo n' is an incorrect workaround to avoid the syntax error. This is not a big deal, as the entry is hidden by 'depends on RUST' in situations where '$(RUSTC) --version' or '$(BINDGEN) --version' fails. Anyway, it looks odd. The default of a string type symbol should be a double-quoted string literal. Turn it into an empty string when the version command fails. Fixes: 2f7ab1267dc9 ("Kbuild: add Rust support") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240727140302.1806011-2-masahiroy@kernel.org [ Rebased on top of v6.11-rc1. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-07-29rust: suppress error messages from CONFIG_{RUSTC,BINDGEN}_VERSION_TEXTMasahiro Yamada
While this is a somewhat unusual case, I encountered odd error messages when I ran Kconfig in a foreign architecture chroot. $ make allmodconfig sh: 1: rustc: not found sh: 1: bindgen: not found # # configuration written to .config # The successful execution of 'command -v rustc' does not necessarily mean that 'rustc --version' will succeed. $ sh -c 'command -v rustc' /home/masahiro/.cargo/bin/rustc $ sh -c 'rustc --version' sh: 1: rustc: not found Here, 'rustc' is built for x86, and I ran it in an arm64 system. The current code: command -v $(RUSTC) >/dev/null 2>&1 && $(RUSTC) --version || echo n can be turned into: command -v $(RUSTC) >/dev/null 2>&1 && $(RUSTC) --version 2>/dev/null || echo n However, I did not understand the necessity of 'command -v $(RUSTC)'. I simplified it to: $(RUSTC) --version 2>/dev/null || echo n Fixes: 2f7ab1267dc9 ("Kbuild: add Rust support") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240727140302.1806011-1-masahiroy@kernel.org [ Rebased on top of v6.11-rc1. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-07-27Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull Rust updates from Miguel Ojeda: "The highlight is the establishment of a minimum version for the Rust toolchain, including 'rustc' (and bundled tools) and 'bindgen'. The initial minimum will be the pinned version we currently have, i.e. we are just widening the allowed versions. That covers three stable Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow), plus beta, plus nightly. This should already be enough for kernel developers in distributions that provide recent Rust compiler versions routinely, such as Arch Linux, Debian Unstable (outside the freeze period), Fedora Linux, Gentoo Linux (especially the testing channel), Nix (unstable) and openSUSE Slowroll and Tumbleweed. In addition, the kernel is now being built-tested by Rust's pre-merge CI. That is, every change that is attempting to land into the Rust compiler is tested against the kernel, and it is merged only if it passes. Similarly, the bindgen tool has agreed to build the kernel in their CI too. Thus, with the pre-merge CI in place, both projects hope to avoid unintentional changes to Rust that break the kernel. This means that, in general, apart from intentional changes on their side (that we will need to workaround conditionally on our side), the upcoming Rust compiler versions should generally work. In addition, the Rust project has proposed getting the kernel into stable Rust (at least solving the main blockers) as one of its three flagship goals for 2024H2 [1]. I would like to thank Niko, Sid, Emilio et al. for their help promoting the collaboration between Rust and the kernel. Toolchain and infrastructure: - Support several Rust toolchain versions. - Support several bindgen versions. - Remove 'cargo' requirement and simplify 'rusttest', thanks to 'alloc' having been dropped last cycle. - Provide proper error reporting for the 'rust-analyzer' target. 'kernel' crate: - Add 'uaccess' module with a safe userspace pointers abstraction. - Add 'page' module with a 'struct page' abstraction. - Support more complex generics in workqueue's 'impl_has_work!' macro. 'macros' crate: - Add 'firmware' field support to the 'module!' macro. - Improve 'module!' macro documentation. Documentation: - Provide instructions on what packages should be installed to build the kernel in some popular Linux distributions. - Introduce the new kernel.org LLVM+Rust toolchains. - Explain '#[no_std]'. And a few other small bits" Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1] * tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits) docs: rust: quick-start: add section on Linux distributions rust: warn about `bindgen` versions 0.66.0 and 0.66.1 rust: start supporting several `bindgen` versions rust: work around `bindgen` 0.69.0 issue rust: avoid assuming a particular `bindgen` build rust: start supporting several compiler versions rust: simplify Clippy warning flags set rust: relax most deny-level lints to warnings rust: allow `dead_code` for never constructed bindings rust: init: simplify from `map_err` to `inspect_err` rust: macros: indent list item in `paste!`'s docs rust: add abstraction for `struct page` rust: uaccess: add typed accessors for userspace pointers uaccess: always export _copy_[from|to]_user with CONFIG_RUST rust: uaccess: add userspace pointers kbuild: rust-analyzer: improve comment documentation kbuild: rust-analyzer: better error handling docs: rust: no_std is used rust: alloc: add __GFP_HIGHMEM flag rust: alloc: fix typo in docs for GFP_NOWAIT ...
2024-07-23Merge tag 'kbuild-v6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove tristate choice support from Kconfig - Stop using the PROVIDE() directive in the linker script - Reduce the number of links for the combination of CONFIG_KALLSYMS and CONFIG_DEBUG_INFO_BTF - Enable the warning for symbol reference to .exit.* sections by default - Fix warnings in RPM package builds - Improve scripts/make_fit.py to generate a FIT image with separate base DTB and overlays - Improve choice value calculation in Kconfig - Fix conditional prompt behavior in choice in Kconfig - Remove support for the uncommon EMAIL environment variable in Debian package builds - Remove support for the uncommon "name <email>" form for the DEBEMAIL environment variable - Raise the minimum supported GNU Make version to 4.0 - Remove stale code for the absolute kallsyms - Move header files commonly used for host programs to scripts/include/ - Introduce the pacman-pkg target to generate a pacman package used in Arch Linux - Clean up Kconfig * tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits) kbuild: doc: gcc to CC change kallsyms: change sym_entry::percpu_absolute to bool type kallsyms: unify seq and start_pos fields of struct sym_entry kallsyms: add more original symbol type/name in comment lines kallsyms: use \t instead of a tab in printf() kallsyms: avoid repeated calculation of array size for markers kbuild: add script and target to generate pacman package modpost: use generic macros for hash table implementation kbuild: move some helper headers from scripts/kconfig/ to scripts/include/ Makefile: add comment to discourage tools/* addition for kernel builds kbuild: clean up scripts/remove-stale-files kconfig: recursive checks drop file/lineno kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec kallsyms: get rid of code for absolute kallsyms kbuild: Create INSTALL_PATH directory if it does not exist kbuild: Abort make on install failures kconfig: remove 'e1' and 'e2' macros from expression deduplication kconfig: remove SYMBOL_CHOICEVAL flag kconfig: add const qualifiers to several function arguments kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups() ...
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-20kallsyms: get rid of code for absolute kallsymsJann Horn
Commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture") removed the last use of the absolute kallsyms. Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/all/20240221202655.2423854-1-jannh@google.com/ [masahiroy@kernel.org: rebase the code and reword the commit description] Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-18init/Kconfig: remove CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUNDMark Rutland
Several versions of GCC mis-compile asm goto with outputs. We try to workaround this, but our workaround is demonstrably incomplete and liable to result in subtle bugs, especially on arm64 where get_user() has recently been moved over to using asm goto with outputs. From discussion(s) with Linus at: https://lore.kernel.org/linux-arm-kernel/Zpfv2tnlQ-gOLGac@J2N7QTR9R3.cambridge.arm.com/ https://lore.kernel.org/linux-arm-kernel/ZpfxLrJAOF2YNqCk@J2N7QTR9R3.cambridge.arm.com/ ... it sounds like the best thing to do for now is to remove the workaround and make CC_HAS_ASM_GOTO_OUTPUT depend on working compiler versions. The issue was originally reported to GCC by Sean Christopherson: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 ... and Jakub Jelinek fixed this for GCC 14, with the fix backported to 13.3.0, 12.4.0, and 11.5.0. In the kernel, we tried to workaround broken compilers in commits: 4356e9f841f7 ("work around gcc bugs with 'asm goto' with outputs") 68fb3ca0e408 ("update workarounds for gcc "asm goto" issue") ... but the workaround of adding an empty asm("") after the asm volatile goto(...) demonstrably does not always avoid the problem, as can be seen in the following test case: | #define asm_goto_output(x...) \ | do { asm volatile goto(x); asm (""); } while (0) | | #define __good_or_bad(__val, __key) \ | do { \ | __label__ __failed; \ | unsigned long __tmp; \ | asm_goto_output( \ | " cbnz %[key], %l[__failed]\n" \ | " mov %[val], #0x900d\n" \ | : [val] "=r" (__tmp) \ | : [key] "r" (__key) \ | : \ | : __failed); \ | (__val) = __tmp; \ | break; \ | __failed: \ | (__val) = 0xbad; \ | } while (0) | | unsigned long get_val(unsigned long key); | unsigned long get_val(unsigned long key) | { | unsigned long val = 0xbad; | | __good_or_bad(val, key); | | return val; | } GCC 13.2.0 (at -O2) compiles this to: | cbnz x0, .Lfailed | mov x0, #0x900d | .Lfailed: | ret GCC 14.1.0 (at -O2) compiles this to: | cbnz x0, .Lfailed | mov x0, #0x900d | ret | .Lfailed: | mov x0, #0xbad | ret Note that GCC 13.2.0 erroneously omits the assignment to 'val' in the error path (even though this does not depend on an output of the asm goto). GCC 14.1.0 correctly retains the assignment. This problem can be seen within the kernel with the following test case: | #include <linux/uaccess.h> | #include <linux/types.h> | | noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr); | noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr) | { | unsigned long val; | | unsafe_get_user(val, ptr, Efault); | return val; | | Efault: | val = 0x900d; | return val; | } GCC 13.2.0 (arm64 defconfig) compiles this to: | and x0, x0, #0xff7fffffffffffff | ldtr x0, [x0] | .Lextable_fixup: | ret GCC 13.2.0 (x86_64 defconfig + MITIGATION_RETPOLINE=n) compiles this to: | endbr64 | mov (%rdi),%rax | .Lextable_fixup: | ret ... omitting the assignment to 'val' in the error path, and leaving garbage in the result register returned by the function (which happens to contain the faulting address in the generated code). GCC 14.1.0 (arm64 defconfig) compiles this to: | and x0, x0, #0xff7fffffffffffff | ldtr x0, [x0] | ret | .Lextable_fixup: | mov x0, #0x900d // #36877 | ret GCC 14.1.0 (x86_64 defconfig + MITIGATION_RETPOLINE=n) compiles this to: | endbr64 | mov (%rdi),%rax | ret | .Lextable_fixup: | mov $0x900d,%eax | ret ... retaining the expected assignment to 'val' in the error path. We don't have a complete and reasonable workaround. While placing empty asm("") blocks after each goto label *might* be sufficient, we don't know for certain, this is tedious and error-prone, and there doesn't seem to be a neat way to wrap this up (which is especially painful for cases with multiple goto labels). Avoid this issue by disabling CONFIG_CC_HAS_ASM_GOTO_OUTPUT for known-broken compiler versions and removing the workaround (along with the CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND config option). For the moment I've left the default implementation of asm_goto_output() unchanged. This should now be redundant since any compiler with the fix for the clobbering issue whould also have a fix for the (earlier) volatile issue, but it's far less churny to leave it around, which makes it easier to backport this patch if necessary. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alex Coplan <alex.coplan@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jakub Jelinek <jakub@gcc.gnu.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-10mm: remove CONFIG_MEMCG_KMEMJohannes Weiner
CONFIG_MEMCG_KMEM used to be a user-visible option for whether slab tracking is enabled. It has been default-enabled and equivalent to CONFIG_MEMCG for almost a decade. We've only grown more kernel memory accounting sites since, and there is no imaginable cgroup usecase going forward that wants to track user pages but not the multitude of user-drivable kernel allocations. Link: https://lkml.kernel.org/r/20240701153148.452230-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10rust: work around `bindgen` 0.69.0 issueMiguel Ojeda
`bindgen` 0.69.0 contains a bug: `--version` does not work without providing a header [1]: error: the following required arguments were not provided: <HEADER> Usage: bindgen <FLAGS> <OPTIONS> <HEADER> -- <CLANG_ARGS>... Thus, in preparation for supporting several `bindgen` versions, work around the issue by passing a dummy argument. Include a comment so that we can remove the workaround in the future. Link: https://github.com/rust-lang/rust-bindgen/pull/2678 [1] Reviewed-by: Finn Behrens <me@kloenk.dev> Tested-by: Benno Lossin <benno.lossin@proton.me> Tested-by: Andreas Hindborg <a.hindborg@samsung.com> Link: https://lore.kernel.org/r/20240709160615.998336-9-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-07-04mm: memcg: put cgroup v1-specific code under a config optionRoman Gushchin
Put legacy cgroup v1 memory controller code under a new CONFIG_MEMCG_V1 config option. The option is turned off by default. Nobody except those who are still using cgroup v1 should turn it on. If the option is not set, memory controller can still be mounted under cgroup v1, but none of memcg-specific control files are present. Please note, that not all cgroup v1's memory controller code is guarded yet (but most of it), it's a subject for some follow-up work. Thanks to Michal Hocko for providing a better Kconfig option description. [roman.gushchin@linux.dev: better config option description provided by Michal] Link: https://lkml.kernel.org/r/ZnxXNtvqllc9CDoo@google.com Link: https://lkml.kernel.org/r/20240625005906.106920-14-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>