summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-01-14fs: export may_delete() as may_delete_dentry()Filipe Manana
For many years btrfs as been using a copy of may_delete() in fs/btrfs/ioctl.c:btrfs_may_delete(). Everytime may_delete() is updated we need to update the btrfs copy, and this is a maintenance burden. Currently there are minor differences between both because the btrfs side lacks updates done in may_delete(). Export may_delete() so that btrfs can use it and with the less generic name may_delete_dentry(). While at it change the calls in vfs_rmdir() to pass a boolean literal instead of 1 and 0 as the last argument since the argument has a bool type. Signed-off-by: Filipe Manana <fdmanana@suse.com> Link: https://patch.msgid.link/e09128fd53f01b19d0a58f0e7d24739f79f47f6d.1768307858.git.fdmanana@suse.com Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14ACPI: extlog: Trace CPER CXL Protocol Error SectionFabio M. De Francesco
When Firmware First is enabled, BIOS handles errors first and then it makes them available to the kernel via the Common Platform Error Record (CPER) sections (UEFI 2.11 Appendix N.2.13). Linux parses the CPER sections via one of two similar paths, either ELOG or GHES. The errors managed by ELOG are signaled to the BIOS by the I/O Machine Check Architecture (I/O MCA). Currently, ELOG and GHES show some inconsistencies in how they report to userspace via trace events. Therefore, make the two mentioned paths act similarly by tracing the CPER CXL Protocol Error Section. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Link: https://patch.msgid.link/20260114101543.85926-6-fabio.m.de.francesco@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work structFabio M. De Francesco
Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent type and copy the CPER CXL protocol errors information to a work data structure. Export the new symbol for reuse by ELOG. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> [ rjw: Subject tweak ] Link: https://patch.msgid.link/20260114101543.85926-5-fabio.m.de.francesco@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checksFabio M. De Francesco
Move the CPER CXL protocol errors validity check out of cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit the serial number check only to CXL agents that are CXL devices (UEFI v2.10, Appendix N.2.13). Export the new symbol for reuse by ELOG. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> [ rjw: Subject tweak ] Link: https://patch.msgid.link/20260114101543.85926-4-fabio.m.de.francesco@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14ACPI: APEI: GHES: Improve ghes_notify_nmi() status checkTony Luck
ghes_notify_nmi() is called for every NMI and must check whether the NMI was generated because an error was signalled by platform firmware. This check is very expensive as for each registered GHES NMI source it reads from the acpi generic address attached to this error source to get the physical address of the acpi_hest_generic_status block. It then checks the "block_status" to see if an error was logged. The ACPI/APEI code must create virtual mappings for each of those physical addresses, and tear them down afterwards. On an Icelake system this takes around 15,000 TSC cycles. Enough to disturb efforts to profile system performance. If that were not bad enough, there are some atomic accesses in the code path that will cause cache line bounces between CPUs. A problem that gets worse as the core count increases. But BIOS changes neither the acpi generic address nor the physical address of the acpi_hest_generic_status block. So this walk can be done once when the NMI is registered to save the virtual address (unmapping if the NMI is ever unregistered). The "block_status" can be checked directly in the NMI handler. This can be done without any atomic accesses. Resulting time to check that there is not an error record is around 900 cycles. Reported-by: Andi Kleen <andi.kleen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://patch.msgid.link/20260112032239.30023-2-xueshuai@linux.alibaba.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14APEI/GHES: ensure that won't go past CPER allocated recordMauro Carvalho Chehab
The logic at ghes_new() prevents allocating too large records, by checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB). Yet, the allocation is done with the actual number of pages from the CPER bios table location, which can be smaller. Yet, a bad firmware could send data with a different size, which might be bigger than the allocated memory, causing an OOPS: Unable to handle kernel paging request at virtual address fff00000f9b40000 Mem abort info: ESR = 0x0000000096000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000 [fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000 Internal error: Oops: 0000000096000007 [#1] SMP Modules linked in: CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022 Workqueue: kacpi_notify acpi_os_execute_deferred pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : hex_dump_to_buffer+0x30c/0x4a0 lr : hex_dump_to_buffer+0x328/0x4a0 sp : ffff800080e13880 x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083 x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004 x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083 x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010 x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020 x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008 x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020 x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000 x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008 Call trace: hex_dump_to_buffer+0x30c/0x4a0 (P) print_hex_dump+0xac/0x170 cper_estatus_print_section+0x90c/0x968 cper_estatus_print+0xf0/0x158 __ghes_print_estatus+0xa0/0x148 ghes_proc+0x1bc/0x220 ghes_notify_hed+0x5c/0xb8 notifier_call_chain+0x78/0x148 blocking_notifier_call_chain+0x4c/0x80 acpi_hed_notify+0x28/0x40 acpi_ev_notify_dispatch+0x50/0x80 acpi_os_execute_deferred+0x24/0x48 process_one_work+0x15c/0x3b0 worker_thread+0x2d0/0x400 kthread+0x148/0x228 ret_from_fork+0x10/0x20 Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44) ---[ end trace 0000000000000000 ]--- Prevent that by taking the actual allocated are into account when checking for CPER length. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject tweaks ] Link: https://patch.msgid.link/4e70310a816577fabf37d94ed36cde4ad62b1e0a.1767871950.git.mchehab+huawei@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14EFI/CPER: don't go past the ARM processor CPER record bufferMauro Carvalho Chehab
There's a logic inside GHES/CPER to detect if the section_length is too small, but it doesn't detect if it is too big. Currently, if the firmware receives an ARM processor CPER record stating that a section length is big, kernel will blindly trust section_length, producing a very long dump. For instance, a 67 bytes record with ERR_INFO_NUM set 46198 and section length set to 854918320 would dump a lot of data going a way past the firmware memory-mapped area. Fix it by adding a logic to prevent it to go past the buffer if ERR_INFO_NUM is too big, making it report instead: [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [Hardware Error]: event severity: recoverable [Hardware Error]: Error 0, type: recoverable [Hardware Error]: section_type: ARM processor error [Hardware Error]: MIDR: 0xff304b2f8476870a [Hardware Error]: section length: 854918320, CPER size: 67 [Hardware Error]: section length is too big [Hardware Error]: firmware-generated error record is incorrect [Hardware Error]: ERR_INFO_NUM is 46198 Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject and changelog tweaks ] Link: https://patch.msgid.link/41cd9f6b3ace3cdff7a5e864890849e4b1c58b63.1767871950.git.mchehab+huawei@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-14exportfs: Complete kernel-doc for struct export_operationsAndré Almeida
Write down the missing members definitions for struct export_operations, using as a reference the commit messages that created the members. Signed-off-by: André Almeida <andrealmeid@igalia.com> Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-3-acc1889de772@igalia.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14exportfs: Mark struct export_operations functions at kernel-docAndré Almeida
Adding a `@` before the function names make then recognizable as kernel-docs, so they get correctly rendered in the documentation. Even if they are already marked with `@` in the short one-line summary, the kernel-docs will correctly favor the more detailed definition here. Signed-off-by: André Almeida <andrealmeid@igalia.com> Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-2-acc1889de772@igalia.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14exportfs: Fix kernel-doc output for get_name()André Almeida
Without a space between %NAME_MAX and the plus sign, kernel-doc will output ``NAME_MAX``+1, which scapes the last backtick and make Sphinx format a much larger string as monospaced text. Signed-off-by: André Almeida <andrealmeid@igalia.com> Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-1-acc1889de772@igalia.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14device_cgroup: remove branch hint after code refactorBreno Leitao
commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()") reordered the checks in devcgroup_inode_permission() to check the inode mode before checking i_rdev, for better cache behavior. However, the likely() annotation on the i_rdev check was not updated to reflect the new code flow. Originally, when i_rdev was checked first, likely(!inode->i_rdev) made sense because most inodes were(?) regular files/directories, thus i_rdev == 0. After the reorder, by the time we reach the i_rdev check, we have already confirmed the inode IS a block or character device. Block and character special files are precisely defined by having a device number (i_rdev), so !inode->i_rdev is now the rare edge case, not the common case. Branch profiling confirmed this is 100% mispredicted: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 2631904 100 devcgroup_inode_permission device_cgroup.h 24 Remove likely() to avoid giving the wrong hint to the CPU. Fixes: 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260107-likely_device-v1-1-0c55f83a7e47@debian.org Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14iomap: stash iomap read ctx in the private field of iomap_iterHongbo Li
It's useful to get filesystem-specific information using the existing private field in the @iomap_iter passed to iomap_{begin,end} for advanced usage for iomap buffered reads, which is much like the current iomap DIO. For example, EROFS needs it to: - implement an efficient page cache sharing feature, since iomap needs to apply to anon inode page cache but we'd like to get the backing inode/fs instead, so filesystem-specific private data is needed to keep such information; - pass in both struct page * and void * for inline data to avoid kmap_to_page() usage (which is bogus). Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://patch.msgid.link/20260109102856.598531-2-lihongbo22@huawei.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-14i3c: mipi-i3c-hci: Allow for Multi-Bus InstancesAdrian Hunter
Add support for MIPI I3C Host Controllers with the Multi-Bus Instance capability. These controllers can host multiple I3C buses (up to 15) within a single hardware function (e.g., PCIe B/D/F), providing one indepedent HCI register set and corresponding I3C bus controller logic per bus. A separate platform device will represent each instance, but it is necessary to allow for shared resources. Multi-bus instances share the same MMIO address space, but the ranges are not guaranteed to be contiguous. To avoid overlapping mappings, pass base_regs from the parent mapping to child devices. Allow the IRQ to be shared among instances. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260106164416.67074-8-adrian.hunter@intel.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2026-01-14i3c: Add stub functions when I3C support is disabledFrank Li
When I3C is disabled, unused functions are removed by the linker because the driver relies on regmap and no I3C devices are registered, so normal I3C paths are never called. However, some drivers may still call low-level I3C transfer helpers. Provide stub implementations to avoid adding conditional ifdefs everywhere. Add stubs for i3c_device_do_xfers() and i3c_device_get_supported_xfer_mode() only. Other stubs will be introduced when they are actually needed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512230418.nu3V6Yua-lkp@intel.com/ Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251230145718.4088694-1-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2026-01-14i3c: drop i3c_priv_xfer and i3c_device_do_priv_xfers()Frank Li
Drop i3c_priv_xfer and i3c_device_do_priv_xfers() after all driver switch to use new API. Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251215172405.2982801-1-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2026-01-14Merge tag 'iio-fixes-for-6.19a' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: IIO: 1st set of fixes for the 6.19 cycle The usual mixed bag of fixes for ancient problems plus some more recent ones. adi,ad7280a - Check for errors from spi_setup(). adi,ad3552r - Fix potential buffer overflow when setting to use the internal ramp. adi,ax5695r - Fill in the data for this device in the chip info table. adi,ad7606 - Don't store a negative error in an unsigned int. adi,ad9467 - Fix incorrect register mask value. adi,adxl380 - Fix inverted condition for whether INT1 interrupt present in dt. atmel,at91-sama5d2 - Cancel work on remove to avoid a potential use-after-free invensense,icm45600 - Fix temperature scaling. samsung,eynos_adc - Use of_platform_depolulate() to correctly clear up such that child devices are created correctly if the driver is rebound. sensiron,scd4x - Fix incorrect endianness reported to user-space. st,accel - Fix gain reported for the iis329dq. st,lsm6dsx - Hide event related interfaces on parts that don't support events. ti,pac1934 - Ensure output of clamp() is used rather than unclamped value. * tag 'iio-fixes-for-6.19a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source iio: accel: iis328dq: fix gain values iio: core: add separate lockdep class for info_exist_lock iio: chemical: scd4x: fix reported channel endianness iio: imu: inv_icm45600: fix temperature offset reporting iio: adc: exynos_adc: fix OF populate on driver rebind iio: dac: ad5686: add AD5695R to ad5686_chip_info_tbl iio: accel: adxl380: fix handling of unavailable "INT1" interrupt iio: imu: st_lsm6dsx: fix iio_chan_spec for sensors without event detection iio: adc: pac1934: Fix clamped value in pac1934_reg_snapshot iio: adc: ad9467: fix ad9434 vref mask iio: adc: ad7606: Fix incorrect type for error return variable iio: adc: ad7280a: handle spi_setup() errors in probe() iio: adc: at91-sama5d2_adc: Fix potential use-after-free in sama5d2_adc driver
2026-01-14tee: add revision sysfs attributeAristo Chen
Add a generic TEE revision sysfs attribute backed by a new optional get_tee_revision() callback. The revision string is diagnostic-only and must not be used to infer feature support. Signed-off-by: Aristo Chen <aristo.chen@canonical.com> Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2026-01-14ARM: s3c: remove a leftover hwmon-s3c.h header fileVladimir Zapolskiy
The last user of defined structures s3c_hwmon_pdata and s3c_hwmon_chcfg was removed in commit 0d297df03890 ("ARM: s3c: simplify platform code"), thus the platform data header file itself can be removed also. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Link: https://patch.msgid.link/20260112211554.3755188-1-vz@mleia.com Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2026-01-14mm_zone: Generalise has_managed_dma()Robin Murphy
It would be useful to be able to check for potential DMA pages beyond just ZONE_DMA - generalise the existing has_managed_dma() function to allow checking other zones too. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: Vladimir Kondratiev <vladimir.kondratiev@mobileye.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/bd002d2351074e57be1ca08f03f333debac658fb.1768230104.git.robin.murphy@arm.com
2026-01-14vdso: Switch get/put_unaligned() from packed struct to memcpy()Ian Rogers
Type punning is necessary for get/put_unaligned() but the use of a packed struct violates strict aliasing rules, requiring -fno-strict-aliasing to be passed to the C compiler. Switch to using memcpy() so that -fno-strict-aliasing isn't necessary. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251016205126.2882625-3-irogers@google.com
2026-01-14vdso: Remove struct getcpu_cacheThomas Weißschuh
The cache parameter of getcpu() is useless nowadays for various reasons. * It is never passed by userspace for either the vDSO or syscalls. * It is never used by the kernel. * It could not be made to work on the current vDSO architecture. * The structure definition is not part of the UAPI headers. * vdso_getcpu() is superseded by restartable sequences in any case. Remove the struct and its header. As a side-effect this gets rid of an unwanted inclusion of the linux/ header namespace from vDSO code. [ tglx: Adapt to s390 upstream changes */ Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Link: https://patch.msgid.link/20251230-getcpu_cache-v3-1-fb9c5f880ebe@linutronix.de
2026-01-13Merge branch '20260105-kvmrprocv10-v10-0-022e96815380@oss.qualcomm.com' into ↵Bjorn Andersson
drivers-for-6.20 Merge the support for loading and managing the TrustZone-based remote processors found in the Glymur platform through a topic branch, as it's a mix of qcom-soc and remoteproc patches.
2026-01-13btf: Optimize type lookup with binary searchDonglin Peng
Improve btf_find_by_name_kind() performance by adding binary search support for sorted types. Falls back to linear search for compatibility. Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260109130003.3313716-7-dolinux.peng@gmail.com
2026-01-13struct filename ->refcnt doesn't need to be atomicAl Viro
... or visible outside of audit, really. Note that references held in delayed_filename always have refcount 1, and from the moment of complete_getname() or equivalent point in getname...() there won't be any references to struct filename instance left in places visible to other threads. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13allow incomplete imports of filenamesAl Viro
There are two filename-related problems in io_uring and its interplay with audit. Filenames are imported when request is submitted and used when it is processed. Unfortunately, the latter may very well happen in a different thread. In that case the reference to filename is put into the wrong audit_context - that of submitting thread, not the processing one. Audit logics is called by the latter, and it really wants to be able to find the names in audit_context current (== processing) thread. Another related problem is the headache with refcounts - normally all references to given struct filename are visible only to one thread (the one that uses that struct filename). io_uring violates that - an extra reference is stashed in audit_context of submitter. It gets dropped when submitter returns to userland, which can happen simultaneously with processing thread deciding to drop the reference it got. We paper over that by making refcount atomic, but that means pointless headache for everyone. Solution: the notion of partially imported filenames. Namely, already copied from userland, but *not* exposed to audit yet. io_uring can create that in submitter thread, and complete the import (obtaining the usual reference to struct filename) in processing thread. Object: struct delayed_filename. Primitives for working with it: delayed_getname(&delayed_filename, user_string) - copies the name from userland, returning 0 and stashing the address of (still incomplete) struct filename in delayed_filename on success and returning -E... on error. delayed_getname_uflags(&delayed_filename, user_string, atflags) - similar, in the same relation to delayed_getname() as getname_uflags() is to getname() complete_getname(&delayed_filename) - completes the import of filename stashed in delayed_filename and returns struct filename to caller, emptying delayed_filename. CLASS(filename_complete_delayed, name)(&delayed_filename) - variant of CLASS(filename) with complete_getname() for constructor. dismiss_delayed_filename(&delayed_filename) - destructor; drops whatever might be stashed in delayed_filename, emptying it. putname_to_delayed(&delayed_filename, name) - if name is shared, stashes its copy into delayed_filename and drops the reference to name, otherwise stashes the name itself in there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13fs: hide names_cache behind runtime const machineryMateusz Guzik
s/names_cachep/names_cache/ for consistency with dentry cache. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13struct filename: saner handling of long namesAl Viro
Always allocate struct filename from names_cachep, long name or short; short names would be embedded into struct filename. Longer ones do not cannibalize the original struct filename - put them into kmalloc'ed buffers (PATH_MAX-sized for import from userland, strlen() + 1 - for ones originating kernel-side, where we know the length beforehand). Cutoff length for short names is chosen so that struct filename would be 192 bytes long - that's both a multiple of 64 and large enough to cover the majority of real-world uses. Simplifies logics in getname()/putname() and friends. [fixed an embarrassing braino in EMBEDDED_NAME_MAX, first reported by Dan Carpenter] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13struct filename: use names_cachep only for getname() and friendsAl Viro
Instances of struct filename come from names_cachep (via __getname()). That is done by getname_flags() and getname_kernel() and these two are the main callers of __getname(). However, there are other callers that simply want to allocate PATH_MAX bytes for uses that have nothing to do with struct filename. We want saner allocation rules for long pathnames, so that struct filename would *always* come from names_cachep, with the out-of-line pathname getting kmalloc'ed. For that we need to be able to change the size of objects allocated by getname_flags()/getname_kernel(). That requires the rest of __getname() users to stop using names_cachep; we could explicitly switch all of those to kmalloc(), but that would cause quite a bit of noise. So the plan is to switch getname_...() to new helpers and turn __getname() into a wrapper for kmalloc(). Remaining __getname() users could be converted to explicit kmalloc() at leisure, hopefully along with figuring out what size do they really want - PATH_MAX is an overkill for some of them, used out of laziness ("we have a convenient helper that does 4K allocations and that's large enough, let's use it"). As a side benefit, names_cachep is no longer used outside of fs/namei.c, so we can move it there and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13get rid of audit_reusename()Al Viro
Originally we tried to avoid multiple insertions into audit names array during retry loop by a cute hack - memorize the userland pointer and if there already is a match, just grab an extra reference to it. Cute as it had been, it had problems - two identical pointers had audit aux entries merged, two identical strings did not. Having different behaviour for syscalls that differ only by addresses of otherwise identical string arguments is obviously wrong - if nothing else, compiler can decide to merge identical string literals. Besides, this hack does nothing for non-audited processes - they get a fresh copy for retry. It's not time-critical, but having behaviour subtly differ that way is bogus. These days we have very few places that import filename more than once (9 functions total) and it's easy to massage them so we get rid of all re-imports. With that done, we don't need audit_reusename() anymore. There's no need to memorize userland pointer either. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13allow to use CLASS() for struct filename *Al Viro
Not all users match that model, but most of them do. By the end of the series we'll be left with very few irregular ones... Added: CLASS(filename, name)(user_path) => getname(user_path) CLASS(filename_kernel, name)(string) => getname_kernel(string) CLASS(filename_flags, name)(user_path, flags) => getname_flags(user_path, flags) CLASS(filename_uflags, name)(user_path, flags) => getname_uflags(user_path, flags) CLASS(filename_maybe_null, name)(user_path, flags) => getname_maybe_null(user_path, flags) all with putname() as destructor. "flags" in filename_flags is in LOOKUP_... space, only LOOKUP_EMPTY matters. "flags" in filename_uflags and filename_maybe_null is in AT_...... space, and only AT_EMPTY_PATH matters. filename_flags conventions might be worth reconsidering later (it might or might not be better off with boolean instead) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13firmware: qcom_scm: Add qcom_scm_pas_get_rsc_table() to get resource tableMukesh Ojha
Qualcomm remote processor may rely on Static and Dynamic resources for it to be functional. Static resources are fixed like for example, memory-mapped addresses required by the subsystem and dynamic resources, such as shared memory in DDR etc., are determined at runtime during the boot process. For most of the Qualcomm SoCs, when run with Gunyah or older QHEE hypervisor, all the resources whether it is static or dynamic, is managed by the hypervisor. Dynamic resources if it is present for a remote processor will always be coming from secure world via SMC call while static resources may be present in remote processor firmware binary or it may be coming qcom_scm_pas_get_rsc_table() SMC call along with dynamic resources. Some of the remote processor drivers, such as video, GPU, IPA, etc., do not check whether resources are present in their remote processor firmware binary. In such cases, the caller of this function should set input_rt and input_rt_size as NULL and zero respectively. Remoteproc framework has method to check whether firmware binary contain resources or not and they should be pass resource table pointer to input_rt and resource table size to input_rt_size and this will be forwarded to TrustZone for authentication. TrustZone will then append the dynamic resources and return the complete resource table in the passed output buffer. More about documentation on resource table format can be found in include/linux/remoteproc.h Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-11-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13firmware: qcom_scm: Add a prep version of auth_and_reset functionMukesh Ojha
For memory passed to TrustZone (TZ), it must either be part of a pool registered with TZ or explicitly registered via SHMbridge SMC calls. When Gunyah hypervisor is present, PAS SMC calls from Linux running at EL1 are trapped by Gunyah running @ EL2, which handles SHMbridge creation for both metadata and remoteproc carveout memory before invoking the calls to TZ. On SoCs running with a non-Gunyah-based hypervisor, Linux must take responsibility for creating the SHM bridge before invoking PAS SMC calls. For the auth_and_reset() call, the remoteproc carveout memory must first be registered with TZ via a SHMbridge SMC call and once authentication and reset are complete, the SHMbridge memory can be deregistered. Introduce qcom_scm_pas_prepare_and_auth_reset(), which sets up the SHM bridge over the remoteproc carveout memory when Linux operates at EL2. This behavior is indicated by a new field added to the PAS context data structure. The function then invokes the auth_and_reset SMC call. Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-8-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13soc: qcom: mdtloader: Remove qcom_mdt_pas_init() from exported symbolsMukesh Ojha
qcom_mdt_pas_init() was previously used only by the remoteproc driver (drivers/remoteproc/qcom_q6v5_pas.c). Since that driver has now transitioned to using PAS context-based qcom_mdt_pas_load() function, making qcom_mdt_pas_init() obsolete for external use. Removes qcom_mdt_pas_init() from the list of exported symbols and make it static to limit its scope to internal use within mdtloader. Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-7-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13soc: qcom: mdtloader: Add PAS context aware qcom_mdt_pas_load() functionMukesh Ojha
Introduce a new PAS context-aware function, qcom_mdt_pas_load(), for remote processor drivers. This function utilizes the PAS context pointer returned from qcom_scm_pas_ctx_init() to perform firmware metadata verification and memory setup via SMC calls. The qcom_mdt_pas_load() and qcom_mdt_load() functions are largely similar, but the former is designed for clients using the PAS context-based data structure. Over time, all users of qcom_mdt_load() can be migrated to use qcom_mdt_pas_load() for consistency and improved abstraction. As the remoteproc PAS driver (qcom_q6v5_pas) has already adopted the PAS context-based approach, update it to use qcom_mdt_pas_load(). Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-6-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13remoteproc: pas: Replace metadata context with PAS context structureMukesh Ojha
As a superset of the existing metadata context, the PAS context structure enables both remoteproc and non-remoteproc subsystems to better support scenarios where the SoC runs with or without the Gunyah hypervisor. To reflect this, relevant SCM and metadata functions are updated to incorporate PAS context awareness and remove metadata context data structure completely. Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-5-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13firmware: qcom_scm: Introduce PAS context allocator helper functionMukesh Ojha
When the Peripheral Authentication Service (PAS) method runs on a SoC where Linux operates at EL2 (i.e., without the Gunyah hypervisor), the reset sequences are handled by TrustZone. In such cases, Linux must perform additional steps before invoking PAS SMC calls, such as creating a SHM bridge. Therefore, PAS SMC calls require awareness and handling of these additional steps when Linux runs at EL2. To support this, there is a need for a data structure that can be initialized prior to invoking any SMC or MDT functions. This structure allows those functions to determine whether they are operating in the presence or absence of the Gunyah hypervisor and behave accordingly. Currently, remoteproc and non-remoteproc subsystems use different variants of the MDT loader helper API, primarily due to differences in metadata context handling. Remoteproc subsystems retain the metadata context until authentication and reset are completed, while non-remoteproc subsystems (e.g., video, graphics, IPA, etc.) do not retain the metadata context and can free it within the qcom_scm_pas_init() call by passing a NULL context parameter and due to these differences, it is not possible to extend metadata context handling to support remoteproc and non remoteproc subsystem use PAS operations, when Linux operates at EL2. Add PAS context data structure allocator helper function. Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-4-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13firmware: qcom_scm: Rename peripheral as pas_idMukesh Ojha
Peripheral and pas_id refers to unique id for a subsystem and used only when peripheral authentication service from secure world is utilized. Lets rename peripheral to pas_id to reflect closer to its meaning. Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-3-022e96815380@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-13Merge tag 'hyperv-fixes-signed-20260112' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Minor fixes and cleanups for the MSHV driver * tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: mshv: release mutex on region invalidation failure hyperv: Avoid -Wflex-array-member-not-at-end warning mshv: hide x86-specific functions on arm64 mshv: Initialize local variables early upon region invalidation mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions
2026-01-13io_uring: track restrictions separately for IORING_OP and IORING_REGISTERJens Axboe
It's quite likely that only register opcode restrictions exists, in which case we'd never need to check the normal opcodes. Split ctx->restricted into two separate fields, one for I/O opcodes, and one for register opcodes. Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-13i3c: add sysfs entry and attribute for Device NACK Retry countAdrian Ng Ho Yin
Document sysfs attribute dev_nack_retry_cnt that controls the number of automatic retries performed by the I3C controller when a target device returns a NACK Add a `dev_nack_retry_count` sysfs attribute to allow reading and updating the device NACK retry count. A new `dev_nack_retry_count` field and an optional `set_dev_nack_retry()` callback are added to i3c_master_controller. The attribute is created only when the callback is implemented. Updates are applied under the I3C bus maintenance lock to ensure safe hardware reconfiguration. Signed-off-by: Adrian Ng Ho Yin <adrianhoyin.ng@altera.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/3c4b5082bde64024fc383c44bebeef89ad3c7ed3.1765529948.git.adrianhoyin.ng@altera.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2026-01-13block, nvme: remove unused dma_iova_state function parameterNitesh Shetty
DMA IOVA state is not used inside blk_rq_dma_map_iter_next, get rid of the argument. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-13vdso: Add prototype for __vdso_clock_getres_time64()Thomas Weißschuh
For consistency with __vdso_clock_gettime64() there should also be a 64-bit variant of clock_getres(). This will allow the extension of CONFIG_COMPAT_32BIT_TIME to the vDSO and finally the removal of 32-bit time types from the kernel and UAPI. The generic vDSO library already provides nearly all necessary building blocks for architectures to provide this function. Only a prototype is missing. Add the prototype to the generic header so architectures can start providing this function. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-1-97ea7a06a543@linutronix.de
2026-01-13sched: Provide idle_rq() helperPeter Zijlstra
A fix for the dl_server 'requires' idle_cpu() usage, which made me note that it and available_idle_cpu() are extern function calls. And while idle_cpu() is used outside of kernel/sched/, available_idle_cpu() is not. This makes it hard to make idle_cpu() an inline helper, so provide idle_rq() and implement idle_cpu() and available_idle_cpu() using that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2026-01-13hrtimer: Drop _tv64() helpersThomas Weißschuh
Since ktime_t has become an alias to s64, these helpers are unnecessary. Migrate the few remaining users to the regular helpers and remove the now dead code. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-3-1a698ef0ddae@linutronix.de
2026-01-13hrtimer: Remove public definition of HIGH_RES_NSECThomas Weißschuh
This constant is only used in a single place and is has a very generic name polluting the global namespace. Move the constant closer to its only user. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-2-1a698ef0ddae@linutronix.de
2026-01-13hrtimer: Remove unused resolution constantsThomas Weißschuh
These constants are never used, remove them. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-1-1a698ef0ddae@linutronix.de
2026-01-13net: add net.core.qdisc_max_burstEric Dumazet
In blamed commit, I added a check against the temporary queue built in __dev_xmit_skb(). Idea was to drop packets early, before any spinlock was acquired. if (unlikely(defer_count > READ_ONCE(q->limit))) { kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP); return NET_XMIT_DROP; } It turned out that HTB Qdisc has a zero q->limit. HTB limits packets on a per-class basis. Some of our tests became flaky. Add a new sysctl : net.core.qdisc_max_burst to control how many packets can be stored in the temporary lockless queue. Also add a new QDISC_BURST_DROP drop reason to better diagnose future issues. Thanks Neal ! Fixes: 100dfa74cad9 ("net: dev_queue_xmit() llist adoption") Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-13fs: report filesystem and file I/O errors to fsnotifyDarrick J. Wong
Create some wrapper code around struct super_block so that filesystems have a standard way to queue filesystem metadata and file I/O error reports to have them sent to fsnotify. If a filesystem wants to provide an error number, it must supply only negative error numbers. These are stored internally as negative numbers, but they are converted to positive error numbers before being passed to fanotify, per the fanotify(7) manpage. Implementations of super_operations::report_error are passed the raw internal event data. Note that we have to play some shenanigans with mempools and queue_work so that the error handling doesn't happen outside of process context, and the event handler functions (both ->report_error and fsnotify) can handle file I/O error messages without having to worry about whatever locks might be held. This asynchronicity requires that unmount wait for pending events to clear. Add a new callback to the superblock operations structure so that filesystem drivers can themselves respond to file I/O errors if they so desire. This will be used for an upcoming self-healing patchset for XFS. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/176826402610.3490369.4378391061533403171.stgit@frogsfrogsfrogs Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-13uapi: promote EFSCORRUPTED and EUCLEAN to errno.hDarrick J. Wong
Stop definining these privately and instead move them to the uapi errno.h so that they become canonical instead of copy pasta. Cc: linux-api@vger.kernel.org Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/176826402587.3490369.17659117524205214600.stgit@frogsfrogsfrogs Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definitionLorenzo Bianconi
Fix Typo in airoha_ppe_dev_setup_tc_block_cb routine definition when CONFIG_NET_AIROHA is not enabled. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601090517.Fj6v501r-lkp@intel.com/ Fixes: f45fc18b6de04 ("net: airoha: Add airoha_ppe_dev struct definition") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260109-airoha_ppe_dev_setup_tc_block_cb-typo-v1-1-282e8834a9f9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>