| Age | Commit message (Collapse) | Author |
|
For many years btrfs as been using a copy of may_delete() in
fs/btrfs/ioctl.c:btrfs_may_delete(). Everytime may_delete() is updated we
need to update the btrfs copy, and this is a maintenance burden. Currently
there are minor differences between both because the btrfs side lacks
updates done in may_delete().
Export may_delete() so that btrfs can use it and with the less generic
name may_delete_dentry(). While at it change the calls in vfs_rmdir() to
pass a boolean literal instead of 1 and 0 as the last argument since the
argument has a bool type.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Link: https://patch.msgid.link/e09128fd53f01b19d0a58f0e7d24739f79f47f6d.1768307858.git.fdmanana@suse.com
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
When Firmware First is enabled, BIOS handles errors first and then it
makes them available to the kernel via the Common Platform Error Record
(CPER) sections (UEFI 2.11 Appendix N.2.13). Linux parses the CPER
sections via one of two similar paths, either ELOG or GHES. The errors
managed by ELOG are signaled to the BIOS by the I/O Machine Check
Architecture (I/O MCA).
Currently, ELOG and GHES show some inconsistencies in how they report to
userspace via trace events.
Therefore, make the two mentioned paths act similarly by tracing the CPER
CXL Protocol Error Section.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Link: https://patch.msgid.link/20260114101543.85926-6-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent
type and copy the CPER CXL protocol errors information to a work data
structure.
Export the new symbol for reuse by ELOG.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260114101543.85926-5-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Move the CPER CXL protocol errors validity check out of
cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
the serial number check only to CXL agents that are CXL devices (UEFI
v2.10, Appendix N.2.13).
Export the new symbol for reuse by ELOG.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260114101543.85926-4-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ghes_notify_nmi() is called for every NMI and must check whether the NMI was
generated because an error was signalled by platform firmware.
This check is very expensive as for each registered GHES NMI source it reads
from the acpi generic address attached to this error source to get the physical
address of the acpi_hest_generic_status block. It then checks the "block_status"
to see if an error was logged.
The ACPI/APEI code must create virtual mappings for each of those physical
addresses, and tear them down afterwards. On an Icelake system this takes around
15,000 TSC cycles. Enough to disturb efforts to profile system performance.
If that were not bad enough, there are some atomic accesses in the code path
that will cause cache line bounces between CPUs. A problem that gets worse as
the core count increases.
But BIOS changes neither the acpi generic address nor the physical address of
the acpi_hest_generic_status block. So this walk can be done once when the NMI is
registered to save the virtual address (unmapping if the NMI is ever unregistered).
The "block_status" can be checked directly in the NMI handler. This can be done
without any atomic accesses.
Resulting time to check that there is not an error record is around 900 cycles.
Reported-by: Andi Kleen <andi.kleen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://patch.msgid.link/20260112032239.30023-2-xueshuai@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The logic at ghes_new() prevents allocating too large records, by
checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB).
Yet, the allocation is done with the actual number of pages from the
CPER bios table location, which can be smaller.
Yet, a bad firmware could send data with a different size, which might
be bigger than the allocated memory, causing an OOPS:
Unable to handle kernel paging request at virtual address fff00000f9b40000
Mem abort info:
ESR = 0x0000000096000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000
[fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000
Internal error: Oops: 0000000096000007 [#1] SMP
Modules linked in:
CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT
Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
Workqueue: kacpi_notify acpi_os_execute_deferred
pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : hex_dump_to_buffer+0x30c/0x4a0
lr : hex_dump_to_buffer+0x328/0x4a0
sp : ffff800080e13880
x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083
x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004
x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083
x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010
x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020
x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008
x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000
x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020
x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000
x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008
Call trace:
hex_dump_to_buffer+0x30c/0x4a0 (P)
print_hex_dump+0xac/0x170
cper_estatus_print_section+0x90c/0x968
cper_estatus_print+0xf0/0x158
__ghes_print_estatus+0xa0/0x148
ghes_proc+0x1bc/0x220
ghes_notify_hed+0x5c/0xb8
notifier_call_chain+0x78/0x148
blocking_notifier_call_chain+0x4c/0x80
acpi_hed_notify+0x28/0x40
acpi_ev_notify_dispatch+0x50/0x80
acpi_os_execute_deferred+0x24/0x48
process_one_work+0x15c/0x3b0
worker_thread+0x2d0/0x400
kthread+0x148/0x228
ret_from_fork+0x10/0x20
Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44)
---[ end trace 0000000000000000 ]---
Prevent that by taking the actual allocated are into account when
checking for CPER length.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/4e70310a816577fabf37d94ed36cde4ad62b1e0a.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There's a logic inside GHES/CPER to detect if the section_length
is too small, but it doesn't detect if it is too big.
Currently, if the firmware receives an ARM processor CPER record
stating that a section length is big, kernel will blindly trust
section_length, producing a very long dump. For instance, a 67
bytes record with ERR_INFO_NUM set 46198 and section length
set to 854918320 would dump a lot of data going a way past the
firmware memory-mapped area.
Fix it by adding a logic to prevent it to go past the buffer
if ERR_INFO_NUM is too big, making it report instead:
[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[Hardware Error]: event severity: recoverable
[Hardware Error]: Error 0, type: recoverable
[Hardware Error]: section_type: ARM processor error
[Hardware Error]: MIDR: 0xff304b2f8476870a
[Hardware Error]: section length: 854918320, CPER size: 67
[Hardware Error]: section length is too big
[Hardware Error]: firmware-generated error record is incorrect
[Hardware Error]: ERR_INFO_NUM is 46198
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject and changelog tweaks ]
Link: https://patch.msgid.link/41cd9f6b3ace3cdff7a5e864890849e4b1c58b63.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Write down the missing members definitions for struct export_operations,
using as a reference the commit messages that created the members.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-3-acc1889de772@igalia.com
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Adding a `@` before the function names make then recognizable as
kernel-docs, so they get correctly rendered in the documentation.
Even if they are already marked with `@` in the short one-line summary,
the kernel-docs will correctly favor the more detailed definition here.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-2-acc1889de772@igalia.com
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Without a space between %NAME_MAX and the plus sign, kernel-doc will
output ``NAME_MAX``+1, which scapes the last backtick and make Sphinx
format a much larger string as monospaced text.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Link: https://patch.msgid.link/20260112-tonyk-fs_uuid-v1-1-acc1889de772@igalia.com
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the
common case in devcgroup_inode_permission()") reordered the checks in
devcgroup_inode_permission() to check the inode mode before checking
i_rdev, for better cache behavior.
However, the likely() annotation on the i_rdev check was not updated
to reflect the new code flow. Originally, when i_rdev was checked
first, likely(!inode->i_rdev) made sense because most inodes were(?)
regular files/directories, thus i_rdev == 0.
After the reorder, by the time we reach the i_rdev check, we have
already confirmed the inode IS a block or character device. Block and
character special files are precisely defined by having a device number
(i_rdev), so !inode->i_rdev is now the rare edge case, not the common
case.
Branch profiling confirmed this is 100% mispredicted:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 2631904 100 devcgroup_inode_permission device_cgroup.h 24
Remove likely() to avoid giving the wrong hint to the CPU.
Fixes: 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260107-likely_device-v1-1-0c55f83a7e47@debian.org
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
It's useful to get filesystem-specific information using the
existing private field in the @iomap_iter passed to iomap_{begin,end}
for advanced usage for iomap buffered reads, which is much like the
current iomap DIO.
For example, EROFS needs it to:
- implement an efficient page cache sharing feature, since iomap
needs to apply to anon inode page cache but we'd like to get the
backing inode/fs instead, so filesystem-specific private data is
needed to keep such information;
- pass in both struct page * and void * for inline data to avoid
kmap_to_page() usage (which is bogus).
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20260109102856.598531-2-lihongbo22@huawei.com
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add support for MIPI I3C Host Controllers with the Multi-Bus Instance
capability. These controllers can host multiple I3C buses (up to 15)
within a single hardware function (e.g., PCIe B/D/F), providing one
indepedent HCI register set and corresponding I3C bus controller logic
per bus.
A separate platform device will represent each instance, but it is
necessary to allow for shared resources.
Multi-bus instances share the same MMIO address space, but the ranges are
not guaranteed to be contiguous. To avoid overlapping mappings, pass
base_regs from the parent mapping to child devices.
Allow the IRQ to be shared among instances.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260106164416.67074-8-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
When I3C is disabled, unused functions are removed by the linker because
the driver relies on regmap and no I3C devices are registered, so normal
I3C paths are never called.
However, some drivers may still call low-level I3C transfer helpers.
Provide stub implementations to avoid adding conditional ifdefs everywhere.
Add stubs for i3c_device_do_xfers() and
i3c_device_get_supported_xfer_mode() only. Other stubs will be introduced
when they are actually needed.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512230418.nu3V6Yua-lkp@intel.com/
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251230145718.4088694-1-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Drop i3c_priv_xfer and i3c_device_do_priv_xfers() after all driver switch
to use new API.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251215172405.2982801-1-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:
IIO: 1st set of fixes for the 6.19 cycle
The usual mixed bag of fixes for ancient problems plus some more
recent ones.
adi,ad7280a
- Check for errors from spi_setup().
adi,ad3552r
- Fix potential buffer overflow when setting to use the internal ramp.
adi,ax5695r
- Fill in the data for this device in the chip info table.
adi,ad7606
- Don't store a negative error in an unsigned int.
adi,ad9467
- Fix incorrect register mask value.
adi,adxl380
- Fix inverted condition for whether INT1 interrupt present in dt.
atmel,at91-sama5d2
- Cancel work on remove to avoid a potential use-after-free
invensense,icm45600
- Fix temperature scaling.
samsung,eynos_adc
- Use of_platform_depolulate() to correctly clear up such that child
devices are created correctly if the driver is rebound.
sensiron,scd4x
- Fix incorrect endianness reported to user-space.
st,accel
- Fix gain reported for the iis329dq.
st,lsm6dsx
- Hide event related interfaces on parts that don't support events.
ti,pac1934
- Ensure output of clamp() is used rather than unclamped value.
* tag 'iio-fixes-for-6.19a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source
iio: accel: iis328dq: fix gain values
iio: core: add separate lockdep class for info_exist_lock
iio: chemical: scd4x: fix reported channel endianness
iio: imu: inv_icm45600: fix temperature offset reporting
iio: adc: exynos_adc: fix OF populate on driver rebind
iio: dac: ad5686: add AD5695R to ad5686_chip_info_tbl
iio: accel: adxl380: fix handling of unavailable "INT1" interrupt
iio: imu: st_lsm6dsx: fix iio_chan_spec for sensors without event detection
iio: adc: pac1934: Fix clamped value in pac1934_reg_snapshot
iio: adc: ad9467: fix ad9434 vref mask
iio: adc: ad7606: Fix incorrect type for error return variable
iio: adc: ad7280a: handle spi_setup() errors in probe()
iio: adc: at91-sama5d2_adc: Fix potential use-after-free in sama5d2_adc driver
|
|
Add a generic TEE revision sysfs attribute backed by a new
optional get_tee_revision() callback. The revision string is
diagnostic-only and must not be used to infer feature support.
Signed-off-by: Aristo Chen <aristo.chen@canonical.com>
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
The last user of defined structures s3c_hwmon_pdata and s3c_hwmon_chcfg
was removed in commit 0d297df03890 ("ARM: s3c: simplify platform code"),
thus the platform data header file itself can be removed also.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Link: https://patch.msgid.link/20260112211554.3755188-1-vz@mleia.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
It would be useful to be able to check for potential DMA pages beyond
just ZONE_DMA - generalise the existing has_managed_dma() function to
allow checking other zones too.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Vladimir Kondratiev <vladimir.kondratiev@mobileye.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/bd002d2351074e57be1ca08f03f333debac658fb.1768230104.git.robin.murphy@arm.com
|
|
Type punning is necessary for get/put_unaligned() but the use of a packed
struct violates strict aliasing rules, requiring -fno-strict-aliasing to be
passed to the C compiler.
Switch to using memcpy() so that -fno-strict-aliasing isn't necessary.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251016205126.2882625-3-irogers@google.com
|
|
The cache parameter of getcpu() is useless nowadays for various reasons.
* It is never passed by userspace for either the vDSO or syscalls.
* It is never used by the kernel.
* It could not be made to work on the current vDSO architecture.
* The structure definition is not part of the UAPI headers.
* vdso_getcpu() is superseded by restartable sequences in any case.
Remove the struct and its header.
As a side-effect this gets rid of an unwanted inclusion of the linux/
header namespace from vDSO code.
[ tglx: Adapt to s390 upstream changes */
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Link: https://patch.msgid.link/20251230-getcpu_cache-v3-1-fb9c5f880ebe@linutronix.de
|
|
drivers-for-6.20
Merge the support for loading and managing the TrustZone-based remote
processors found in the Glymur platform through a topic branch, as it's
a mix of qcom-soc and remoteproc patches.
|
|
Improve btf_find_by_name_kind() performance by adding binary search
support for sorted types. Falls back to linear search for compatibility.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-7-dolinux.peng@gmail.com
|
|
... or visible outside of audit, really. Note that references
held in delayed_filename always have refcount 1, and from the
moment of complete_getname() or equivalent point in getname...()
there won't be any references to struct filename instance left
in places visible to other threads.
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
There are two filename-related problems in io_uring and its
interplay with audit.
Filenames are imported when request is submitted and used when
it is processed. Unfortunately, the latter may very well
happen in a different thread. In that case the reference to
filename is put into the wrong audit_context - that of submitting
thread, not the processing one. Audit logics is called by
the latter, and it really wants to be able to find the names
in audit_context current (== processing) thread.
Another related problem is the headache with refcounts -
normally all references to given struct filename are visible
only to one thread (the one that uses that struct filename).
io_uring violates that - an extra reference is stashed in
audit_context of submitter. It gets dropped when submitter
returns to userland, which can happen simultaneously with
processing thread deciding to drop the reference it got.
We paper over that by making refcount atomic, but that means
pointless headache for everyone.
Solution: the notion of partially imported filenames. Namely,
already copied from userland, but *not* exposed to audit yet.
io_uring can create that in submitter thread, and complete the
import (obtaining the usual reference to struct filename) in
processing thread.
Object: struct delayed_filename.
Primitives for working with it:
delayed_getname(&delayed_filename, user_string) - copies the name from
userland, returning 0 and stashing the address of (still incomplete)
struct filename in delayed_filename on success and returning -E... on
error.
delayed_getname_uflags(&delayed_filename, user_string, atflags) -
similar, in the same relation to delayed_getname() as getname_uflags()
is to getname()
complete_getname(&delayed_filename) - completes the import of filename
stashed in delayed_filename and returns struct filename to caller,
emptying delayed_filename.
CLASS(filename_complete_delayed, name)(&delayed_filename) - variant of
CLASS(filename) with complete_getname() for constructor.
dismiss_delayed_filename(&delayed_filename) - destructor; drops whatever
might be stashed in delayed_filename, emptying it.
putname_to_delayed(&delayed_filename, name) - if name is shared, stashes
its copy into delayed_filename and drops the reference to name, otherwise
stashes the name itself in there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
s/names_cachep/names_cache/ for consistency with dentry cache.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Always allocate struct filename from names_cachep, long name or short;
short names would be embedded into struct filename. Longer ones do
not cannibalize the original struct filename - put them into kmalloc'ed
buffers (PATH_MAX-sized for import from userland, strlen() + 1 - for
ones originating kernel-side, where we know the length beforehand).
Cutoff length for short names is chosen so that struct filename would be
192 bytes long - that's both a multiple of 64 and large enough to cover
the majority of real-world uses.
Simplifies logics in getname()/putname() and friends.
[fixed an embarrassing braino in EMBEDDED_NAME_MAX, first reported by
Dan Carpenter]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Instances of struct filename come from names_cachep (via
__getname()). That is done by getname_flags() and getname_kernel()
and these two are the main callers of __getname(). However, there are
other callers that simply want to allocate PATH_MAX bytes for uses that
have nothing to do with struct filename.
We want saner allocation rules for long pathnames, so that struct
filename would *always* come from names_cachep, with the out-of-line
pathname getting kmalloc'ed. For that we need to be able to change the
size of objects allocated by getname_flags()/getname_kernel().
That requires the rest of __getname() users to stop using
names_cachep; we could explicitly switch all of those to kmalloc(),
but that would cause quite a bit of noise. So the plan is to switch
getname_...() to new helpers and turn __getname() into a wrapper for
kmalloc(). Remaining __getname() users could be converted to explicit
kmalloc() at leisure, hopefully along with figuring out what size do
they really want - PATH_MAX is an overkill for some of them, used out
of laziness ("we have a convenient helper that does 4K allocations and
that's large enough, let's use it").
As a side benefit, names_cachep is no longer used outside
of fs/namei.c, so we can move it there and be done with that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Originally we tried to avoid multiple insertions into audit names array
during retry loop by a cute hack - memorize the userland pointer and
if there already is a match, just grab an extra reference to it.
Cute as it had been, it had problems - two identical pointers had
audit aux entries merged, two identical strings did not. Having
different behaviour for syscalls that differ only by addresses of
otherwise identical string arguments is obviously wrong - if nothing
else, compiler can decide to merge identical string literals.
Besides, this hack does nothing for non-audited processes - they get
a fresh copy for retry. It's not time-critical, but having behaviour
subtly differ that way is bogus.
These days we have very few places that import filename more than once
(9 functions total) and it's easy to massage them so we get rid of all
re-imports. With that done, we don't need audit_reusename() anymore.
There's no need to memorize userland pointer either.
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Not all users match that model, but most of them do. By the end of
the series we'll be left with very few irregular ones...
Added:
CLASS(filename, name)(user_path) =>
getname(user_path)
CLASS(filename_kernel, name)(string) =>
getname_kernel(string)
CLASS(filename_flags, name)(user_path, flags) =>
getname_flags(user_path, flags)
CLASS(filename_uflags, name)(user_path, flags) =>
getname_uflags(user_path, flags)
CLASS(filename_maybe_null, name)(user_path, flags) =>
getname_maybe_null(user_path, flags)
all with putname() as destructor.
"flags" in filename_flags is in LOOKUP_... space, only LOOKUP_EMPTY matters.
"flags" in filename_uflags and filename_maybe_null is in AT_...... space,
and only AT_EMPTY_PATH matters.
filename_flags conventions might be worth reconsidering later (it might or
might not be better off with boolean instead)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Qualcomm remote processor may rely on Static and Dynamic resources for
it to be functional. Static resources are fixed like for example,
memory-mapped addresses required by the subsystem and dynamic
resources, such as shared memory in DDR etc., are determined at
runtime during the boot process.
For most of the Qualcomm SoCs, when run with Gunyah or older QHEE
hypervisor, all the resources whether it is static or dynamic, is
managed by the hypervisor. Dynamic resources if it is present for a
remote processor will always be coming from secure world via SMC call
while static resources may be present in remote processor firmware
binary or it may be coming qcom_scm_pas_get_rsc_table() SMC call along
with dynamic resources.
Some of the remote processor drivers, such as video, GPU, IPA, etc., do
not check whether resources are present in their remote processor
firmware binary. In such cases, the caller of this function should set
input_rt and input_rt_size as NULL and zero respectively. Remoteproc
framework has method to check whether firmware binary contain resources
or not and they should be pass resource table pointer to input_rt and
resource table size to input_rt_size and this will be forwarded to
TrustZone for authentication. TrustZone will then append the dynamic
resources and return the complete resource table in the passed output
buffer.
More about documentation on resource table format can be found in
include/linux/remoteproc.h
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-11-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
For memory passed to TrustZone (TZ), it must either be part of a pool
registered with TZ or explicitly registered via SHMbridge SMC calls.
When Gunyah hypervisor is present, PAS SMC calls from Linux running at
EL1 are trapped by Gunyah running @ EL2, which handles SHMbridge
creation for both metadata and remoteproc carveout memory before
invoking the calls to TZ.
On SoCs running with a non-Gunyah-based hypervisor, Linux must take
responsibility for creating the SHM bridge before invoking PAS SMC
calls. For the auth_and_reset() call, the remoteproc carveout memory
must first be registered with TZ via a SHMbridge SMC call and once
authentication and reset are complete, the SHMbridge memory can be
deregistered.
Introduce qcom_scm_pas_prepare_and_auth_reset(), which sets up the SHM
bridge over the remoteproc carveout memory when Linux operates at EL2.
This behavior is indicated by a new field added to the PAS context data
structure. The function then invokes the auth_and_reset SMC call.
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-8-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
qcom_mdt_pas_init() was previously used only by the remoteproc driver
(drivers/remoteproc/qcom_q6v5_pas.c). Since that driver has now
transitioned to using PAS context-based qcom_mdt_pas_load() function,
making qcom_mdt_pas_init() obsolete for external use.
Removes qcom_mdt_pas_init() from the list of exported symbols and make
it static to limit its scope to internal use within mdtloader.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-7-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Introduce a new PAS context-aware function, qcom_mdt_pas_load(), for
remote processor drivers. This function utilizes the PAS context
pointer returned from qcom_scm_pas_ctx_init() to perform firmware
metadata verification and memory setup via SMC calls.
The qcom_mdt_pas_load() and qcom_mdt_load() functions are largely
similar, but the former is designed for clients using the PAS
context-based data structure. Over time, all users of qcom_mdt_load()
can be migrated to use qcom_mdt_pas_load() for consistency and
improved abstraction.
As the remoteproc PAS driver (qcom_q6v5_pas) has already adopted the
PAS context-based approach, update it to use qcom_mdt_pas_load().
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-6-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
As a superset of the existing metadata context, the PAS context
structure enables both remoteproc and non-remoteproc subsystems to
better support scenarios where the SoC runs with or without the Gunyah
hypervisor. To reflect this, relevant SCM and metadata functions are
updated to incorporate PAS context awareness and remove metadata context
data structure completely.
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-5-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
When the Peripheral Authentication Service (PAS) method runs on a SoC
where Linux operates at EL2 (i.e., without the Gunyah hypervisor), the
reset sequences are handled by TrustZone. In such cases, Linux must
perform additional steps before invoking PAS SMC calls, such as creating
a SHM bridge. Therefore, PAS SMC calls require awareness and handling of
these additional steps when Linux runs at EL2.
To support this, there is a need for a data structure that can be
initialized prior to invoking any SMC or MDT functions. This structure
allows those functions to determine whether they are operating in the
presence or absence of the Gunyah hypervisor and behave accordingly.
Currently, remoteproc and non-remoteproc subsystems use different
variants of the MDT loader helper API, primarily due to differences in
metadata context handling. Remoteproc subsystems retain the metadata
context until authentication and reset are completed, while
non-remoteproc subsystems (e.g., video, graphics, IPA, etc.) do not
retain the metadata context and can free it within the
qcom_scm_pas_init() call by passing a NULL context parameter and due to
these differences, it is not possible to extend metadata context
handling to support remoteproc and non remoteproc subsystem use PAS
operations, when Linux operates at EL2.
Add PAS context data structure allocator helper function.
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-4-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Peripheral and pas_id refers to unique id for a subsystem and used only
when peripheral authentication service from secure world is utilized.
Lets rename peripheral to pas_id to reflect closer to its meaning.
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260105-kvmrprocv10-v10-3-022e96815380@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Minor fixes and cleanups for the MSHV driver
* tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
mshv: release mutex on region invalidation failure
hyperv: Avoid -Wflex-array-member-not-at-end warning
mshv: hide x86-specific functions on arm64
mshv: Initialize local variables early upon region invalidation
mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions
|
|
It's quite likely that only register opcode restrictions exists, in
which case we'd never need to check the normal opcodes. Split
ctx->restricted into two separate fields, one for I/O opcodes, and one
for register opcodes.
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Document sysfs attribute dev_nack_retry_cnt that controls the number of
automatic retries performed by the I3C controller when a target device
returns a NACK
Add a `dev_nack_retry_count` sysfs attribute to allow reading and updating
the device NACK retry count. A new `dev_nack_retry_count` field and an
optional `set_dev_nack_retry()` callback are added to
i3c_master_controller. The attribute is created only when the callback is
implemented.
Updates are applied under the I3C bus maintenance lock to ensure safe
hardware reconfiguration.
Signed-off-by: Adrian Ng Ho Yin <adrianhoyin.ng@altera.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/3c4b5082bde64024fc383c44bebeef89ad3c7ed3.1765529948.git.adrianhoyin.ng@altera.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
DMA IOVA state is not used inside blk_rq_dma_map_iter_next, get
rid of the argument.
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For consistency with __vdso_clock_gettime64() there should also be a
64-bit variant of clock_getres(). This will allow the extension of
CONFIG_COMPAT_32BIT_TIME to the vDSO and finally the removal of 32-bit
time types from the kernel and UAPI. The generic vDSO library already
provides nearly all necessary building blocks for architectures to
provide this function. Only a prototype is missing.
Add the prototype to the generic header so architectures can start
providing this function.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-1-97ea7a06a543@linutronix.de
|
|
A fix for the dl_server 'requires' idle_cpu() usage, which made me
note that it and available_idle_cpu() are extern function calls.
And while idle_cpu() is used outside of kernel/sched/,
available_idle_cpu() is not.
This makes it hard to make idle_cpu() an inline helper, so provide
idle_rq() and implement idle_cpu() and available_idle_cpu() using
that.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Since ktime_t has become an alias to s64, these helpers are unnecessary.
Migrate the few remaining users to the regular helpers and remove the
now dead code.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-3-1a698ef0ddae@linutronix.de
|
|
This constant is only used in a single place and is has a very generic
name polluting the global namespace.
Move the constant closer to its only user.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-2-1a698ef0ddae@linutronix.de
|
|
These constants are never used, remove them.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260107-hrtimer-header-cleanup-v1-1-1a698ef0ddae@linutronix.de
|
|
In blamed commit, I added a check against the temporary queue
built in __dev_xmit_skb(). Idea was to drop packets early,
before any spinlock was acquired.
if (unlikely(defer_count > READ_ONCE(q->limit))) {
kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
return NET_XMIT_DROP;
}
It turned out that HTB Qdisc has a zero q->limit.
HTB limits packets on a per-class basis.
Some of our tests became flaky.
Add a new sysctl : net.core.qdisc_max_burst to control
how many packets can be stored in the temporary lockless queue.
Also add a new QDISC_BURST_DROP drop reason to better diagnose
future issues.
Thanks Neal !
Fixes: 100dfa74cad9 ("net: dev_queue_xmit() llist adoption")
Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Create some wrapper code around struct super_block so that filesystems
have a standard way to queue filesystem metadata and file I/O error
reports to have them sent to fsnotify.
If a filesystem wants to provide an error number, it must supply only
negative error numbers. These are stored internally as negative
numbers, but they are converted to positive error numbers before being
passed to fanotify, per the fanotify(7) manpage. Implementations of
super_operations::report_error are passed the raw internal event data.
Note that we have to play some shenanigans with mempools and queue_work
so that the error handling doesn't happen outside of process context,
and the event handler functions (both ->report_error and fsnotify) can
handle file I/O error messages without having to worry about whatever
locks might be held. This asynchronicity requires that unmount wait for
pending events to clear.
Add a new callback to the superblock operations structure so that
filesystem drivers can themselves respond to file I/O errors if they so
desire. This will be used for an upcoming self-healing patchset for
XFS.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://patch.msgid.link/176826402610.3490369.4378391061533403171.stgit@frogsfrogsfrogs
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Stop definining these privately and instead move them to the uapi
errno.h so that they become canonical instead of copy pasta.
Cc: linux-api@vger.kernel.org
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://patch.msgid.link/176826402587.3490369.17659117524205214600.stgit@frogsfrogsfrogs
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Fix Typo in airoha_ppe_dev_setup_tc_block_cb routine definition when
CONFIG_NET_AIROHA is not enabled.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601090517.Fj6v501r-lkp@intel.com/
Fixes: f45fc18b6de04 ("net: airoha: Add airoha_ppe_dev struct definition")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260109-airoha_ppe_dev_setup_tc_block_cb-typo-v1-1-282e8834a9f9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|