| Age | Commit message (Collapse) | Author |
|
Follow an observation that (n ^ (n - 1)) will only ever retain the most
significant bit set in the word operated on if that is the only bit set in
the first place, and use it to determine whether a number is a whole power
of 2, avoiding the need for an explicit check for nonzero.
This reduces the sequence produced to 3 instructions only across Alpha,
MIPS, and RISC-V targets, down from 4, 5, and 4 respectively, removing a
branch in the two latter cases. And it's 5 instructions on POWER and
x86-64 vs 8 and 9 respectively. There are no branches now emitted here
for targets that have a suitable conditional set operation, although an
inline expansion will often end with one, depending on what code a call to
this function is used in.
Credit goes to GCC authors for coming up with this optimisation used as
the fallback for (__builtin_popcountl(n) == 1), equivalent to this code,
for targets where the hardware population count operation is considered
expensive.
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.2601111836250.30566@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Garry <john.g.garry@oracle.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Su Hui <suhui@nfschina.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Introduce KHO ABI header describing preservation ABI for memblock's
reserve_mem regions and link the relevant documentation to KHO docs.
[lukas.bulwahn@redhat.com: MAINTAINERS: adjust file entry in MEMBLOCK AND MEMORY MANAGEMENT INITIALIZATION]
Link: https://lkml.kernel.org/r/20260107090438.22901-1-lukas.bulwahn@redhat.com
[rppt@kernel.org: update reserved_mem node description, per Pratyush]
Link: https://lkml.kernel.org/r/aW_M-HYZzx5SkbnZ@kernel.org
Link: https://lkml.kernel.org/r/20260105165839.285270-7-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jason Miu <jasonmiu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The `struct kho_vmalloc` defines the in-memory layout for preserving
vmalloc regions across kexec. This layout is a contract between kernels
and part of the KHO ABI.
To reflect this relationship, the related structs and helper macros are
relocated to the ABI header, `include/linux/kho/abi/kexec_handover.h`.
This move places the structure's definition under the protection of the
KHO_FDT_COMPATIBLE version string.
The structure and its components are now also documented within the ABI
header to describe the contract and prevent ABI breaks.
[rppt@kernel.org: update comment, per Pratyush]
Link: https://lkml.kernel.org/r/aW_Mqp6HcqLwQImS@kernel.org
Link: https://lkml.kernel.org/r/20260105165839.285270-6-rppt@kernel.org
Signed-off-by: Jason Miu <jasonmiu@google.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Introduce the `include/linux/kho/abi/kexec_handover.h` header file, which
defines the stable ABI for the KHO mechanism. This header specifies how
preserved data is passed between kernels using an FDT.
The ABI contract includes the FDT structure, node properties, and the
"kho-v1" compatible string. By centralizing these definitions, this
header serves as the foundational agreement for inter-kernel communication
of preserved states, ensuring forward compatibility and preventing
misinterpretation of data across kexec transitions.
Since the ABI definitions are now centralized in the header files, the
YAML files that previously described the FDT interfaces are redundant.
These redundant files have therefore been removed.
Link: https://lkml.kernel.org/r/20260105165839.285270-5-rppt@kernel.org
Signed-off-by: Jason Miu <jasonmiu@google.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
memfd preservation ABI description starts with "This header defines" which
is fine in the header but reads weird in the generated html documentation.
Update it to make the generated documentation coherent.
Link: https://lkml.kernel.org/r/20260105165839.285270-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jason Miu <jasonmiu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "kho: ABI headers and Documentation updates".
LUO started adding KHO ABI headers to include/linux/kho/abi, but the core
parts of KHO and memblock are still using the old way for descriptions on
their ABIs.
Let's consolidate all things KHO in include/linux/kho/abi.
And while on that, make some documentation updates to have more coherent
KHO docs.
This patch (of 6):
LUO ABI description starts with "This header defines" which is fine in the
header but reads weird in the generated html documentation.
Update it to make the generated documentation coherent.
Link: https://lkml.kernel.org/r/20260105165839.285270-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20260105165839.285270-2-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Jason Miu <jasonmiu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Address page fault in ima_restore_measurement_list()", v3.
When the second-stage kernel is booted via kexec with a limiting command
line such as "mem=<size>" we observe a pafe fault that happens.
BUG: unable to handle page fault for address: ffff97793ff47000
RIP: ima_restore_measurement_list+0xdc/0x45a
#PF: error_code(0x0000) not-present page
This happens on x86_64 only, as this is already fixed in aarch64 in
commit: cbf9c4b9617b ("of: check previous kernel's ima-kexec-buffer
against memory bounds")
This patch (of 3):
When the second-stage kernel is booted with a limiting command line (e.g.
"mem=<size>"), the IMA measurement buffer handed over from the previous
kernel may fall outside the addressable RAM of the new kernel. Accessing
such a buffer can fault during early restore.
Introduce a small generic helper, ima_validate_range(), which verifies
that a physical [start, end] range for the previous-kernel IMA buffer lies
within addressable memory:
- On x86, use pfn_range_is_mapped().
- On OF based architectures, use page_is_ram().
Link: https://lkml.kernel.org/r/20251231061609.907170-1-harshit.m.mogalapalli@oracle.com
Link: https://lkml.kernel.org/r/20251231061609.907170-2-harshit.m.mogalapalli@oracle.com
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: guoweikang <guoweikang.kernel@gmail.com>
Cc: Henry Willard <henry.willard@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Jonathan McDowell <noodles@fb.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Webb <paul.x.webb@oracle.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Yifei Liu <yifei.l.liu@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This definition disarms the warning in uapi/linux/types.h about including
kernel headers from user space. However the warning is already disarmed
due to the fact that kernel code is built with -D__KERNEL__.
Drop the pointless definition.
Link: https://lkml.kernel.org/r/20251230-exported-headers-types-h-v1-1-947fc606f3d8@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Provide a variant of DEFINE_RES that takes 0 arguments to initialize an
"unset" resource descriptor.
This should be used for the improper case of
struct resource res = {};
where DEFINE_RES() should be used.
With this new helper variant, it would result in:
struct resource res = DEFINE_RES();
instead of having to define the full 3 arguments:
struct resource res = DEFINE_RES(0, 0, IORESOURCE_UNSET);
DEFINE_RES() with no args, will set the flags to IORESOURCE_UNSET
signaling the resource descriptor is UNSET and doesn't reflect an actual
resource currently.
Link: https://lkml.kernel.org/r/20251213115314.16700-1-ansuelsmth@gmail.com
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio. This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio. As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.
Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.
Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: memfd_luo hotfixes".
This series contains a couple of fixes for memfd preservation using LUO.
This patch (of 3):
The Live Update Orchestrator's (LUO) memfd preservation works by
preserving all the folios of a memfd, re-creating an empty memfd on the
next boot, and then inserting back the preserved folios.
Currently it creates the file by directly calling shmem_file_setup().
This leaves out other work done by alloc_file() like setting up the file
mode, flags, or calling the security hooks.
Export alloc_file() to let memfd_luo use it. Rename it to
memfd_alloc_file() since it is no longer private and thus needs a
subsystem prefix.
Link: https://lkml.kernel.org/r/20260122151842.4069702-1-pratyush@kernel.org
Link: https://lkml.kernel.org/r/20260122151842.4069702-2-pratyush@kernel.org
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A KASAN warning can be triggered when vrealloc() changes the requested
size to a value that is not aligned to KASAN_GRANULE_SIZE.
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1 at mm/kasan/shadow.c:174 kasan_unpoison+0x40/0x48
...
pc : kasan_unpoison+0x40/0x48
lr : __kasan_unpoison_vmalloc+0x40/0x68
Call trace:
kasan_unpoison+0x40/0x48 (P)
vrealloc_node_align_noprof+0x200/0x320
bpf_patch_insn_data+0x90/0x2f0
convert_ctx_accesses+0x8c0/0x1158
bpf_check+0x1488/0x1900
bpf_prog_load+0xd20/0x1258
__sys_bpf+0x96c/0xdf0
__arm64_sys_bpf+0x50/0xa0
invoke_syscall+0x90/0x160
Introduce a dedicated kasan_vrealloc() helper that centralizes KASAN
handling for vmalloc reallocations. The helper accounts for KASAN granule
alignment when growing or shrinking an allocation and ensures that partial
granules are handled correctly.
Use this helper from vrealloc_node_align_noprof() to fix poisoning logic.
[ryabinin.a.a@gmail.com: move kasan_enabled() check, fix build]
Link: https://lkml.kernel.org/r/20260119144509.32767-1-ryabinin.a.a@gmail.com
Link: https://lkml.kernel.org/r/20260113191516.31015-1-ryabinin.a.a@gmail.com
Fixes: d699440f58ce ("mm: fix vrealloc()'s KASAN poisoning logic")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: <joonki.min@samsung-slsi.corp-partner.google.com>
Closes: https://lkml.kernel.org/r/CANP3RGeuRW53vukDy7WDO3FiVgu34-xVJYkfpm08oLO3odYFrA@mail.gmail.com
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add support for displaying bitmasks in human-readable list format (e.g.,
0,2-5,7) in addition to the default hexadecimal bitmap representation.
This is particularly useful when tracing CPU masks and other large
bitmasks where individual bit positions are more meaningful than their
hexadecimal encoding.
When the "bitmask-list" option is enabled, the printk "%*pbl" format
specifier is used to render bitmasks as comma-separated ranges, making
trace output easier to interpret for complex CPU configurations and
large bitmask values.
Link: https://patch.msgid.link/20251226160724.2246493-2-atomlin@atomlin.com
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix the the buggy conversion of fuse_reverse_inval_entry() introduced
during the creation rework
- Disallow nfs delegation requests for directories by setting
simple_nosetlease()
- Require an opt-in for getting readdir flag bits outside of S_DT_MASK
set in d_type
- Fix scheduling delayed writeback work by only scheduling when the
dirty time expiry interval is non-zero and cancel the delayed work if
the interval is set to zero
- Use rounded_jiffies_interval for dirty time work
- Check the return value of sb_set_blocksize() for romfs
- Wait for batched folios to be stable in __iomap_get_folio()
- Use private naming for fuse hash size
- Fix the stale dentry cleanup to prevent a race that causes a UAF
* tag 'vfs-6.19-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
vfs: document d_dispose_if_unused()
fuse: shrink once after all buckets have been scanned
fuse: clean up fuse_dentry_tree_work()
fuse: add need_resched() before unlocking bucket
fuse: make sure dentry is evicted if stale
fuse: fix race when disposing stale dentries
fuse: use private naming for fuse hash size
writeback: use round_jiffies_relative for dirtytime_work
iomap: wait for batched folios to be stable in __iomap_get_folio
romfs: check sb_set_blocksize() return value
docs: clarify that dirtytime_expire_seconds=0 disables writeback
writeback: fix 100% CPU usage when dirtytime_expire_interval is 0
readdir: require opt-in for d_type flags
vboxsf: don't allow delegations to be set on directories
ceph: don't allow delegations to be set on directories
gfs2: don't allow delegations to be set on directories
9p: don't allow delegations to be set on directories
smb/client: properly disallow delegations on directories
nfs: properly disallow delegation requests on directories
fuse: fix conversion of fuse_reverse_inval_entry() to start_removing()
|
|
XDR enum decoders generated by xdrgen do not verify that incoming
values are valid members of the enum. Incoming out-of-range values
from malicious or buggy peers propagate through the system
unchecked.
Add validation logic to generated enum decoders using a switch
statement that explicitly lists valid enumerator values. The
compiler optimizes this to a simple range check when enum values
are dense (contiguous), while correctly rejecting invalid values
for sparse enums with gaps in their value ranges.
The --no-enum-validation option on the source subcommand disables
this validation when not needed.
The minimum and maximum fields in _XdrEnum, which were previously
unused placeholders for a range-based validation approach, have
been removed since the switch-based validation handles both dense
and sparse enums correctly.
Because the new mechanism results in substantive changes to
generated code, existing .x files are regenerated. Unrelated white
space and semicolon changes in the generated code are due to recent
commit 1c873a2fd110 ("xdrgen: Don't generate unnecessary semicolon")
and commit 38c4df91242b ("xdrgen: Address some checkpatch whitespace
complaints").
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The xdrgen decoders for strings and opaque data had an
optimization that skipped calling xdr_inline_decode() when the
item length was zero. This left the data pointer uninitialized,
which could lead to unpredictable behavior when callers access
it.
Remove the zero-length check and always call xdr_inline_decode().
When passed a length of zero, xdr_inline_decode() returns the
current buffer position, which is valid and matches the behavior
of hand-coded XDR decoders throughout the kernel.
Fixes: 4b132aacb076 ("tools: Add xdrgen")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
"short" and "unsigned short" types are not defined in RFC 4506, but
are supported by the rpcgen program. An upcoming protocol
specification includes at least one "unsigned short" field, so xdrgen
needs to implement support for these types.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
|
|
We need the driver-core fixes in here as well to build on top of.
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
One of SAI interfaces is connected to AUDMIX in the i.MX952 chip, but
AUDMIX can be bypassed or not bypassed on the i.MX952 platform.
There are three use cases:
1) SAI -> Codec (No AUDMIX between SAI and Codec)
2) SAI -> Codec (Has AUDMIX, but AUDMIX is bypassed)
3) SAI -> AUDMIX -> Codec (Has AUDMIX and used)
So add 'fsl,sai-amix-mode' property for this feature
fsl,sai-amix-mode = "none": is for case 1)
fsl,sai-amix-mode = "bypass": is for case 2)
fsl,sai-amix-mode = "audmix": is for case 3)
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260123082501.4050296-5-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The error path for bus initialization is missing a call to put_device().
Add that call. This error path will probably never actually execute,
but any kernel source code may be subject to static checking or re-use.
Cc: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Cc: Daniel Palmer <daniel@0x0f.com>
Suggested-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://patch.msgid.link/478d5f080d74b6688c9e3f9132e3fe251e997ad7.1765610469.git.fthain@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next
Suzuki writes:
coresight: Updates for Linux v6.20/v7.0
This batch of CoreSight hwtracing updates contains :
- Fine grained control of Timestamp generation in ETM4 trace, retaining backward
compatibility
- Feature updates for Qualcomm TPDA driver
- Support Qualcomm Interconnect TNOC
- Miscellaneous fixes to TMC-ETR driver
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
* tag 'coresight-next-v7.0' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/coresight/linux: (27 commits)
coresight: tmc: Decouple the perf buffer allocation from sysfs mode
coresight: tmc-etr: Fix race condition between sysfs and perf mode
coresight: tmc: Add missing doc including reading and etr_mode of struct tmc_drvdata
coresight-tnoc: Add runtime PM support for Interconnect TNOC
coresight-tnoc: add platform driver to support Interconnect TNOC
dt-bindings: arm: qcom: Add Coresight Interconnect TNOC
coresight: etm3x: Fix cpulocked warning on cpuhp
coresight: tpda: Fix intendation for sysfs interface documentation
coresight: tpda: add sysfs node to flush specific port
coresight: tpda: add logic to configure TPDA_SYNCR register
coresight: tpda: add global_flush_req sysfs node
coresight: tpda: add sysfs nodes for tpda cross-trigger configuration
coresight: docs: Document etm4x timestamp interval option
coresight: Extend width of timestamp format attribute
coresight: Prepare to allow setting the timestamp interval
coresight: Remove misleading definitions
coresight: Interpret ETMv4 config with ATTR_CFG_GET_FLD()
coresight: Interpret perf config with ATTR_CFG_GET_FLD()
coresight: Don't reject unrecognized ETMv3 format attributes
coresight: Interpret ETMv3 config with ATTR_CFG_GET_FLD()
...
|
|
We need the char/misc/iio fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
VM_HIGH_ARCH_5 is used for riscv.
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6
Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com>
Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-1-b55691eacf4f@rivosinc.com
[pjw@kernel.org: clarify subject; update to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
idev->mr_maxdelay is read and written locklessly,
add READ_ONCE()/WRITE_ONCE() annotations.
While we are at it, make this field an u32.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260122172247.2429403-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
NETDEV_RSS_KEY_LEN has been set to 52 bytes in 2014, until now.
Jakub suggested we bump the size to 128 bytes or more.
Some drivers (like idpf) were already working around the core limit.
Since this change might cause some issues in admin scripts,
bump it directly to 256 in one go.
tjbp26:~# cat /proc/sys/net/core/netdev_rss_key | wc -c
768
tjbp26:~# ethtool -x eth1
RX flow hash indirection table for eth1 with 32 RX ring(s):
...
RSS hash key:
fe:16:5b:2f:93:85:c2:c9:c1:ef:bd:60:c6:e0:2b:99:4d:bf:b7:14:c8:1e:8d:cb:31:17:51:da:55:eb:91:d9:9e:f9:89:9b:44:a1:dc:08:72:3a:b3:d6:31:86:9a:fe:02:3a:0d:eb:a1:7c:f5:a3:51:3b:08:56:c9:3f:71:69:01:ba:70:38
RSS hash function:
toeplitz: on
xor: off
crc32: off
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/netdev/20260122075206.504ec591@kernel.org/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260122190349.2771064-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
clang is not inlining skb_frag_unref() and __skb_frag_unref()
in gro fast path.
It also does not inline gro_try_pull_from_frag0().
Using __always_inline fixes this issue, makes the
kernel faster _and_ smaller.
Also change __skb_frag_ref(), skb_frag_ref() and skb_page_unref()
to let them inlined for the last patch in this series.
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux.1
add/remove: 2/6 grow/shrink: 1/2 up/down: 218/-511 (-293)
Function old new delta
gro_pull_from_frag0 - 188 +188
__pfx_gro_pull_from_frag0 - 16 +16
skb_shift 1125 1139 +14
__pfx_skb_frag_unref 16 - -16
__pfx_gro_try_pull_from_frag0 16 - -16
__pfx___skb_frag_unref 16 - -16
__skb_frag_unref 36 - -36
skb_frag_unref 59 - -59
dev_gro_receive 1608 1514 -94
napi_gro_frags 892 771 -121
gro_try_pull_from_frag0 153 - -153
Total: Before=22566192, After=22565899, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260122045720.1221017-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following (anti-)pattern was observed in the code tree:
do {
start = u64_stats_fetch_begin(&pstats->syncp);
memcpy(&temp, &pstats->stats, sizeof(temp));
} while (u64_stats_fetch_retry(&pstats->syncp, start));
On 64bit arches, struct u64_stats_sync is empty and provides no help
against load/store tearing, especially for memcpy(), for which arches may
provide their highly-optimized implements.
In theory the affected code should convert to u64_stats_t, or use
READ_ONCE()/WRITE_ONCE() properly.
However since there are needs to copy chunks of statistics, instead of
writing loops at random places, we provide a safe memcpy() variant for
u64_stats.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260120092137.2161162-2-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/iio driver fixes from Greg KH:
"Here are some small char/misc/iio and some other minor driver
subsystem fixes for 6.19-rc7. Nothing huge here, just some fixes for
reported issues including:
- lots of little iio driver fixes
- comedi driver fixes
- mux driver fix
- w1 driver fixes
- uio driver fix
- slimbus driver fixes
- hwtracing bugfix
- other tiny bugfixes
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (36 commits)
comedi: dmm32at: serialize use of paged registers
mei: trace: treat reg parameter as string
uio: pci_sva: correct '-ENODEV' check logic
uacce: ensure safe queue release with state management
uacce: implement mremap in uacce_vm_ops to return -EPERM
uacce: fix isolate sysfs check condition
uacce: fix cdev handling in the cleanup path
slimbus: core: clean up of_slim_get_device()
slimbus: core: fix of_slim_get_device() kernel doc
slimbus: core: amend slim_get_device() kernel doc
slimbus: core: fix device reference leak on report present
slimbus: core: fix runtime PM imbalance on report present
slimbus: core: fix OF node leak on registration failure
intel_th: rename error label
intel_th: fix device leak on output open()
comedi: Fix getting range information for subdevices 16 to 255
mux: mmio: Fix IS_ERR() vs NULL check in probe()
interconnect: debugfs: initialize src_node and dst_node to empty strings
iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source
iio: accel: iis328dq: fix gain values
...
|
|
Implement session cookie for fsession. The session cookies will be stored
in the stack, and the layout of the stack will look like this:
return value -> 8 bytes
argN -> 8 bytes
...
arg1 -> 8 bytes
nr_args -> 8 bytes
ip (optional) -> 8 bytes
cookie2 -> 8 bytes
cookie1 -> 8 bytes
The offset of the cookie for the current bpf program, which is in 8-byte
units, is stored in the
"(((u64 *)ctx)[-1] >> BPF_TRAMP_COOKIE_INDEX_SHIFT) & 0xFF". Therefore, we
can get the session cookie with ((u64 *)ctx)[-offset].
Implement and inline the bpf_session_cookie() for the fsession in the
verifier.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20260124062008.8657-6-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
If fsession exists, we will use the bit (1 << BPF_TRAMP_IS_RETURN_SHIFT)
in ((u64 *)ctx)[-1] to store the "is_return" flag.
The logic of bpf_session_is_return() for fsession is implemented in the
verifier by inline following code:
bool bpf_session_is_return(void *ctx)
{
return (((u64 *)ctx)[-1] >> BPF_TRAMP_IS_RETURN_SHIFT) & 1;
}
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260124062008.8657-5-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The fsession is something that similar to kprobe session. It allow to
attach a single BPF program to both the entry and the exit of the target
functions.
Introduce the struct bpf_fsession_link, which allows to add the link to
both the fentry and fexit progs_hlist of the trampoline.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260124062008.8657-2-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Always inline I/O and IRQ methods using build_assert!() to avoid
false positive build errors
- Do not free the driver's device private data in I2C shutdown()
avoiding race conditions that can lead to UAF bugs
- Drop the driver's device private data after the driver has been
fully unbound from its device to avoid UAF bugs from &Device<Bound>
scopes, such as IRQ callbacks
* tag 'driver-core-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
rust: driver: drop device private data post unbind
rust: driver: add DriverData type to the DriverLayout trait
rust: driver: add DEVICE_DRIVER_OFFSET to the DriverLayout trait
rust: driver: introduce a DriverLayout trait
rust: auxiliary: add Driver::unbind() callback
rust: i2c: do not drop device private data on shutdown()
rust: irq: always inline functions using build_assert with arguments
rust: io: always inline functions using build_assert with arguments
|
|
In clang version 21.1 and later the -Wimplicit-enum-enum-cast warning
option has been introduced. This warning is enabled by default and can
be used to catch .queuecommand() implementations that return another
value than 0 or one of the SCSI_MLQUEUE_* constants. Hence this patch
that changes the return type of the .queuecommand() implementations from
'int' into 'enum scsi_qc_status'. No functionality has been changed.
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260115210357.2501991-6-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Merge in patches to support several patch series such as Soft Reserve
handling, type2 accelerator enabling, and LSA 2.1 labeling support.
Mainly addition of cxl_memdev_attach() to allow the memdev probe
to make a decision of proceed/fail depending success of CXL topology
enumeration.
dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
cxl/mem: Drop @host argument to devm_cxl_add_memdev()
cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
cxl/port: Arrange for always synchronous endpoint attach
cxl/mem: Arrange for always-synchronous memdev attach
cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
|
|
Some/most devices implementing gso_partial need to disable the GSO partial
features when the IP ID can't be mangled; to that extend each of them
implements something alike the following[1]:
if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
features &= ~NETIF_F_TSO;
in the ndo_features_check() op, which leads to a bit of duplicate code.
Later patch in the series will implement GSO partial support for virtual
devices, and the current status quo will require more duplicate code and
a new indirect call in the TX path for them.
Introduce the mangleid_features mask, allowing the core to disable NIC
features based on/requiring MANGLEID, without any further intervention
from the driver.
The same functionality could be alternatively implemented adding a single
boolean flag to the struct net_device, but would require an additional
checks in ndo_features_check().
Also note that [1] is incorrect if the NIC additionally implements
NETIF_F_GSO_UDP_L4, mangleid_features transparently handle even such a
case.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/5a7cdaeea40b0a29b88e525b6c942d73ed3b8ce7.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On the embedded platform, certain critical data, such as IMU data, is
transmitted through UART. The tty_flip_buffer_push() interface in the TTY
layer uses system_dfl_wq to handle the flipping of the TTY buffer.
Although the unbound workqueue can create new threads on demand and wake
up the kworker thread on an idle CPU, it may be preempted by real-time
tasks or other high-prio tasks.
flush_to_ldisc() needs to wake up the relevant data handle thread. When
executing __wake_up_common_lock(), it calls spin_lock_irqsave(), which
does not disable preemption but disables migration in RT-Linux. This
prevents the kworker thread from being migrated to other cores by CPU's
balancing logic, resulting in long delays. The call trace is as follows:
__wake_up_common_lock
__wake_up
ep_poll_callback
__wake_up_common
__wake_up_common_lock
__wake_up
n_tty_receive_buf_common
n_tty_receive_buf2
tty_ldisc_receive_buf
tty_port_default_receive_buf
flush_to_ldisc
In our system, the processing interval for each frame of IMU data
transmitted via UART can experience significant jitter due to this issue.
Instead of the expected 10 to 15 ms frame processing interval, we see
spikes up to 30 to 35 ms. Moreover, in just one or two hours, there can
be 2 to 3 occurrences of such high jitter, which is quite frequent. This
jitter exceeds the software's tolerable limit of 20 ms.
Introduce flip_wq in tty_port which can be set by tty_port_link_wq() or as
default linked to default workqueue allocated when tty_register_driver().
The default workqueue is allocated with flag WQ_SYSFS, so that cpumask and
nice can be set dynamically. The execution timing of tty_port_link_wq() is
not clearly restricted. The newly added function tty_port_link_driver_wq()
checks whether the flip_wq of the tty_port has already been assigned when
linking the default tty_driver's workqueue to the port. After the user has
set a custom workqueue for a certain tty_port using tty_port_link_wq(), the
system will only use this custom workqueue, even if tty_driver does not
have %TTY_DRIVER_CUSTOM_WORKQUEUE flag.
Introduce %TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
default single tty_driver workqueue. Two reasons why need to introduce the
%TTY_DRIVER_CUSTOM_WORKQUEUE flag:
1. If the WQ_SYSFS parameter is enabled, workqueue_sysfs_register() will
fail when trying to create a workqueue with the same name. The pty is an
example of this; if both CONFIG_LEGACY_PTYS and CONFIG_UNIX98_PTYS are
enabled, the call to tty_register_driver() in unix98_pty_init() will fail.
2. Different tty ports may be used for different tasks, which may require
separate core binding control via workqueues. In this case, the workqueue
created by default in the tty driver is unnecessary. Enabling this flag
prevents the creation of this redundant workqueue.
After applying this patch, we can set the related UART TTY flip buffer
workqueue by sysfs. We set the cpumask to CPU cores associated with the
IMU tasks, and set the nice to -20. Testing has shown significant
improvement in the previously described issue, with almost no stuttering
occurring anymore.
Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://patch.msgid.link/20251223034836.2625547-1-jackzxcui1989@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for HSIC USB mode, which can be set for second USB controller
and PHY on Tegra SoC along with already supported UTMI or ULPI.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Link: https://patch.msgid.link/20260122151125.7367-3-clamor95@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Refactor to directly use enum usb_phy_interface to determine the PHY mode.
This change is in preparation for adding support for HSIC mode.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Link: https://patch.msgid.link/20260122151125.7367-2-clamor95@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The port driver sets this bit for an alternate mode description to indicate
support for the mode selection feature. Once set, individual Alt Mode
drivers will no longer attempt to activate their respective modes within
their probe functions. This prevents race conditions and non-prioritized
activation.
The bit is not set by default. If left unset, the system retains the
current behavior where Alt Mode drivers manage their own activation logic.
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/20260119131824.2529334-6-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The mode selection process is controlled by the following API functions,
which allow to initiate and complete mode entry based on the priority of
each mode:
`typec_mode_selection_start` function compiles a priority list of supported
Alternate Modes.
`typec_altmode_state_update` function is invoked by the port driver to
communicate the current mode of the Type-C connector.
`typec_mode_selection_delete` function stops the currently running mode
selection process and releases all associated system resources.
`mode_selection_work_fn` task attempts to activate modes. The process stops
on success; otherwise, it proceeds to the next mode after a timeout or
error.
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/20260119131824.2529334-5-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch introduces a priority sysfs attribute to the USB Type-C
alternate mode port interface. This new attribute allows user-space to
configure the numeric priority of alternate modes managing their preferred
order of operation. If a new priority value conflicts with an existing
mode's priority, the priorities of the conflicting mode and all subsequent
modes are automatically incremented to ensure uniqueness.
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Benson Leung <bleung@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/20260119131824.2529334-4-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This new field in the port properties dictates whether the Platform Policy
Manager (PPM) allows the OS Policy Manager (OPM) to change the currently
active, negotiated alternate mode.
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://patch.msgid.link/20260119131824.2529334-2-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Introduce a linked list tracking all inode connectors for a superblock.
We will use this list when the superblock is getting shutdown to
properly clean up all the inode marks instead of relying on scanning all
inodes in the superblock which can get rather slow.
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Add aes-gcm crypto API's for AMD/Xilinx Versal device.
Signed-off-by: Harsh Jain <h.jain@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add API to get SoC version and family info.
Signed-off-by: Harsh Jain <h.jain@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
For better maintainability move crypto related API's to
new zynqmp-crypto.c file.
Signed-off-by: Harsh Jain <h.jain@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Cross-merge networking fixes after downstream PR (net-6.19-rc7).
Conflicts:
drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
b35a6fd37a00 ("hinic3: Add adaptive IRQ coalescing with DIM")
fb2bb2a1ebf7 ("hinic3: Fix netif_queue_set_napi queue_index input parameter error")
https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com
drivers/net/wireless/ath/ath12k/mac.c
drivers/net/wireless/ath/ath12k/wifi7/hw.c
31707572108d ("wifi: ath12k: Fix wrong P2P device link id issue")
c26f294fef2a ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module")
https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au
Adjacent changes:
drivers/net/wireless/ath/ath12k/mac.c
8b8d6ee53dfd ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel")
914c890d3b90 ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for 6.20:
Core Changes:
- buddy: Fix free_trees memory leak, prevent a BUG_ON
- dma-buf: Start to introduce cgroup memory accounting in heaps, Remove
sysfs stats, add new tracepoints
- hdmi: Limit infoframes exposure to userspace based on driver
capabilities
- property: Account for property blobs in memcg
Driver Changes:
- atmel-hlcdc: Switch to drmm resources, Support nomodeset parameter,
various patches to use newish helpers and fix memory safety bugs
- hisilicon: Fix various DisplayPort related bugs
- imagination: Introduce hardware version checks
- renesas: Fix kernel panic on reboot
- rockchip: Fix RK3576 HPD interrupt handling, Improve RK3588 HPD
interrupt handling
- v3d: Convert to drm logging helpers
- bridge:
- Continuation of the refcounting effort
- new bridge: Algoltek AG6311
- panel:
- new panel: Anbernic RG-DS
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260122-antique-sexy-junglefowl-1bc5a8@houat
|
|
The stmmac implementation used by NXP for the i.MX8MP SoC is subject to
errata ERR050694. According to this errata, when no preamble byte is
transferred before the SFD from the PHY to the MAC, the MAC will discard
the frame.
Setting the PHY_F_KEEP_PREAMBLE_BEFORE_SFD flag instructs PHYs that
support it to keep the preamble byte before the SFD. This ensures that
the MAC successfully receives frames.
As this is an issue in the MAC implementation, only enable the flag for
the i.MX8MP SoC where the errata applies but not for other SoCs using a
working stmmac implementation.
The exact wording of the errata ERR050694 from NXP:
The IEEE 802.3 standard states that, in MII/GMII modes, the byte
preceding the SFD (0xD5), SMD-S (0xE6,0x4C, 0x7F, or 0xB3), or SMD-C
(0x61, 0x52, 0x9E, or 0x2A) byte can be a non-PREAMBLE byte or there can
be no preceding preamble byte. The MAC receiver must successfully
receive a packet without any preamble(0x55) byte preceding the SFD,
SMD-S, or SMD-C byte.
However due to the defect, in configurations where frame preemption is
enabled, when preamble byte does not precede the SFD, SMD-S, or SMD-C
byte, the received packet is discarded by the MAC receiver. This is
because, the start-of-packet detection logic of the MAC receiver
incorrectly checks for a preamble byte.
NXP refers to IEEE 802.3 where in clause 35.2.3.2.2 Receive case (GMII)
they show two tables one where the preamble is preceding the SFD and one
where it is not. The text says:
The operation of 1000 Mb/s PHYs can result in shrinkage of the preamble
between transmission at the source GMII and reception at the destination
GMII. Table 35-3 depicts the case where no preamble bytes are conveyed
across the GMII. This case may not be possible with a specific PHY, but
illustrates the minimum preamble with which MAC shall be able to
operate. Table 35-4 depicts the case where the entire preamble is
conveyed across the GMII.
This workaround was tested on a Verdin iMX8MP by enforcing 10 MBit/s:
ethtool -s end0 speed 10
Without keeping the preamble, no packet were received. With keeping the
preamble, everything worked as expected.
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260120203905.23805-4-eichest@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|