| Age | Commit message (Collapse) | Author |
|
The ieee80211.h file has gotten very long, continue splitting
it by putting HE definitions into a separate file.
Link: https://patch.msgid.link/20251105153843.6998c0802104.I3dd7cfea6abbd118b999ecdedd48437d39cb0533@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The ieee80211.h file has gotten very long, continue splitting
it by putting VHT definitions into a separate file.
Link: https://patch.msgid.link/20251105153843.c31cb771a250.I787a13064db7d80440101de3445be17881daf1b6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The ieee80211.h file has gotten very long, continue splitting
it by putting HT definitions into a separate file.
Link: https://patch.msgid.link/20251105153843.7532471178d0.Id956a5433ad8658e4e5c0272dbcbb59587206142@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The ieee80211.h file has gotten very long, start splitting it
by putting mesh definitions into a separate file.
Link: https://patch.msgid.link/20251105153843.489713ca8b34.I3befb4bf6ace0315758a1794224ddd18c4652e32@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add a few more assert to detect active reference count underflows.
Link: https://patch.msgid.link/20251109-namespace-6-19-fixes-v1-6-ae8a4ad5a3b3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The setns() system call supports:
(1) namespace file descriptors (nsfd)
(2) process file descriptors (pidfd)
When using nsfds the namespaces will remain active because they are
pinned by the vfs. However, when pidfds are used things are more
complicated.
When the target task exits and passes through exit_nsproxy_namespaces()
or is reaped and thus also passes through exit_cred_namespaces() after
the setns()'ing task has called prepare_nsset() but before the active
reference count of the set of namespaces it wants to setns() to might
have been dropped already:
P1 P2
pid_p1 = clone(CLONE_NEWUSER | CLONE_NEWNET | CLONE_NEWNS)
pidfd = pidfd_open(pid_p1)
setns(pidfd, CLONE_NEWUSER | CLONE_NEWNET | CLONE_NEWNS)
prepare_nsset()
exit(0)
// ns->__ns_active_ref == 1
// parent_ns->__ns_active_ref == 1
-> exit_nsproxy_namespaces()
-> exit_cred_namespaces()
// ns_active_ref_put() will also put
// the reference on the owner of the
// namespace. If the only reason the
// owning namespace was alive was
// because it was a parent of @ns
// it's active reference count now goes
// to zero... --------------------------------
// |
// ns->__ns_active_ref == 0 |
// parent_ns->__ns_active_ref == 0 |
| commit_nsset()
-----------------> // If setns()
// now manages to install the namespaces
// it will call ns_active_ref_get()
// on them thus bumping the active reference
// count from zero again but without also
// taking the required reference on the owner.
// Thus we get:
//
// ns->__ns_active_ref == 1
// parent_ns->__ns_active_ref == 0
When later someone does ns_active_ref_put() on @ns it will underflow
parent_ns->__ns_active_ref leading to a splat from our asserts
thinking there are still active references when in fact the counter
just underflowed.
So resurrect the ownership chain if necessary as well. If the caller
succeeded to grab passive references to the set of namespaces the
setns() should simply succeed even if the target task exists or gets
reaped in the meantime and thus has dropped all active references to its
namespaces.
The race is rare and can only be triggered when using pidfs to setns()
to namespaces. Also note that active reference on initial namespaces are
nops.
Since we now always handle parent references directly we can drop
ns_ref_active_get_owner() when adding a namespace to a namespace tree.
This is now all handled uniformly in the places where the new namespaces
actually become active.
Link: https://patch.msgid.link/20251109-namespace-6-19-fixes-v1-5-ae8a4ad5a3b3@kernel.org
Fixes: 3c9820d5c64a ("ns: add active reference count")
Reported-by: syzbot+1957b26299cf3ff7890c@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
There's no need to bump the active reference counts of initial
namespaces as they're always active and can simply remain at 1.
Link: https://patch.msgid.link/20251109-namespace-6-19-fixes-v1-2-ae8a4ad5a3b3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
KHO allocates metadata for its preserved memory map using the slab
allocator via kzalloc(). This metadata is temporary and is used by the
next kernel during early boot to find preserved memory.
A problem arises when KFENCE is enabled. kzalloc() calls can be randomly
intercepted by kfence_alloc(), which services the allocation from a
dedicated KFENCE memory pool. This pool is allocated early in boot via
memblock.
When booting via KHO, the memblock allocator is restricted to a "scratch
area", forcing the KFENCE pool to be allocated within it. This creates a
conflict, as the scratch area is expected to be ephemeral and
overwriteable by a subsequent kexec. If KHO metadata is placed in this
KFENCE pool, it leads to memory corruption when the next kernel is loaded.
To fix this, modify KHO to allocate its metadata directly from the buddy
allocator instead of slab.
Link: https://lkml.kernel.org/r/20251021000852.2924827-4-pasha.tatashin@soleen.com
Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: David Matlack <dmatlack@google.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Page cache folios from a file system that support large block size (LBS)
can have minimal folio order greater than 0, thus a high order folio might
not be able to be split down to order-0. Commit e220917fa507 ("mm: split
a folio in minimum folio order chunks") bumps the target order of
split_huge_page*() to the minimum allowed order when splitting a LBS
folio. This causes confusion for some split_huge_page*() callers like
memory failure handling code, since they expect after-split folios all
have order-0 when split succeeds but in reality get min_order_for_split()
order folios and give warnings.
Fix it by failing a split if the folio cannot be split to the target
order. Rename try_folio_split() to try_folio_split_to_order() to reflect
the added new_order parameter. Remove its unused list parameter.
[The test poisons LBS folios, which cannot be split to order-0 folios, and
also tries to poison all memory. The non split LBS folios take more
memory than the test anticipated, leading to OOM. The patch fixed the
kernel warning and the test needs some change to avoid OOM.]
Link: https://lkml.kernel.org/r/20251017013630.139907-1-ziy@nvidia.com
Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: syzbot+e6367ea2fdab6ed46056@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Strip trailing padding bytes from modules.builtin.modinfo to fix
error during modules_install with certain versions of kmod
- Drop unused static inline function warning in .c files with clang
from W=1 to W=2
- Ensure kernel-doc.py invocations use the PYTHON3 make variable to
ensure user's choice of Python interpreter is always respected
* tag 'kbuild-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: Let kernel-doc.py use PYTHON3 override
compiler_types: Move unused static inline functions warning to W=2
kbuild: Strip trailing padding bytes from modules.builtin.modinfo
|
|
Correct and add to adis.h to resolve all kernel-doc warnings:
- add a missing struct member description
- change one non-kernel-doc comment to use /* instead of /**
- correct function parameter @value to @val (7 locations)
- add function return value comments (13 locations)
Warning: include/linux/iio/imu/adis.h:97 struct member 'has_fifo'
not described in 'adis_data'
Warning: include/linux/iio/imu/adis.h:139 Incorrect use of kernel-doc
format: * The state_lock is meant to be used during operations that
require
Warning: include/linux/iio/imu/adis.h:158 struct member '"__adis_"'
not described in 'adis'
Warning: include/linux/iio/imu/adis.h:264 function parameter 'val'
not described in 'adis_write_reg'
Warning: include/linux/iio/imu/adis.h:371 No description found for
return value of 'adis_update_bits_base'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Add mlx5_fs_set_root_dev() function which swaps the root namespace
core device with another one for a given table_type.
It is intended for usage only by RDMA_TRANSPORT tables in case of LAG
configuration, to allow the creation of tables during LAG always
through the LAG master device, which is valid since during LAG the
master is allowed to manage the RDMA_TRANSPORT tables of its slaves.
In addition move the table_type enum to global include to allow its use
in a downstream patch in the RDMA driver.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-3-98bb707b5d57@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Add other_eswitch support which allows flow tables creation above vports
that reside on different esw managers.
The new flag MLX5_FLOW_TABLE_OTHER_ESWITCH indicates if the
esw_owner_vhca_id attribute is supported.
Note that this is only supported if the Advanced-RDMA cap-
rdma_transport_manager_other_eswitch is set.
And it is the caller responsibility to check that.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-2-98bb707b5d57@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Add OTHER_ESWITCH capabilities which includes other_eswitch and
eswitch_owner_vhca_id to all steering objects.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-1-98bb707b5d57@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Expose pcie_tph_get_st_table_loc() to be used by drivers as will be done
in the next patch from the series.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20251027-st-direct-mode-v1-1-e0ad953866b6@nvidia.com
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Move the VCHIQ headers from drivers/staging/vc04_services/include to
include/linux/raspberrypi
This is done so that they can be shared between the VCHIQ interface
(which is going to be de-staged in a subsequent commit from staging) and
the VCHIQ drivers left in the staging/vc04_services (namely
bcm2835-audio, bcm2835-camera).
The include/linux/raspberrypi/ provides a central location to serve both of
these areas.
Co-developed-by: Umang Jain <umang.jain@ideasonboard.com>
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jai Luthra <jai.luthra@ideasonboard.com>
Link: https://patch.msgid.link/20251029-vchiq-destage-v3-4-da8d6c83c2c5@ideasonboard.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-11-06 (i40, ice, iavf)
Mohammad Heib introduces a new devlink parameter, max_mac_per_vf, for
controlling the maximum number of MAC address filters allowed by a VF. This
allows administrators to control the VF behavior in a more nuanced manner.
Aleksandr and Przemek add support for Receive Side Scaling of GTP to iAVF
for VFs running on E800 series ice hardware. This improves performance and
scalability for virtualized network functions in 5G and LTE deployments.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
iavf: add RSS support for GTP protocol via ethtool
ice: Extend PTYPE bitmap coverage for GTP encapsulated flows
ice: improve TCAM priority handling for RSS profiles
ice: implement GTP RSS context tracking and configuration
ice: add virtchnl definitions and static data for GTP RSS
ice: add flow parsing for GTP and new protocol field support
i40e: support generic devlink param "max_mac_per_vf"
devlink: Add new "max_mac_per_vf" generic device param
====================
Link: https://patch.msgid.link/20251106225321.1609605-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All three members are effectively of type bool, so make this explicit
and shrink size of struct fixed_phy_status.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/9eca3d7e-fa64-4724-8fdc-f2c1a8f2ae8f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for Open Alliance TC14 (OATC14) 10BASE-T1S PHYs cable
diagnostic feature.
This patch implements:
- genphy_c45_oatc14_cable_test_start() to initiate a cable test
- genphy_c45_oatc14_cable_test_get_status() to retrieve test results
- Helper function to map PHY cable test status to ethtool result codes
- Function declarations and exports for use by PHY drivers
This enables ethtool to report ok, open, short, and undetectable cable
conditions on OATC14 10Base-T1S PHYs.
Open Alliance TC14 10BASE-T1S Advanced Diagnostic PHY Features
Specification ref:
https://opensig.org/wp-content/uploads/2025/06/OPEN_Alliance_10BASE-T1S_Advanced_PHY_features_for-automotive_Ethernet_V2.1b.pdf
Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20251105051213.50443-2-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Per Nathan, clang catches unused "static inline" functions in C files
since commit 6863f5643dd7 ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
Linus said:
> So I entirely ignore W=1 issues, because I think so many of the extra
> warnings are bogus.
>
> But if this one in particular is causing more problems than most -
> some teams do seem to use W=1 as part of their test builds - it's fine
> to send me a patch that just moves bad warnings to W=2.
>
> And if anybody uses W=2 for their test builds, that's THEIR problem..
Here is the change to bump the warning from W=1 to W=2.
Fixes: 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build")
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251106105000.2103276-1-andriy.shevchenko@linux.intel.com
[nathan: Adjust comment as well]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
|
|
Introduce the function bdev_zone_start() as a more explicit (and clear)
replacement for ALIGN_DOWN() to get the start sector of a zone
containing a particular sector of a zoned block device.
Use this new helper in blkdev_get_zone_info() and
blkdev_report_zones_cached().
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
!CONFIG_GENERIC_ARCH_TOPOLOGY
The arm_pmu driver is using topology_core_has_smt() for retrieving
the SMT implementation which depends on CONFIG_GENERIC_ARCH_TOPOLOGY.
The config is optional on arm platforms so provide a
!CONFIG_GENERIC_ARCH_TOPOLOGY stub for topology_core_has_smt().
Fixes: c3d78c34ad00 ("perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511041757.vuCGOmFc-lkp@intel.com/
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Yicong Yang <yangyccccc@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There may be console drivers that have not yet figured out a way
to implement safe atomic printing (->write_atomic() callback).
These drivers could choose to only implement threaded printing
(->write_thread() callback), but then it is guaranteed that _no_
output will be printed during panic. Not even attempted.
As a result, developers may be tempted to implement unsafe
->write_atomic() callbacks and/or implement some sort of custom
deferred printing trickery to try to make it work. This goes
against the principle intention of the nbcon API as well as
endangers other nbcon drivers that are doing things correctly
(safely).
As a compromise, allow nbcon drivers to implement unsafe
->write_atomic() callbacks by providing a new console flag
CON_NBCON_ATOMIC_UNSAFE. When specified, the ->write_atomic()
callback for that console will _only_ be called during the
final "hope and pray" flush attempt at the end of a panic:
nbcon_atomic_flush_unsafe().
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/lkml/b2qps3uywhmjaym4mht2wpxul4yqtuuayeoq4iv4k3zf5wdgh3@tocu6c7mj4lt
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/all/swdpckuwwlv3uiessmtnf2jwlx3jusw6u7fpk5iggqo4t2vdws@7rpjso4gr7qp/ [1]
Link: https://lore.kernel.org/all/20251103-fix_netpoll_aa-v4-1-4cfecdf6da7c@debian.org/ [2]
Link: https://patch.msgid.link/20251027161212.334219-2-john.ogness@linutronix.de
[pmladek@suse.com: Fix build with rework/nbcon-in-kdb branch.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
This commit adds the SRCU_READ_FLAVOR_FAST_UPDOWN=0x8 macro
and adjusts rcutorture to make use of it. In this commit, both
SRCU_READ_FLAVOR_FAST=0x4 and the new SRCU_READ_FLAVOR_FAST_UPDOWN
test SRCU-fast. When the SRCU-fast-updown is added, the new
SRCU_READ_FLAVOR_FAST_UPDOWN macro will test it when passed to the
rcutorture.reader_flavor module parameter.
The old SRCU_READ_FLAVOR_FAST macro's value changed from 0x8 to 0x4.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
The upcoming Rust abstraction layer for the PWM subsystem uses a custom
`dev->release` handler to safely manage the lifetime of its driver
data.
To prevent leaking the memory of the `struct pwm_chip` (allocated by
`pwmchip_alloc`), this custom handler must also call the original
`pwmchip_release` function to complete the cleanup.
Make `pwmchip_release` a global, exported function so that it can be
called from the Rust FFI bridge. This involves removing the `static`
keyword, adding a prototype to the public header, and exporting the
symbol.
Reviewed-by: Elle Rhumsaa <elle@weathered-steel.dev>
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Link: https://patch.msgid.link/20251016-rust-next-pwm-working-fan-for-sending-v16-1-a5df2405d2bd@samsung.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
We want to expand usage of sheaves to all non-boot caches, including
kmalloc caches. Since sheaves themselves are also allocated by
kmalloc(), we need to prevent excessive or infinite recursion -
depending on sheaf size, the sheaf can be allocated from smaller, same
or larger kmalloc size bucket, there's no particular constraint.
This is similar to allocating the objext arrays so let's just reuse the
existing mechanisms for those. __GFP_NO_OBJ_EXT in alloc_empty_sheaf()
will prevent a nested kmalloc() from allocating a sheaf itself - it will
either have sheaves already, or fallback to a non-sheaf-cached
allocation (so bootstrap of sheaves in a kmalloc cache that allocates
sheaves from its own size bucket is possible). Additionally, reuse
OBJCGS_CLEAR_MASK to clear unwanted gfp flags from the nested
allocation.
Link: https://patch.msgid.link/20251105-sheaves-cleanups-v1-5-b8218e1ac7ef@suse.cz
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
The blk-mq dma iterator has an optimization for requests that align to
the device's iommu merge boundary. This boundary may be larger than the
device's virtual boundary, but the code had been depending on that queue
limit to know ahead of time if the request is guaranteed to align to
that optimization.
Rather than rely on that queue limit, which many devices may not report,
save the lowest set bit of any boundary gap between each segment in the
bio while checking the segments. The request stores the value for
merging and quickly checking per io if the request can use iova
optimizations.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No zone plugs are allocated when a zone is opened by calling Zone Append
on it. This makes the cached zone reporting report incorrectly empty
zones if the file system is unmounted and report zones is called after
that, e.g. by xfstests test cases using the scratch device.
Fix this by recording if zone append was used on a device, and disable
cached reporting for the device until a ZONE_RESET_ALL happens that
guarantees all zones are empty.
We could probably do even better using a per-zone flag, but the practical
use cache for zone reporting after the initial mount are rather limited,
so let's keep things simple for now.
Fixes: 31f0656a4ab7 ("block: introduce blkdev_report_zones_cached()")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
cgroup_task_dead() is called from finish_task_switch() which runs with
preemption disabled and doesn't allow scheduling even on PREEMPT_RT. The
function needs to acquire css_set_lock which is a regular spinlock that can
sleep on RT kernels, leading to "sleeping function called from invalid
context" warnings.
css_set_lock is too large in scope to convert to a raw_spinlock. However,
the unlinking operations don't need to run synchronously - they just need
to complete after the task is done running.
On PREEMPT_RT, defer the work through irq_work. While the work doesn't need
to happen immediately, it can't be delayed indefinitely either as the dead
task pins the cgroup and task_struct can be pinned indefinitely. Use the
lazy version of irq_work to allow batching and lower impact while ensuring
timely completion.
v2: Use IRQ_WORK_INIT_LAZY instead of immediate irq_work and add explanation
for why the work can't be delayed indefinitely (Sebastian Andrzej Siewior).
Fixes: d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
Reported-by: Calvin Owens <calvin@wbinvd.org>
Link: https://lore.kernel.org/r/20251104181114.489391-1-calvin@wbinvd.org
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add virtchnl protocol header and field definitions for advanced RSS
configuration including GTPC, GTPU, L2TPv2, ECPRI, PPP, GRE, and IP
fragment headers.
- Define new virtchnl protocol header types
- Add RSS field selectors for tunnel protocols
- Extend static mapping arrays for protocol field matching
- Add L2TPv2 session ID and length+session ID field support
This provides the foundational definitions needed for VF RSS
configuration of tunnel protocols.
Co-developed-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Co-developed-by: Jie Wang <jie1x.wang@intel.com>
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Co-developed-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Co-developed-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Co-developed-by: Ting Xu <ting.xu@intel.com>
Signed-off-by: Ting Xu <ting.xu@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Some of the new field added to socinfo structure with version 21, 22
and 23 which is only used by boot firmware and it is of no use for
Linux.Add reserve field in socinfo so that the structure remain
updated and prepared if we get any new field in future which could
be used by Linux. While at it, also updates switch case for backward
compatibility if the SoC runs with boot firmware which has these
new version added.
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251104130906.167666-2-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Add support for socinfo version 20. Version 20 adds a new field
package id and its zeroth bit contain information that can be
can be used to tune temperature thresholds on devices which might
be able to withstand higher temperatures. Zeroth bit value 1 means
that its heat dissipation is better and more relaxed thermal
scheme can be put in place and 0 means a more aggressive scheme
may be needed.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251104130906.167666-1-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
"This is a work-around for a (now fixed) corner case in the arm32 build
with Clang KCFI enabled.
- Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)"
* tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk()
ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS
compiler_types: Introduce __nocfi_generic
|
|
Cross-merge networking fixes after downstream PR (net-6.18-rc5).
Conflicts:
drivers/net/wireless/ath/ath12k/mac.c
9222582ec524 ("Revert "wifi: ath12k: Fix missing station power save configuration"")
6917e268c433 ("wifi: ath12k: Defer vdev bring-up until CSA finalize to avoid stale beacon")
https://lore.kernel.org/11cece9f7e36c12efd732baa5718239b1bf8c950.camel@sipsolutions.net
Adjacent changes:
drivers/net/ethernet/intel/Kconfig
b1d16f7c0063 ("libie: depend on DEBUG_FS when building LIBIE_FWLOG")
93f53db9f9dc ("ice: switch to Page Pool")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
Including fixes from bluetooth and wireless.
Current release - new code bugs:
- ptp: expose raw cycles only for clocks with free-running counter
- bonding: fix null-deref in actor_port_prio setting
- mdio: ERR_PTR-check regmap pointer returned by
device_node_to_regmap()
- eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG
Previous releases - regressions:
- virtio_net: fix perf regression due to bad alignment of
virtio_net_hdr_v1_hash
- Revert "wifi: ath10k: avoid unnecessary wait for service ready
message" caused regressions for QCA988x and QCA9984
- Revert "wifi: ath12k: Fix missing station power save configuration"
caused regressions for WCN7850
- eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
corruptions after kexec
Previous releases - always broken:
- virtio-net: fix received packet length check for big packets
- sctp: fix races in socket diag handling
- wifi: add an hrtimer-based delayed work item to avoid low
granularity of timers set relatively far in the future, and use it
where it matters (e.g. when performing AP-scheduled channel switch)
- eth: mlx5e:
- correctly propagate error in case of module EEPROM read failure
- fix HW-GRO on systems with PAGE_SIZE == 64kB
- dsa: b53: fixes for tagging, link configuration / RMII, FDB,
multicast
- phy: lan8842: implement latest errata"
* tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
selftests/vsock: avoid false-positives when checking dmesg
net: bridge: fix MST static key usage
net: bridge: fix use-after-free due to MST port state bypass
lan966x: Fix sleeping in atomic context
bonding: fix NULL pointer dereference in actor_port_prio setting
net: dsa: microchip: Fix reserved multicast address table programming
net: wan: framer: pef2256: Switch to devm_mfd_add_devices()
net: libwx: fix device bus LAN ID
net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages
net/mlx5e: SHAMPO, Fix skb size check for 64K pages
net/mlx5e: SHAMPO, Fix header mapping for 64K pages
net: ti: icssg-prueth: Fix fdb hash size configuration
net/mlx5e: Fix return value in case of module EEPROM read error
net: gro_cells: Reduce lock scope in gro_cell_poll
libie: depend on DEBUG_FS when building LIBIE_FWLOG
wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
netpoll: Fix deadlock in memory allocation under spinlock
net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
virtio-net: fix received length check in big packets
bnxt_en: Fix warning in bnxt_dl_reload_down()
...
|
|
The wl1273 FM radio is on Arnd's unused driver list:
https://lore.kernel.org/lkml/a15bb180-401d-49ad-a212-0c81d613fbc8@app.fastmail.com/
Other patches have removed the core, the ASoC code and the Radio code.
With all those in, remove the header.
Also, tidy the ref in the docs.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Mark the write buffer arguments in apple_smc_write(), apple_smc_rw(),
and apple_smc_write_atomic() as const. These functions do not modify
the data provided by the caller, so the parameters should be const
qualified.
Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
Reviewed-by: Sven Peter <sven@kernel.org>
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
into ibs-for-mfd-merged
|
|
When using the _SMC_KEY macro in switch/case statements, GCC 15.2.1 errors
out with 'case label does not reduce to an integer constant'. Introduce
a new __SMC_KEY macro that can be used instead.
Signed-off-by: James Calligeros <jcalligeros99@gmail.com>
Link: https://patch.msgid.link/20251025-macsmc-subdevs-v4-5-374d5c9eba0e@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Merge series from AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>:
This series adds support for three new MediaTek PMICs: MT6316, MT6363
and MT6373 and their variants - used in board designs featuring the
MediaTek MT8196 Chromebook SoC, or the MT6991 Dimensity 9400 Smartphone
SoC.
In particular, MT6316 is a regulator, but the MT6363 and MT6373 PMICs
are multi-function devices, as they have and expose multiple sub-devices;
moreover, some of those also contain an interrupt controller, managing
internal IPs interrupts: for those, a chained interrupt handler is
registered, which parent is the SPMI controller itself.
This series adds support for all of the MT6316 regulator variants and
for MT6363, MT6373 SPMI PMICs and their interrupt controller.
|
|
Merge series from Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>:
This patchset has 4 fixes and some enhancements to the Elite DSP driver
support.
Fixes includes
- setting correct flags for expected behaviour of appl_ptr
- fix closing of copp instances
- fix buffer alignment.
- fix state checks before closing asm stream
Enhancements include:
- adding q6asm_get_hw_pointer and ack callback support
- simplify code via __free(kfree) mechanism.
- use spinlock guards
- few cleanups discovered during doing above 2.
There is another set of updates comming soon, which will add support
for early memory mapping and few more modules support in audioreach.
|
|
Add support for a new instruction
BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0
which does an indirect jump to a location stored in Rx. The register
Rx should have type PTR_TO_INSN. This new type assures that the Rx
register contains a value (or a range of values) loaded from a
correct jump table – map of type instruction array.
For example, for a C switch LLVM will generate the following code:
0: r3 = r1 # "switch (r3)"
1: if r3 > 0x13 goto +0x666 # check r3 boundaries
2: r3 <<= 0x3 # adjust to an index in array of addresses
3: r1 = 0xbeef ll # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
5: r1 += r3 # r1 inherits boundaries from r3
6: r1 = *(u64 *)(r1 + 0x0) # r1 now has type INSN_TO_PTR
7: gotox r1 # jit will generate proper code
Here the gotox instruction corresponds to one particular map. This is
possible however to have a gotox instruction which can be loaded from
different maps, e.g.
0: r1 &= 0x1
1: r2 <<= 0x3
2: r3 = 0x0 ll # load from map M_1
4: r3 += r2
5: if r1 == 0x0 goto +0x4
6: r1 <<= 0x3
7: r3 = 0x0 ll # load from map M_2
9: r3 += r1
A: r1 = *(u64 *)(r3 + 0x0)
B: gotox r1 # jump to target loaded from M_1 or M_2
During check_cfg stage the verifier will collect all the maps which
point to inside the subprog being verified. When building the config,
the high 16 bytes of the insn_state are used, so this patch
(theoretically) supports jump tables of up to 2^16 slots.
During the later stage, in check_indirect_jump, it is checked that
the register Rx was loaded from a particular instruction array.
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20251105090410.1250500-9-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
LIBIE_FWLOG is unusable without DEBUG_FS. Mark it in Kconfig.
Fix build error on ixgbe when DEBUG_FS is not set. To not add another
layer of #if IS_ENABLED(LIBIE_FWLOG) in ixgbe fwlog code define debugfs
dentry even when DEBUG_FS isn't enabled. In this case the dummy
functions of LIBIE_FWLOG will be used, so not initialized dentry isn't a
problem.
Fixes: 641585bc978e ("ixgbe: fwlog support for e610")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/lkml/f594c621-f9e1-49f2-af31-23fbcb176058@roeck-us.net/
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251104172333.752445-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On bpf(BPF_PROG_LOAD) syscall user-supplied BPF programs are
translated by the verifier into "xlated" BPF programs. During this
process the original instructions offsets might be adjusted and/or
individual instructions might be replaced by new sets of instructions,
or deleted.
Add a new BPF map type which is aimed to keep track of how, for a
given program, the original instructions were relocated during the
verification. Also, besides keeping track of the original -> xlated
mapping, make x86 JIT to build the xlated -> jitted mapping for every
instruction listed in an instruction array. This is required for every
future application of instruction arrays: static keys, indirect jumps
and indirect calls.
A map of the BPF_MAP_TYPE_INSN_ARRAY type must be created with a u32
keys and value of size 8. The values have different semantics for
userspace and for BPF space. For userspace a value consists of two
u32 values – xlated and jitted offsets. For BPF side the value is
a real pointer to a jitted instruction.
On map creation/initialization, before loading the program, each
element of the map should be initialized to point to an instruction
offset within the program. Before the program load such maps should
be made frozen. After the program verification xlated and jitted
offsets can be read via the bpf(2) syscall.
If a tracked instruction is removed by the verifier, then the xlated
offset is set to (u32)-1 which is considered to be too big for a valid
BPF program offset.
One such a map can, obviously, be used to track one and only one BPF
program. If the verification process was unsuccessful, then the same
map can be re-used to verify the program with a different log level.
However, if the program was loaded fine, then such a map, being
frozen in any case, can't be reused by other programs even after the
program release.
Example. Consider the following original and xlated programs:
Original prog: Xlated prog:
0: r1 = 0x0 0: r1 = 0
1: *(u32 *)(r10 - 0x4) = r1 1: *(u32 *)(r10 -4) = r1
2: r2 = r10 2: r2 = r10
3: r2 += -0x4 3: r2 += -4
4: r1 = 0x0 ll 4: r1 = map[id:88]
6: call 0x1 6: r1 += 272
7: r0 = *(u32 *)(r2 +0)
8: if r0 >= 0x1 goto pc+3
9: r0 <<= 3
10: r0 += r1
11: goto pc+1
12: r0 = 0
7: r6 = r0 13: r6 = r0
8: if r6 == 0x0 goto +0x2 14: if r6 == 0x0 goto pc+4
9: call 0x76 15: r0 = 0xffffffff8d2079c0
17: r0 = *(u64 *)(r0 +0)
10: *(u64 *)(r6 + 0x0) = r0 18: *(u64 *)(r6 +0) = r0
11: r0 = 0x0 19: r0 = 0x0
12: exit 20: exit
An instruction array map, containing, e.g., instructions [0,4,7,12]
will be translated by the verifier to [0,4,13,20]. A map with
index 5 (the middle of 16-byte instruction) or indexes greater than 12
(outside the program boundaries) would be rejected.
The functionality provided by this patch will be extended in consequent
patches to implement BPF Static Keys, indirect jumps, and indirect calls.
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20251105090410.1250500-2-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently we don't get stack trace via ORC unwinder on top of fgraph exit
handler. We can see that when generating stacktrace from kretprobe_multi
bpf program which is based on fprobe/fgraph.
The reason is that the ORC unwind code won't get pass the return_to_handler
callback installed by fgraph return probe machinery.
Solving this by creating stack frame in return_to_handler expected by
ftrace_graph_ret_addr function to recover original return address and
continue with the unwind.
Also updating the pt_regs data with cs/flags/rsp which are needed for
successful stack retrieval from ebpf bpf_get_stackid helper.
- in get_perf_callchain we check user_mode(regs) so CS has to be set
- in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED
has to be unset
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
In rculist_nulls.h we can still see ordinary assignments to ->pprev and
->next of hlist_nulls.
As noted in the two patches below:
commit efd04f8a8b45 ("rcu: Use WRITE_ONCE() for assignments to ->next for
rculist_nulls")
commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev for
hlist_nulls")
We should use WRITE_ONCE().
Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit makes CONFIG_PROVE_RCU=y kernels enforce the new rule
that srcu_struct structures that are passed to srcu_read_lock_fast()
and other SRCU-fast read-side markers be either initialized with
init_srcu_struct_fast() on the one hand or defined with DEFINE_SRCU_FAST()
or DEFINE_STATIC_SRCU_FAST() on the other.
This eliminates the read-side test that was formerly included in
srcu_read_lock_fast() and friends, speeding these primitives up by
about 25% (admittedly only about half of a nanosecond, but when tracing
on fastpaths...)
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit adds CONFIG_PROVE_RCU=y checking to enforce the new rule that
srcu_struct structures passed to srcu_read_lock_fast() and other SRCU-fast
read-side markers be either initialized with init_srcu_struct_fast()
on the one hand or defined using either DEFINE_SRCU_FAST() or
DEFINE_STATIC_SRCU_FAST(). This will enable removal of the non-debug
read-side checks from srcu_read_lock_fast() and friends, which on my
laptop provides a 25% speedup (which admittedly amounts to about half
a nanosecond, but when tracing fastpaths...)
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit creates DEFINE_SRCU_FAST() and DEFINE_STATIC_SRCU_FAST()
macros that are similar to DEFINE_SRCU() and DEFINE_STATIC_SRCU(),
but which create srcu_struct structures that are usable only by readers
initiated by srcu_read_lock_fast() and friends.
This commit does make DEFINE_SRCU_FAST() available to modules, in which
case the per-CPU srcu_data structures are not created at compile time, but
rather at module-load time. This means that the >srcu_reader_flavor field
of the srcu_data structure is not available. Therefore,
this commit instead creates an ->srcu_reader_flavor field in the
srcu_struct structure, adds arguments to the DEFINE_SRCU()-related
macros to initialize this new field, and extends the checks in the
__srcu_check_read_flavor() function to include this new field.
This commit also allows dynamically allocated srcu_struct structure
to be marked for SRCU-fast readers. It does so by defining a new
init_srcu_struct_fast() function that marks the specified srcu_struct
structure for use by srcu_read_lock_fast() and friends.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit creates an srcu_expedite_current() function that expedites
the current (and possibly the next) SRCU grace period for the specified
srcu_struct structure. This functionality will be inherited by RCU
Tasks Trace courtesy of its mapping to SRCU fast.
If the current SRCU grace period is already waiting, that wait will
complete before the expediting takes effect. If there is no SRCU grace
period in flight, this function might well create one.
[ paulmck: Apply Zqiang feedback for PREEMPT_RT use. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|