summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-11-12Expose definition for 1600Gbps link modeLeon Romanovsky
Single patch to expose new link mode for 1600Gbps, utilizing 8 lanes at 200Gbps per lane. Signed-off-by: Leon Romanovsky <leon@kernel.org> * mlx5-next: net/mlx5: Expose definition for 1600Gbps link mode
2025-11-12vfs: expose delegation support to userlandJeff Layton
Now that support for recallable directory delegations is available, expose this functionality to userland with new F_SETDELEG and F_GETDELEG commands for fcntl(). Note that this also allows userland to request a FL_DELEG type lease on files too. Userland applications that do will get signalled when there are metadata changes in addition to just data changes (which is a limitation of FL_LEASE leases). These commands accept a new "struct delegation" argument that contains a flags field for future expansion. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-17-52f3feebb2f2@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: make vfs_symlink break delegations on parent dirJeff Layton
In order to add directory delegation support, we must break delegations on the parent on any change to the directory. Add a delegated_inode parameter to vfs_symlink() and have it break the delegation. do_symlinkat() can then wait on the delegation break before proceeding. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-12-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: make vfs_mknod break delegations on parent directoryJeff Layton
In order to add directory delegation support, we need to break delegations on the parent whenever there is going to be a change in the directory. Add a new delegated_inode pointer to vfs_mknod() and have the appropriate callers wait when there is an outstanding delegation. All other callers just set the pointer to NULL. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-11-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: make vfs_create break delegations on parent directoryJeff Layton
In order to add directory delegation support, we need to break delegations on the parent whenever there is going to be a change in the directory. Add a delegated_inode parameter to vfs_create. Most callers are converted to pass in NULL, but do_mknodat() is changed to wait for a delegation break if there is one. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-10-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: clean up argument list for vfs_create()Jeff Layton
As Neil points out: "I would be in favour of dropping the "dir" arg because it is always d_inode(dentry->d_parent) which is stable." ...and... "Also *every* caller of vfs_create() passes ".excl = true". So maybe we don't need that arg at all." Drop both arguments from vfs_create() and fix up the callers. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-9-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: allow rmdir to wait for delegation break on parentJeff Layton
In order to add directory delegation support, we need to break delegations on the parent whenever there is going to be a change in the directory. Add a delegated_inode struct to vfs_rmdir() and populate that pointer with the parent inode if it's non-NULL. Most existing in-kernel callers pass in a NULL pointer. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-7-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12vfs: allow mkdir to wait for delegation break on parentJeff Layton
In order to add directory delegation support, we need to break delegations on the parent whenever there is going to be a change in the directory. Add a new delegated_inode parameter to vfs_mkdir. All of the existing callers set that to NULL for now, except for do_mkdirat which will properly block until the lease is gone. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-6-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12filelock: add struct delegated_inodeJeff Layton
The current API requires a pointer to an inode pointer. It's easy for callers to get this wrong. Add a new delegated_inode structure and use that to pass back any inode that needs to be waited on. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-3-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12filelock: rework the __break_lease API to use flagsJeff Layton
Currently __break_lease takes both a type and an openmode. With the addition of directory leases, that makes less sense. Declare a set of LEASE_BREAK_* flags that can be used to control how lease breaks work instead of requiring a type and an openmode. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251111-dir-deleg-ro-v6-2-52f3feebb2f2@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-12net/mlx5: Expose definition for 1600Gbps link modeTariq Toukan
This patch exposes new link mode for 1600Gbps, utilizing 8 lanes at 200Gbps per lane. Co-developed-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1762863888-1092798-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-11-11Merge tag 'for-net-2025-11-11' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_conn: Fix not cleaning up PA_LINK connections - hci_event: Fix not handling PA Sync Lost event - MGMT: cancel mesh send timer when hdev removed - 6lowpan: reset link-local header on ipv6 recv path - 6lowpan: fix BDADDR_LE vs ADDR_LE_DEV address type confusion - L2CAP: export l2cap_chan_hold for modules - 6lowpan: Don't hold spin lock over sleeping functions - 6lowpan: add missing l2cap_chan_lock() - btusb: reorder cleanup in btusb_disconnect to avoid UAF - btrtl: Avoid loading the config file on security chips * tag 'for-net-2025-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btrtl: Avoid loading the config file on security chips Bluetooth: hci_event: Fix not handling PA Sync Lost event Bluetooth: hci_conn: Fix not cleaning up PA_LINK connections Bluetooth: 6lowpan: add missing l2cap_chan_lock() Bluetooth: 6lowpan: Don't hold spin lock over sleeping functions Bluetooth: L2CAP: export l2cap_chan_hold for modules Bluetooth: 6lowpan: fix BDADDR_LE vs ADDR_LE_DEV address type confusion Bluetooth: 6lowpan: reset link-local header on ipv6 recv path Bluetooth: btusb: reorder cleanup in btusb_disconnect to avoid UAF Bluetooth: MGMT: cancel mesh send timer when hdev removed ==================== Link: https://patch.msgid.link/20251111141357.1983153-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11ethtool: fix incorrect kernel-doc style comment in ethtool.hKriish Sharma
Building documentation produced the following warning: WARNING: ./include/linux/ethtool.h:495 This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * IEEE 802.3ck/df defines 16 bins for FEC histogram plus one more for This comment was not intended to be parsed as kernel-doc, so replace the '/**' with '/*' to silence the warning and align with normal comment style in header files. No functional changes. Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com> Link: https://patch.msgid.link/20251110182545.2112596-1-kriish.sharma2006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11coresight: Change device mode to atomic typeLeo Yan
The device mode is defined as local type. This type cannot promise SMP-safe access. Change to atomic type and impose relax ordering, which ensures the SMP-safe synchronisation and the ordering between the mode setting and relevant operations. Fixes: 22fd532eaa0c ("coresight: etm3x: adding operation mode for etm_enable()") Reviewed-by: Mike Leach <mike.leach@linaro.org> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20251111-arm_coresight_power_management_fix-v6-1-f55553b6c8b3@arm.com
2025-11-11lib/crypto: x86/polyval: Migrate optimized code into libraryEric Biggers
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL library be properly optimized on x86_64. This drops the x86_64 optimizations of polyval in the crypto_shash API. That's fine, since polyval will be removed from crypto_shash entirely since it is unneeded there. But even if it comes back, the crypto_shash API could just be implemented on top of the library API, as usual. Adjust the names and prototypes of the assembly functions to align more closely with the rest of the library code. Also replace a movaps instruction with movups to remove the assumption that the key struct is 16-byte aligned. Users can still align the key if they want (and at least in this case, movups is just as fast as movaps), but it's inconvenient to require it. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11lib/crypto: arm64/polyval: Migrate optimized code into libraryEric Biggers
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it up to the POLYVAL library interface. This makes the POLYVAL library be properly optimized on arm64. This drops the arm64 optimizations of polyval in the crypto_shash API. That's fine, since polyval will be removed from crypto_shash entirely since it is unneeded there. But even if it comes back, the crypto_shash API could just be implemented on top of the library API, as usual. Adjust the names and prototypes of the assembly functions to align more closely with the rest of the library code. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11lib/crypto: polyval: Add POLYVAL libraryEric Biggers
Add support for POLYVAL to lib/crypto/. This will replace the polyval crypto_shash algorithm and its use in the hctr2 template, simplifying the code and reducing overhead. Specifically, this commit introduces the POLYVAL library API and a generic implementation of it. Later commits will migrate the existing architecture-optimized implementations of POLYVAL into lib/crypto/ and add a KUnit test suite. I've also rewritten the generic implementation completely, using a more modern approach instead of the traditional table-based approach. It's now constant-time, requires no precomputation or dynamic memory allocations, decreases the per-key memory usage from 4096 bytes to 16 bytes, and is faster than the old polyval-generic even on bulk data reusing the same key (at least on x86_64, where I measured 15% faster). We should do this for GHASH too, but for now just do it for POLYVAL. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11efi/runtime-wrappers: Keep track of the efi_runtime_lock ownerArd Biesheuvel
The EFI runtime wrappers use a file local semaphore to serialize access to the EFI runtime services. This means that any calls to the arch wrappers around the runtime services will also be serialized, removing the need for redundant locking. For robustness, add a facility that allows those arch wrappers to assert that the semaphore was taken by the current task. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-11efi/memattr: Convert efi_memattr_init() return type to voidBreno Leitao
The efi_memattr_init() function's return values (0 and -ENOMEM) are never checked by callers. Convert the function to return void since the return status is unused. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-11reset: mpfs: add non-auxiliary bus probingConor Dooley
While the auxiliary bus was a nice bandaid, and meant that re-writing the representation of the clock regions in devicetree was not required, it has run its course. The "mss_top_sysreg" region that contains the clock and reset regions, also contains pinctrl and an interrupt controller, so the time has come rewrite the devicetree and probe the reset controller from an mfd devicetree node, rather than implement those drivers using the auxiliary bus. Wanting to avoid propagating this naive/incorrect description of the hardware to the new pic64gx SoC is a major motivating factor here. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-11dt-bindings: clock: document 8ULP's SIM LPAVLaurentiu Mihalcea
Add documentation for i.MX8ULP's SIM LPAV module. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Laurentiu Mihalcea <laurentiu.mihalcea@nxp.com> Link: https://lore.kernel.org/r/20251104120301.913-3-laurentiumihalcea111@gmail.com Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
2025-11-11net: export netdev_get_by_index_lock()David Wei
Need to call netdev_get_by_index_lock() from io_uring/zcrx.c, but it is currently private to net. Export the function in linux/netdevice.h. Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-11mlx5: Fix default values in create CQAkiva Goldberger
Currently, CQs without a completion function are assigned the mlx5_add_cq_to_tasklet function by default. This is problematic since only user CQs created through the mlx5_ib driver are intended to use this function. Additionally, all CQs that will use doorbells instead of polling for completions must call mlx5_cq_arm. However, the default CQ creation flow leaves a valid value in the CQ's arm_db field, allowing FW to send interrupts to polling-only CQs in certain corner cases. These two factors would allow a polling-only kernel CQ to be triggered by an EQ interrupt and call a completion function intended only for user CQs, causing a null pointer exception. Some areas in the driver have prevented this issue with one-off fixes but did not address the root cause. This patch fixes the described issue by adding defaults to the create CQ flow. It adds a default dummy completion function to protect against null pointer exceptions, and it sets an invalid command sequence number by default in kernel CQs to prevent the FW from sending an interrupt to the CQ until it is armed. User CQs are responsible for their own initialization values. Callers of mlx5_core_create_cq are responsible for changing the completion function and arming the CQ per their needs. Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://patch.msgid.link/1762681743-1084694-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11Bluetooth: hci_event: Fix not handling PA Sync Lost eventLuiz Augusto von Dentz
This handles PA Sync Lost event which previously was assumed to be handled with BIG Sync Lost but their lifetime are not the same thus why there are 2 different events to inform when each sync is lost. Fixes: b2a5f2e1c127 ("Bluetooth: hci_event: Add support for handling LE BIG Sync Lost event") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-11-11ASoC: cs35l56: Allow restoring factory calibration through ALSA controlRichard Fitzgerald
Add an ALSA control (CAL_DATA) that can be used to restore amp calibration, instead of using debugfs. A readback control (CAL_DATA_RB) is also added for factory testing. On ChromeOS the process that restores amp calibration from NVRAM has limited permissions and cannot access debugfs. It requires an ALSA control that it can write the calibration blob into. ChromeOS also restricts access to ALSA controls, which avoids the risk of accidental or malicious overwriting of good calibration data with bad data. As this control is not needed for normal Linux-based distros it is a Kconfig option. A separate control, CAL_DATA_RB, provides a readback of the current calibration data, which could be either from a write to CAL_DATA or the result of factory production-line calibration. The write and read are intentionally separate controls to defeat "dumb" save-and-restore tools like alsa-restore that assume it is safe to save all control values and write them back in any order at some undefined future time. Such behavior carries the risk of restoring stale or bad data over the top of good data. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20251111130850.513969-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-11ASoC: cs35l56: Add control to read CAL_SET_STATUSRichard Fitzgerald
Create an ALSA control to read the value of the firmware CAL_SET_STATUS control. This reports whether the firmware is using a calibration blob or the default calibration from the .bin file. The firmware only reports a valid value in this register while audio is actually playing and the internal PLL is locked to the audio clock. Otherwise it returns a status of "unknown". Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20251111130850.513969-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-11rv: Make rv_reacting_on() staticThomas Weißschuh
There are no external users left. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20251014-rv-lockdep-v1-2-0b9e51919ea8@linutronix.de Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2025-11-11rv: Pass va_list to reactorsThomas Weißschuh
The only thing the reactors can do with the passed in varargs is to convert it into a va_list. Do that in a central helper instead. It simplifies the reactors, removes some hairy macro-generated code and introduces a convenient hook point to modify reactor behavior. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20251014-rv-lockdep-v1-1-0b9e51919ea8@linutronix.de Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2025-11-11net/mlx5: E-Switch, support eswitch inactive modeSaeed Mahameed
Add support for eswitch switchdev inactive mode Inactive mode: Drop all traffic going to FDB, Remove mpfs l2 rules and disconnect adjacent vports. Active mode: Traffic flows through FDB, mpfs table populated, and adjacent vports are connected. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251108070404.1551708-4-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11devlink: Introduce switchdev_inactive eswitch modeSaeed Mahameed
Adds DEVLINK_ESWITCH_MODE_SWITCHDEV_INACTIVE attribute to UAPI and documentation. Before having traffic flow through an eswitch, a user may want to have the ability to block traffic towards the FDB until FDB is fully programmed and the user is ready to send traffic to it. For example: when two eswitches are present for vports in a multi-PF setup, one eswitch may take over the traffic from the other when the user chooses. Before this take over, a user may want to first program the inactive eswitch and then once ready redirect traffic to this new eswitch. switchdev modes transition semantics: legacy->switchdev_inactive: Create switchdev mode normally, traffic not allowed to flow yet. switchdev_inactive->switchdev: Enable traffic to flow. switchdev->switchdev_inactive: Block traffic on the FDB, FDB and representros state and content is preserved. When eswitch is configured to this mode, traffic is ignored/dropped on this eswitch FDB, while current configuration is kept, e.g FDB rules and netdev representros are kept available, FDB programming is allowed. Example: # start inactive switchdev devlink dev eswitch set pci/0000:08:00.1 mode switchdev_inactive # setup TC rules, representors etc .. # activate devlink dev eswitch set pci/0000:08:00.1 mode switchdev Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251108070404.1551708-2-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11Merge tag 'v6.18-rc5' into media-nextMauro Carvalho Chehab
Linux 6.18-rc5 * tag 'v6.18-rc5': (1016 commits) Linux 6.18-rc5 kbuild: Let kernel-doc.py use PYTHON3 override rtc: rx8025: fix incorrect register reference Revert "drm/nouveau: set DMA mask before creating the flush page" io_uring: fix regbuf vector size truncation compiler_types: Move unused static inline functions warning to W=2 smb: client: validate change notify buffer before copy tracing/tools: Fix incorrcet short option in usage text for --threads drm/xe: Enforce correct user fence signaling order using x86/microcode/AMD: Add more known models to entry sign checking drm/xe: Do clean shutdown also when using flr drm/xe: Move declarations under conditional branch drm/xe/guc: Synchronize Dead CT worker with unbind tracing: Fix memory leaks in create_field_var() ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up tracing: tprobe-events: Fix to put tracepoint_user when disable the tprobe tracing: tprobe-events: Fix to register tracepoint correctly gpio: tb10x: Drop unused tb10x_set_bits() function drm/amd/display: Enable mst when it's detected but yet to be initialized drm/amdgpu: Fix wait after reset sequence in S3 ...
2025-11-11sched/deadline: Fix dl_server stop conditionPeter Zijlstra
Gabriel reported that the dl_server doesn't stop as expected. The problem was found to be the fact that idle time and fair runtime are treated equally. Both will count towards dl_server runtime and push the activation forwards when it is in the zero-laxity wait state. Notably: dl_server_update_idle() update_curr_dl_se() if (dl_defer && dl_throttled && dl_runtime_exceeded()) hrtimer_try_to_cancel(); // stop timer replenish_dl_new_period() deadline = now + dl_deadline; // fwd period runtime = dl_runtime; start_dl_timer(); // restart timer And while we do want idle time accounted towards the *current* activation of the dl_server -- after all, a fair task could've ran if we had any -- we don't necessarily want idle time to cause or push forward an activation. Introduce dl_defer_idle to make this distinction. It will be set once idle time pushed the activation forward, once set idle time will only be allowed to consume any runtime but not push the activation. This will then cause dl_server_timer() to fire, which will stop the dl_server. Any non-idle time accounting during this phase will clear dl_defer_idle, so only a full period of idle will cause the dl_server to stop. Reported-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251101000057.GA2184199@noisy.programming.kicks-ass.net
2025-11-11wifi: cfg80211/mac80211: Add fallback mechanism for INDOOR_SP connectionPagadala Yesu Anjaneyulu
Implement fallback to LPI mode when SP mode is not permitted by regulatory constraints for INDOOR_SP connections. Limit fallback mechanism to client mode. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20251110140806.8b43201a34ae.I37fc7bb5892eb9d044d619802e8f2095fde6b296@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-11wifi: cfg80211/mac80211: clean up duplicate ap_power handlingPagadala Yesu Anjaneyulu
Move duplicated ap_power type handling code to an inline function in cfg80211. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20251110140806.959948da1cb5.I893b5168329fb3232f249c182a35c99804112da6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-11fs: move inode fields used during fast path lookup closer togetherMateusz Guzik
This should avoid *some* cache misses. Successful path lookup is guaranteed to load at least ->i_mode, ->i_opflags and ->i_acl. At the same time the common case will avoid looking at more fields. struct inode is not guaranteed to have any particular alignment, notably ext4 has it only aligned to 8 bytes meaning nearby fields might happen to be on the same or only adjacent cache lines depending on luck (or no luck). According to pahole: umode_t i_mode; /* 0 2 */ short unsigned int i_opflags; /* 2 2 */ kuid_t i_uid; /* 4 4 */ kgid_t i_gid; /* 8 4 */ unsigned int i_flags; /* 12 4 */ struct posix_acl * i_acl; /* 16 8 */ struct posix_acl * i_default_acl; /* 24 8 */ ->i_acl is unnecessarily separated by 8 bytes from the other fields. With struct inode being offset 48 bytes into the cacheline this means an avoidable miss. Note it will still be there for the 56 byte case. New layout: umode_t i_mode; /* 0 2 */ short unsigned int i_opflags; /* 2 2 */ unsigned int i_flags; /* 4 4 */ struct posix_acl * i_acl; /* 8 8 */ struct posix_acl * i_default_acl; /* 16 8 */ kuid_t i_uid; /* 24 4 */ kgid_t i_gid; /* 28 4 */ I verified with pahole there are no size or hole changes. This is stopgap until someone(tm) sanitizes the layout in the first place, allocation methods aside. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251109121931.1285366-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11xsk: add indirect call for xsk_destruct_skbJason Xing
Since Eric proposed an idea about adding indirect call wrappers for UDP and managed to see a huge improvement[1], the same situation can also be applied in xsk scenario. This patch adds an indirect call for xsk and helps current copy mode improve the performance by around 1% stably which was observed with IXGBE at 10Gb/sec loaded. If the throughput grows, the positive effect will be magnified. I applied this patch on top of batch xmit series[2], and was able to see <5% improvement from our internal application which is a little bit unstable though. Use INDIRECT wrappers to keep xsk_destruct_skb static as it used to be when the mitigation config is off. Be aware of the freeing path that can be very hot since the frequency can reach around 2,000,000 times per second with the xdpsock test. [1]: https://lore.kernel.org/netdev/20251006193103.2684156-2-edumazet@google.com/ [2]: https://lore.kernel.org/all/20251021131209.41491-1-kerneljasonxing@gmail.com/ Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251031103328.95468-1-kerneljasonxing@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11ns: drop custom reference count initialization for initial namespacesChristian Brauner
Initial namespaces don't modify their reference count anymore. They remain fixed at one so drop the custom refcount initializations. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-16-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11pid: rely on common reference count behaviorChristian Brauner
Now that we changed the generic reference counting mechanism for all namespaces to never manipulate reference counts of initial namespaces we can drop the special handling for pid namespaces. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-15-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: add asserts for initial namespace active reference countsChristian Brauner
They always remain fixed at one. Notice when that assumptions is broken. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-14-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: add asserts for initial namespace reference countsChristian Brauner
They always remain fixed at one. Notice when that assumptions is broken. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-13-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: make all reference counts on initial namespace a nopChristian Brauner
They are always active so no need to needlessly cacheline ping-pong. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-12-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: rename is_initial_namespace()Christian Brauner
Rename is_initial_namespace() to ns_init_inum() and make it symmetrical with the ns id variant. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-9-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: make is_initial_namespace() argument constChristian Brauner
We don't modify the data structure at all so pass it as const. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-8-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11nstree: switch to new structuresChristian Brauner
Switch the nstree management to the new combined structures. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-5-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11nstree: add helper to operate on struct ns_tree_{node,root}Christian Brauner
Add helpers that work on the combined rbtree and rculist combined. This will make the code a lot more managable and legible. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-4-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11nstree: move nstree types into separate headerChristian Brauner
Introduce two new fundamental data structures for namespace tree management in a separate header file. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-3-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11nstree: decouple from ns_common headerChristian Brauner
Foward declare struct ns_common and remove the include of ns_common.h. We want ns_common.h to possibly include nstree structures but not the other way around. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-2-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11ns: move namespace types into separate headerChristian Brauner
Add a dedicated header for namespace types. Link: https://patch.msgid.link/20251110-work-namespace-nstree-fixes-v1-1-e8a9264e0fb9@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11Merge branch 'kbuild-6.19.fms.extension'Christian Brauner
Bring in the shared branch with the kbuild tree to enable '-fms-extensions' for 6.19. Further namespace cleanup work requires this extension. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-11md: allow configuring logical block sizeLi Nan
Previously, raid array used the maximum logical block size (LBS) of all member disks. Adding a larger LBS disk at runtime could unexpectedly increase RAID's LBS, risking corruption of existing partitions. This can be reproduced by: ``` # LBS of sd[de] is 512 bytes, sdf is 4096 bytes. mdadm -CRq /dev/md0 -l1 -n3 /dev/sd[de] missing --assume-clean # LBS is 512 cat /sys/block/md0/queue/logical_block_size # create partition md0p1 parted -s /dev/md0 mklabel gpt mkpart primary 1MiB 100% lsblk | grep md0p1 # LBS becomes 4096 after adding sdf mdadm --add -q /dev/md0 /dev/sdf cat /sys/block/md0/queue/logical_block_size # partition lost partprobe /dev/md0 lsblk | grep md0p1 ``` Simply restricting larger-LBS disks is inflexible. In some scenarios, only disks with 512 bytes LBS are available currently, but later, disks with 4KB LBS may be added to the array. Making LBS configurable is the best way to solve this scenario. After this patch, the raid will: - store LBS in disk metadata - add a read-write sysfs 'mdX/logical_block_size' Future mdadm should support setting LBS via metadata field during RAID creation and the new sysfs. Though the kernel allows runtime LBS changes, users should avoid modifying it after creating partitions or filesystems to prevent compatibility issues. Only 1.x metadata supports configurable LBS. 0.90 metadata inits all fields to default values at auto-detect. Supporting 0.90 would require more extensive changes and no such use case has been observed. Note that many RAID paths rely on PAGE_SIZE alignment, including for metadata I/O. A larger LBS than PAGE_SIZE will result in metadata read/write failures. So this config should be prevented. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-6-linan666@huaweicloud.com Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>