summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-11-27kho: introduce high-level memory allocation APIPasha Tatashin
Currently, clients of KHO must manually allocate memory (e.g., via alloc_pages), calculate the page order, and explicitly call kho_preserve_folio(). Similarly, cleanup requires separate calls to unpreserve and free the memory. Introduce a high-level API to streamline this common pattern: - kho_alloc_preserve(size): Allocates physically contiguous, zeroed memory and immediately marks it for preservation. - kho_unpreserve_free(ptr): Unpreserves and frees the memory in the current kernel. - kho_restore_free(ptr): Restores the struct page state of preserved memory in the new kernel and immediately frees it to the page allocator. [pasha.tatashin@soleen.com: build fixes] Link: https://lkml.kernel.org/r/CA+CK2bBgXDhrHwTVgxrw7YTQ-0=LgW0t66CwPCgG=C85ftz4zw@mail.gmail.com Link: https://lkml.kernel.org/r/20251114190002.3311679-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Vasilevsky <dave@vasilevsky.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Kees Cook <kees@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27kho: add interfaces to unpreserve folios, page ranges, and vmallocPasha Tatashin
Allow users of KHO to cancel the previous preservation by adding the necessary interfaces to unpreserve folio, pages, and vmallocs. Link: https://lkml.kernel.org/r/20251101142325.1326536-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Simon Horman <horms@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27kho: drop notifiersMike Rapoport (Microsoft)
The KHO framework uses a notifier chain as the mechanism for clients to participate in the finalization process. While this works for a single, central state machine, it is too restrictive for kernel-internal components like pstore/reserve_mem or IMA. These components need a simpler, direct way to register their state for preservation (e.g., during their initcall) without being part of a complex, shutdown-time notifier sequence. The notifier model forces all participants into a single finalization flow and makes direct preservation from an arbitrary context difficult. This patch refactors the client participation model by removing the notifier chain and introducing a direct API for managing FDT subtrees. The core kho_finalize() and kho_abort() state machine remains, but clients now register their data with KHO beforehand. Link: https://lkml.kernel.org/r/20251101142325.1326536-3-pasha.tatashin@soleen.com Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Simon Horman <horms@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27rbtree: inline rb_last()Eric Dumazet
This is a very small function, inlining it saves cpu cycles in TCP by reducing register pressure and removing call/ret overhead. It also reduces vmlinux text size by 122 bytes on a typical x86_64 build. Before: size vmlinux text data bss dec hex filename 34811781 22177365 5685248 62674394 3bc55da vmlinux After: size vmlinux text data bss dec hex filename 34811659 22177365 5685248 62674272 3bc5560 vmlinux [ojeda@kernel.org: fix rust build] Link: https://lkml.kernel.org/r/20251120085518.1463498-1-ojeda@kernel.org Link: https://lkml.kernel.org/r/20251114140646.3817319-3-edumazet@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Stehen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27rbtree: inline rb_first()Eric Dumazet
Patch series "rbree: inline rb_first() and rb_last()". Inline these two small helpers, heavily used in TCP and FQ packet scheduler, and in many other places. This reduces kernel text size, and brings an 1.5 % improvement on network TCP stress test. This patch (of 2): This is a very small function, inlining it saves cpu cycles by reducing register pressure and removing call/ret overhead. It also reduces vmlinux text size by 744 bytes on a typical x86_64 build. Before: size vmlinux text data bss dec hex filename 34812525 22177365 5685248 62675138 3bc58c2 vmlinux After: size vmlinux text data bss dec hex filename 34811781 22177365 5685248 62674394 3bc55da vmlinux [ojeda@kernel.org: fix rust build] Link: https://lkml.kernel.org/r/20251120085518.1463498-1-ojeda@kernel.org Link: https://lkml.kernel.org/r/20251114140646.3817319-1-edumazet@google.com Link: https://lkml.kernel.org/r/20251114140646.3817319-2-edumazet@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Stehen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27Merge branch 'mm-hotfixes-stable' into mm-nonmm-stable in order to be ableAndrew Morton
to merge "kho: make debugfs interface optional" into mm-nonmm-stable.
2025-11-27Merge tag 'cache-for-v6.19' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers-late standalone cache drivers for v6.19 ccache: Add a compatible for the pic64gx SoC. No driver change needed, as it falls back to the PolarFire SoC. hisi hha/generic cpu cache maintenance: Add support for a non-architectural mechanism for invalidating memory regions, needed for some cxl implementations on arm64 (and probably elsewhere in the future). The HiSilicon Hydra Home Agent is the first driver to provide this support. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'cache-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent cache: Make top level Kconfig menu a boolean dependent on RISCV MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header arm64: Select GENERIC_CPU_CACHE_MAINTENANCE lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion() memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion() dt-bindings: cache: sifive,ccache0: add a pic64gx compatible Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-27keys: Fix grammar and formatting in 'struct key_type' commentsThorsten Blum
s/it/if/ and s/revokation/revocation/, capitalize "clear", and add a period after the sentence. Fix the comment formatting. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-11-27Merge tag 'reset-gpio-for-v6.19' of https://git.pengutronix.de/git/pza/linux ↵Arnd Bergmann
into soc/drivers-late Reset/GPIO/swnode changes for v6.19 * Extend software node implementation, allowing its properties to reference existing firmware nodes. * Update the GPIO property interface to use reworked swnode macros. * Rework reset-gpio code to use GPIO lookup via swnode. * Fix spi-cs42l43 driver to work with swnode changes. * tag 'reset-gpio-for-v6.19' of https://git.pengutronix.de/git/pza/linux: reset: gpio: use software nodes to setup the GPIO lookup reset: gpio: convert the driver to using the auxiliary bus reset: make the provider of reset-gpios the parent of the reset device reset: order includes alphabetically in reset/core.c gpio: swnode: allow referencing GPIO chips by firmware nodes spi: cs42l43: Use actual ACPI firmware node for chip selects software node: allow referencing firmware nodes software node: increase the reference of the swnode by its fwnode software node: read the reference args via the fwnode API Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: net/xdp/xsk.c 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number") 8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb") 30ed05adca4a ("xsk: use a smaller new lock for shared pool case") https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27Merge tag 'ceph-for-6.18-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A patch to make sparse read handling work in msgr2 secure mode from Slava and a couple of fixes from Ziming and myself to avoid operating on potentially invalid memory, all marked for stable" * tag 'ceph-for-6.18-rc8' of https://github.com/ceph/ceph-client: libceph: prevent potential out-of-bounds writes in handle_auth_session_key() libceph: replace BUG_ON with bounds check for map->max_osd ceph: fix crash in process_v2_sparse_read() for encrypted directories libceph: drop started parameter of __ceph_open_session() libceph: fix potential use-after-free in have_mon_and_osd_map()
2025-11-27Merge tag 'net-6.18-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and CAN. No known outstanding regressions. Current release - regressions: - mptcp: initialize rcv_mss before calling tcp_send_active_reset() - eth: mlx5e: fix validation logic in rate limiting Previous releases - regressions: - xsk: avoid data corruption on cq descriptor number - bluetooth: - prevent race in socket write iter and sock bind - fix not generating mackey and ltk when repairing - can: - kvaser_usb: fix potential infinite loop in command parsers - rcar_canfd: fix CAN-FD mode as default - eth: - veth: reduce XDP no_direct return section to fix race - virtio-net: avoid unnecessary checksum calculation on guest RX Previous releases - always broken: - sched: fix TCF_LAYER_TRANSPORT handling in tcf_get_base_ptr() - bluetooth: mediatek: fix kernel crash when releasing iso interface - vhost: rewind next_avail_head while discarding descriptors - eth: - r8169: fix RTL8127 hang on suspend/shutdown - aquantia: add missing descriptor cache invalidation on ATL2 - dsa: microchip: fix resource releases in error path" * tag 'net-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in mptcp_do_fastclose(). net: fec: do not register PPS event for PEROUT net: fec: do not allow enabling PPS and PEROUT simultaneously net: fec: do not update PEROUT if it is enabled net: fec: cancel perout_timer when PEROUT is disabled net: mctp: unconditionally set skb->dev on dst output net: atlantic: fix fragment overflow handling in RX path MAINTAINERS: separate VIRTIO NET DRIVER and add netdev virtio-net: avoid unnecessary checksum calculation on guest RX eth: fbnic: Fix counter roll-over issue mptcp: clear scheduled subflows on retransmit net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic s390/net: list Aswin Karuvally as maintainer net: wwan: mhi: Keep modem name match with Foxconn T99W640 vhost: rewind next_avail_head while discarding descriptors net/sched: em_canid: fix uninit-value in em_canid_match can: rcar_canfd: Fix CAN-FD mode as default xsk: avoid data corruption on cq descriptor number r8169: fix RTL8127 hang on suspend/shutdown net: sxgbe: fix potential NULL dereference in sxgbe_rx() ...
2025-11-27Merge tag 'devfreq-next-for-6.19' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Pull devfreq changes for v6.19 from Chanwoo Choi: "- Move governor.h under include/linux/ and rename to devfreq-governor.h in order to allow devfreq governor definitions in out of drivers/devfreq/. - Fix potential use-after-free issue of OPP handling on hisi_uncore_freq.c - Use min() to improve the readability on tegra30-devfreq.c - Fix typo in DFSO_DOWNDIFFERENTIAL macro name on governor_simpleondemand.c" * tag 'devfreq-next-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate PM / devfreq: hisi: Fix potential UAF in OPP handling PM / devfreq: Move governor.h to a public header location
2025-11-27sysctl: Wrap do_proc_douintvec with the public function proc_douintvec_convJoel Granados
Make do_proc_douintvec static and export proc_douintvec_conv wrapper function for external use. This is to keep with the design in sysctl.c. Update fs/pipe.c to use the new public API. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Create pipe-max-size converter using sysctl UINT macrosJoel Granados
Create a converter for the pipe-max-size proc_handler using the SYSCTL_UINT_CONV_CUSTOM. Move SYSCTL_CONV_IDENTITY macro to the sysctl header to make it available for pipe size validation. Keep returning -EINVAL when (val == 0) by using a range checking converter and setting the minimal valid value (extern1) to SYSCTL_ONE. Keep round_pipe_size by passing it as the operation for SYSCTL_USER_TO_KERN_INT_CONV. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Move proc_doulongvec_ms_jiffies_minmax to kernel/time/jiffies.cJoel Granados
Move proc_doulongvec_ms_jiffies_minmax to kernel/time/jiffies.c. Create a non static wrapper function proc_doulongvec_minmax_conv that forwards the custom convmul and convdiv argument values to the internal do_proc_doulongvec_minmax. Remove unused linux/times.h include from kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Move jiffies converters to kernel/time/jiffies.cJoel Granados
Move integer jiffies converters (proc_dointvec{_,_ms_,_userhz_}jiffies and proc_dointvec_ms_jiffies_minmax) to kernel/time/jiffies.c. Error stubs for when CONFIG_PRCO_SYSCTL is not defined are not reproduced because all the jiffies converters go through proc_dointvec_conv which is already stubbed. This is part of the greater effort to move sysctl logic out of kernel/sysctl.c thereby reducing merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Move UINT converter macros to sysctl headerJoel Granados
Move SYSCTL_USER_TO_KERN_UINT_CONV and SYSCTL_UINT_CONV_CUSTOM macros to include/linux/sysctl.h. No need to embed sysctl_kern_to_user_uint_conv in a macro as it will not need a custom kernel pointer operation. This is a preparation commit to enable jiffies converter creation outside kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Move INT converter macros to sysctl headerJoel Granados
Move direction macros (SYSCTL_{USER_TO_KERN,KERN_TO_USER}) and the integer converter macros (SYSCTL_{USER_TO_KERN,KERN_TO_USER}_INT_CONV, SYSCTL_INT_CONV_CUSTOM) into include/linux/sysctl.h. This is a preparation commit to enable jiffies converter creation outside kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27sysctl: Allow custom converters from outside sysctlJoel Granados
The new non-static proc_dointvec_conv forwards a custom converter function to do_proc_dointvec from outside the sysctl scope. Rename the do_proc_dointvec call points so any future changes to proc_dointvec_conv are propagated in sysctl.c This is a preparation commit that allows the integer jiffie converter functions to move out of kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27Merge tag 'linux-can-next-for-6.19-20251126' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2025-11-26 this is a pull request of 27 patches for net-next/main. The first 17 patches are by Vincent Mailhol and Oliver Hartkopp and add CAN XL support to the CAN netlink interface. Geert Uytterhoeven and Biju Das provide 7 patches for the rcar_canfd driver to add suspend/resume support. The next 2 patches are by Markus Schneider-Pargmann and add them as the m_can maintainer. Conor Dooley's patch updates the mpfs-can DT bindungs. linux-can-next-for-6.19-20251126 * tag 'linux-can-next-for-6.19-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (27 commits) dt-bindings: can: mpfs: document resets MAINTAINERS: Simplify m_can section MAINTAINERS: Add myself as m_can maintainer can: rcar_canfd: Add suspend/resume support can: rcar_canfd: Convert to DEFINE_SIMPLE_DEV_PM_OPS() can: rcar_canfd: Invert CAN clock and close_candev() order can: rcar_canfd: Extract rcar_canfd_global_{,de}init() can: rcar_canfd: Use devm_clk_get_optional() for RAM clk can: rcar_canfd: Invert global vs. channel teardown can: rcar_canfd: Invert reset assert order can: dev: print bitrate error with two decimal digits can: raw: instantly reject unsupported CAN frames can: add dummy_can driver can: calc_bittiming: add can_calc_sample_point_pwm() can: calc_bittiming: add can_calc_sample_point_nrz() can: calc_bittiming: replace misleading "nominal" by "reference" can: netlink: add PWM netlink interface can: calc_bittiming: add PWM calculation can: bittiming: add PWM validation can: bittiming: add PWM parameters ... ==================== Link: https://patch.msgid.link/20251126120106.154635-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27sysctl: Replace void pointer with const pointer to ctl_tableJoel Granados
* Replace void* data in the converter functions with a const struct ctl_table* table as it was only getting forwarding values from ctl_table->extra{1,2}. * Remove the void* data in the do_proc_* functions as they already had a pointer to the ctl_table. * Remove min/max structures do_proc_do{uint,int}vec_minmax_conv_param; the min/max values get passed directly in ctl_table. * Keep min/max initialization in extra{1,2} in proc_dou8vec_minmax. * The do_proc_douintvec was adjusted outside sysctl.c as it is exported to fs/pipe.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-11-27srcu: Create an SRCU-fast-updown APIPaul E. McKenney
This commit creates an SRCU-fast-updown API, including DEFINE_SRCU_FAST_UPDOWN(), DEFINE_STATIC_SRCU_FAST_UPDOWN(), __init_srcu_struct_fast_updown(), init_srcu_struct_fast_updown(), srcu_read_lock_fast_updown(), srcu_read_unlock_fast_updown(), __srcu_read_lock_fast_updown(), and __srcu_read_unlock_fast_updown(). These are initially identical to their SRCU-fast counterparts, but both SRCU-fast and SRCU-fast-updown will be optimized in different directions by later commits. SRCU-fast will lack any sort of srcu_down_read() and srcu_up_read() APIs, which will enable extremely efficient NMI safety. For its part, SRCU-fast-updown will not be NMI safe, which will enable reasonably efficient implementations of srcu_down_read_fast() and srcu_up_read_fast(). This API fork happens to meet two different future use cases. * SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast must be as well. * SRCU-fast-updown will be needed for uretprobes code in order to get rid of the read-side memory barriers while still allowing entering the reader at task level while exiting it in a timer handler. This commit also adds rcutorture tests for the new APIs. This (annoyingly) needs to be in the same commit for bisectability. With this commit, the 0x8 value tests SRCU-fast-updown. However, most SRCU-fast testing will be via the RCU Tasks Trace wrappers. [ paulmck: Apply s/0x8/0x4/ missing change per Boqun Feng feedback. ] [ paulmck: Apply Akira Yokosawa feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <bpf@vger.kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-11-27configfs: Constify ct_item_ops in struct config_item_typeChristophe JAILLET
Make 'ct_item_ops' const in struct config_item_type. This allows constification of many structures which hold some function pointers. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/f43cb57418a7f59e883be8eedc7d6abe802a2094.1761390472.git.christophe.jaillet@wanadoo.fr Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
2025-11-27configfs: Constify ct_group_ops in struct config_item_typeChristophe JAILLET
Make 'ct_group_ops' const in struct config_item_type. This allows constification of many structures which hold some function pointers. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/6b720cf407e8a6d30f35beb72e031b2553d1ab7e.1761390472.git.christophe.jaillet@wanadoo.fr Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
2025-11-27net: pcs: xpcs: Add support for FBNIC 25G, 50G, 100G PMDAlexander Duyck
The fbnic driver is planning to make use of the XPCS driver to enable support for PCS and better integration with phylink. To do this though we will need to enable several workarounds since the PMD interface for fbnic is likely to be unique since it is a mix of two different vendor products with a unique wrapper around the IP. I have generated a PHY identifier based on IEEE 802.3-2022 22.2.4.3.1 using an OUI belonging to Meta Platforms and used with our NICs. Using this we will provide it as the PMD ID via the SW based MDIO interface so that the fbnic device can be identified and necessary workarounds enabled in the XPCS driver. As an initial workaround this change adds an exception so that soft_reset is not set when the driver is initially bound to the PCS. In addition I have added logic to integrate the PMD Rx signal detect state into the link state for the PCS. With this we can avoid the link coming up too soon on the FBNIC PMD and as a result of it being in the training state so we can avoid link flaps. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374321695.959489.6648161125012056619.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: pcs: xpcs: Fix PMA identifier handling in XPCSAlexander Duyck
The XPCS driver was mangling the PMA identifier as the original code appears to have been focused on just capturing the OUI. Rather than store a mangled ID it is better to work with the actual PMA ID and instead just mask out the values that don't apply rather than shifting them and reordering them as you still don't get the original OUI for the NIC without having to bitswap the values as per the definition of the layout in IEEE 802.3-2022 22.2.4.3.1. By laying it out as it was in the hardware it is also less likely for us to have an unintentional collision as the enum values will occupy the revision number area while the OUI occupies the upper 22 bits. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374320920.959489.17267159479370601070.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27virtio: clean up features qword/dword termsMichael S. Tsirkin
virtio pci uses word to mean "16 bits". mmio uses it to mean "32 bits". To avoid confusion, let's avoid the term in core virtio altogether. Just say U64 to mean "64 bit". Fixes: e7d4c1c5a546 ("virtio: introduce extended features") Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-ID: <ad53b7b6be87fc524f45abaeca0bb05fb3633397.1764225384.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix map ops commentMichael S. Tsirkin
@free will free the map handle not sync it. Fix the doc to match. Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core") Message-Id: <f6ff1c7aff8401900bf362007d7fb52dfdb6a15b.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix virtqueue_set_affinity() docsMichael S. Tsirkin
Rewrite the comment for better grammar and clarity. Fixes: 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue") Message-Id: <e317e91bd43b070e5eaec0ebbe60c5749d02e2dd.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: standardize Returns documentation styleMichael S. Tsirkin
Remove colons after "Returns" in virtio_map_ops function documentation - both to avoid triggering an htmldoc warning and for consistency with virtio_config_ops. This affects map_page, alloc, need_sync, and max_mapping_size. Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core") Message-Id: <c262893fa21f4b1265147ef864574a9bd173348f.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix grammar in virtio_map_ops docsMichael S. Tsirkin
Fix grammar issues in the virtio_map_ops docs: - missing article before "transport" - "implements" -> "implement" to match subject Fixes: bee8c7c24b73 ("virtio: introduce map ops in virtio core") Message-Id: <3f7bcae5a984f14b72e67e82572b110acb06fa7e.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix grammar in virtio_queue_info docsMichael S. Tsirkin
Fix grammar in the description of @ctx Fixes: c502eb85c34e ("virtio: introduce virtio_queue_info struct and find_vqs_info() config op") Message-Id: <a5cf2b92573200bdb1c1927e559d3930d61a4af2.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix whitespace in virtio_config_opsMichael S. Tsirkin
The finalize_features documentation uses a tab between words. Use space instead. Fixes: d16c0cd27331 ("docs: driver-api: virtio: virtio on Linux") Message-Id: <39d7685c82848dc6a876d175e33a1407f6ab3fc1.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-27virtio: fix typo in virtio_device_ready() commentMichael S. Tsirkin
"coherenct" -> "coherent" Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") Message-Id: <db286e9a65449347f6584e68c9960fd5ded2b4b0.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-11-26virtio-net: avoid unnecessary checksum calculation on guest RXJon Kohler
Commit a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") inadvertently altered checksum offload behavior for guests not using UDP GSO tunneling. Before, tun_put_user called tun_vnet_hdr_from_skb, which passed has_data_valid = true to virtio_net_hdr_from_skb. After, tun_put_user began calling tun_vnet_hdr_tnl_from_skb instead, which passes has_data_valid = false into both call sites. This caused virtio hdr flags to not include VIRTIO_NET_HDR_F_DATA_VALID for SKBs where skb->ip_summed == CHECKSUM_UNNECESSARY. As a result, guests are forced to recalculate checksums unnecessarily. Restore the previous behavior by ensuring has_data_valid = true is passed in the !tnl_gso_type case, but only from tun side, as virtio_net_hdr_tnl_from_skb() is used also by the virtio_net driver, which in turn must not use VIRTIO_NET_HDR_F_DATA_VALID on tx. cc: stable@vger.kernel.org Fixes: a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") Signed-off-by: Jon Kohler <jon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20251125222754.1737443-1-jon@nutanix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26of: Add wrappers to match root node with OF device ID tablesKrzysztof Kozlowski
Several drivers duplicate same code for getting reference to the root node, matching it against 'struct of_device_id' table and getting out the match data from the table entry. There is a of_machine_compatible_match() wrapper but it takes array of strings, which is not suitable for many drivers since they want the driver data associated with each compatible. Add two wrappers, similar to existing of_device_get_match_data(): 1. of_machine_device_match() doing only matching against 'struct of_device_id' and returning bool. 2. of_machine_get_match_data() doing the matching and returning associated driver data for found compatible. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20251112-b4-of-match-matchine-data-v2-1-d46b72003fd6@linaro.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-11-26phy: add hwtstamp_get callback to phy driversVadim Fedorenko
PHY devices had lack of hwtstamp_get callback even though most of them are tracking configuration info. Introduce new call back to mii_timestamper. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-3-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26phy: rename hwtstamp callback to hwtstamp_setVadim Fedorenko
PHY devices has hwtstamp callback which actually performs set operation. Rename it to better reflect the action. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-2-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26ASoC: stm32: sai: fix device and OF node leaks onMark Brown
Merge series from Johan Hovold <johan@kernel.org>: This series fixes device and OF node reference leaks during probe and a clock prepare imbalance on probe failures. Included is a related cleanup of an error path.
2025-11-26libceph: drop started parameter of __ceph_open_session()Ilya Dryomov
With the previous commit revamping the timeout handling, started isn't used anymore. It could be taken into account by adjusting the initial value of the timeout, but there is little point as both callers capture the timestamp shortly before calling __ceph_open_session() -- the only thing of note that happens in the interim is taking client->mount_mutex and that isn't expected to take multiple seconds. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2025-11-26regulator: Use container_of_const() when all types areMark Brown
Merge series from Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>: Use container_of_const(), which is preferred over container_of(), when the argument 'ptr' and returned pointer are already const, for better code safety and readability. Some drivers already have const everywhere, so container_of_const can be directly used. In few other drivers, the final pointer can be constified that way.
2025-11-26socket: Split out a getsockname helper for io_uringGabriel Krisman Bertazi
Similar to getsockopt, split out a helper to check security and issue the operation from the main handler that can be used by io_uring. Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-26socket: Unify getsockname and getpeername implementationGabriel Krisman Bertazi
They are already implemented by the same get_name hook in the protocol level. Bring the unification one level up to reduce code duplication in preparation to supporting these as io_uring operations. Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-26function_graph: Enable funcgraph-args and funcgraph-retaddr to work ↵pengdonglin
simultaneously Currently, the funcgraph-args and funcgraph-retaddr features are mutually exclusive. This patch resolves this limitation by allowing funcgraph-retaddr to have an args array. To verify the change, use perf to trace vfs_write with both options enabled: Before: # perf ftrace -G vfs_write --graph-opts args,retaddr ...... down_read() { /* <-n_tty_write+0xa3/0x540 */ __cond_resched(); /* <-down_read+0x12/0x160 */ preempt_count_add(); /* <-down_read+0x3b/0x160 */ preempt_count_sub(); /* <-down_read+0x8b/0x160 */ } After: # perf ftrace -G vfs_write --graph-opts args,retaddr ...... down_read(sem=0xffff8880100bea78) { /* <-n_tty_write+0xa3/0x540 */ __cond_resched(); /* <-down_read+0x12/0x160 */ preempt_count_add(val=1); /* <-down_read+0x3b/0x160 */ preempt_count_sub(val=1); /* <-down_read+0x8b/0x160 */ } Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Xiaoqin Zhang <zhangxiaoqin@xiaomi.com> Link: https://patch.msgid.link/20251125093425.2563849-1-dolinux.peng@gmail.com Signed-off-by: pengdonglin <pengdonglin@xiaomi.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-26Merge branch 'iommufd_dmabuf' into k.o-iommufd/for-nextJason Gunthorpe
Jason Gunthorpe says: ==================== This series is the start of adding full DMABUF support to iommufd. Currently it is limited to only work with VFIO's DMABUF exporter. It sits on top of Leon's series to add a DMABUF exporter to VFIO: https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.com/ The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but otherwise works the same as it does today for a memfd. The user can select a slice of the FD to map into the ioas and if the underliyng alignment requirements are met it will be placed in the iommu_domain. Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR memory from VFIO to an iommu_domain controlled by iommufd. This is used for PCI Peer to Peer support in VMs, and is the last feature that the VFIO type 1 container has that iommufd couldn't do. The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime control and is a use-after-free security problem. Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there should be no access to the MMIO it can shoot down the mapping in iommufd which will unmap it from the iommu_domain. There is no automatic remap, this is a safety protocol so the kernel doesn't get stuck. Userspace is expected to know it is doing something that will revoke the dmabuf and map/unmap it around the activity. Eg when QEMU goes to issue FLR it should do the map/unmap to iommufd. Since DMABUF is missing some key general features for this use case it relies on a "private interconnect" between VFIO and iommufd via the vfio_pci_dma_buf_iommufd_map() call. The call confirms the DMABUF has revoke semantics and delivers a phys_addr for the memory suitable for use with iommu_map(). Medium term there is a desire to expand the supported DMABUFs to include GPU drivers to support DPDK/SPDK type use cases so future series will work to add a general concept of revoke and a general negotiation of interconnect to remove vfio_pci_dma_buf_iommufd_map(). I also plan another series to modify iommufd's vfio_compat to transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI of type1. The latest series for interconnect negotation to exchange a phys_addr is: https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com And the discussion for design of revoke is here: https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/ ==================== Based on a shared branch with vfio. * iommufd_dmabuf: iommufd/selftest: Add some tests for the dmabuf flow iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Allow MMIO pages in a batch iommufd: Allow a DMABUF to be revoked iommufd: Do not map/unmap revoked DMABUFs iommufd: Add DMABUF to iopt_pages vfio/pci: Add vfio_pci_dma_buf_iommufd_map() vfio/nvgrace: Support get_dmabuf_phys vfio/pci: Add dma-buf export support for MMIO regions vfio/pci: Enable peer-to-peer DMA transactions by default vfio/pci: Share the core device pointer while invoking feature functions vfio: Export vfio device get and put registration helpers dma-buf: provide phys_vec to scatter-gather mapping routine PCI/P2PDMA: Document DMABUF model PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation PCI/P2PDMA: Simplify bus address mapping API PCI/P2PDMA: Separate the mmap() support from the core logic Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-11-26drivers: hid: renegotiate resolution multipliers with device after resetBenedek Kupper
The scroll resolution multipliers are set in the context of hidinput_connect(), which is only called at probe time: when the host changes the value on the device with a SET_REPORT(FEATURE), and the device accepts it, these multipliers are stored on the host side, and used to calculate the final scroll event values sent to userspace. After a USB suspend, the resume operation on many hubs and chipsets involve a USB reset signal as well. A reset on the device side clears all previous state information, including the value of the multiplier report. This reset is not handled by the multiplier handling logic, so what ends up happening is the host is still expecting high-resolution scroll events, but the device is reset to default resolution, making the effective, user-perceived scroll speed incredibly slow. The solution is to renegotiate the multiplier selection after each reset. This is not the only bug related to the high-resolution scrolling implementation in the kernel (the other one is https://bugzilla.kernel.org/show_bug.cgi?id=220144), but for this one, there is no device side workaround for, leading to poor user experience with our product: https://github.com/UltimateHackingKeyboard/firmware/issues/1155 https://github.com/UltimateHackingKeyboard/firmware/issues/1261 https://github.com/UltimateHackingKeyboard/firmware/pull/1355 This patch was tested by an affected user and has been reported to fix the issue (see discussion in 1355). Signed-off-by: Benedek Kupper <kupper.benedek@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-11-26mod_devicetable: Bump auxiliary_device_id name sizeRaag Jadav
We have an upcoming driver named "intel_ehl_pse_io". This creates an auxiliary child device for it's GPIO sub-functionality, which matches against "intel_ehl_pse_io.gpio-elkhartlake" and overshoots the current maximum limit of 32 bytes for auxiliary device id string. Bump the size to 40 bytes to satisfy such cases. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251106052838.433673-1-raag.jadav@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-26sysfs: simplify attribute definition macrosThomas Weißschuh
Define the macros in terms of each other. This makes them easier to understand and also will make it easier to implement the transition machinery for 'const struct attribute'. __ATTR_RO_MODE() can't be implemented in terms of __ATTR() as not all attributes have a .store callback. The same issue theoretically exists for __ATTR_WO(), but practically that does not occur today. Reorder __ATTR_RO() below __ATTR_RO_MODE() to keep the order of the macro definition consistent with respect to each other. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251029-sysfs-const-attr-prep-v5-7-ea7d745acff4@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-26sysfs: attribute_group: enable const variants of is_visible()Thomas Weißschuh
When constifying instances of struct attribute, for consistency the corresponding .is_visible() callback should be adapted, too. Introduce a temporary transition mechanism until all callbacks are converted. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251029-sysfs-const-attr-prep-v5-4-ea7d745acff4@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>