summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-02-17net: phy: add PHY_EEE_CAP2_FEATURESHeiner Kallweit
As a prerequisite for adding EEE CAP2 register support, complement PHY_EEE_CAP1_FEATURES with PHY_EEE_CAP2_FEATURES. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-17net: mdio: add helpers for accessing the EEE CAP2 registersHeiner Kallweit
This adds helpers for accessing the EEE CAP2 registers. For now only 2500baseT and 5000baseT modes are supported. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()") 723de3ebef03 ("net: free altname using an RCU callback") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) c2da9408579d ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c bdd70eb68913 ("mptcp: drop the push_pending field") 28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-15Merge tag 'net-6.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from can, wireless and netfilter. Current release - regressions: - af_unix: fix task hung while purging oob_skb in GC - pds_core: do not try to run health-thread in VF path Current release - new code bugs: - sched: act_mirred: don't zero blockid when net device is being deleted Previous releases - regressions: - netfilter: - nat: restore default DNAT behavior - nf_tables: fix bidirectional offload, broken when unidirectional offload support was added - openvswitch: limit the number of recursions from action sets - eth: i40e: do not allow untrusted VF to remove administratively set MAC address Previous releases - always broken: - tls: fix races and bugs in use of async crypto - mptcp: prevent data races on some of the main socket fields, fix races in fastopen handling - dpll: fix possible deadlock during netlink dump operation - dsa: lan966x: fix crash when adding interface under a lag when some of the ports are disabled - can: j1939: prevent deadlock by changing j1939_socks_lock to rwlock Misc: - a handful of fixes and reliability improvements for selftests - fix sysfs documentation missing net/ in paths - finish the work of squashing the missing MODULE_DESCRIPTION() warnings in networking" * tag 'net-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits) net: fill in MODULE_DESCRIPTION()s for missing arcnet net: fill in MODULE_DESCRIPTION()s for mdio_devres net: fill in MODULE_DESCRIPTION()s for ppp net: fill in MODULE_DESCRIPTION()s for fddik/skfp net: fill in MODULE_DESCRIPTION()s for plip net: fill in MODULE_DESCRIPTION()s for ieee802154/fakelb net: fill in MODULE_DESCRIPTION()s for xen-netback net: ravb: Count packets instead of descriptors in GbEth RX path pppoe: Fix memory leak in pppoe_sendmsg() net: sctp: fix skb leak in sctp_inq_free() net: bcmasp: Handle RX buffer allocation failure net-timestamp: make sk_tskey more predictable in error path selftests: tls: increase the wait in poll_partial_rec_async ice: Add check for lport extraction to LAG init netfilter: nf_tables: fix bidirectional offload regression netfilter: nat: restore default DNAT behavior netfilter: nft_set_pipapo: fix missing : in kdoc igc: Remove temporary workaround igb: Fix string truncation warnings in igb_set_fw_version can: netlink: Fix TDCO calculation using the old data bittiming ...
2024-02-15update workarounds for gcc "asm goto" issueLinus Torvalds
In commit 4356e9f841f7 ("work around gcc bugs with 'asm goto' with outputs") I did the gcc workaround unconditionally, because the cause of the bad code generation wasn't entirely clear. In the meantime, Jakub Jelinek debugged the issue, and has come up with a fix in gcc [2], which also got backported to the still maintained branches of gcc-11, gcc-12 and gcc-13. Note that while the fix technically wasn't in the original gcc-14 branch, Jakub says: "while it is true that no GCC 14 snapshots until today (or whenever the fix will be committed) have the fix, for GCC trunk it is up to the distros to use the latest snapshot if they use it at all and would allow better testing of the kernel code without the workaround, so that if there are other issues they won't be discovered years later. Most userland code doesn't actually use asm goto with outputs..." so we will consider gcc-14 to be fixed - if somebody is using gcc snapshots of the gcc-14 before the fix, they should upgrade. Note that while the bug goes back to gcc-11, in practice other gcc changes seem to have effectively hidden it since gcc-12.1 as per a bisect by Jakub. So even a gcc-14 snapshot without the fix likely doesn't show actual problems. Also, make the default 'asm_goto_output()' macro mark the asm as volatile by hand, because of an unrelated gcc issue [1] where it doesn't match the documented behavior ("asm goto is always volatile"). Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979 [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 [2] Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Requested-by: Jakub Jelinek <jakub@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-15net: ipv6/addrconf: introduce a regen_min_advance sysctlAlex Henrie
In RFC 8981, REGEN_ADVANCE cannot be less than 2 seconds, and the RFC does not permit the creation of temporary addresses with lifetimes shorter than that: > When processing a Router Advertisement with a > Prefix Information option carrying a prefix for the purposes of > address autoconfiguration (i.e., the A bit is set), the host MUST > perform the following steps: > 5. A temporary address is created only if this calculated preferred > lifetime is greater than REGEN_ADVANCE time units. However, some users want to change their IPv6 address as frequently as possible regardless of the RFC's arbitrary minimum lifetime. For the benefit of those users, add a regen_min_advance sysctl parameter that can be set to below or above 2 seconds. Link: https://datatracker.ietf.org/doc/html/rfc8981 Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-15net: mdio_bus: make mdio_bus_type constRicardo B. Marliere
Since commit d492cc2573a0 ("driver core: device.h: make struct bus_type a const *"), the driver core can properly handle constant struct bus_type, move the mdio_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240213-bus_cleanup-mdio-v1-1-f9e799da7fda@marliere.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-14Merge tag 'mips-fixes_6.8_2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - Fix for broken ipv6 checksums - Fix handling of exceptions in delay slots * tag 'mips-fixes_6.8_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mm/memory: Use exception ip to search exception tables MIPS: Clear Cause.BD in instruction_pointer_set ptrace: Introduce exception_ip arch hook MIPS: Add 'memory' clobber to csum_ipv6_magic() inline assembler
2024-02-14net: remove dev_base_lockEric Dumazet
dev_base_lock is not needed anymore, all remaining users also hold RTNL. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-14net: add netdev_set_operstate() helperEric Dumazet
dev_base_lock is going away, add netdev_set_operstate() helper so that hsr does not have to know core internals. Remove dev_base_lock acquisition from rfc2863_policy() v3: use an "unsigned int" for dev->operstate, so that try_cmpxchg() can work on all arches. ( https://lore.kernel.org/oe-kbuild-all/202402081918.OLyGaea3-lkp@intel.com/ ) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-14net: convert dev->reg_state to u8Eric Dumazet
Prepares things so that dev->reg_state reads can be lockless, by adding WRITE_ONCE() on write side. READ_ONCE()/WRITE_ONCE() do not support bitfields. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-13veth: rely on skb_pp_cow_data utility routineLorenzo Bianconi
Rely on skb_pp_cow_data utility routine and remove duplicated code. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/029cc14cce41cb242ee7efdcf32acc81f1ce4e9f.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-13xdp: add multi-buff support for xdp running in generic modeLorenzo Bianconi
Similar to native xdp, do not always linearize the skb in netif_receive_generic_xdp routine but create a non-linear xdp_buff to be processed by the eBPF program. This allow to add multi-buffer support for xdp running in generic mode. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/1044d6412b1c3e95b40d34993fd5f37cd2f319fd.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-13xdp: rely on skb pointer reference in do_xdp_generic and ↵Lorenzo Bianconi
netif_receive_generic_xdp Rely on skb pointer reference instead of the skb pointer in do_xdp_generic and netif_receive_generic_xdp routine signatures. This is a preliminary patch to add multi-buff support for xdp running in generic mode where we will need to reallocate the skb to avoid linearization and we will need to make it visible to do_xdp_generic() caller. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c09415b1f48c8620ef4d76deed35050a7bddf7c2.1707729884.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-12ptrace: Introduce exception_ip arch hookJiaxun Yang
On architectures with delay slot, architecture level instruction pointer (or program counter) in pt_regs may differ from where exception was triggered. Introduce exception_ip hook to invoke architecture code and determine actual instruction pointer to the exception. Link: https://lore.kernel.org/lkml/00d1b813-c55f-4365-8d81-d70258e10b16@app.fastmail.com/ Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-02-12Merge tag 'vfs-6.8-rc5.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix performance regression introduced by moving the security permission hook out of do_clone_file_range() and into its caller vfs_clone_file_range(). This causes the security hook to be called in situation were it wasn't called before as the fast permission checks were left in do_clone_file_range(). Fix this by merging the two implementations back together and restoring the old ordering: fast permission checks first, expensive ones later. - Tweak mount_setattr() permission checking so that mount properties on the real rootfs can be changed. When we added mount_setattr() we added additional checks compared to legacy mount(2). If the mount had a parent then verify that the caller and the mount namespace the mount is attached to match and if not make sure that it's an anonymous mount. But the real rootfs falls into neither category. It is neither an anoymous mount because it is obviously attached to the initial mount namespace but it also obviously doesn't have a parent mount. So that means legacy mount(2) allows changing mount properties on the real rootfs but mount_setattr(2) blocks this. This causes regressions (See the commit for details). Fix this by relaxing the check. If the mount has a parent or if it isn't a detached mount, verify that the mount namespaces of the caller and the mount are the same. Technically, we could probably write this even simpler and check that the mount namespaces match if it isn't a detached mount. But the slightly longer check makes it clearer what conditions one needs to think about. * tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: relax mount_setattr() permission checks remap_range: merge do_clone_file_range() into vfs_clone_file_range()
2024-02-12net-device: move lstats in net_device_read_txrxEric Dumazet
dev->lstats is notably used from loopback ndo_start_xmit() and other virtual drivers. Per cpu stats updates are dirtying per-cpu data, but the pointer itself is read-only. Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-12tcp: move tp->tcp_usec_ts to tcp_sock_read_txrx groupEric Dumazet
tp->tcp_usec_ts is a read mostly field, used in rx and tx fast paths. Fixes: d5fed5addb2b ("tcp: reorganize tcp_sock fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Wei Wang <weiwan@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-12tcp: move tp->scaling_ratio to tcp_sock_read_txrx groupEric Dumazet
tp->scaling_ratio is a read mostly field, used in rx and tx fast paths. Fixes: d5fed5addb2b ("tcp: reorganize tcp_sock fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Wei Wang <weiwan@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-11Merge tag 'timers_urgent_for_v6.8_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Borislav Petkov: - Make sure a warning is issued when a hrtimer gets queued after the timers have been migrated on the CPU down path and thus said timer will get ignored * tag 'timers_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Report offline hrtimer enqueue
2024-02-10Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Update a potentially stale firmware attribute (Maurizio) - Fixes for the recent verbose error logging (Keith, Chaitanya) - Protection information payload size fix for passthrough (Francis) - Fix for a queue freezing issue in virtblk (Yi) - blk-iocost underflow fix (Tejun) - blk-wbt task detection fix (Jan) * tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux: virtio-blk: Ensure no requests in virtqueues before deleting vqs. blk-iocost: Fix an UBSAN shift-out-of-bounds warning nvme: use ns->head->pi_size instead of t10_pi_tuple structure size nvme-core: fix comment to reflect right functions nvme: move passthrough logging attribute to head blk-wbt: Fix detection of dirty-throttled tasks nvme-host: fix the updating of the firmware version
2024-02-10net: phy: provide whether link has changed in c37_read_statusChristian Marangi
Some PHY driver might require additional regs call after genphy_c37_read_status() is called. Expand genphy_c37_read_status to provide a bool wheather the link has changed or not to permit PHY driver to skip additional regs call if nothing has changed. Every user of genphy_c37_read_status() is updated with the new additional bool. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-10net: phy: add devm/of_phy_package_join helperChristian Marangi
Add devm/of_phy_package_join helper to join PHYs in a PHY package. These are variant of the manual phy_package_join with the difference that these will use DT nodes to derive the base_addr instead of manually passing an hardcoded value. An additional value is added in phy_package_shared, "np" to reference the PHY package node pointer in specific PHY driver probe_once and config_init_once functions to make use of additional specific properties defined in the PHY package node in DT. The np value is filled only with of_phy_package_join if a valid PHY package node is found. A valid PHY package node must have the node name set to "ethernet-phy-package". Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-09Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Some fscrypt-related fixups (sparse reads are used only for encrypted files) and two cap handling fixes from Xiubo and Rishabh" * tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client: ceph: always check dir caps asynchronously ceph: prevent use-after-free in encode_cap_msg() ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT libceph: just wait for more data to be available on the socket libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*() libceph: fail sparse-read if the data length doesn't match
2024-02-09work around gcc bugs with 'asm goto' with outputsLinus Torvalds
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09Merge tag 'efi-fixes-for-v6.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "The only notable change here is the patch that changes the way we deal with spurious errors from the EFI memory attribute protocol. This will be backported to v6.6, and is intended to ensure that we will not paint ourselves into a corner when we tighten this further in order to comply with MS requirements on signed EFI code. Note that this protocol does not currently exist in x86 production systems in the field, only in Microsoft's fork of OVMF, but it will be mandatory for Windows logo certification for x86 PCs in the future. - Tighten ELF relocation checks on the RISC-V EFI stub - Give up if the new EFI memory attributes protocol fails spuriously on x86 - Take care not to place the kernel in the lowest 16 MB of DRAM on x86 - Omit special purpose EFI memory from memblock - Some fixes for the CXL CPER reporting code - Make the PE/COFF layout of mixed-mode capable images comply with a strict interpretation of the spec" * tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section cxl/trace: Remove unnecessary memcpy's cxl/cper: Fix errant CPER prints for CXL events efi: Don't add memblocks for soft-reserved memory efi: runtime: Fix potential overflow of soft-reserved region size efi/libstub: Add one kernel-doc comment x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR x86/efistub: Give up if memory attribute protocol returns an error riscv/efistub: Tighten ELF relocation check riscv/efistub: Ensure GP-relative addressing is not used
2024-02-09wwan: core: Add WWAN fastboot port typeJinjian Song
Add a new WWAN port that connects to the device fastboot protocol interface. Signed-off-by: Jinjian Song <jinjian.song@fibocom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/stmicro/stmmac/common.h 38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters") fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF") c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio") drivers/net/wireless/microchip/wilc1000/netdev.c c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000") 328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-07Merge tag 'mlx5-updates-2024-02-01' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2024-02-01 1) IPSec global stats for xfrm and mlx5 2) XSK memory improvements for non-linear SKBs 3) Software steering debug dump to use seq_file ops 4) Various code clean-ups * tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: XDP, Exclude headroom and tailroom from memory calculations net/mlx5e: XSK, Exclude tailroom from non-linear SKBs memory calculations net/mlx5: DR, Change SWS usage to debug fs seq_file interface net/mlx5: Change missing SyncE capability print to debug net/mlx5: Remove initial segmentation duplicate definitions net/mlx5: Return specific error code for timeout on wait_fw_init net/mlx5: SF, Stop waiting for FW as teardown was called net/mlx5: remove fw reporter dump option for non PF net/mlx5: remove fw_fatal reporter dump option for non PF net/mlx5: Rename mlx5_sf_dev_remove Documentation: Fix counter name of mlx5 vnic reporter net/mlx5e: Delete obsolete IPsec code net/mlx5e: Connect mlx5 IPsec statistics with XFRM core xfrm: get global statistics from the offloaded device xfrm: generalize xdo_dev_state_update_curlft to allow statistics update ==================== Link: https://lore.kernel.org/r/20240206005527.1353368-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-07net: Do not return value from init_dummy_netdev()Amit Cohen
init_dummy_netdev() always returns zero and all the callers do not check the returned value. Set the function to not return value, as it is not really used today. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240205103022.440946-1-amcohen@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-07libceph: just wait for more data to be available on the socketXiubo Li
A short read may occur while reading the message footer from the socket. Later, when the socket is ready for another read, the messenger invokes all read_partial_*() handlers, including read_partial_sparse_msg_data(). The expectation is that read_partial_sparse_msg_data() would bail, allowing the messenger to invoke read_partial() for the footer and pick up where it left off. However read_partial_sparse_msg_data() violates that and ends up calling into the state machine in the OSD client. The sparse-read state machine assumes that it's a new op and interprets some piece of the footer as the sparse-read header and returns bogus extents/data length, etc. To determine whether read_partial_sparse_msg_data() should bail, let's reuse cursor->total_resid. Because once it reaches to zero that means all the extents and data have been successfully received in last read, else it could break out when partially reading any of the extents and data. And then osd_sparse_read() could continue where it left off. [ idryomov: changelog ] Link: https://tracker.ceph.com/issues/63586 Fixes: d396f89db39a ("libceph: add sparse read support to msgr1") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07libceph: fail sparse-read if the data length doesn't matchXiubo Li
Once this happens that means there have bugs. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-06net: phy: add helper phy_advertise_eee_allHeiner Kallweit
Per default phylib preserves the EEE advertising at the time of phy probing. The EEE advertising can be changed from user space, in addition this helper allows to set the EEE advertising to all supported modes from drivers in kernel space. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20bfc471-aeeb-4ae4-ba09-7d6d4be6b86a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-06blk-wbt: Fix detection of dirty-throttled tasksJan Kara
The detection of dirty-throttled tasks in blk-wbt has been subtly broken since its beginning in 2016. Namely if we are doing cgroup writeback and the throttled task is not in the root cgroup, balance_dirty_pages() will set dirty_sleep for the non-root bdi_writeback structure. However blk-wbt checks dirty_sleep only in the root cgroup bdi_writeback structure. Thus detection of recently throttled tasks is not working in this case (we noticed this when we switched to cgroup v2 and suddently writeback was slow). Since blk-wbt has no easy way to get to proper bdi_writeback and furthermore its intention has always been to work on the whole device rather than on individual cgroups, just move the dirty_sleep timestamp from bdi_writeback to backing_dev_info. That fixes the checking for recently throttled task and saves memory for everybody as a bonus. CC: stable@vger.kernel.org Fixes: b57d74aff9ab ("writeback: track if we're sleeping on progress in balance_dirty_pages()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240123175826.21452-1-jack@suse.cz [axboe: fixup indentation errors] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-06remap_range: merge do_clone_file_range() into vfs_clone_file_range()Amir Goldstein
commit dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()") moved the permission hooks from do_clone_file_range() out to its caller vfs_clone_file_range(), but left all the fast sanity checks in do_clone_file_range(). This makes the expensive security hooks be called in situations that they would not have been called before (e.g. fs does not support clone). The only reason for the do_clone_file_range() helper was that overlayfs did not use to be able to call vfs_clone_file_range() from copy up context with sb_writers lock held. However, since commit c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held"), overlayfs just uses an open coded version of vfs_clone_file_range(). Merge_clone_file_range() into vfs_clone_file_range(), restoring the original order of checks as it was before the regressing commit and adapt the overlayfs code to call vfs_clone_file_range() before the permission hooks that were added by commit ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()"). Note that in the merge of do_clone_file_range(), the file_start_write() context was reduced to cover ->remap_file_range() without holding it over the permission hooks, which was the reason for doing the regressing commit in the first place. Reported-and-tested-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com Fixes: dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20240202102258.1582671-1-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-06net: phy: constify phydev->drvRussell King (Oracle)
Device driver structures are shared between all devices that they match, and thus nothing should never write to the device driver structure through the phydev->drv pointer. Let's make this pointer const to catch code that attempts to do so. Suggested-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1rVxXt-002YqY-9G@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-06hrtimer: Report offline hrtimer enqueueFrederic Weisbecker
The hrtimers migration on CPU-down hotplug process has been moved earlier, before the CPU actually goes to die. This leaves a small window of opportunity to queue an hrtimer in a blind spot, leaving it ignored. For example a practical case has been reported with RCU waking up a SCHED_FIFO task right before the CPUHP_AP_IDLE_DEAD stage, queuing that way a sched/rt timer to the local offline CPU. Make sure such situations never go unnoticed and warn when that happens. Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240129235646.3171983-4-boqun.feng@gmail.com
2024-02-05net/mlx5: Remove initial segmentation duplicate definitionsGal Pressman
Device definitions belong in mlx5_ifc, remove the duplicates in mlx5_core.h. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-02-05xfrm: generalize xdo_dev_state_update_curlft to allow statistics updateLeon Romanovsky
In order to allow drivers to fill all statistics, change the name of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats. Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-02-04net: make dev_unreg_count globalEric Dumazet
We can use a global dev_unreg_count counter instead of a per netns one. As a bonus we can factorize the changes done on it for bulk device removals. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-04tun: Fix code style issues in <linux/if_tun.h>Yunjian Wang
This fixes the following code style problem: - WARNING: please, no spaces at the start of a line - CHECK: Please use a blank line after function/struct/union/enum declarations Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-04Merge tag 'dmaengine-fix-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Core: - fix return value of is_slave_direction() for D2D dma Driver fixes for: - Documentaion fixes to resolve warnings for at_hdmac driver - bunch of fsl driver fixes for memory leaks, and useless kfree - TI edma and k3 fixes for packet error and null pointer checks" * tag 'dmaengine-fix-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: at_hdmac: add missing kernel-doc style description dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV dmaengine: fsl-qdma: Remove a useless devm_kfree() dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA dmaengine: ti: k3-udma: Report short packet errors dmaengine: ti: edma: Add some null pointer checks to the edma_probe dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools dmaengine: at_hdmac: fix some kernel-doc warnings
2024-02-03cxl/cper: Fix errant CPER prints for CXL eventsIra Weiny
Jonathan reports that CXL CPER events dump an extra generic error message. {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 {1}[Hardware Error]: event severity: recoverable {1}[Hardware Error]: Error 0, type: recoverable {1}[Hardware Error]: section type: unknown, fbcd0a77-c260-417f-85a9-088b1621eba6 {1}[Hardware Error]: section length: 0x90 {1}[Hardware Error]: 00000000: 00000090 00000007 00000000 0d938086 ................ {1}[Hardware Error]: 00000010: 00100000 00000000 00040000 00000000 ................ ... CXL events were rerouted though the CXL subsystem for additional processing. However, when that work was done it was missed that cper_estatus_print_section() continued with a generic error message which is confusing. Teach CPER print code to ignore printing details of some section types. Assign the CXL event GUIDs to this set to prevent confusing unknown prints. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-02-02Merge tag 'pci-v6.8-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Fix a potential deadlock that was reintroduced by an ASPM revert merged for v6.8 (Johan Hovold) - Add Manivannan Sadhasivam as PCI Endpoint maintainer (Lorenzo Pieralisi) * tag 'pci-v6.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Manivannan Sadhasivam as PCI Endpoint maintainer PCI/ASPM: Fix deadlock when enabling ASPM
2024-02-02Merge tag 'block-6.8-2024-02-01' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Remove duplicated enums (Guixen) - Use appropriate controller state accessors (Keith) - Retryable authentication (Hannes) - Add missing module descriptions (Chaitanya) - Fibre-channel fixes for blktests (Daniel) - Various type correctness updates (Caleb) - Improve fabrics connection debugging prints (Nitin) - Passthrough command verbose error logging (Adam) - Fix for where we set IO priority in the bio for drivers that use fops->submit_bio() to queue IO, like md/dm etc. * tag 'block-6.8-2024-02-01' of git://git.kernel.dk/linux: (32 commits) block: Fix where bio IO priority gets set nvme: allow passthru cmd error logging nvme-fc: show hostnqn when connecting to fc target nvme-rdma: show hostnqn when connecting to rdma target nvme-tcp: show hostnqn when connecting to tcp target nvmet-fc: use RCU list iterator for assoc_list nvmet-fc: take ref count on tgtport before delete assoc nvmet-fc: avoid deadlock on delete association path nvmet-fc: abort command when there is no binding nvmet-fc: do not tack refs on tgtports from assoc nvmet-fc: remove null hostport pointer check nvmet-fc: hold reference on hostport match nvmet-fc: free queue and assoc directly nvmet-fc: defer cleanup using RCU properly nvmet-fc: release reference on target port nvmet-fcloop: swap the list_add_tail arguments nvme-fc: do not wait in vain when unloading module nvme-fc: log human-readable opcode on timeout nvme: split out fabrics version of nvme_opcode_str() nvme: take const cmd pointer in read-only helpers ...
2024-02-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01Merge tag 'net-6.8-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. As Paolo promised we continue to hammer out issues in our selftests. This is not the end but probably the peak. Current release - regressions: - smc: fix incorrect SMC-D link group matching logic Current release - new code bugs: - eth: bnxt: silence WARN() when device skips a timestamp, it happens Previous releases - regressions: - ipmr: fix null-deref when forwarding mcast packets - conntrack: evaluate window negotiation only for packets in the REPLY direction, otherwise SYN retransmissions trigger incorrect window scale negotiation - ipset: fix performance regression in swap operation Previous releases - always broken: - tcp: add sanity checks to types of pages getting into the rx zerocopy path, we only support basic NIC -> user, no page cache pages etc. - ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv() - nt_tables: more input sanitization changes - dsa: mt7530: fix 10M/100M speed on MediaTek MT7988 switch - bridge: mcast: fix loss of snooping after long uptime, jiffies do wrap on 32bit - xen-netback: properly sync TX responses, protect with locking - phy: mediatek-ge-soc: sync calibration values with MediaTek SDK, increase connection stability - eth: pds: fixes for various teardown, and reset races Misc: - hsr: silence WARN() if we can't alloc supervision frame, it happens" * tag 'net-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) doc/netlink/specs: Add missing attr in rt_link spec idpf: avoid compiler padding in virtchnl2_ptype struct selftests: mptcp: join: stop transfer when check is done (part 2) selftests: mptcp: join: stop transfer when check is done (part 1) selftests: mptcp: allow changing subtests prefix selftests: mptcp: decrease BW in simult flows selftests: mptcp: increase timeout to 30 min selftests: mptcp: add missing kconfig for NF Mangle selftests: mptcp: add missing kconfig for NF Filter in v6 selftests: mptcp: add missing kconfig for NF Filter mptcp: fix data re-injection from stale subflow selftests: net: enable some more knobs selftests: net: add missing config for NF_TARGET_TTL selftests: forwarding: List helper scripts in TEST_FILES Makefile variable selftests: net: List helper scripts in TEST_FILES Makefile variable selftests: net: Remove executable bits from library scripts selftests: bonding: Check initial state selftests: team: Add missing config options hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove xen-netback: properly sync TX responses ...
2024-02-01Merge tag 'hid-for-linus-2024020101' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - cleanups in the error path in hid-steam (Dan Carpenter) - fixes for Wacom tablets selftests that sneaked in while the CI was taking a break during the year end holidays (Benjamin Tissoires) - null pointer check in nvidia-shield (Kunwu Chan) - memory leak fix in hidraw (Su Hui) - another null pointer fix in i2c-hid-of (Johan Hovold) - another memory leak fix in HID-BPF this time, as well as a double fdget() fix reported by Dan Carpenter (Benjamin Tissoires) - fix for Cirque touchpad when they go on suspend (Kai-Heng Feng) - new device ID in hid-logitech-hidpp: "Logitech G Pro X SuperLight 2" (Jiri Kosina) * tag 'hid-for-linus-2024020101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: bpf: use __bpf_kfunc instead of noinline HID: bpf: actually free hdev memory after attaching a HID-BPF program HID: bpf: remove double fdget() HID: i2c-hid-of: fix NULL-deref on failed power up HID: hidraw: fix a problem of memory leak in hidraw_release() HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend HID: nvidia-shield: Add missing null pointer checks to LED initialization HID: logitech-hidpp: add support for Logitech G Pro X Superlight 2 selftests/hid: wacom: fix confidence tests HID: hid-steam: Fix cleanup in probe() HID: hid-steam: remove pointless error message
2024-02-01Merge tag 'lsm-pr-20240131' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fixes from Paul Moore: "Two small patches to fix some problems relating to LSM hook return values and how the individual LSMs interact" * tag 'lsm-pr-20240131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: fix default return value of the socket_getpeersec_*() hooks lsm: fix the logic in security_inode_getsecctx()
2024-02-01Merge tag 'nf-24-01-31' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) TCP conntrack now only evaluates window negotiation for packets in the REPLY direction, from Ryan Schaefer. Otherwise SYN retransmissions trigger incorrect window scale negotiation. From Ryan Schaefer. 2) Restrict tunnel objects to NFPROTO_NETDEV which is where it makes sense to use this object type. 3) Fix conntrack pick up from the middle of SCTP_CID_SHUTDOWN_ACK packets. From Xin Long. 4) Another attempt from Jozsef Kadlecsik to address the slow down of the swap command in ipset. 5) Replace a BUG_ON by WARN_ON_ONCE in nf_log, and consolidate check for the case that the logger is NULL from the read side lock section. 6) Address lack of sanitization for custom expectations. Restrict layer 3 and 4 families to what it is supported by userspace. * tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger netfilter: ipset: fix performance regression in swap operation netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV netfilter: conntrack: correct window scaling with retransmitted SYN ==================== Link: https://lore.kernel.org/r/20240131225943.7536-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>