summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2026-04-12net_sched: fix skb memory leak in deferred qdisc dropsFernando Fernandez Mancera
When the network stack cleans up the deferred list via qdisc_run_end(), it operates on the root qdisc. If the root qdisc do not implement the TCQ_F_DEQUEUE_DROPS flag the packets queue to free are never freed and gets stranded on the child's local to_free list. Fix this by making qdisc_dequeue_drop() aware of the root qdisc. It fetches the root qdisc and check for the TCQ_F_DEQUEUE_DROPS flag. If the flag is present, the packet is appended directly to the root's to_free list. Otherwise, drop it directly as it was done before the optimization was implemented. Fixes: a6efc273ab82 ("net_sched: use qdisc_dequeue_drop() in cake, codel, fq_codel") Reported-by: Damilola Bello <damilola@aterlo.com> Closes: https://lore.kernel.org/netdev/CAPgFtOLaedBMU0f_BxV2bXftTJSmJr018Q5uozOo5vVo6b9tjw@mail.gmail.com/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260408100044.4530-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12netlink: add a nla_nest_end_safe() helperHangbin Liu
The nla_len field in struct nlattr is a __u16, which can only hold values up to 65535. If a nested attribute grows beyond this limit, nla_nest_end() silently truncates the length, producing a corrupted netlink message with no indication of the problem. Since nla_nest_end() is used everywhere and this issue rarely happens, let's add a new helper to check the length. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-b4-ynl_ethtool-v2-4-7623a5e8f70b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12net: tso: Introduce tso_dma_map and helpersJoe Damato
Add struct tso_dma_map to tso.h for tracking DMA addresses of mapped GSO payload data and tso_dma_map_completion_state. The tso_dma_map combines DMA mapping storage with iterator state, allowing drivers to walk pre-mapped DMA regions linearly. Includes fields for the DMA IOVA path (iova_state, iova_offset, total_len) and a fallback per-region path (linear_dma, frags[], frag_idx, offset). The tso_dma_map_completion_state makes the IOVA completion state opaque for drivers. Drivers are expected to allocate this and use the added helpers to update the completion state. Adds skb_frag_phys() to skbuff.h, returning the physical address of a paged fragment's data, which is used by the tso_dma_map helpers introduced in this commit described below. The added TSO DMA map helpers are: tso_dma_map_init(): DMA-maps the linear payload region and all frags upfront. Prefers the DMA IOVA API for a single contiguous mapping with one IOTLB sync; falls back to per-region dma_map_phys() otherwise. Returns 0 on success, cleans up partial mappings on failure. tso_dma_map_cleanup(): Handles both IOVA and fallback teardown paths. tso_dma_map_count(): counts how many descriptors the next N bytes of payload will need. Returns 1 if IOVA is used since the mapping is contiguous. tso_dma_map_next(): yields the next (dma_addr, chunk_len) pair. On the IOVA path, each segment is a single contiguous chunk. On the fallback path, indicates when a chunk starts a new DMA mapping so the driver can set dma_unmap_len on that descriptor for completion-time unmapping. tso_dma_map_completion_save(): updates the completion state. Drivers will call this at xmit time. tso_dma_map_complete(): tears down the mapping at completion time and returns true if the IOVA path was used. If it was not used, this is a no-op and returns false. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260408230607.2019402-2-joe@dama.to Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12Merge tag 'nf-next-26-04-10' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter: updates for net-next 1-3) IPVS updates from Julian Anastasov to enhance visibility into IPVS internal state by exposing hash size, load factor etc and allows userspace to tune the load factor used for resizing hash tables. 4) reject empty/not nul terminated device names from xt_physdev. This isn't a bug fix; existing code doesn't require a c-string. But clean this up anyway because conceptually the interface name definitely should be a c-string. 5) Switch nfnetlink to skb_mac_header helpers that didn't exist back when this code was written. This gives us additional debug checks but is not intended to change functionality. 6) Let the xt ttl/hoplimit match reject unknown operator modes. This is a cleanup, the evaluation function simply returns false when the mode is out of range. From Marino Dzalto. 7) xt_socket match should enable defrag after all other checks. This bug is harmless, historically defrag could not be disabled either except by rmmod. 8) remove UDP-Lite conntrack support, from Fernando Fernandez Mancera. 9) Avoid a couple -Wflex-array-member-not-at-end warnings in the old xtables 32bit compat code, from Gustavo A. R. Silva. 10) nftables fwd expression should drop packets when their ttl/hl has expired. This is a bug fix deferred, its not deemed important enough for -rc8. 11) Add additional checks before assuming the mac header is an ethernet header, from Zhengchuan Liang. * tag 'nf-next-26-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: require Ethernet MAC header before using eth_hdr() netfilter: nft_fwd_netdev: check ttl/hl before forwarding netfilter: x_tables: Avoid a couple -Wflex-array-member-not-at-end warnings netfilter: conntrack: remove UDP-Lite conntrack support netfilter: xt_socket: enable defrag after all other checks netfilter: xt_HL: add pr_fmt and checkentry validation netfilter: nfnetlink: prefer skb_mac_header helpers netfilter: x_physdev: reject empty or not-nul terminated device names ipvs: add conn_lfactor and svc_lfactor sysctl vars ipvs: add ip_vs_status info ipvs: show the current conn_tab size to users ==================== Link: https://patch.msgid.link/20260410112352.23599-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12Merge tag 'wireless-next-2026-04-10' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Final updates, notably: - crypto: move Michael MIC code into wireless (only) - mac80211: - multi-link 4-addr support - NAN data support (but no drivers yet) - ath10k: DT quirk to make it work on some devices - ath12k: IPQ5424 support - rtw89: USB improvements for performance * tag 'wireless-next-2026-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (124 commits) wifi: cfg80211: Explicitly include <linux/export.h> in michael-mic.c wifi: ath10k: Add device-tree quirk to skip host cap QMI requests dt-bindings: wireless: ath10k: Add quirk to skip host cap QMI requests crypto: Remove michael_mic from crypto_shash API wifi: ipw2x00: Use michael_mic() from cfg80211 wifi: ath12k: Use michael_mic() from cfg80211 wifi: ath11k: Use michael_mic() from cfg80211 wifi: mac80211, cfg80211: Export michael_mic() and move it to cfg80211 wifi: ipw2x00: Rename michael_mic() to libipw_michael_mic() wifi: libertas_tf: refactor endpoint lookup wifi: libertas: refactor endpoint lookup wifi: at76c50x: refactor endpoint lookup wifi: ath12k: Enable IPQ5424 WiFi device support wifi: ath12k: Add CE remap hardware parameters for IPQ5424 wifi: ath12k: add ath12k_hw_regs for IPQ5424 wifi: ath12k: add ath12k_hw_version_map entry for IPQ5424 wifi: ath12k: Add ath12k_hw_params for IPQ5424 dt-bindings: net: wireless: add ath12k wifi device IPQ5424 wifi: ath10k: fix station lookup failure during disconnect wifi: ath12k: Create symlink for each radio in a wiphy ... ==================== Link: https://patch.msgid.link/20260410064703.735099-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12tcp: add indirect call wrapper in tcp_conn_request()Eric Dumazet
Small improvement in SYN processing, to directly call tcp_v6_init_seq_and_ts_off() or tcp_v4_init_seq_and_ts_off(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260410174950.745670-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12tcp: return a drop_reason from tcp_add_backlog()Eric Dumazet
Part of a stack canary removal from tcp_v{4,6}_rcv(). Return a drop_reason instead of a boolean, so that we no longer have to pass the address of a local variable. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-37 (-37) Function old new delta tcp_v6_rcv 3133 3129 -4 tcp_v4_rcv 3206 3202 -4 tcp_add_backlog 1281 1252 -29 Total: Before=25567186, After=25567149, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260409101147.1642967-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10Merge tag 'vfs-7.0-rc8.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "The kernfs rbtree is keyed by (hash, ns, name) where the hash is seeded with the raw namespace pointer via init_name_hash(ns). The resulting hash values are exposed to userspace through readdir seek positions, and the pointer-based ordering in kernfs_name_compare() is observable through entry order. Switch from raw pointers to ns_common::ns_id for both hashing and comparison. A preparatory commit first replaces all const void * namespace parameters with const struct ns_common * throughout kernfs, sysfs, and kobject so the code can access ns->ns_id. Also compare the ns_id when hashes match in the rbtree to handle crafted collisions. Also fix eventpoll RCU grace period issue and a cachefiles refcount problem" * tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: kernfs: make directory seek namespace-aware kernfs: use namespace id instead of pointer for hashing and comparison kernfs: pass struct ns_common instead of const void * for namespace tags eventpoll: defer struct eventpoll free to RCU grace period cachefiles: fix incorrect dentry refcount in cachefiles_cull()
2026-04-10netfilter: conntrack: remove UDP-Lite conntrack supportFernando Fernandez Mancera
UDP-Lite (RFC 3828) socket support was recently retired from the core networking stack. As a follow-up of that, drop the connection tracker and NAT support for UDP-Lite in Netfilter. This patch removes CONFIG_NF_CT_PROTO_UDPLITE and scrubs UDP-Lite awareness from the conntrack core, NAT core, nft_ct, and ctnetlink. Please note that stateless packet inspection, matching, ipsets or logging support for IPPROTO_UDPLITE is preserved. As conntrack no longer extracts UDP-Lite ports or tracks its L4 state, when performing NAT the UDP-Lite checksum cannot be updated anymore. That is an expected and acceptable consequence of removing UDP-Lite conntrack module. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-09net: remove the netif_get_rx_queue_lease_locked() helpersJakub Kicinski
The netif_get_rx_queue_lease_locked() API hides the locking and the descend onto the leased queue. Making the code harder to follow (at least to me). Remove the API and open code the descend a bit. Most of the code now looks like: if (!leased) return __helper(x); hw_rxq = .. netdev_lock(hw_rxq->dev); ret = __helper(x); netdev_unlock(hw_rxq->dev); return ret; Of course if we have more code paths that need the wrapping we may need to revisit. For now, IMHO, having to know what netif_get_rx_queue_lease_locked() does is not worth the 20LoC it saves. Link: https://patch.msgid.link/20260408151251.72bd2482@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'Jakub Kicinski
Daniel Borkmann says: ==================== netkit: Support for io_uring zero-copy and AF_XDP Containers use virtual netdevs to route traffic from a physical netdev in the host namespace. They do not have access to the physical netdev in the host and thus can't use memory providers or AF_XDP that require reconfiguring/restarting queues in the physical netdev. This patchset adds the concept of queue leasing to virtual netdevs that allow containers to use memory providers and AF_XDP at native speed. Leased queues are bound to a real queue in a physical netdev and act as a proxy. Memory providers and AF_XDP operations take an ifindex and queue id, so containers would pass in an ifindex for a virtual netdev and a queue id of a leased queue, which then gets proxied to the underlying real queue. We have implemented support for this concept in netkit and tested the latter against Nvidia ConnectX-6 (mlx5) as well as Broadcom BCM957504 (bnxt_en) 100G NICs. For more details see the individual patches. ==================== Link: https://patch.msgid.link/20260402231031.447597-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: Proxy netdev_queue_get_dma_dev for leased queuesDavid Wei
Extend netdev_queue_get_dma_dev to return the physical device of the real rxq for DMA in case the queue was leased. This allows memory providers like io_uring zero-copy or devmem to bind to the physically leased rxq via virtual devices such as netkit. Signed-off-by: David Wei <dw@davidwei.uk> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-8-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: Slightly simplify net_mp_{open,close}_rxqDaniel Borkmann
net_mp_open_rxq is currently not used in the tree as all callers are using __net_mp_open_rxq directly, and net_mp_close_rxq is only used once while all other locations use __net_mp_close_rxq. Consolidate into a single API, netif_mp_{open,close}_rxq, using the netif_ prefix to indicate that the caller is responsible for locking. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-6-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: Add lease info to queue-get responseDaniel Borkmann
Populate nested lease info to the queue-get response that returns the ifindex, queue id with type and optionally netns id if the device resides in a different netns. Example with ynl client when using AF_XDP via queue leasing: # ip a [...] 4: enp10s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:24 qdisc mq state UP group default qlen 1000 link/ether e8:eb:d3:a3:43:f6 brd ff:ff:ff:ff:ff:ff inet 10.0.0.2/24 scope global enp10s0f0np0 valid_lft forever preferred_lft forever inet6 fe80::eaeb:d3ff:fea3:43f6/64 scope link proto kernel_ll valid_lft forever preferred_lft forever [...] # ethtool -i enp10s0f0np0 driver: mlx5_core [...] # ynl --family netdev --output-json --do queue-get \ --json '{"ifindex": 4, "id": 15, "type": "rx"}' {'id': 15, 'ifindex': 4, 'lease': {'ifindex': 8, 'netns-id': 0, 'queue': {'id': 1, 'type': 'rx'}}, 'napi-id': 8227, 'type': 'rx', 'xsk': {}} # ip netns list foo (id: 0) # ip netns exec foo ip a [...] 8: nk@NONE: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff inet6 fe80::200:ff:fe00:0/64 scope link proto kernel_ll valid_lft forever preferred_lft forever [...] # ip netns exec foo ethtool -i nk driver: netkit [...] # ip netns exec foo ls /sys/class/net/nk/queues/ rx-0 rx-1 tx-0 # ip netns exec foo ynl --family netdev --output-json --do queue-get \ --json '{"ifindex": 8, "id": 1, "type": "rx"}' {"id": 1, "type": "rx", "ifindex": 8, "xsk": {}} Note that the caller of netdev_nl_queue_fill_one() holds the netdevice lock. For the queue-get we do not lock both devices. When queues get {un,}leased, both devices are locked, thus if __netif_get_rx_queue_lease() returns a lease pointer, it points to a valid device. The netns-id is fetched via peernet2id_alloc() similarly as done in OVS. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-4-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09net: Implement netdev_nl_queue_create_doitDaniel Borkmann
Implement netdev_nl_queue_create_doit which creates a new rx queue in a virtual netdev and then leases it to a rx queue in a physical netdev. Example with ynl client: # ynl --family netdev --output-json --do queue-create \ --json '{"ifindex": 8, "type": "rx", "lease": {"ifindex": 4, "queue": {"type": "rx", "id": 15}}}' {'id': 1} Note that the netdevice locking order is always from the virtual to the physical device. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-3-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-7.0-rc8). Conflicts: net/ipv6/seg6_iptunnel.c c3812651b522f ("seg6: separate dst_cache for input and output paths in seg6 lwtunnel") 78723a62b969a ("seg6: add per-route tunnel source address") https://lore.kernel.org/adZhwtOYfo-0ImSa@sirena.org.uk net/ipv4/icmp.c fde29fd934932 ("ipv4: icmp: fix null-ptr-deref in icmp_build_probe()") d98adfbdd5c01 ("ipv4: drop ipv6_stub usage and use direct function calls") https://lore.kernel.org/adO3dccqnr6j-BL9@sirena.org.uk Adjacent changes: drivers/net/ethernet/stmicro/stmmac/chain_mode.c 51f4e090b9f8 ("net: stmmac: fix integer underflow in chain mode") 6b4286e05508 ("net: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-09RDMA/mana_ib: Support memory windowsKonstantin Taranov
Implement .alloc_mw() and .dealloc_mw() for mana device. This is just the basic infrastructure, MW is not practically usable until additional kernel support for allowing user space to submit MW work requests is completed. Link: https://patch.msgid.link/r/20260331090851.2276205-1-kotaranov@linux.microsoft.com Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-04-09kernfs: pass struct ns_common instead of const void * for namespace tagsChristian Brauner
kernfs has historically used const void * to pass around namespace tags used for directory-level namespace filtering. The only current user of this is sysfs network namespace tagging where struct net pointers are cast to void *. Replace all const void * namespace parameters with const struct ns_common * throughout the kernfs, sysfs, and kobject namespace layers. This includes the kobj_ns_type_operations callbacks, kobject_namespace(), and all sysfs/kernfs APIs that accept or return namespace tags. Passing struct ns_common is needed because various codepaths require access to the underlying namespace. A struct ns_common can always be converted back to the concrete namespace type (e.g., struct net) via container_of() or to_ns_common() in the reverse direction. This is a preparatory change for switching to ns_id-based directory iteration to prevent a KASLR pointer leak through the current use of raw namespace pointers as hash seeds and comparison keys. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-08devlink: Add port-level resource registration infrastructureOr Har-Toov
The current devlink resource infrastructure supports only device-level resources. Some hardware resources are associated with specific ports rather than the entire device, and today we have no way to show resource per-port. Add support for registering resources at the port level. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08devlink: Refactor resource functions to be genericOr Har-Toov
Currently the resource functions take devlink pointer as parameter and take the resource list from there. Allow resource functions to work with other resource lists that will be added in next patches and not only with the devlink's resource list. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08net: dropreason: add MACVLAN_BROADCAST_BACKLOG and IPVLAN_MULTICAST_BACKLOGEric Dumazet
ipvlan and macvlan use queues to process broadcast/multicast packets from a work queue. Under attack these queues can drop packets. Add MACVLAN_BROADCAST_BACKLOG drop_reason for macvlan broadcast queue. Add IPVLAN_MULTICAST_BACKLOG drop_reason for ipvlan multicast queue. Use different reasons as some deployments use both ipvlan and macvlan. Also change ipvlan_rcv_frame() to use SKB_DROP_REASON_DEV_READY when the device is not UP. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407150710.1640747-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08codel: annotate data-races in codel_dump_stats()Eric Dumazet
codel_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_codel_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. No change in kernel size: $ scripts/bloat-o-meter -t vmlinux.0 vmlinux add/remove: 0/0 grow/shrink: 1/1 up/down: 3/-1 (2) Function old new delta codel_qdisc_dequeue 2462 2465 +3 codel_dump_stats 250 249 -1 Total: Before=29739919, After=29739921, chg +0.00% Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407143053.1570620-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08bonding: remove unused bond_is_first_slave and bond_is_last_slave macrosXiang Mei
Since commit 2884bf72fb8f ("net: bonding: fix use-after-free in bond_xmit_broadcast()"), bond_is_last_slave() was only used in bond_xmit_broadcast(). After the recent fix replaced that usage with a simple index comparison, bond_is_last_slave() has no remaining callers. bond_is_first_slave() likewise has no callers. Remove both unused macros. Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260404220412.444753-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08netfilter: nfnetlink_queue: make hash table per queueFlorian Westphal
Sharing a global hash table among all queues is tempting, but it can cause crash: BUG: KASAN: slab-use-after-free in nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] [..] nfqnl_recv_verdict+0x11ac/0x15e0 [nfnetlink_queue] nfnetlink_rcv_msg+0x46a/0x930 kmem_cache_alloc_node_noprof+0x11e/0x450 struct nf_queue_entry is freed via kfree, but parallel cpu can still encounter such an nf_queue_entry when walking the list. Alternative fix is to free the nf_queue_entry via kfree_rcu() instead, but as we have to alloc/free for each skb this will cause more mem pressure. Cc: Scott Mitchell <scott.k.mitch1@gmail.com> Fixes: e19079adcd26 ("netfilter: nfnetlink_queue: optimize verdict lookup with hash table") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nft_ct: fix use-after-free in timeout object destroyTuan Do
nft_ct_timeout_obj_destroy() frees the timeout object with kfree() immediately after nf_ct_untimeout(), without waiting for an RCU grace period. Concurrent packet processing on other CPUs may still hold RCU-protected references to the timeout object obtained via rcu_dereference() in nf_ct_timeout_data(). Add an rcu_head to struct nf_ct_timeout and use kfree_rcu() to defer freeing until after an RCU grace period, matching the approach already used in nfnetlink_cttimeout.c. KASAN report: BUG: KASAN: slab-use-after-free in nf_conntrack_tcp_packet+0x1381/0x29d0 Read of size 4 at addr ffff8881035fe19c by task exploit/80 Call Trace: nf_conntrack_tcp_packet+0x1381/0x29d0 nf_conntrack_in+0x612/0x8b0 nf_hook_slow+0x70/0x100 __ip_local_out+0x1b2/0x210 tcp_sendmsg_locked+0x722/0x1580 __sys_sendto+0x2d8/0x320 Allocated by task 75: nft_ct_timeout_obj_init+0xf6/0x290 nft_obj_init+0x107/0x1b0 nf_tables_newobj+0x680/0x9c0 nfnetlink_rcv_batch+0xc29/0xe00 Freed by task 26: nft_obj_destroy+0x3f/0xa0 nf_tables_trans_destroy_work+0x51c/0x5c0 process_one_work+0x2c4/0x5a0 Fixes: 7e0b2b57f01d ("netfilter: nft_ct: add ct timeout support") Cc: stable@vger.kernel.org Signed-off-by: Tuan Do <tuan@calif.io> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use itPablo Neira Ayuso
Add a new helper function to retrieve the next action entry in flow rule, check if the maximum number of actions is reached, bail out in such case. Replace existing opencoded iteration on the action array by this helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-08netfilter: nft_meta: add double-tagged vlan and pppoe supportPablo Neira Ayuso
Currently: add rule netdev x y ip saddr 1.1.1.1 does not work with neither double-tagged vlan nor pppoe packets. This is because the network and transport header offset are not pointing to the IP and transport protocol headers in the stack. This patch expands NFT_META_PROTOCOL and NFT_META_L4PROTO to parse double-tagged vlan and pppoe packets so matching network and transport header fields becomes possible with the existing userspace generated bytecode. Note that this parser only supports double-tagged vlan which is composed of vlan offload + vlan header in the skb payload area for simplicity. NFT_META_PROTOCOL is used by bridge and netdev family as an implicit dependency in the bytecode to match on network header fields. Similarly, there is also NFT_META_L4PROTO, which is also used as an implicit dependency when matching on the transport protocol header fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-07net: pull headers in qdisc_pkt_len_segs_init()Eric Dumazet
Most ndo_start_xmit() methods expects headers of gso packets to be already in skb->head. net/core/tso.c users are particularly at risk, because tso_build_hdr() does a memcpy(hdr, skb->data, hdr_len); qdisc_pkt_len_segs_init() already does a dissection of gso packets. Use pskb_may_pull() instead of skb_header_pointer() to make sure drivers do not have to reimplement this. Some malicious packets could be fed, detect them so that we can drop them sooner with a new SKB_DROP_REASON_SKB_BAD_GSO drop_reason. Fixes: e876f208af18 ("net: Add a software TSO helper API") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260403221540.3297753-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07wifi: mac80211: add NAN peer schedule supportMiri Korenblit
Peer schedules specify which channels the peer is available on and when. Add support for configuring peer NAN schedules: - build and store the schedule and maps - for each channel, make sure that it fits into the capabilities, and take the minimum between it and the local compatible nan channel. - configure the driver Note that the removal of a peer schedule should be done by the driver upon NMI station removal. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260326121156.185ff2283fa6.I0345eb665be8ccf4a77eb1aca9a421eb8d2432e2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-07wifi: mac80211: support NAN stationsMiri Korenblit
Add support for both NMI and NDI stations. The NDI station will be linked to the NMI station of the NAN peer for which the NDI station is added. A peer can choose to reuse its NMI address as the NDI address. Since different keys might be in use for NAN management and for data frames, we will have 2 different stations, even if they'll have the same address. Even though there are no links in NAN, sta->deflink will still be used to store the one set of capabilities and SMPS mode. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260326121156.9fdd37b8e755.I7a7bd6e8e751cab49c329419485839afd209cfc6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-07wifi: mac80211: add NAN local schedule supportMiri Korenblit
A NAN local schedule consist of a list of NAN channels, and an array that maps time slots to the channel it is scheduled to (or NULL to indicate unscheduled). A NAN channel is the configuration of a channel which is used for NAN operations. It is a new type of chanctx user (before, the only user is a link). A NAN channel may not have a chanctx assigned if it is ULWed out. A NAN channel may or may not be scheduled (for example, user space may want to prepare the resources before the actual schedule is configured). Add management of the NAN local schedule. Since we introduce a new chanctx user, also adjust the different for_each_chanctx_user_* macros to visit also the NAN channels and take those into account. Co-developed-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260326121156.03350fd40630.Id158f815cfc9b5ab1ebdb8ee608bda426e4d7474@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-07wifi: mac80211: export ieee80211_calculate_rx_timestampBenjamin Berg
The function is quite useful when handling beacon timestamps. Export it so that it can be used by mac80211_hwsim and others. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260326121156.a1abc9c52f37.Ieabfe66768b1bf64c3076d62e73c50794faeacdc@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-07wifi: mac80211: add a TXQ for management frames on NAN devicesBenjamin Berg
Currently there is no TXQ for non-data frames. Add a new txq_mgmt for this purpose and create one of these on NAN devices. On NAN devices, these frames may only be transmitted during the discovery window and it is therefore helpful to schedule them using a queue. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260326121156.32eddd986bd2.Iee95758287c276155fbd7779d3f263339308e083@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-06tcp: add recv_should_stop helperGeliang Tang
Factor out a new helper tcp_recv_should_stop() from tcp_recvmsg_locked() and tcp_splice_read() to check whether to stop receiving. And use this helper in mptcp_recvmsg() and mptcp_splice_read() to reduce redundant code. Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-3-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06xsk: fix XDP_UMEM_SG_FLAG issuesMaciej Fijalkowski
Currently xp_assign_dev_shared() is missing XDP_USE_SG being propagated to flags so set it in order to preserve mtu check that is supposed to be done only when no multi-buffer setup is in picture. Also, this flag has the same value as XDP_UMEM_TX_SW_CSUM so we could get unexpected SG setups for software Tx checksums. Since csum flag is UAPI, modify value of XDP_UMEM_SG_FLAG. Fixes: d609f3d228a8 ("xsk: add multi-buffer support for sockets sharing umem") Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-4-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06xsk: respect tailroom for ZC setupsMaciej Fijalkowski
Multi-buffer XDP stores information about frags in skb_shared_info that sits at the tailroom of a packet. The storage space is reserved via xdp_data_hard_end(): ((xdp)->data_hard_start + (xdp)->frame_sz - \ SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) and then we refer to it via macro below: static inline struct skb_shared_info * xdp_get_shared_info_from_buff(const struct xdp_buff *xdp) { return (struct skb_shared_info *)xdp_data_hard_end(xdp); } Currently we do not respect this tailroom space in multi-buffer AF_XDP ZC scenario. To address this, introduce xsk_pool_get_tailroom() and use it within xsk_pool_get_rx_frame_size() which is used in ZC drivers to configure length of HW Rx buffer. Typically drivers on Rx Hw buffers side work on 128 byte alignment so let us align the value returned by xsk_pool_get_rx_frame_size() in order to avoid addressing this on driver's side. This addresses the fact that idpf uses mentioned function *before* pool->dev being set so we were at risk that after subtracting tailroom we would not provide 128-byte aligned value to HW. Since xsk_pool_get_rx_frame_size() is actively used in xsk_rcv_check() and __xsk_rcv(), add a variant of this routine that will not include 128 byte alignment and therefore old behavior is preserved. Reviewed-by: Björn Töpel <bjorn@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-3-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06net: dsa: add bridge member iteration macroDaniel Golle
Drivers that offload bridges need to iterate over the ports that are members of a given bridge, for example to rebuild per-port forwarding bitmaps when membership changes. Currently drivers typically open-code this by combining dsa_switch_for_each_user_port() with a dsa_port_offloads_bridge_dev() check, or cache bridge membership within the driver. Add dsa_switch_for_each_bridge_member() macro to express this pattern directly, and use it for the existing dsa_bridge_ports() inline helper. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/e7136aaa26773f39e805a00fe4ecf13cd2b83fc0.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06net: dsa: move dsa_bridge_ports() helper to dsa.hDaniel Golle
The yt921x driver contains a helper to create a bitmap of ports which are members of a bridge. Move the helper as static inline function into dsa.h, so other driver can make use of it as well. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/4f8bbfce3e4e3a02064fc4dc366263136c6e0383.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-05mm: introduce a new page type for page pool in page typeByungchul Park
Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to determine if a page belongs to a page pool. However, with the planned removal of @pp_magic, we should instead leverage the page_type in struct page, such as PGTY_netpp, for this purpose. Introduce and use the page type APIs e.g. PageNetpp(), __SetPageNetpp(), and __ClearPageNetpp() instead, and remove the existing APIs accessing @pp_magic e.g. page_pool_page_is_pp(), netmem_or_pp_magic(), and netmem_clear_pp_magic(). Plus, add @page_type to struct net_iov at the same offset as struct page so as to use the page_type APIs for struct net_iov as well. While at it, reorder @type and @owner in struct net_iov to avoid a hole and increasing the struct size. This work was inspired by the following link: https://lore.kernel.org/all/582f41c0-2742-4400-9c81-0d46bf4e8314@gmail.com/ While at it, move the sanity check for page pool to on the free path. [byungchul@sk.com: gate the sanity check, per Johannes] Link: https://lkml.kernel.org/r/20260316223113.20097-1-byungchul@sk.com Link: https://lkml.kernel.org/r/20260224051347.19621-1-byungchul@sk.com Co-developed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Byungchul Park <byungchul@sk.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: David Wei <dw@davidwei.uk> Cc: Dragos Tatulea <dtatulea@nvidia.com> Cc: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Mark Bloch <mbloch@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Simon Horman <horms@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Stehen Rothwell <sfr@canb.auug.org.au> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-03net: increase IP_TUNNEL_RECURSION_LIMIT to 5Chris J Arges
In configurations with multiple tunnel layers and MPLS lwtunnel routing, a single tunnel hop can increment the counter beyond this limit. This causes packets to be dropped with the "Dead loop on virtual device" message even when a routing loop doesn't exist. Increase IP_TUNNEL_RECURSION_LIMIT from 4 to 5 to handle this use-case. Fixes: 6f1a9140ecda ("net: add xmit recursion limit to tunnel xmit functions") Link: https://lore.kernel.org/netdev/88deb91b-ef1b-403c-8eeb-0f971f27e34f@redhat.com/ Signed-off-by: Chris J Arges <carges@cloudflare.com> Link: https://patch.msgid.link/20260402222401.3408368-1-carges@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-7.0-rc7). Conflicts: net/vmw_vsock/af_vsock.c b18c83388874 ("vsock: initialize child_ns_mode_locked in vsock_net_init()") 0de607dc4fd8 ("vsock: add G2H fallback for CIDs not owned by H2G transport") Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c ceee35e5674a ("bnxt_en: Refactor some basic ring setup and adjustment logic") 57cdfe0dc70b ("bnxt_en: Resize RSS contexts on channel count change") drivers/net/wireless/intel/iwlwifi/mld/mac80211.c 4d56037a02bd ("wifi: iwlwifi: mld: block EMLSR during TDLS connections") 687a95d204e7 ("wifi: iwlwifi: mld: correctly set wifi generation data") drivers/net/wireless/intel/iwlwifi/mld/scan.h b6045c899e37 ("wifi: iwlwifi: mld: Refactor scan command handling") ec66ec6a5a8f ("wifi: iwlwifi: mld: Fix MLO scan timing") drivers/net/wireless/intel/iwlwifi/mvm/fw.c 078df640ef05 ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v 2") 323156c3541e ("wifi: iwlwifi: mvm: don't send a 6E related command when not supported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: mctp: perform source address lookups when we populate our dstJeremy Kerr
Rather than querying the output device for its address in mctp_local_output, set up the source address when we're populating the dst structure. If no address is assigned, use MCTP_ADDR_NULL. This will allow us more flexibility when routing for NULL-source-eid cases. For now though, we still reject a NULL source address in the output path. We need to update the tests a little, so that addresses are assigned before we do the dst lookups. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20260331-dev-mctp-null-eids-v1-1-b4d047372eaf@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-30RDMA/mana_ib: Disable RX steering on RSS QP destroyLong Li
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss() destroys the RX WQ objects but does not disable vPort RX steering in firmware. This leaves stale steering configuration that still points to the destroyed RX objects. If traffic continues to arrive (e.g. peer VM is still transmitting) and the VF interface is subsequently brought up (mana_open), the firmware may deliver completions using stale CQ IDs from the old RX objects. These CQ IDs can be reused by the ethernet driver for new TX CQs, causing RX completions to land on TX CQs: WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana] (is_sq == false) WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails) Fix this by disabling vPort RX steering before destroying RX WQ objects. Note that mana_fence_rqs() cannot be used here because the fence completion is delivered on the CQ, which is polled by user-mode (e.g. DPDK) and not visible to the kernel driver. Refactor the disable logic into a shared mana_disable_vport_rx() in mana_en, exported for use by mana_ib, replacing the duplicate code. The ethernet driver's mana_dealloc_queues() is also updated to call this common function. Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Cc: stable@vger.kernel.org Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-29ipv6: remove ipv6_stub infrastructure completelyFernando Fernandez Mancera
As IPv6 is built-in only and there are no more users of ipv6_stub, the ipv6_stub is now entirely obsolete. Remove all the code related to the definition, initialization and usage. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Link: https://patch.msgid.link/20260325120928.15848-11-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29bpf: remove ipv6_bpf_stub completely and use direct function callsFernando Fernandez Mancera
As IPv6 is built-in only, the ipv6_bpf_stub can be removed completely. Convert all ipv6_bpf_stub usage to direct function calls instead. The fallback functions introduced previously will prevent linkage errors when CONFIG_IPV6 is disabled. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260325120928.15848-10-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29net: convert remaining ipv6_stub users to direct function callsFernando Fernandez Mancera
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer necessary. Convert remaining ipv6_stub users to make direct function calls. The fallback functions introduced previously will prevent linkage errors when CONFIG_IPV6 is disabled. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Link: https://patch.msgid.link/20260325120928.15848-9-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29ipv6: prepare headers for ipv6_stub removalFernando Fernandez Mancera
In preparation for dropping ipv6_stub and converting its users to direct function calls, introduce static inline dummy functions and fallback macros in the IPv6 networking headers. In addition, introduce checks on fib6_nh_init(), ip6_dst_lookup_flow() and ip6_fragment() to avoid a crash due to ipv6.disable=1 set during booting. The other functions are safe as they cannot be called with ipv6.disable=1 set. These fallbacks ensure that when CONFIG_IPV6 is completely disabled, there are no compiling or linking errors due to code paths not guarded by preprocessor macro IS_ENABLED(CONFIG_IPV6). In addition, export ndisc_send_na(), ip6_route_input() and ip6_fragment(). Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Link: https://patch.msgid.link/20260325120928.15848-6-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29ipv6: replace IS_BUILTIN(CONFIG_IPV6) with IS_ENABLED(CONFIG_IPV6)Fernando Fernandez Mancera
As IPv6 is built-in only, it does not make sense to continue using IS_BUILTIN(CONFIG_IPV6). Therefore, replace it with IS_ENABLED() when necessary and drop it if it isn't valid anymore. Notice that there is still one instance related to ICMPv6, as it requires more changes it will be handle separately. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260325120928.15848-4-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29net: remove EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() macrosFernando Fernandez Mancera
As IPv6 is built-in only, the macro is always evaluating to an empty one. Remove it completely from the code. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260325120928.15848-3-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26tcp: Fix inconsistent indenting warningJiayuan Chen
Suppress such warning reported by test robot: include/net/tcp.h:1449 tcp_ca_event() warn: inconsistent indenting Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603251430.gQ3VuiKV-lkp@intel.com/ Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260325071854.805-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>