| Age | Commit message (Collapse) | Author |
|
Pull bpf fixes from Alexei Starovoitov:
- Fix sk_local_storage diag dump via netlink (Amery Hung)
- Fix off-by-one in arena direct-value access (Junyoung Jang)
- Reject TCP_NODELAY in bpf-tcp congestion control (KaFai Wan)
- Fix type confusion in bpf_*_sock() (Kuniyuki Iwashima)
- Reject TX-only AF_XDP sockets (Linpu Yu)
- Don't run arg-tracking analysis twice on main subprog (Paul Chaignon)
- Fix NULL pointer dereference in bpf_sk_storage_clone and fib lookup
(Weiming Shi)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix off-by-one boundary validation in arena direct-value access
xskmap: reject TX-only AF_XDP sockets
bpf: Don't run arg-tracking analysis twice on main subprog
bpf: Free reuseport cBPF prog after RCU grace period.
bpf: tcp: Fix type confusion in sol_tcp_sockopt().
bpf: tcp: Fix type confusion in bpf_skc_to_tcp6_sock().
bpf: tcp: Fix type confusion in bpf_skc_to_tcp_sock().
mptcp: bpf: Fix type confusion in bpf_mptcp_sock_from_subflow()
selftest: bpf: Add test for bpf_tcp_sock() and RAW socket.
bpf: tcp: Fix type confusion in bpf_tcp_sock().
tools/headers: Regenerate stddef.h to fix BPF selftests
bpf: Fix sk_local_storage diag dumping uninitialized special fields
bpf: Fix NULL pointer dereference in bpf_skb_fib_lookup()
sockmap: Fix sk_psock_drop() race vs sock_map_{unhash,close,destroy}().
bpf: Fix NULL pointer dereference in bpf_sk_storage_clone and diag paths
selftests/bpf: Verify bpf-tcp-cc rejects TCP_NODELAY
selftests/bpf: Test TCP_NODELAY in TCP hdr opt callbacks
bpf: Reject TCP_NODELAY in bpf-tcp-cc
bpf: Reject TCP_NODELAY in TCP header option callbacks
|
|
BPF_MAP_TYPE_ARENA accepts BPF_PSEUDO_MAP_VALUE offsets at exactly
the end of the arena mapping (off == arena_size). The boundary check
in arena_map_direct_value_addr() uses `>` instead of `>=`, which
incorrectly allows a one-past-end pointer to be accepted.
Change the condition to `>=` to correctly reject offsets that fall
outside the valid arena user_vm range.
Fixes: 317460317a02 ("bpf: Introduce bpf_arena.")
Signed-off-by: Junyoung Jang <graypanda.inzag@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260426172505.1947915-1-graypanda.inzag@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
XSKMAP entries are used as redirect targets for incoming XDP frames.
A TX-only AF_XDP socket lacks an Rx ring and cannot handle redirected
traffic, but xsk_map_update_elem() currently allows such sockets to
be inserted into the map.
Redirecting packets to such a socket on the veth generic-XDP path
causes a kernel crash in xsk_generic_rcv().
This became possible after xsk_is_setup_for_bpf_map() was removed from
the XSKMAP update path, which allowed bound TX-only sockets to be
inserted into the map.
Reject TX-only sockets during XSKMAP updates to avoid the crash.
They remain fully operational for pure Tx purposes outside XSKMAP.
Fixes: 968be23ceaca ("xsk: Fix possible segfault at xskmap entry insertion")
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yifan Wu <yifanwucs@gmail.com>
Signed-off-by: Linpu Yu <linpu5433@gmail.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20260508144344.694-1-linpu5433@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Because subprog 0, the main subprog, is considered a global function,
we end up running the arg-tracking dataflow analysis twice on it. That
results in slightly longer verification but mostly in more verbose
verifier logs. This patch fixes it by keeping only the iteration over
global subprogs.
When running over all of Cilium's programs with BPF_LOG_LEVEL2, this
reduces verbosity by ~20% on average.
Fixes: bf0c571f7feb6 ("bpf: introduce forward arg-tracking dataflow analysis")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/e4d7b53d4963ef520541a782f5fc8108a168877c.1778176504.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull fsverity fix from Eric Biggers:
"Fix a regression in overlayfs caused by an fsverity API change"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
ovl: fix verity lazy-load guard broken by fsverity_active() semantic change
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Add 'bindgen' target to make UML 32-bit builds work with GCC
- Disable two Clippy warnings ('collapsible_{if,match}')
'pin-init' crate:
- Fix unsoundness issue that created &'static references"
* tag 'rust-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: allow `clippy::collapsible_if` globally
rust: allow `clippy::collapsible_match` globally
rust: pin-init: fix incorrect accessor reference lifetime
rust: pin-init: internal: move alignment check to `make_field_check`
rust: arch: um: Fix building 32-bit UML with GCC
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- ads7871: Fix endianness bug in 16-bit register reads
- lm75: Fix configuration register writes and AS6200/TMP112 setup and
alarm handling
- lm63: Fix TOCTOU problems
- corsair-psu: Close HID device on probe errors
- ltc2992: Fix overflow and threshold range
- Documentation: fix link to ideapad-laptop.c file
- Remove stale CONFIG_SENSORS_SBRMI Makefile reference
* tag 'hwmon-for-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ads7871) Fix endianness bug in 16-bit register reads
hwmon: (lm75) Fix configuration register writes.
hwmon: (lm75) Fix AS6200 and TMP112 setup and alarm handling
hwmon: (lm63) Add locking to avoid TOCTOU
hwmon: (corsair-psu) Close HID device on probe errors
hwmon: Remove stale CONFIG_SENSORS_SBRMI Makefile reference
Documentation: hwmon: fix link to ideapad-laptop.c file
hwmon: (ltc2992) Fix u32 overflow in power read path
hwmon: (ltc2992) Clamp threshold writes to hardware range
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are two small staging driver fixes for 7.1-rc3. They are:
- vme_user root device leak fix
- NULL dereference bugfix in the rtl8723bs driver
Both of these have been in linux-next all this week with no reported
issues"
* tag 'staging-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723bs: os_dep: avoid NULL pointer dereference in rtw_cbuf_alloc
staging: vme_user: fix root device leak on init failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes for 7.1-rc3 to resolve some
reported issues, and a new device id. These are:
- usblp driver heap leak fixes
- ulpi driver memory leak fix
- typec driver fixes
- dwc3 driver fix
- omap dma driver fix
- new option driver device id addition
All of these have been in linux-next for over a week with no reported
issues"
* tag 'usb-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Telit Cinterion LE910Cx compositions
usb: usblp: fix uninitialized heap leak via LPGETSTATUS ioctl
usb: usblp: fix heap leak in IEEE 1284 device ID via short response
usb: dwc3: Move GUID programming after PHY initialization
usb: typec: tcpm: fix debug accessory mode detection for sink ports
usb: typec: tcpm: reset internal port states on soft reset AMS
usb: ulpi: fix memory leak on ulpi_register() error paths
USB: omap_udc: DMA: Don't enable burst 4 mode
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- sanitize more input parameters in the core (found by syzkaller)
- usual set of driver fixes (proper completion handling, applying
quirks, correct workqueue selection...)
- ID additions to simplify dependency handling
- new email address for Peter Rosin
* tag 'i2c-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: smbus: reject oversized block transfers in the common path
MAINTAINERS: Update mail for Peter Rosin
i2c: stub: Reject I2C block transfers with invalid length
i2c: Compare the return value of gpiod_get_direction against GPIO_LINE_DIRECTION_OUT
i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl
i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids
dt-bindings: i2c: apple,i2c: Add t8122 compatible
i2c: stm32f7: reinit_completion() per transfer not per msg
dt-bindings: i2c: amlogic: Add compatible for T7 SOC
i2c: testunit: Replace system_long_wq with system_dfl_long_wq
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix KASAN sanitization flag for core_$(BITS).o
- Fixes for handling offset values in pseries htmdump
- Fix interrupt mask in cpm1_gpiochip_add16()
- ps3/pasemi fixes to drop redundant result assignment
- Fixes in papr-hvpipe code path
- powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events
Thanks to Aboorva Devarajan, Athira Rajeev, Christophe Leroy (CS GROUP),
Geert Uytterhoeven, Haren Myneni, Krzysztof Kozlowski, Mukesh Kumar
Chaurasiya (IBM), Nathan Chancellor, Ritesh Harjani (IBM), Shivani
Nittor, Sourabh Jain, Thomas Zimmermann, and Venkat Rao Bagalkote.
* tag 'powerpc-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits)
powerpc/pasemi: Drop redundant res assignment
powerpc/ps3: Drop redundant result assignment
powerpc/vdso: Drop -DCC_USING_PATCHABLE_FUNCTION_ENTRY from 32-bit flags with clang
arch/powerpc: Drop CONFIG_FIRMWARE_EDID from defconfig files
powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events
powerpc/8xx: Fix interrupt mask in cpm1_gpiochip_add16()
powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec
powerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o
pseries/papr-hvpipe: Fix style and checkpatch issues in enable_hvpipe_IRQ()
pseries/papr-hvpipe: Refactor and simplify hvpipe_rtas_recv_msg()
pseries/papr-hvpipe: Kill task_struct pointer from struct hvpipe_source_info
pseries/papr-hvpipe: Simplify spin unlock usage in papr_hvpipe_handle_release()
pseries/papr-hvpipe: Fix the usage of copy_to_user()
pseries/papr-hvpipe: Fix & simplify error handling in papr_hvpipe_init()
pseries/papr-hvpipe: Fix null ptr deref in papr_hvpipe_dev_create_handle()
pseries/papr-hvpipe: Prevent kernel stack memory leak to userspace
pseries/papr-hvpipe: Fix race with interrupt handler
powerpc/pseries/htmdump: Add memory configuration dump support to htmdump module
powerpc/pseries/htmdump: Fix the offset value used in htm status dump
powerpc/pseries/htmdump: Fix the offset value used in processor configuration dump
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix memory map enumeration bug in the Xen e820 parsing code (Juergen
Gross)
- Re-enable e820 BIOS fallback if e820 table is empty (David Gow)
* tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/e820: Re-enable BIOS fallback if e820 table is empty
x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix CPU hotplug activation race in the timer migration code, by
Frederic Weisbecker"
* tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/migration: Fix another hotplug activation race
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix spurious failures in rseq self-tests (Mark Brown)
- Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative
use of the supposedly read-only field
The fix is to introduce a new ABI variant based on a new (larger)
rseq area registration size, to keep the TCMalloc use of rseq
backwards compatible on new kernels (Thomas Gleixner)
- Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
- Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
* tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix wakeup_preempt_fair() for not waking up task
sched/fair: Fix overflow in vruntime_eligible()
selftests/rseq: Expand for optimized RSEQ ABI v2
rseq: Reenable performance optimizations conditionally
rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
selftests/rseq: Validate legacy behavior
selftests/rseq: Make registration flexible for legacy and optimized mode
selftests/rseq: Skip tests if time slice extensions are not available
rseq: Revert to historical performance killing behaviour
rseq: Don't advertise time slice extensions if disabled
rseq: Protect rseq_reset() against interrupts
rseq: Set rseq::cpu_id_start to 0 on unregistration
selftests/rseq: Don't run tests with runner scripts outside of the scripts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events fixes from Ingo Molnar:
- Fix deadlock in the perf_mmap() failure path (Peter Zijlstra)
- Intel ACR (Auto Counter Reload) fixes (Dapeng Mi):
- Fix validation and configuration of ACR masks
- Fix ACR rescheduling bug causing stale masks
- Disable the PMI on ACR-enabled hardware
- Enable ACR on Panther Cover uarch too
* tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Enable auto counter reload for DMR
perf/x86/intel: Disable PMI for self-reloaded ACR events
perf/x86/intel: Always reprogram ACR events to prevent stale masks
perf/x86/intel: Improve validation and configuration of ACR masks
perf/core: Fix deadlock in perf_mmap() failure path
|
|
When the device is removed all allocated resources should be freed.
In uhdlc_memclean the netdev transmit queue was already stopped. But at
this point we may have pending skb in the transmit queue which must be
freed. Therefore iterate over the tx_skbuff pointers and free all
pending skb. The issue was discovered by sashiko.
Tested on a ls1043a board running HDLC in bus mode on kernel 6.12.
https: //sashiko.dev/#/patchset/20260429114208.941011-1-holger.brunck%40hitachienergy.com
Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Link: https://patch.msgid.link/20260507155332.3452319-1-holger.brunck@hitachienergy.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
1) Allow initial x_tables table replacement without emitting an audit
log message. Delay the register message until after hooks are wired up
to avoid unnecessary unregister logs during error unwinding.
2) Fix a NULL dereference by allocating hook ops before adding the
table to the per-netns list. Use `synchronize_rcu()` during error
unwinding to ensure the table stops processing packets before
teardown. Defer audit log register message until all operations
succeed.
3) Refactor xtables to use a single `xt_unregister_table_pre_exit`
function. Eliminate code duplication by centralizing table
unregistration logic within the xtables core. ebtables cannot be
changed due to incompatibility.
4) Unregister xtables templates before module removal. This prevents
a race condition where userspace instantiates a new table after the
pernet unreg removed the current table.
5) Add `xtables_unregister_table_exit` to fully unregister netfilter
tables during module removal. Unlink the table from dying lists,
then free hook operations.
6) Implement a two-stage removal scheme for ebtables following the
x_tables pattern. Assign table->ops while holding the ebt mutex to
prevent exposing partially-filled structures.
7) Fix ebtables module initialization race. Register the template last
in table initialization functions. Prevent table instantiation before
pernet operations are available.
8) Fix a race condition in x_tables module initialization. Ensure
pernet ops are fully set up before exposing the table to userspace.
9) Fix a race condition in ebtables module initialization, similar to
previous patch.
10) Restore propagation of helper to expected connection, this is a
fix-for-recent-fix.
11) Validate that the expectation tuple and mask netlink attributes are
present when adding expectation via nfqueue, this fixes a possible
null-ptr-deref.
12) Fix possible rare memleak in the SIP helper in case helper has been
detached from conntrack entry, from Li Xiasong.
13) Fix refcount leak in nft_ct when creating custom expectation, also
from Li Xiason.
Patches 1-9 from Florian Westphal.
10) Restore propagation of helper to expected connection, this is a
fix-for-recent-fix.
11) Check that tuple and mask netlink attributes are set when creating an
expectation via nfqueue.
* tag 'nf-26-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_ct: fix missing expect put in obj eval
netfilter: nf_conntrack_sip: get helper before allocating expectation
netfilter: ctnetlink: check tuple and mask in expectations created via nfqueue
netfilter: nf_conntrack_expect: restore helper propagation via expectation
netfilter: bridge: eb_tables: close module init race
netfilter: x_tables: close dangling table module init race
netfilter: ebtables: close dangling table module init race
netfilter: ebtables: move to two-stage removal scheme
netfilter: x_tables: add and use xtables_unregister_table_exit
netfilter: x_tables: unregister the templates first
netfilter: x_tables: add and use xt_unregister_table_pre_exit
netfilter: x_tables: allocate hook ops while under mutex
netfilter: x_tables: allow initial table replace without emitting audit log message
====================
Link: https://patch.msgid.link/20260507234509.603182-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SCTP_SENDALL path in sctp_sendmsg() iterates ep->asocs with
list_for_each_entry_safe(), which caches the next entry in @tmp before
the loop body runs. The body calls sctp_sendmsg_to_asoc(), which may
drop the socket lock inside sctp_wait_for_sndbuf().
While the lock is dropped, another thread can SCTP_SOCKOPT_PEELOFF the
association cached in @tmp, migrating it to a new endpoint via
sctp_sock_migrate() (list_del_init() + list_add_tail() to
newep->asocs), and optionally close the new socket which frees the
association via kfree_rcu(). The cached @tmp can also be freed by a
network ABORT for that association, processed in softirq while the
lock is dropped.
sctp_wait_for_sndbuf() revalidates @asoc (the current entry) on re-lock
via the "sk != asoc->base.sk" and "asoc->base.dead" checks, but nothing
revalidates @tmp. After a successful return, the iterator advances to
the stale @tmp, yielding either a use-after-free (if the peeled socket
was closed) or a list-walk onto the new endpoint's list head (type
confusion of &newep->asocs as a struct sctp_association *).
Both are reachable from CapEff=0; the type-confusion path gives
controlled indirect call via the outqueue.sched->init_sid pointer.
Fix by re-deriving @tmp from @asoc after sctp_sendmsg_to_asoc()
returns. @asoc is known to still be on ep->asocs at that point: the
only callers that list_del an association from ep->asocs are
sctp_association_free() (which sets asoc->base.dead) and
sctp_assoc_migrate() (which changes asoc->base.sk), and
sctp_wait_for_sndbuf() checks both under the lock before any
successful return; a tripped check propagates as err < 0 and the loop
bails before the re-derive.
The SCTP_ABORT path in sctp_sendmsg_check_sflags() returns 0 and the
loop hits 'continue' before sctp_sendmsg_to_asoc() is ever called, so
the @tmp cached by list_for_each_entry_safe() still covers the
lock-held free that ba59fb027307 ("sctp: walk the list of asoc
safely") was added for.
Fixes: 4910280503f3 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
Cc: stable@vger.kernel.org
Signed-off-by: Ben Morris <bmorris@anthropic.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20260508001455.3137-1-joycathacker@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The error path on of_property_read_u32() failure inside
icssm_prueth_probe() returns without putting eth_ports_node,
which was acquired before the for_each_child_of_node() loop.
Drop it before returning.
Fixes: 511f6c1ae093 ("net: ti: icssm-prueth: Adds ICSSM Ethernet driver")
Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com>
Link: https://patch.msgid.link/20260506195813.641610-1-shitalkumar.gandhi@cambiumnetworks.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
lan966x_probe_port() stores the newly allocated net_device in the
port before calling register_netdev(). If register_netdev() fails,
the probe error path calls lan966x_cleanup_ports(), which sees
port->dev and calls unregister_netdev() for a device that was never
registered.
Destroy the phylink instance created for this port and clear port->dev
before returning the registration error. The common cleanup path now skips
ports without port->dev before reaching the registered netdev cleanup, so
it only handles ports that reached the registered-netdev lifetime.
This also avoids treating an uninitialized FDMA netdev and the failed port
as a NULL == NULL match in the common cleanup path.
Fixes: d28d6d2e37d1 ("net: lan966x: add port module support")
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Link: https://patch.msgid.link/20260506124331.31945-1-mhun512@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
- ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state rather
than the tracer's
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:
- Don't fallback to bus reset after failed slot reset; a bus reset
isn't safe if the .reset_slot() callback is implemented (Keith Busch)
- Update saved_config_space upon resource assignment to fix passthrough
regressions when x86 pcibios_assign_resources() updates BARs (Lukas
Wunner)
- Initialize a temporary pci_dev->dev in sysfs 'new_id' attribute to
fix a lockdep regression after driver_override was moved from PCI to
device core (Samiullah Khawaja)
- Update MAINTAINERS email addresses (Marek Vasut, Hans Zhang)
- Add MAINTAINERS reviewer for PCIe Cadence IP (Aksh Garg)
* tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer
MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1
MAINTAINERS: Update Marek Vasut email for PCIe R-Car
PCI: Initialize temporary device in new_id_store()
PCI: Update saved_config_space upon resource assignment
PCI: Don't fallback to bus reset after failed slot reset
|
|
Jacob Keller says:
====================
Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf)
Matt Volrath fixes two issues with the i40e driver probe routine, ensuring
that PTP is properly cleaned up if the probe fails.
Emil corrects the initialization of the read_dev_clk_lock spinlock in
idpf_ptp_init, ensuring it is initialized prior to when the
ptp_schedule_worker() is called.
Greg KH fixes a double free and use-after free in the idpf auxiliary device
error paths.
Marcin fixes ice_set_rss_hfunc() to use the correct q_opt_flags field,
correcting the assignment and preventing submission of invalid data to the
firmware.
Bart corrects the locking in ice_dcb_rebuild(), ensuring that the tc_mutex
is held over the entire operation.
Ivan fixes the rclk pin state get for E810 devices, ensuring the index is
properly offset by the base_rclk_idx value. This ensures that the correct
pin index is used to look up recovered clock state. He additionally adds
bounds checking to prevent attempting to access pins outside of the pin
state array.
Ivan also moves the CGU register macros to the top of ice_dpll.h, inside
the header guard to avoid duplicate macro definitions should the ice_dpll.h
header is included multiple times.
====================
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-0-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The CGU register definitions (ICE_CGU_R10, ICE_CGU_R11 and related field
masks) were placed after the #endif of the _ICE_DPLL_H_ include guard,
leaving them unprotected. Move them inside the guard.
Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-8-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The refactoring of ice_dpll_rclk_state_on_pin_get() to use
ice_dpll_pin_get_parent_idx() omitted the base_rclk_idx adjustment that was
correctly added in the ice_dpll_rclk_state_on_pin_set() path. This breaks
E810 devices where base_rclk_idx is non-zero, causing the wrong hardware
index to be used for pin state lookup and incorrect recovered clock state
to be reported via the DPLL subsystem. E825C is unaffected as its
base_rclk_idx is 0.
While at it, add bounds check against ICE_DPLL_RCLK_NUM_MAX on hw_idx after
the base_rclk_idx subtraction in both ice_dpll_rclk_state_on_pin_{get,set}()
to prevent out-of-bounds access on the pin state array.
Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-7-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the mutex_lock() call up to prevent that DCB settings change after
the first ice_query_port_ets() call. The second ice_query_port_ets()
call in ice_dcb_rebuild() is already protected by pf->tc_mutex.
This also fixes a bug in an error path, as before taking the first
"goto dcb_error" in the function jumped over mutex_lock() to
mutex_unlock().
This bug has been detected by the clang thread-safety analyzer.
Cc: intel-wired-lan@lists.osuosl.org
Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-6-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
function, leaving other VSI options unchanged. However, ::q_opt_flags is
mistakenly set to the value of another field, instead of its original
value, probably due to a typo. What happens next is hardware-dependent:
On E810, only the first bit is meaningful (see
ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
state than before VSI update.
On E830, some of the remaining bits are not reserved. Setting them
to some unrelated values can cause the firmware to reject the update
because of invalid settings, or worse - succeed.
Reproducer:
sudo ethtool -X $PF1 equal 8
Output in dmesg:
Failed to configure RSS hash for VSI 6, error -5
Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-5-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When auxiliary_device_add() fails in idpf_plug_vport_aux_dev() or
idpf_plug_core_aux_dev(), the err_aux_dev_add label calls
auxiliary_device_uninit() and falls through to err_aux_dev_init. The
uninit call will trigger put_device(), which invokes the release
callback (idpf_vport_adev_release / idpf_core_adev_release) that frees
iadev. The fall-through then reads adev->id from the freed iadev for
ida_free() and double-frees iadev with kfree().
Free the IDA slot and clear the back-pointer before uninit, while adev
is still valid, then return immediately.
Commit 65637c3a1811 ("idpf: fix UAF in RDMA core aux dev deinitialization")
fixed the same use-after-free in the matching unplug path in this file but
missed both probe error paths.
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: stable@kernel.org
Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy")
Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-4-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In idpf_ptp_init(), read_dev_clk_lock is initialized after
ptp_schedule_worker() had already been called (and after
idpf_ptp_settime64() could reach the lock). The PTP aux worker
fires immediately upon scheduling and can call into
idpf_ptp_read_src_clk_reg_direct(), which takes
spin_lock(&ptp->read_dev_clk_lock) on an uninitialized lock, triggering
the lockdep "non-static key" warning:
[12973.796587] idpf 0000:83:00.0: Device HW Reset initiated
[12974.094507] INFO: trying to register non-static key.
...
[12974.097208] Call Trace:
[12974.097213] <TASK>
[12974.097218] dump_stack_lvl+0x93/0xe0
[12974.097234] register_lock_class+0x4c4/0x4e0
[12974.097249] ? __lock_acquire+0x427/0x2290
[12974.097259] __lock_acquire+0x98/0x2290
[12974.097272] lock_acquire+0xc6/0x310
[12974.097281] ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097311] ? lockdep_hardirqs_on_prepare+0xde/0x190
[12974.097318] ? finish_task_switch.isra.0+0xd2/0x350
[12974.097330] ? __pfx_ptp_aux_kworker+0x10/0x10 [ptp]
[12974.097343] _raw_spin_lock+0x30/0x40
[12974.097353] ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097373] idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097391] ? kthread_worker_fn+0x88/0x3d0
[12974.097404] ? kthread_worker_fn+0x4e/0x3d0
[12974.097411] idpf_ptp_update_cached_phctime+0x26/0x120 [idpf]
[12974.097428] ? _raw_spin_unlock_irq+0x28/0x50
[12974.097436] idpf_ptp_do_aux_work+0x15/0x20 [idpf]
[12974.097454] ptp_aux_kworker+0x20/0x40 [ptp]
[12974.097464] kthread_worker_fn+0xd5/0x3d0
[12974.097474] ? __pfx_kthread_worker_fn+0x10/0x10
[12974.097482] kthread+0xf4/0x130
[12974.097489] ? __pfx_kthread+0x10/0x10
[12974.097498] ret_from_fork+0x32c/0x410
[12974.097512] ? __pfx_kthread+0x10/0x10
[12974.097519] ret_from_fork_asm+0x1a/0x30
[12974.097540] </TASK>
Move the call to spin_lock_init() up a bit to make sure read_dev_clk_lock
is not touched before it's been initialized.
Fixes: 5cb8805d2366 ("idpf: negotiate PTP capabilities and get PTP clock")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-3-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PTP pin structs are allocated early in probe, but never cleaned up.
Fix this by calling i40e_ptp_free_pins in the error path.
To support this, i40e_ptp_free_pins is added to the header and
pin_config is correctly nullified after being freed.
This has been an issue since i40e_ptp_alloc_pins was introduced.
Fixes: 1050713026a08 ("i40e: add support for PTP external synchronization clock")
Reported-by: Kohei Enju <kohei@enjuk.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Kohei Enju <kohei@enjuk.jp>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-2-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix two conditions which would leak PTP registration on probe failure:
1. i40e_setup_pf_switch can encounter an error in
i40e_setup_pf_filter_control, call i40e_ptp_init, then return
non-zero, sending i40e_probe to err_vsis.
2. i40e_setup_misc_vector can return non-zero, sending i40e_probe to
err_vsis.
Both of these conditions have been present since PTP was introduced in
this driver.
Found with coccinelle.
Fixes: beb0dff1251db ("i40e: enable PTP")
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-1-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When an existing node-scope shaper is moved to a different parent
via the group operation, the framework fails to update the leaves
count on both the old and new parent shapers. Only newly created
nodes (handle.id == NET_SHAPER_ID_UNSPEC) trigger the parent
leaves increment at line 1039.
This causes the parent's leaves counter to diverge from the
actual number of children in the xarray. When the node is later
deleted, pre_del_node() allocates an array sized by the stale
leaves count, but the xarray iteration finds more children than
expected, hitting the WARN_ON_ONCE guard and returning -EINVAL.
Rather than adding reparenting support with complex leaves count
bookkeeping, reject group calls that attempt to change an existing
node's parent. Updates to an existing node's rate or leaves under
the same parent remain permitted. We expect that for any modification
of the topology user should always create new groups and let the
kernel garbage collect the leaf-less nodes.
Fixes: 5d5d4700e75d ("net-shapers: implement NL group operation")
Signed-off-by: Mohsin Bashir <hmohsin@meta.com>
Link: https://patch.msgid.link/20260506233745.111895-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These methods generally consume ownership of the provided skb, so even
if an error path is encountered, the skb is freed. This is because the
very first thing they do after some initial setup is to unconditionally
consume the skb via consume_skb(skb). Any subsequent errors lead to the
core netlink layer freeing the skb.
However, there is one check that occurs before ownership is passed,
which is the check for the group index. So if this error condition is
encountered, then the skb is leaked. This error condition is generally
considered a violation of the netlink API, so it's not expected to occur
under normal circumstances. For the same reason, no callers check for
this error condition, and no callers need to be adjusted. However, we
should still follow the same ownership semantics of the rest of the
function. Thus, free the skb in this codepath.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Matthew Maurer <mmaurer@google.com>
Fixes: 2a94fe48f32c ("genetlink: make multicast groups const, prevent abuse")
Link: https://lore.kernel.org/r/845b36ba-7b3a-41f2-acb2-b284f253e2ca@lunn.ch
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260506-genlmsg-return-v2-1-a63ee2a055d6@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
NSH header length is a 6-bit field that encodes the total length of
the header in 4-byte words. So the maximum length is 0b111111 * 4,
which is 252 and not 256. The maximum context length is the same
number minus the length of the base header (8), so 244.
These macros are used to validate push_nsh() action in openvswitch.
Miscalculation here doesn't cause any real issues. In the worst case
the oversized context is truncated while building the header, so we'll
construct and send a broken packet, which is not a big problem, as any
receiver should validate the fields. No invalid memory accesses will
happen during the header push. But we should fix the macros to reject
the incorrect actions in the first place.
Using previously defined values and calculating the length instead
of defining numbers directly, so it's easier to understand where they
come from and harder to make a mistake.
Fixes: 1f0b7744c505 ("net: add NSH header structures and helpers")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260507120434.2962505-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In phy_prepare_data(), several strings such as 'name', 'drvname',
'upstream_sfp_name', and 'downstream_sfp_name' are allocated using
kstrdup(). However, these allocations were not checked for failure.
If kstrdup() fails for 'name', it returns NULL while the function
continues. This leads to a kernel NULL pointer dereference and panic
later in phy_reply_size() when it unconditionally calls strlen() on
the NULL pointer.
While other strings like 'upstream_sfp_name' might be checked before
access in certain code paths, failing to handle these allocations
consistently can lead to incomplete data reporting or hidden bugs.
Fix this by adding proper NULL checks for all kstrdup() calls in
phy_prepare_data() and implement a centralized error handling path
using goto labels to ensure all previously allocated resources are
freed on failure.
Fixes: 9dd2ad5e92b9 ("net: ethtool: phy: Convert the PHY_GET command to generic phy dump")
Signed-off-by: Quan Sun <2022090917019@std.uestc.edu.cn>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260507131738.1173835-1-2022090917019@std.uestc.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I would like to hand over the macb maintenance to Théo, as I'm unable to
keep up with the recent flow of patches for this driver. After speaking
with Claudiu, he indicated that he is in the same position as me.
To help with this work, Conor has agreed to act as a reviewer.
I was given responsibility for this driver years ago, and I'm glad to
see it continue with talented developers.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260507120444.9733-1-nicolas.ferre@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When in irq deferral mode (defer-hard-irqs > 0), a short enough
gro-flush timeout can trigger before NAPI_STATE_SCHED is cleared if the
last poll in busy_poll_stop() takes too long. This can have the effect
of leaving the queue stuck with interrupts disabled and no timer armed
which results in a tx timeout if there is no subsequent busypoll cycle.
To prevent this, defer the gro-flush timer arm after the last poll.
Fixes: 7fd3253a7de6 ("net: Introduce preferred busy-polling")
Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260506090808.820559-2-dtatulea@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Maoyi Xie says:
====================
ipv6: flowlabel: per-netns budget for unprivileged callers
From: Maoyi Xie <maoyi.xie@ntu.edu.sg>
This series fixes the cross-tenant DoS in net/ipv6/ip6_flowlabel.c.
v1 through v6 were single-patch postings, each in its own thread.
v6 review pointed out that the existing fl_size read in
mem_check() and the corresponding write in fl_intern() are not in
the same critical section. v7 split the work into 2 patches.
Patch 1/2 is a prerequisite. It moves spin_lock_bh(&ip6_fl_lock)
and the matching unlock from fl_intern() into its only caller
ipv6_flowlabel_get(), so the mem_check() call runs under the same
critical section as the fl_intern() insert. With all writers and
the read of fl_size under the lock, fl_size is converted from
atomic_t to plain int. This is independent of the per-netns
budget. It also makes 2/2 backportable without conflicts.
Patch 2/2 is the v6 patch, rebased on 1/2.
- flowlabel_count is plain int rather than atomic_t, since the
previous patch put all writers and readers under ip6_fl_lock.
- In ip6_fl_gc(), fl_free() is now placed below the fl_size
and flowlabel_count decrements, removing the v6 cache of
fl->fl_net.
- In ip6_fl_purge(), fl_free() stays in its original position.
The function argument net is used for flowlabel_count.
- mem_check() uses spaces around the / operator on all four
expressions, addressing the checkpatch note in v6 review.
Numeric budget (preserved from v6):
pre-patch:
global non-CAP_NET_ADMIN budget = FL_MAX_SIZE - FL_MAX_SIZE/4
= 4096 - 1024 = 3072
per-actor reach = 3072
post-patch:
FL_MAX_SIZE doubled to 8192
global non-CAP_NET_ADMIN budget = 8192 - 2048 = 6144
per-netns ceiling = 6144 / 2 = 3072
per-actor reach = 3072 (preserved)
CAP_NET_ADMIN against init_user_ns still bypasses both caps.
Reproducer (KASAN VM, 4 cores, qemu): unprivileged netns A holds
3072 flowlabels via 100 procs. Fresh unprivileged netns B then
allocates 32 flowlabels (the FL_MAX_PER_SOCK ceiling for one
socket), the same as a clean baseline. Without the per-netns
ceiling, netns A could push fl_size past FL_MAX_SIZE - FL_MAX_SIZE
/ 4 and netns B would see allocations denied.
====================
Link: https://patch.msgid.link/20260506082416.2259567-1-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fl_size, fl_ht and ip6_fl_lock in net/ipv6/ip6_flowlabel.c are
file scope and shared across netns. mem_check() reads fl_size to
decide whether to deny non-CAP_NET_ADMIN callers. capable() runs
against init_user_ns, so an unprivileged user in any non-init
userns can push fl_size past FL_MAX_SIZE - FL_MAX_SIZE / 4 and
starve every other unprivileged userns on the host.
Add struct netns_ipv6::flowlabel_count, bumped and decremented
next to fl_size in fl_intern, ip6_fl_gc and ip6_fl_purge. The new
field fills the existing 4-byte hole after ipmr_seq, so struct
netns_ipv6 stays the same size on 64-bit builds.
Bump FL_MAX_SIZE from 4096 to 8192. It has been 4096 since the
file was added. Machines and connection counts have grown.
mem_check() folds an extra per-netns ceiling into the existing
non-CAP_NET_ADMIN conditional. The ceiling is half of the total
budget that unprivileged callers have ever been able to use, i.e.
(FL_MAX_SIZE - FL_MAX_SIZE / 4) / 2 = 3072 entries. With
FL_MAX_SIZE doubled, this preserves the original per-user reach
of 3K (what an unprivileged caller could already obtain before
this change), while forcing an attacker to spread allocations
across at least two netns to exhaust the global non-CAP_NET_ADMIN
budget.
CAP_NET_ADMIN against init_user_ns still bypasses both caps.
The previous patch took ip6_fl_lock across mem_check and
fl_intern, so the new flowlabel_count read in mem_check and the
new flowlabel_count++ in fl_intern run under the same critical
section. flowlabel_count is therefore plain int, like fl_size.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Suggested-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Link: https://patch.msgid.link/20260506082416.2259567-3-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mem_check() in net/ipv6/ip6_flowlabel.c reads fl_size without
holding ip6_fl_lock. fl_intern() takes the lock immediately
afterwards. The two checks therefore race against concurrent
fl_intern, ip6_fl_gc and ip6_fl_purge writers, which makes the
mem_check budget check approximate.
Move spin_lock_bh(&ip6_fl_lock) and the matching unlock from
fl_intern() into its only caller ipv6_flowlabel_get(). The
mem_check() call now runs under the same critical section as the
fl_intern() insert, so the budget check is exact.
With all writers and the read of fl_size under ip6_fl_lock,
convert fl_size from atomic_t to plain int. The four sites that
update or read fl_size are fl_intern (insert path), ip6_fl_gc
(garbage collector, the !sched check and the per-entry decrement),
ip6_fl_purge (per-netns purge), and mem_check (budget check), and
all four now run under ip6_fl_lock.
This is a prerequisite for adding a per-netns budget alongside
fl_size. The follow-up patch adds netns_ipv6::flowlabel_count and
folds it into mem_check().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Suggested-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Link: https://patch.msgid.link/20260506082416.2259567-2-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It appears there's a need for a maintainer for the 3Com EtherLink III
family of Ethernet network adapters. There is documentation available
and the driver is very mature so the task ought to be of little hassle,
so I think I should be able to squeeze in any issues to be addressed.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/alpine.DEB.2.21.2604271056460.28583@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kuniyuki Iwashima says:
====================
tcp: Two fixes for socket migration in reqsk_timer_handler().
The series fixes two bugs in the error path of socket migration
in reqsk_timer_handler().
Patch 1 fixes a potential UAF in reqsk_timer_handler().
Patch 2 fixes imbalanced icsk_accept_queue count.
====================
Link: https://patch.msgid.link/20260506035954.1563147-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When TCP socket migration happens in reqsk_timer_handler(),
@sk_listener will be updated with the new listener.
When we call __inet_csk_reqsk_queue_drop(), the listener must
be the one stored in req->rsk_listener.
The cited commit accidentally replaced oreq->rsk_listener with
sk_listener, leading to imbalanced icsk_accept_queue count.
Let's pass the correct listener to __inet_csk_reqsk_queue_drop().
Fixes: e8c526f2bdf1 ("tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260506035954.1563147-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When TCP socket migration fails at inet_ehash_insert() in
reqsk_timer_handler(), we jump to the no_ownership: label
and free the new reqsk immediately with __reqsk_free().
Thus, we must stop the new reqsk's timer before jumping to the
label, but the timer might be missed since the cited commit,
resulting in UAF.
As we are in the original reqsk's timer context, we can safely
call timer_delete_sync() for the new reqsk.
Let's pass false to __inet_csk_reqsk_queue_drop() to stop
the new reqsk's timer.
Fixes: 83fccfc3940c ("inet: fix potential deadlock in reqsk_queue_unlink()")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260506035954.1563147-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I wish to contribute to the review process for Cadence PCIe IP drivers,
hence add myself as a reviewer.
Signed-off-by: Aksh Garg <a-garg7@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260508060951.840233-1-a-garg7@ti.com
|
|
Update my email address as my work email account is no longer in use.
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260508023006.1787674-1-18255117159@163.com
|
|
Use up to date address. No functional change.
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260428052030.51101-1-marek.vasut+renesas@mailbox.org
|
|
When setting new_id of a PCI device driver using sysfs a lockdep splat
occurs. This is because new_id_store() builds a temporary pci_dev for
pci_match_device(), which calls device_match_driver_override(). That
depends on the driver_override.lock added by cb3d1049f4ea ("driver core:
generalize driver_override in struct device").
The new driver_override.lock was not initialized in the temporary pci_dev,
resulting in this lockdep splat.
Initialize the temporary pci_dev to fix this.
Repro:
Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID:
# echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
register_lock_class+0x77e/0x790
lock_acquire+0xbf/0x2e0
pci_match_device+0x24/0x180
new_id_store+0x189/0x1d0
kernfs_fop_write_iter+0x14f/0x210
vfs_write+0x263/0x5e0
ksys_write+0x79/0xf0
do_syscall_64+0x117/0xf80
Fixes: 10a4206a2401 ("PCI: use generic driver_override infrastructure")
Fixes: 8895d3bcb8ba ("PCI: Fail new_id for vendor/device values already built into driver")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
[bhelgaas: add commit log details and repro, trim backtrace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com
|
|
Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB
adapter plugged into an ASRock X570S PG Riptide board with BIOS version
P5.41 (09/07/2023):
ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter
ddbridge 0000:05:00.0: cannot read registers
ddbridge 0000:05:00.0: fail
BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the
upstream bridge window. The kernel corrects the BAR assignment:
pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window
pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned
Correction of the BAR assignment happens in an x86-specific fs_initcall,
pcibios_assign_resources(), after device enumeration in a subsys_initcall.
This order was introduced at the behest of Linus in 2004:
https://git.kernel.org/tglx/history/c/a06a30144bbc
No other architecture performs such a late BAR correction.
Bernd bisected the issue to commit a2f1e22390ac ("PCI/ERR: Ensure error
recoverability at all times"), but it only occurs in the absence of commit
4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing").
This combination exists in stable kernel v6.12.70, but not in mainline,
hence Bernd cannot reproduce the issue with mainline.
Since a2f1e22390ac, config space is saved on enumeration, prior to BAR
correction. Upon passthrough, the corrected BAR is overwritten with the
incorrect saved value by:
vfio_pci_core_register_device()
vfio_pci_set_power_state()
pci_restore_state()
But only if the device's current_state is PCI_UNKNOWN, as it was prior to
commit 4d4c10f763d7. Since the commit, it is PCI_D0, which changes the
behavior of vfio_pci_set_power_state() to no longer restore the state
without saving it first.
Alexandre is reporting the same issue as Bernd, but in his case, mainline
is affected as well. The difference is that on Alexandre's system, the
host kernel binds a driver to the device which is unbound prior to
passthrough, whereas on Bernd's system no driver gets bound by the host
kernel.
Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when
vfio-pci is subsequently bound to the device, pci_restore_state() is once
again called without invoking pci_save_state() first.
To robustly fix the issue, always update saved_config_space upon resource
assignment.
Reported-by: Bernd Schumacher <bernd@bschu.de>
Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/
Reported-by: Alexandre N. <an.tech@mailo.com>
Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/
Fixes: a2f1e22390ac ("PCI/ERR: Ensure error recoverability at all times")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Bernd Schumacher <bernd@bschu.de>
Tested-by: Alexandre N. <an.tech@mailo.com>
Cc: stable@vger.kernel.org # v6.12+
Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de
|
|
Eulgyu Kim reported the splat below with a repro. [0]
The repro sets up a UDP reuseport group with a cBPF prog and
replaces it with a new one while another thread is sending
a UDP packet to the group.
The reuseport prog is freed by sk_reuseport_prog_free().
bpf_prog_put() is called for "e"BPF prog to destruct through
multiple stages while cBPF prog is freed immediately by
bpf_release_orig_filter() and bpf_prog_free().
If a reuseport prog is detached from the setsockopt() path
(reuseport_attach_prog() or reuseport_detach_prog()),
sk_reuseport_prog_free() is called without waiting for RCU
readers to complete, resulting in various bugs.
Let's defer freeing the reuseport cBPF prog after one RCU
grace period.
Note "e"BPF prog is safe as is unless the fast path starts
to touch fields destroyed in bpf_prog_put_deferred() and
__bpf_prog_put_noref().
[0]:
BUG: KASAN: vmalloc-out-of-bounds in reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
Read of size 4 at addr ffffc9000051e004 by task slowme/10208
CPU: 6 UID: 1000 PID: 10208 Comm: slowme Not tainted 7.0.0-geb7ac95ff75e #32 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
udp4_lib_lookup2+0x3bc/0x950 net/ipv4/udp.c:495
__udp4_lib_lookup+0x768/0xe20 net/ipv4/udp.c:723
__udp4_lib_lookup_skb+0x297/0x390 net/ipv4/udp.c:752
__udp4_lib_rcv+0x1312/0x2620 net/ipv4/udp.c:2752
ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207
ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241
NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
__netif_receive_skb_one_core net/core/dev.c:6181 [inline]
__netif_receive_skb net/core/dev.c:6294 [inline]
process_backlog+0xaa4/0x1960 net/core/dev.c:6645
__napi_poll+0xae/0x340 net/core/dev.c:7709
napi_poll net/core/dev.c:7772 [inline]
net_rx_action+0x5d7/0xf50 net/core/dev.c:7929
handle_softirqs+0x22b/0x870 kernel/softirq.c:622
do_softirq+0x76/0xd0 kernel/softirq.c:523
</IRQ>
<TASK>
__local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:924 [inline]
__dev_queue_xmit+0x1dd7/0x3710 net/core/dev.c:4890
neigh_output include/net/neighbour.h:556 [inline]
ip_finish_output2+0xca9/0x1070 net/ipv4/ip_output.c:237
NF_HOOK_COND include/linux/netfilter.h:307 [inline]
ip_output+0x29f/0x450 net/ipv4/ip_output.c:438
ip_send_skb+0x45/0xc0 net/ipv4/ip_output.c:1508
udp_send_skb+0xb04/0x1510 net/ipv4/udp.c:1195
udp_sendmsg+0x1a71/0x2350 net/ipv4/udp.c:1485
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
__sys_sendto+0x554/0x680 net/socket.c:2206
__do_sys_sendto net/socket.c:2213 [inline]
__se_sys_sendto net/socket.c:2209 [inline]
__x64_sys_sendto+0xde/0x100 net/socket.c:2209
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x160/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x415a2d
Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6bc31e41e8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f6bc31e4cdc RCX: 0000000000415a2d
RDX: 0000000000000001 RSI: 00007f6bc31e421f RDI: 0000000000000003
RBP: 00007f6bc31e4240 R08: 00007f6bc31e4220 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000212 R12: 00007f6bc31e46c0
R13: ffffffffffffffb8 R14: 0000000000000000 R15: 00007ffc9b0d70b0
</TASK>
Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Reported-by: Eulgyu Kim <eulgyukim@snu.ac.kr>
Reported-by: Taeyang Lee <0wn@theori.io>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20260426012647.3233119-1-kuniyu@google.com
|