| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix spurious failures in rseq self-tests (Mark Brown)
- Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative
use of the supposedly read-only field
The fix is to introduce a new ABI variant based on a new (larger)
rseq area registration size, to keep the TCMalloc use of rseq
backwards compatible on new kernels (Thomas Gleixner)
- Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
- Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
* tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix wakeup_preempt_fair() for not waking up task
sched/fair: Fix overflow in vruntime_eligible()
selftests/rseq: Expand for optimized RSEQ ABI v2
rseq: Reenable performance optimizations conditionally
rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
selftests/rseq: Validate legacy behavior
selftests/rseq: Make registration flexible for legacy and optimized mode
selftests/rseq: Skip tests if time slice extensions are not available
rseq: Revert to historical performance killing behaviour
rseq: Don't advertise time slice extensions if disabled
rseq: Protect rseq_reset() against interrupts
rseq: Set rseq::cpu_id_start to 0 on unregistration
selftests/rseq: Don't run tests with runner scripts outside of the scripts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter, IPsec, Bluetooth and WiFi.
Current release - fix to a fix:
- ipmr: add __rcu to netns_ipv4.mrt, make sure we hold the RCU lock
in all relevant places
Current release - new code bugs:
- fixes for the recently added resizable hash tables
- ipv6: make sure we default IPv6 tunnel drivers to =m now that IPv6
itself is built in
- drv: octeontx2-af: fixes for parser/CAM fixes
Previous releases - regressions:
- phy: micrel: fix LAN8814 QSGMII soft reset
- wifi:
- cw1200: revert "Fix locking in error paths"
- ath12k: fix crash on WCN7850, due to adding the same queue
buffer to a list multiple times
Previous releases - always broken:
- number of info leak fixes
- ipv6: implement limits on extension header parsing
- wifi: number of fixes for missing bound checks in the drivers
- Bluetooth: fixes for races and locking issues
- af_unix:
- fix an issue between garbage collection and PEEK
- fix yet another issue with OOB data
- xfrm: esp: avoid in-place decrypt on shared skb frags
- netfilter: replace skb_try_make_writable() by skb_ensure_writable()
- openvswitch: vport: fix race between tunnel creation and linking
leading to invalid memory accesses (type confusion)
- drv: amd-xgbe: fix PTP addend overflow causing frozen clock
Misc:
- sched/isolation: make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
(for relevant IPVS change)"
* tag 'net-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (190 commits)
net: sparx5: configure serdes for 1000BASE-X in sparx5_port_init()
net: sparx5: fix wrong chip ids for TSN SKUs
net: stmmac: dwmac-nuvoton: fix NULL pointer dereference in nvt_set_phy_intf_sel()
tcp: Fix dst leak in tcp_v6_connect().
ipmr: Call ipmr_fib_lookup() under RCU.
net: phy: broadcom: Save PHY counters during suspend
net/smc: fix missing sk_err when TCP handshake fails
af_unix: Reject SIOCATMARK on non-stream sockets
veth: fix OOB txq access in veth_poll() with asymmetric queue counts
eth: fbnic: fix double-free of PCS on phylink creation failure
net: ethernet: cortina: Drop half-assembled SKB
selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
selftests: mptcp: check output: catch cmd errors
mptcp: pm: prio: skip closed subflows
mptcp: pm: ADD_ADDR rtx: return early if no retrans
mptcp: pm: ADD_ADDR rtx: skip inactive subflows
mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
mptcp: pm: ADD_ADDR rtx: free sk if last
mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
mptcp: pm: ADD_ADDR rtx: fix potential data-race
...
|
|
Pull smb server fixes from Steve French:
- Fix memory leak in connection free
- Fix inherited ACL ACE validation
- Minor cleanup
- Fix for share config
- Fix durable handle cleanup race
- Fix close_file_table_ids in session teardown
- smbdirect fixes:
- Fix memory region registration
- Two fixes for out-of-tree builds
* tag 'v7.1-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: validate inherited ACE SID length
ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()
ksmbd: fail share config requests when path allocation fails
ksmbd: close durable scavenger races against m_fp_list lookups
ksmbd: harden file lifetime during session teardown
ksmbd: centralize ksmbd_conn final release to plug transport leak
smb: smbdirect: fix MR registration for coalesced SG lists
smb: smbdirect: introduce and use include/linux/smbdirect.h
smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPL
|
|
The optimized RSEQ V2 mode requires that user space adheres to the ABI
specification and does not modify the read-only fields cpu_id_start,
cpu_id, node_id and mm_cid behind the kernel's back.
While the kernel does not rely on these fields, the adherence to this is a
fundamental prerequisite to allow multiple entities, e.g. libraries, in an
application to utilize the full potential of RSEQ without stepping on each
other toes.
Validate this adherence on every update of these fields. If the kernel
detects that user space modified the fields, the application is force
terminated.
Fixes: d6200245c75e ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.845230956%40kernel.org
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
- Fix devm_alloc_workqueue() passing a va_list as a positional arg to
the variadic alloc_workqueue() macro, which garbled wq->name and
skipped lockdep init on the devm path. Fold both noprof entry points
onto a va_list helper.
Also, annotate it using __printf(1, 0)
* tag 'wq-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Annotate alloc_workqueue_va() with __printf(1, 0)
workqueue: fix devm_alloc_workqueue() va_list misuse
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- During v6.19, cgroup task unlink was moved from do_exit() to after the
final task switch to satisfy a controller invariant. That left the kernel
seeing tasks past exit_signals() longer than userspace expected, and
several v7.0 follow-ups tried to bridge the gap by making rmdir wait for
the kernel side. None held up.
The latest is an A-A deadlock when rmdir is invoked by the reaper of
zombies whose pidns teardown the rmdir itself is waiting on, which
points at the synchronizing approach being fundamentally wrong.
Take a different approach: drop the wait, leave rmdir's user-visible
side returning as soon as cgroup.procs is empty, and defer the css
percpu_ref kill that drives ->css_offline() until the cgroup is fully
depopulated.
Tagged for stable. Somewhat invasive but contained. The hope is that
fixing forward sticks. If not, the fallback is to revert the entire
chain and rework on the development branch.
Note that this doesn't plug a pre-existing analogous race in
cgroup_apply_control_disable() (controller disable via
subtree_control). Not a regression. The development branch will do
the more invasive restructuring needed for that.
- Documentation update for cgroup-v1 charge-commit section that still
referenced functions removed when the memcg hugetlb try-commit-cancel
protocol was retired.
* tag 'cgroup-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs: cgroup-v1: Update charge-commit section
cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix idle CPU selection returning prev_cpu outside the task's cpus_ptr
when the BPF caller's allowed mask was wider. Stable backport.
- Two opposite-direction gaps in scx_task_iter's cgroup-scoped mode
versus the global mode:
- Tasks past exit_signals() are filtered by the cgroup walk but kept
by global. Sub-scheduler enable abort leaked __scx_init_task()
state. Add a CSS_TASK_ITER_WITH_DEAD flag to cgroup's task
iterator (scx_task_iter is its only user) and use it.
- Tasks past sched_ext_dead() are still returned, tripping
WARN_ON_ONCE() in callers or making them touch torn-down state.
Mark and skip under the per-task rq lock.
* tag 'sched_ext-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: idle: Recheck prev_cpu after narrowing allowed mask
sched_ext: Skip past-sched_ext_dead() tasks in scx_task_iter_next_locked()
cgroup, sched_ext: Include exiting tasks in cgroup iter
|
|
The recent RSEQ optimization work broke the TCMalloc abuse of the RSEQ ABI
as it not longer unconditionally updates the CPU, node, mm_cid fields,
which are documented as read only for user space. Due to the observed
behavior of the kernel it was possible for TCMalloc to overwrite the
cpu_id_start field for their own purposes and rely on the kernel to update
it unconditionally after each context switch and before signal delivery.
The RSEQ ABI only guarantees that these fields are updated when the data
changes, i.e. the task is migrated or the MMCID of the task changes due to
switching from or to per CPU ownership mode.
The optimization work eliminated the unconditional updates and reduced them
to the documented ABI guarantees, which results in a massive performance
win for syscall, scheduling heavy work loads, which in turn breaks the
TCMalloc expectations.
There have been several options discussed to restore the TCMalloc
functionality while preserving the optimization benefits. They all end up
in a series of hard to maintain workarounds, which in the worst case
introduce overhead for everyone, e.g. in the scheduler.
The requirements of TCMalloc and the optimization work are diametral and
the required work arounds are a maintainence burden. They end up as fragile
constructs, which are blocking further optimization work and are pretty
much guaranteed to cause more subtle issues down the road.
The optimization work heavily depends on the generic entry code, which is
not used by all architectures yet. So the rework preserved the original
mechanism moslty unmodified to keep the support for architectures, which
handle rseq in their own exit to user space loop. That code is currently
optimized out by the compiler on architectures which use the generic entry
code.
This allows to revert back to the original behaviour by replacing the
compile time constant conditions with a runtime condition where required,
which disables the optimization and the dependend time slice extension
feature until the run-time condition can be enabled in the RSEQ
registration code on a per task basis again.
The following changes are required to restore the original behavior, which
makes TCMalloc work again:
1) Replace the compile time constant conditionals with runtime
conditionals where appropriate to prevent the compiler from optimizing
the legacy mode out
2) Enforce unconditional update of IDs on context switch for the
non-optimized v1 mode
3) Enforce update of IDs in the pre signal delivery path for the
non-optimized v1 mode
4) Enforce update of IDs in the membarrier(RSEQ) IPI for the
non-optimized v1 mode
5) Make time slice and future extensions depend on optimized v2 mode
This brings back the full performance problems, but preserves the v2
optimization code and for generic entry code using architectures also the
TIF_RSEQ optimization which avoids a full evaluation of the exit to user
mode loop in many cases.
Fixes: 566d8015f7ee ("rseq: Avoid CPU/MM CID updates when no event pending")
Reported-by: Mathias Stearn <mathias@mongodb.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Closes: https://lore.kernel.org/CAHnCjA25b+nO2n5CeifknSKHssJpPrjnf+dtr7UgzRw4Zgu=oA@mail.gmail.com
Link: https://patch.msgid.link/20260428224427.517051752%40kernel.org
Cc: stable@vger.kernel.org
|
|
Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), kthreads default to use the HK_TYPE_DOMAIN
cpumask. IOW, it is no longer affected by the setting of the nohz_full
boot kernel parameter.
That means HK_TYPE_KTHREAD should now be an alias of HK_TYPE_DOMAIN
instead of HK_TYPE_KERNEL_NOISE to correctly reflect the current kthread
behavior. Make the change as HK_TYPE_KTHREAD is still being used in
some networking code.
Fixes: 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
scx_task_iter's cgroup-scoped mode can return tasks whose
sched_ext_dead() has already completed: cgroup_task_dead() removes
from cset->tasks after sched_ext_dead() in finish_task_switch() and is
irq-work deferred on PREEMPT_RT. The global mode is fine -
sched_ext_dead() removes from scx_tasks via list_del_init() first.
Callers (sub-sched enable prep/abort/apply, scx_sub_disable(),
scx_fail_parent()) assume returned tasks are still on @sch and trip
WARN_ON_ONCE() or operate on torn-down state otherwise.
Set %SCX_TASK_OFF_TASKS in sched_ext_dead() under @p's rq lock and
have scx_task_iter_next_locked() skip flagged tasks under the same
lock. Setter and reader serialize on the per-task rq lock - no race.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") made
css_task_iter_advance() skip exiting tasks so cgroup.procs stays consistent
with waitpid() visibility. Unfortunately, this broke scx_task_iter.
scx_task_iter walks either scx_tasks (global) or a cgroup subtree via
css_task_iter() and the two modes are expected to cover the same set of
tasks. After the above change the cgroup-scoped mode silently skips tasks
past exit_signals() that are still on scx_tasks.
scx_sub_enable_workfn()'s abort path is one of the symptoms: an exiting
SCX_TASK_SUB_INIT task can race past the cgroup iter leaking
__scx_init_task() state. Other iterations share the same gap.
Add CSS_TASK_ITER_WITH_DEAD to opt out of the skip and use it from
scx_task_iter().
Fixes: b0e4c2f8a0f0 ("sched_ext: Implement cgroup subtree iteration for scx_task_iter")
Reported-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
A chain of commits going back to v7.0 reworked rmdir to satisfy the
controller invariant that a subsystem's ->css_offline() must not run while
tasks are still doing kernel-side work in the cgroup.
[1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
[2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup")
[3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
[4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition")
[5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context")
[1] moved task cset unlink from do_exit() to finish_task_switch() so a
task's cset link drops only after the task has fully stopped scheduling.
That made tasks past exit_signals() linger on cset->tasks until their final
context switch, which led to a series of problems as what userspace expected
to see after rmdir diverged from what the kernel needs to wait for. [2]-[5]
tried to bridge that divergence: [2] filtered the exiting tasks from
cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4]
fixed the wait's condition; [5] made nr_dying_subsys_* visible
synchronously.
The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the
rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g.
host PID 1 systemd reaping orphan pids that were re-parented to it during
the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those
pids to free, the pids can't free because PID 1 is the reaper and it's stuck
in rmdir, and the system A-A deadlocks. No internal lock ordering breaks
this; the wait itself is the bug.
The css killing side that drove the original reorder, however, can be made
cleanly asynchronous: ->css_offline() is already async, run from
css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to
make that chain start only after all tasks have left the cgroup. rmdir's
user-visible side then returns as soon as cgroup.procs and friends are
empty, while ->css_offline() still runs only after the cgroup is fully
drained.
Verified by the original reproducer (pidns teardown + zombie reaper, runs
under vng) which hangs vanilla and succeeds here, and by per-commit
deterministic repros for [2], [3], [4], [5] with a boot parameter that
widens the post-exit_signals() window so each state is reliably reachable.
Some stress tests on top of that.
cgroup_apply_control_disable() has the same shape of pre-existing race:
when a controller is disabled via subtree_control, kill_css() ran
synchronously while tasks past exit_signals() could still be linked to
the cgroup's csets, and ->css_offline() could fire before they drained.
This patch preserves the existing synchronous behavior at that call site
(kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch
will defer kill_css_finish() there using a per-css trigger.
This seems like the right approach and I don't see problems with it. The
changes are somewhat invasive but not excessively so, so backporting to
-stable should be okay. If something does turn out to be wrong, the fallback
is to revert the entire chain ([1]-[5]) and rework in the development branch
instead.
v2: Pin cgrp across the deferred destroy work with explicit
cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1
wasn't actually broken (ordered cgroup_offline_wq + queue_work order
in cgroup_task_dead() saved it) but the explicit ref removes the
dependency on those non-obvious invariants. Also note the
pre-existing cgroup_apply_control_disable() race in the description;
a follow-up will defer kill_css_finish() there.
Fixes: 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
Cc: stable@vger.kernel.org # v7.0+
Reported-and-tested-by: Martin Pitt <martin@piware.de>
Link: https://lore.kernel.org/all/afHNg2VX2jy9bW7y@piware.de/
Link: https://lore.kernel.org/all/35e0670adb4abeab13da2c321582af9f@kernel.org/
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Pull drm fixes from Dave Airlie:
"Fixes for rc2, the usual amdgpu/xe double header, I think xe had a
couple of weeks combined due to some maintainer access issues,
otherwise there's just a few misc fixes and documentation fixups.
core and helpers:
- calculate framebuffer geometry with format helpers
- fix docs
amdgpu:
- GFX12 fix for CONFIG_DRM_DEBUG_MM configs
- Fix DC analog support
- Userq fixes
- GART placement fix
- Aldebaran SMU fixes
- AMDGPU_INFO_READ_MMR_REG fix
- UVD 3.1 fix
- GC 6 TCC fix
- Fix root reservation in amdgpu_vm_handle_fault()
- RAS fix
- Module reload fix for APUs
- Fix build for CONFIG_DRM_FBDEV_EMULATION=n
- IGT DWB regression fix
- GC 11.5.4 fix
- VCN user fence fixes
- JPEG user fence fixes
- SMU 13.0.6 fix
- VCN 3/4 IB parser fixes
- NV3x+ dGPU vblank fix
- DCE6/8 fixes for LVDS/eDP panels without an EDID
amdkfd:
- Fix for when CONFIG_HSA_AMD is not set
- SVM fixes
xe:
- uapi: Add missing pad and extensions check
- uapi: Reject unsafe PAT indices for CPU cached memory
- Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge
- Xe3p tuning and workaround fixes
- USE drm mm instead of drm SA for CCS read/write
- Fix leaks and null derefs
- Fix Wa_18022495364
appletbdrm:
- allocate protocol buffers with kvzalloc()
dma-buf:
- fix docs
imagination:
- avoid segfault in debugfs
ofdrm:
- put PCI device reference on errors
udl:
- increase USB timeout"
* tag 'drm-fixes-2026-05-02' of https://gitlab.freedesktop.org/drm/kernel: (77 commits)
drm/xe/uapi: Reject coh_none PAT index for CPU_ADDR_MIRROR
drm/xe/uapi: Reject coh_none PAT index for CPU cached memory in madvise
drm/xe/xelp: Fix Wa_18022495364
drm/xe/gsc: Fix BO leak on error in query_compatibility_version()
drm/xe/eustall: Fix drm_dev_put called before stream disable in close
drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl()
drm/xe: Fix dma-buf attachment leak in xe_gem_prime_import()
drm/xe: Fix bo leak in xe_dma_buf_init_obj() on allocation failure
drm/xe/bo: Fix bo leak on GGTT flag validation in xe_bo_init_locked()
drm/xe/bo: Fix bo leak on unaligned size validation in xe_bo_init_locked()
drm/xe: Fix potential NULL deref in xe_exec_queue_tlb_inval_last_fence_put_unlocked
drm/xe/vf: Use drm mm instead of drm sa for CCS read/write
drm/xe: Add memory pool with shadow support
drm/xe/debugfs: Correct printing of register whitelist ranges
drm/xe: Mark ROW_CHICKEN5 as a masked register
drm/xe/tuning: Use proper register offset for GAMSTLB_CTRL
drm/xe/xe3p_lpg: Add missing indirect ring state feature flag
drm/xe: Drop redundant rtp entries for Wa_14019988906 & Wa_14019877138
drm/xe/vm: Add missing pad and extensions check
drm/xe: Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
1) Replace skb_try_make_writable() by skb_ensure_writable() in
nft_fwd_netdev and the flowtable to deal with uncloned packets
having their network header in paged fragments.
2) Drop packet if output device does not exist and ensure sufficient
headroom in nft_fwd_netdev before transmitting the skb.
3) Use the existing dup recursion counter in nft_fwd_netdev for the
neigh_xmit variant, from Weiming Shi.
4) Add .check_hooks interface to x_tables to detach the control plane
hook check based on the match/target configuration. Then, update
nft_compat to use .check_hooks from .validate path, this fixes a
lack of hook validation for several match/targets.
5) Fix incorrect .usersize in xt_CT, from Florian Westphal.
6) Fix a memleak with netdev tables in dormant state,
from Florian Westphal.
7) Several patches to check if the packet is a fragment, then skip
layer 4 inspection, for x_tables and nf_tables; as well as common
nf_socket infrastructure. The xt_hashlimit match drops fragments
to stay consistent with the existing approach when failing to parse
the layer 4 protocol header.
8) Ensure sufficient headroom in the flowtable before transmitting
the skb.
9) Fix the flowtable inline vlan approach for double-tagged vlan:
Reverse the iteration over .encap[] since it represents the
encapsulation as seen from the ingress path. Postpone pushing
layer 2 header so output device is available to calculate needed
headroom. Finally, add and use nf_flow_vlan_push() to fix it.
10) Fix flowtable inline pppoe with GSO packets. Moreover, use
FLOW_OFFLOAD_XMIT_DIRECT to fill up destination hardware
address since neighbour cache does not exist in pppoe.
11) Use skb_pull_rcsum() to decapsulate vlan and pppoe headers, for
double-tagged vlan in particular this should provide some benefits
in certain scenarios.
More notes regarding 9-11):
- sashiko is also signalling to use it for IPIP headers, but that needs
more adjustments such setting skb->protocol after removing the IPIP
header, will follow up in a separated patch.
- I plan to submit selftests to cover double-tagged-vlan. As for pppoe,
it should be possible but that would mandate a few userspace dependencies.
This has been semi-automatically tested by me and reporters describing
broken double-vlan-tagged and pppoe currently in the flowtable.
* tag 'nf-26-05-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: flowtable: use skb_pull_rcsum() to pop vlan/pppoe header
netfilter: flowtable: fix inline pppoe encapsulation in xmit path
netfilter: flowtable: fix inline vlan encapsulation in xmit path
netfilter: flowtable: ensure sufficient headroom in xmit path
netfilter: xtables: fix L4 header parsing for non-first fragments
netfilter: nf_tables: skip L4 header parsing for non-first fragments
netfilter: nf_socket: skip socket lookup for non-first fragments
netfilter: nf_tables: fix netdev hook allocation memleak with dormant tables
netfilter: xt_CT: fix usersize for v1 and v2 revision
netfilter: nft_compat: run xt_check_hooks_{match,target}() from .validate
netfilter: x_tables: add .check_hooks to matches and targets
netfilter: nft_fwd_netdev: use recursion counter in neigh egress path
netfilter: nft_fwd_netdev: add device and headroom validate with neigh forwarding
netfilter: replace skb_try_make_writable() by skb_ensure_writable()
====================
Link: https://patch.msgid.link/20260501122237.296262-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This makes it easier to rebuild cifs.ko and ksmbd.ko against
a running kernel.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
rseq_reset() uses memset() to clear the tasks rseq data. That's racy
against membarrier() and preemption.
Guard it with irqsave to cure this.
Fixes: faba9d250eae ("rseq: Introduce struct rseq_data")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.353887714%40kernel.org
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- MD pull request via Yu:
- Fix a raid5 UAF on IO across the reshape position
- Avoid failing RAID1/RAID10 devices for invalid IO errors
- Fix RAID10 divide-by-zero when far_copies is zero
- Restore bitmap grow through sysfs
- Use mddev_is_dm() instead of open-coding gendisk checks
- Use ATTRIBUTE_GROUPS() for md default sysfs attributes
- Replace open-coded wait loops with wait_event helpers
- NVMe pull request via Keith:
- Target data transfer size configuation (Aurelien)
- Enable P2P for RDMA (Shivaji Kant)
- TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar)
- TCP host updates (Alistair, Chaitanya)
- Authentication updates (Alistair, Daniel, Chris Leech)
- Multipath fixes (John Garry)
- New quirks (Alan Cui, Tao Jiang)
- Apple driver fix (Fedor Pchelkin)
- PCI admin doorbell update fix (Keith)
- Properly propagate CDROM read-only state to the block layer
* tag 'block-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (35 commits)
md: use ATTRIBUTE_GROUPS() for md default sysfs attributes
md: use mddev_is_dm() instead of open-coding gendisk checks
md/raid1: replace wait loop with wait_event_idle() in raid1_write_request()
md/md-bitmap: add a none backend for bitmap grow
md/md-bitmap: split bitmap sysfs groups
md: factor bitmap creation away from sysfs handling
md: use mddev_lock_nointr() in mddev_suspend_and_lock_nointr()
md: replace wait loop with wait_event() in md_handle_request()
md/raid10: fix divide-by-zero in setup_geo() with zero far_copies
md/raid1,raid10: don't fail devices for invalid IO errors
MAINTAINERS: Add Xiao Ni as md/raid reviewer
md/raid5: Fix UAF on IO across the reshape position
cdrom, scsi: sr: propagate read-only status to block layer via set_disk_ro()
nvme-auth: Hash DH shared secret to create session key
nvme-pci: fix missed admin queue sq doorbell write
nvme-auth: Include SC_C in RVAL controller hash
nvme-tcp: teardown circular locking fixes
nvmet-tcp: Don't clear tls_key when freeing sq
Revert "nvmet-tcp: Don't free SQ on authentication success"
nvme: skip trace completion for host path errors
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"20 hotfixes. All are for MM (and for MMish maintainers). 9 are
cc:stable and the remainder are for post-7.0 issues or aren't deemed
suitable for backporting.
There are two DAMON series from SeongJae Park which address races
which could lead to use-after-free errors, and avoid the possibility
of presenting stale parameter values to users"
* tag 'mm-hotfixes-stable-2026-04-30-15-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end()
mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()
MAINTAINERS: remove stale kdump project URL
mm/damon/stat: detect and use fresh enabled value
mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
selftests/mm: specify requirement for PROC_MEM_ALWAYS_FORCE=y
mm/damon/sysfs-schemes: protect path kfree() with damon_sysfs_lock
mm/damon/sysfs-schemes: protect memcg_path kfree() with damon_sysfs_lock
MAINTAINERS: update Li Wang's email address
MAINTAINERS, mailmap: update email address for Qi Zheng
MAINTAINERS: update Liam's email address
mm/hugetlb_cma: round up per_node before logging it
MAINTAINERS: fix regex pattern in CORE MM category
mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap()
mm: start background writeback based on per-wb threshold for strictlimit BDIs
kho: fix error handling in kho_add_subtree()
liveupdate: fix return value on session allocation failure
mailmap: update entry for Dan Carpenter
vmalloc: fix buffer overflow in vrealloc_node_align()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"Besides an out-of-bound bug, this is about properly supporting Winbond
octal SPI NAND chips which use a specific pattern for stuffing more
address bits in some operations. This uses the spi-mem flag in SPI
NAND that was added to the spi-mem layer just before the merge window
through the spi tree"
* tag 'mtd/fixes-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: spinand: winbond: Fix ODTR write VCR on W35NxxJW
mtd: spinand: winbond: Set the packed page read flag to W35N02/04JW
mtd: spinand: Add support for packed read data ODTR commands
mtd: spi-nor: debugfs: fix out-of-bounds read in spi_nor_params_show()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- ipmr: free mr_table after RCU grace period.
Previous releases - regressions:
- core: add net_iov_init() and use it to initialize ->page_type
- sched: taprio: fix NULL pointer dereference in class dump
- netfilter: nf_tables:
- use list_del_rcu for netlink hooks
- fix strict mode inbound policy matching
- tcp: make probe0 timer handle expired user timeout
- vrf: fix a potential NPD when removing a port from a VRF
- eth: ice:
- fix NULL pointer dereference in ice_reset_all_vfs()
- fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw
Previous releases - always broken:
- page_pool: fix memory-provider leak in error path
- sched: sch_cake: annotate data-races in cake_dump_stats()
- mptcp: fix scheduling with atomic in timestamp sockopt
- psp: check for device unregister when creating assoc
- tls: fix strparser anchor skb leak on offload RX setup failure
- eth:
- stmmac: prevent NULL deref when RX memory exhausted
- airoha: do not read uninitialized fragment address
- rtl8150: fix use-after-free in rtl8150_start_xmit()
Misc:
- add Ido Schimmel as IPv4/IPv6 maintainer
- add David Heidelberg as NFC subsystem maintainer"
* tag 'net-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
net/sched: cls_flower: revert unintended changes
sfc: fix error code in efx_devlink_info_running_versions()
net: tls: fix strparser anchor skb leak on offload RX setup failure
ice: add dpll peer notification for paired SMA and U.FL pins
ice: fix missing dpll notifications for SW pins
dpll: export __dpll_pin_change_ntf() for use under dpll_lock
ice: fix SMA and U.FL pin state changes affecting paired pin
ice: fix missing SMA pin initialization in DPLL subsystem
ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw
ice: fix NULL pointer dereference in ice_reset_all_vfs()
iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler
iavf: wait for PF confirmation before removing VLAN filters
iavf: stop removing VLAN filters from PF on interface down
iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING
page_pool: fix memory-provider leak in page_pool_create_percpu() error path
bonding: 3ad: implement proper RCU rules for port->aggregator
net: airoha: Do not return err in ndo_stop() callback
hv_sock: fix ARM64 support
MAINTAINERS: update the IPv4/IPv6 entry and add Ido Schimmel
selftests: drv-net: clarify linters and frameworks in README
...
|
|
Export __dpll_pin_change_ntf() so that drivers can send pin change
notifications from within pin callbacks, which are already called
under dpll_lock. Using dpll_pin_change_ntf() in that context would
deadlock.
Add lockdep_assert_held() to catch misuse without the lock held.
Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Alexander Nowlin <alexander.nowlin@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-9-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add a new .check_hooks interface for checking if the match/target is
used from the validate hook according to its configuration.
Move existing conditional hook check based on the match/target
configuration from .checkentry to .check_hooks for the following
matches/targets:
- addrtype
- devgroup
- physdev
- policy
- set
- TCPMSS
- SET
This is a preparation patch to fix nft_compat, not functional changes
are intended.
Based on patch from Florian Westphal.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix inverted check of registering the stats for branch tracing
When calling register_stat_tracer() which returns zero on success and
negative on error, the callers were checking the return of zero as an
error and printing a warning message. Because this was just a normal
printk() message and not a WARN(), it wasn't caught in any testing.
Fix the check to print the warning message when an error actually
happens.
- Fix a typo in a comment in tracepoint.h
- Limit the size of event probes to 3K in size
It is possible to create a dynamic event probe via the tracefs system
that is greater than the max size of an event that the ring buffer
can hold. This basically causes the event to become useless.
Limit the size of an event probe to be 3K as that should be large
enough to handle any dynamic events being created, and fits within
the PAGE_SIZE sub-buffers of the ring buffer.
* tag 'trace-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Limit size of event probe to 3K
tracepoint: Fix typo in tracepoint.h comment
tracing: branch: Fix inverted check on stat tracer registration
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) IEEE1394 ARP payload contains no target hardware address in the
ARP packet. Apparently, arp_tables was never updated to deal with
IEEE1394 ARP properly. To deal with this, return no match in case
the target hardware address selector is used, either for inverse or
normal match. Moreover, arpt_mangle disallows mangling of the target
hardware and IP address because, it is not worth to adjust the
offset calculation to fix this, we suspect no users of arp_tables
for this family.
2) Use list_del_rcu() to delete device hooks in nf_tables, this hook
list is RCU protected, concurrent netlink dump readers can be
walking on this list, fix it by adding a helper function and use it
for consistency. From Florian Westphal.
3) Add list_splice_rcu(), this is useful for joining the local list of
new device hooks to the RCU protected hook list in chain and
flowtable. Reviewed by Paul E. McKenney.
4) Use list_splice_rcu() to publish the new device hooks in chain and
flowtable to fix concurrent netlink dump traversal.
5) Add a new hook transaction object to track device hook deletions.
The current approach moves device hooks to be deleted around during
the preparation phase, this breaks concurrent RCU reader via netlink
dump. This new hook transaction is combined with NFT_HOOK_REMOVE
flag to annotate hooks for removal in the preparation phase.
6) xt_policy inbound policy check in strict mode can lead to
out-of-bound access of the secpath array due to incorrect.
The iteration over the secpath needs to be reversed in the inbound
to check for the human readable policy, expecting inner in first
position and outer in second position, the secpath from inbound
actually stores outer in first position then in second position.
From Jiexun Wang.
7) Fix possible zero shift in nft_bitwise triggering UBSAN splat,
reject zero shift from control plane, from Kai Ma.
8) Replace simple_strtoul() in the conntrack SIP helper since it relies
on nul-terminated strings. From Florian Westphal.
* tag 'nf-26-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_conntrack_sip: don't use simple_strtoul
netfilter: reject zero shift in nft_bitwise
netfilter: xt_policy: fix strict mode inbound policy matching
netfilter: nf_tables: add hook transactions for device deletions
netfilter: nf_tables: join hook list via splice_list_rcu() in commit phase
rculist: add list_splice_rcu() for private lists
netfilter: nf_tables: use list_del_rcu for netlink hooks
netfilter: arp_tables: fix IEEE1394 ARP payload parsing
====================
Link: https://patch.msgid.link/20260428095840.51961-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
"The merge window pulled in the cgroup sub-scheduler infrastructure,
and new AI reviews are accelerating bug reporting and fixing - hence
the larger than usual fixes batch:
- Use-after-frees during scheduler load/unload:
- The disable path could free the BPF scheduler while deferred
irq_work / kthread work was still in flight
- cgroup setter callbacks read the active scheduler outside the
rwsem that synchronizes against teardown
Fix both, and reuse the disable drain in the enable error paths so
the BPF JIT page can't be freed under live callbacks.
- Several BPF op invocations didn't tell the framework which runqueue
was already locked, so helper kfuncs that re-acquire the runqueue
by CPU could deadlock on the held lock
Fix the affected callsites, including recursive parent-into-child
dispatch.
- The hardlockup notifier ran from NMI but eventually took a
non-NMI-safe lock. Bounce it through irq_work.
- A handful of bugs in the new sub-scheduler hierarchy:
- helper kfuncs hard-coded the root instead of resolving the
caller's scheduler
- the enable error path tried to disable per-task state that had
never been initialized, and leaked cpus_read_lock on the way
out
- a sysfs object was leaked on every load/unload
- the dispatch fast-path used the root scheduler instead of the
task's
- a couple of CONFIG #ifdef guards were misclassified
- Verifier-time hardening: BPF programs of unrelated struct_ops types
(e.g. tcp_congestion_ops) could call sched_ext kfuncs - a semantic
bug and, once sub-sched was enabled, a KASAN out-of-bounds read.
Now rejected at load. Plus a few NULL and cross-task argument
checks on sched_ext kfuncs, and a selftest covering the new deny.
- rhashtable (Herbert): restore the insecure_elasticity toggle and
bounce the deferred-resize kick through irq_work to break a
lock-order cycle observable from raw-spinlock callers. sched_ext's
scheduler-instance hash is the first user of both.
- The bypass-mode load balancer used file-scope cpumasks; with
multiple scheduler instances now possible, those raced. Move to
per-instance cpumasks, plus a follow-up to skip tasks whose
recorded CPU is stale relative to the new owning runqueue.
- Smaller fixes:
- a dispatch queue's first-task tracking misbehaved when a parked
iterator cursor sat in the list
- the runqueue's next-class wasn't promoted on local-queue
enqueue, leaving an SCX task behind RT in edge cases
- the reference qmap scheduler stopped erroring on legitimate
cross-scheduler task-storage misses"
* tag 'sched_ext-for-7.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (26 commits)
sched_ext: Fix scx_flush_disable_work() UAF race
sched_ext: Call wakeup_preempt() in local_dsq_post_enq()
sched_ext: Release cpus_read_lock on scx_link_sched() failure in root enable
sched_ext: Reject NULL-sch callers in scx_bpf_task_set_slice/dsq_vtime
sched_ext: Refuse cross-task select_cpu_from_kfunc calls
sched_ext: Align cgroup #ifdef guards with SUB_SCHED vs GROUP_SCHED
sched_ext: Make bypass LB cpumasks per-scheduler
sched_ext: Pass held rq to SCX_CALL_OP() for core_sched_before
sched_ext: Pass held rq to SCX_CALL_OP() for dump_cpu/dump_task
sched_ext: Save and restore scx_locked_rq across SCX_CALL_OP
sched_ext: Use dsq->first_task instead of list_empty() in dispatch_enqueue() FIFO-tail
sched_ext: Resolve caller's scheduler in scx_bpf_destroy_dsq() / scx_bpf_dsq_nr_queued()
sched_ext: Read scx_root under scx_cgroup_ops_rwsem in cgroup setters
sched_ext: Don't disable tasks in scx_sub_enable_workfn() abort path
sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()
sched_ext: Guard scx_dsq_move() against NULL kit->dsq after failed iter_new
sched_ext: Unregister sub_kset on scheduler disable
sched_ext: Defer scx_hardlockup() out of NMI
sched_ext: sync disable_irq_work in bpf_scx_unreg()
sched_ext: Fix local_dsq_post_enq() to use task's scheduler in sub-sched
...
|
|
Change "my" to "may" in the description of subsystem configurations.
Link: https://patch.msgid.link/20260422021819.1788091-1-synte4028@gmail.com
Signed-off-by: Sheng Che Peng <synte4028@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
devm_alloc_workqueue() built a va_list and passed it as a single
positional argument to the variadic alloc_workqueue() macro:
va_start(args, max_active);
wq = alloc_workqueue(fmt, flags, max_active, args);
va_end(args);
C does not allow forwarding a va_list through a ... parameter.
alloc_workqueue() expands to alloc_workqueue_noprof(), which runs
its own va_start() over its ... params, so the inner
vsnprintf(wq->name, sizeof(wq->name), fmt, args) in
__alloc_workqueue() received the outer va_list object as the first
variadic slot rather than the caller's actual format arguments.
Add a new static helper alloc_workqueue_va() that wraps
__alloc_workqueue() and runs wq_init_lockdep() on success, and
fold both alloc_workqueue_noprof() and devm_alloc_workqueue_noprof()
onto it as suggested by Tejun.
The wq_init_lockdep() step is required on the devm path
too, otherwise __flush_workqueue()'s on-stack
COMPLETION_INITIALIZER_ONSTACK_MAP would NULL-deref wq->lockdep_map.
No caller changes are required. devm_alloc_ordered_workqueue() is
a macro forwarding to devm_alloc_workqueue() and inherits the fix.
Two in-tree callers actively trigger the broken path on every probe:
drivers/power/supply/mt6370-charger.c:889
drivers/power/supply/max77705_charger.c:649
both of which use devm_alloc_ordered_workqueue(dev, "%s", 0,
dev_name(dev)).
A standalone reproducer module is available at[1].
Link: https://github.com/leitao/debug/blob/main/workqueue/valist/wq_va_test.c [1]
Fixes: 1dfc9d60a69e ("workqueue: devres: Add device-managed allocate workqueue")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
With CONFIG_IP_MROUTE_MULTIPLE_TABLES=n, ipmr_fib_lookup()
does not check if net->ipv4.mrt is NULL.
Since default_device_exit_batch() is called after ->exit_rtnl(),
a device could receive IGMP packets and access net->ipv4.mrt
during/after ipmr_rules_exit_rtnl().
If ipmr_rules_exit_rtnl() had already cleared it and freed the
memory, the access would trigger null-ptr-deref or use-after-free.
Let's fix it by using RCU helper and free mrt after RCU grace
period.
In addition, check_net(net) is added to mroute_clean_tables()
and ipmr_cache_unresolved() to synchronise via mfc_unres_lock.
This prevents ipmr_cache_unresolved() from putting skb into
c->_c.mfc_un.unres.unresolved after mroute_clean_tables()
purges it.
For the same reason, timer_shutdown_sync() is moved after
mroute_clean_tables().
Since rhltable_destroy() holds mutex internally, rcu_work is
used, and it is placed as the first member because rcu_head
must be placed within <4K offset. mr_table is alraedy 3864
bytes without rcu_work.
Note that IP6MR is not yet converted to ->exit_rtnl(), so this
change is not needed for now but will be.
Fixes: b22b01867406 ("ipmr: Convert ipmr_net_exit_batch() to ->exit_rtnl().")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260423053456.4097409-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
"Three fixes for fsnotify / fanotify"
* tag 'fsnotify_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: fix inode reference leak in fsnotify_recalc_mask()
fanotify: Fix spelling mistake "enforecement" -> "enforcement"
fanotify: fix false positive on permission events
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox
Pull mailbox updates from Jassi Brar:
- core: fix NULL message handling and add API to query TX queue slots
- test: resolve concurrency bugs, dangling IRQs, and memory leaks
- dt-bindings: qcom: add Eliza IPCC
- mtk: fix address calculation and pointer handling bugs
- cix: resolve SCMI suspend timeouts
- misc memory allocation optimizations and cleanups
* tag 'mailbox-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
mailbox: mailbox-test: make data_ready a per-instance variable
mailbox: mailbox-test: initialize struct earlier
mailbox: mailbox-test: don't free the reused channel
mailbox: mailbox-test: handle channel errors consistently
mailbox: update kdoc for struct mbox_controller
mailbox: add sanity check for channel array
mailbox: mailbox-test: free channels on probe error
mailbox: prefix new constants with MBOX_
dt-bindings: mailbox: qcom-ipcc: Document the Eliza Inter-Processor Communication Controller
mailbox: cix: Add IRQF_NO_SUSPEND to mailbox interrupt
mailbox: Fix NULL message support in mbox_send_message()
mailbox: remove superfluous internal header
mailbox: correct kdoc title for mbox_bind_client
mailbox: test: really ignore optional memory resources
mailbox: exynos: drop superfluous mbox setting per channel
mailbox: mtk-cmdq: Fix CURR and END addr for task insert case
mailbox: mtk-vcp-mailbox: Fix the return value in mtk_vcp_mbox_xlate()
mailbox: hi6220: kzalloc + kcalloc to kzalloc
mailbox: rockchip: kzalloc + kcalloc to kzalloc
mailbox: add API to query available TX queue slots
|
|
The cdrom core never calls set_disk_ro() for a registered device, so
BLKROGET on a CD-ROM device always returns 0 (writable), even when the
drive has no write capabilities and writes will inevitably fail. This
causes problems for userspace that relies on BLKROGET to determine
whether a block device is read-only. For example, systemd's loop device
setup uses BLKROGET to decide whether to create a loop device with
LO_FLAGS_READ_ONLY. Without the read-only flag, writes pass through the
loop device to the CD-ROM and fail with I/O errors. systemd-fsck
similarly checks BLKROGET to decide whether to run fsck in no-repair
mode (-n).
The write-capability bits in cdi->mask come from two different sources:
CDC_DVD_RAM and CDC_CD_RW are populated by the driver from the MODE
SENSE capabilities page (page 0x2A) before register_cdrom() is called,
while CDC_MRW_W and CDC_RAM require the MMC GET CONFIGURATION command
and were only probed by cdrom_open_write() at device open time. This
meant that any attempt to compute the writable state from the full
mask at probe time was incorrect, because the GET CONFIGURATION bits
were still unset (and cdi->mask is initialized such that capabilities
are assumed present).
Fix this by factoring the GET CONFIGURATION probing out of
cdrom_open_write() into a new exported helper,
cdrom_probe_write_features(), and having sr call it from sr_probe()
right after get_capabilities() has populated the MODE SENSE bits.
register_cdrom() then calls set_disk_ro() based on the full
write-capability mask (CDC_DVD_RAM | CDC_MRW_W | CDC_RAM | CDC_CD_RW)
so the block layer reflects the drive's actual write support. The
feature queries used (CDF_MRW and CDF_RWRT via GET CONFIGURATION with
RT=00) report drive-level capabilities that are persistent across
media, so a single probe before register_cdrom() is sufficient and the
redundant probe at open time is dropped.
With set_disk_ro() now accurate, the long-vestigial cd->writeable flag
in sr can go: get_capabilities() used to set cd->writeable based on
the same four mask bits, but because CDC_MRW_W and CDC_RAM default to
"capability present" in cdi->mask and aren't touched by MODE SENSE,
the condition that gated cd->writeable was always true, making it
unconditionally 1. Replace the corresponding gate in sr_init_command()
with get_disk_ro(cd->disk), which turns a previously no-op check into
a real one and also catches kernel-internal bio writers that bypass
blkdev_write_iter()'s bdev_read_only() check.
The sd driver (SCSI disks) does not have this problem because it
checks the MODE SENSE Write Protect bit and calls set_disk_ro()
accordingly. The sr driver cannot use the same approach because the
MMC specification does not define the WP bit in the MODE SENSE
device-specific parameter byte for CD-ROM devices.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Daan De Meyer <daan@amutable.com>
Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://patch.msgid.link/20260427210139.1400-2-phil@philpotter.co.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NVMe fixes from Keith:
"- Target data transfer size confiruation (Aurelien)
- Enable P2P for RDMA (Shivaji Kant)
- TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar)
- TCP host updates (Alistair, Chaitanya)
- Authentication updates (Alistair, Daniel, Chris Leech)
- Multipath fixes (John Garry)
- New quirks (Alan Cui, Tao Jiang)
- Apple driver fix (Fedor Pchelkin)
- PCI admin doorbell update fix (Keith)"
* tag 'nvme-7.1-2026-04-24' of git://git.infradead.org/nvme: (22 commits)
nvme-auth: Hash DH shared secret to create session key
nvme-pci: fix missed admin queue sq doorbell write
nvme-auth: Include SC_C in RVAL controller hash
nvme-tcp: teardown circular locking fixes
nvmet-tcp: Don't clear tls_key when freeing sq
Revert "nvmet-tcp: Don't free SQ on authentication success"
nvme: skip trace completion for host path errors
nvme-pci: add quirk for Memblaze Pblaze5 (0x1c5f:0x0555)
nvme-multipath: put module reference when delayed removal work is canceled
nvme: expose TLS mode
nvme-apple: drop invalid put of admin queue reference count
nvme-core: fix parameter name in comment
nvmet: avoid recursive nvmet-wq flush in nvmet_ctrl_free
nvme-multipath: drop head pointer check in nvme_mpath_clear_current_path()
nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808 (Samsung PM981/983/970 EVO Plus )
nvmet-tcp: fix race between ICReq handling and queue teardown
nvmet-tcp: remove redundant calls to nvmet_tcp_fatal_error()
nvmet-tcp: propagate nvmet_tcp_build_pdu_iovec() errors to its callers
nvme: enable PCI P2PDMA support for RDMA transport
nvmet: introduce new mdts configuration entry
...
|
|
Some devices stuff address bits in the double byte opcode (in place of
the repeated byte) in order to be able to increase the size of the
devices, without adding extra address bytes.
Create a flag to identify those devices. When the flag is set, use the
"packed" variant for the read data operation.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
Switching to private email address. Update all contact information
Add an entry to mailmap at the same time.
Link: https://lore.kernel.org/20260422184310.2682901-1-liam@infradead.org
Signed-off-by: Liam R. Howlett <liam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The mmap_prepare hook functionality includes the ability to invoke
mmap_prepare() from the mmap() hook of existing 'stacked' drivers, that is
ones which are capable of calling the mmap hooks of other drivers/file
systems (e.g. overlayfs, shm).
As part of the mmap_prepare action functionality, we deal with errors by
unmapping the VMA should one arise. This works in the usual mmap_prepare
case, as we invoke this action at the last moment, when the VMA is
established in the maple tree.
However, the mmap() hook passes a not-fully-established VMA pointer to the
caller (which is the motivation behind the mmap_prepare() work), which is
detached.
So attempting to unmap a VMA in this state will be problematic, with the
most obvious symptom being a warning in vma_mark_detached(), because the
VMA is already detached.
It's also unncessary - the mmap() handler will clean up the VMA on error.
So to fix this issue, this patch propagates whether or not an mmap action
is being completed via the compatibility layer or directly.
If the former, then we do not attempt VMA cleanup, if the latter, then we
do.
This patch also updates the userland VMA tests to reflect the change.
Link: https://lore.kernel.org/20260421102150.189982-1-ljs@kernel.org
Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reported-by: syzbot+db390288d141a1dccf96@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69e69734.050a0220.24bfd3.0027.GAE@google.com/
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Getting fixes and updates from v7.1-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"Here are the accumulated fixes for 7.1-rc1 and a single structural
change worth mentioning separately: Rafael's commit converting tpm_crb
from ACPI driver to a platform driver"
* tag 'for-next-tpm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: tpm_tis: stop transmit if retries are exhausted
tpm: tpm_tis: add error logging for data transfer
tpm: avoid -Wunused-but-set-variable
tpm: Use kfree_sensitive() to free auth session in tpm_dev_release()
tpm2-sessions: Fix missing tpm_buf_destroy() in tpm2_read_public()
tpm: Fix auth session leak in tpm2_get_random() error path
tpm: i2c: atmel: fix block comment formatting
tpm_crb: Convert ACPI driver to a platform one
tpm: Make tcpci_pm_ops variable static const
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Series for zloop, fixing a variety of issues
- t10-pi code cleanup
- Fix for a merge window regression with the bio memory allocation mask
- Fix for a merge window regression in ublk, caused by an issue with
the maple tree iteration code at teardown
- ublk self tests additions
- Zoned device pgmap fixes
- Various little cleanups and fixes
* tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (21 commits)
Revert "floppy: fix reference leak on platform_device_register() failure"
ublk: avoid unpinning pages under maple tree spinlock
ublk: refactor common helper ublk_shmem_remove_ranges()
ublk: fix maple tree lockdep warning in ublk_buf_cleanup
selftests: ublk: add ublk auto integrity test
selftests: ublk: enable test_integrity_02.sh on fio 3.42
selftests: ublk: remove unused argument to _cleanup
block: only restrict bio allocation gfp mask asked to block
block/blk-throttle: Add WQ_PERCPU to alloc_workqueue users
block: Add WQ_PERCPU to alloc_workqueue users
block: relax pgmap check in bio_add_page for compatible zone device pages
block: add pgmap check to biovec_phys_mergeable
floppy: fix reference leak on platform_device_register() failure
ublk: use unchecked copy helpers for bio page data
t10-pi: reduce ref tag code duplication
zloop: remove irq-safe locking
zloop: factor out zloop_mark_{full,empty} helpers
zloop: set RQF_QUIET when completing requests on deleted devices
zloop: improve the unaligned write pointer warning
zloop: use vfs_truncate
...
|
|
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix handling of ENOSPC so that if we have to resend writes, they
are written synchronously
- SUNRPC RDMA transport fixes from Chuck
- Several fixes for delegated timestamps in NFSv4.2
- Failure to obtain a directory delegation should not cause stat() to
fail with NFSv4
- Rename was failing to update timestamps when a directory delegation
is held on NFSv4
- Ensure we check rsize/wsize after crossing a NFSv4 filesystem
boundary
- NFSv4/pnfs:
- If the server is down, retry the layout returns on reboot
- Fallback to MDS could result in a short write being incorrectly
logged
Cleanups:
- Use memcpy_and_pad in decode_fh"
* tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits)
NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address
NFS: remove redundant __private attribute from nfs_page_class
NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes
NFS: fix writeback in presence of errors
nfs: use memcpy_and_pad in decode_fh
NFSv4.1: Apply session size limits on clone path
NFSv4: retry GETATTR if GET_DIR_DELEGATION failed
NFS: fix RENAME attr in presence of directory delegations
pnfs/flexfiles: validate ds_versions_cnt is non-zero
NFS/blocklayout: print each device used for SCSI layouts
xprtrdma: Post receive buffers after RPC completion
xprtrdma: Scale receive batch size with credit window
xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor
xprtrdma: Decouple frwr_wp_create from frwr_map
xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot
xprtrdma: Avoid 250 ms delay on backlog wakeup
xprtrdma: Close sendctx get/put race that can block a transport
nfs: update inode ctime after removexattr operation
nfs: fix utimensat() for atime with delegated timestamps
NFS: improve "Server wrote zero bytes" error
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO / and others driver updates from Greg KH:
"Here is the char/misc/iio and other smaller driver subsystem updates
for 7.1-rc1. Lots of stuff in here, all tiny, but relevant for the
different drivers they touch. Major points in here is:
- the usual large set of new IIO drivers and updates for that
subsystem (the large majority of this diffstat)
- lots of comedi driver updates and bugfixes
- coresight driver updates
- interconnect driver updates and additions
- mei driver updates
- binder (both rust and C versions) updates and fixes
- lots of other smaller driver subsystem updates and additions
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (405 commits)
coresight: tpdm: fix invalid MMIO access issue
mei: me: add nova lake point H DID
mei: lb: add late binding version 2
mei: bus: add mei_cldev_uuid
w1: ds2490: drop redundant device reference
bus: mhi: host: pci_generic: Add Telit FE912C04 modem support
mei: csc: wake device while reading firmware status
mei: csc: support controller with separate PCI device
mei: convert PCI error to common errno
mei: trace: print return value of pci_cfg_read
mei: me: move trace into firmware status read
mei: fix idle print specifiers
mei: me: use PCI_DEVICE_DATA macro
sonypi: Convert ACPI driver to a platform one
misc: apds990x: fix all kernel-doc warnings
most: usb: Use kzalloc_objs for endpoint address array
hpet: Convert ACPI driver to a platform one
misc: vmw_vmci: Fix spelling mistakes in comments
parport: Remove completed item from to-do list
char: remove unnecessary module_init/exit functions
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"This is quite a big set of fixes, almost all from Johan Hovold who is
on an ongoing quest to clean up issues with probe and removal handling
in drivers.
There isn't anything too concerning here especially with the
deregistration stuff which will very rarely get run in production
systems since this is all platform devices in the SoC on embedded
hardware, but it's all real issues which should be fixed. There's more
in flight here.
We also have a few other minor fixes, one from Felix Gu along the same
lines as Johan's work and a couple of documentation things"
* tag 'spi-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (23 commits)
spi: fix controller cleanup() documentation
spi: fix resource leaks on device setup failure
spi: axiado: clean up probe return value
spi: axiado: rename probe error labels
spi: axiado: fix runtime pm imbalance on probe failure
spi: orion: clean up probe return value
spi: orion: fix clock imbalance on registration failure
spi: orion: fix runtime pm leak on unbind
spi: imx: fix runtime pm leak on probe deferral
spi: mpc52xx: fix use-after-free on registration failure
spi: Fix the error description in the `ptp_sts_word_post` comment
spi: topcliff-pch: fix use-after-free on unbind
spi: topcliff-pch: fix controller deregistration
spi: orion: fix controller deregistration
spi: mxic: fix controller deregistration
spi: mpc52xx: fix use-after-free on unbind
spi: mpc52xx: fix controller deregistration
spi: cadence-quadspi: fix controller deregistration
spi: cadence: fix controller deregistration
spi: mtk-snfi: fix memory leak in probe
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking deletions from Jakub Kicinski:
"Delete some obsolete networking code
Old code like amateur radio and NFC have long been a burden to core
networking developers. syzbot loves to find bugs in BKL-era code, and
noobs try to fix them.
If we want to have a fighting chance of surviving the LLM-pocalypse
this code needs to find a dedicated owner or get deleted. We've talked
about these deletions multiple times in the past and every time
someone wanted the code to stay. It is never very clear to me how many
of those people actually use the code vs are just nostalgic to see it
go. Amateur radio did have occasional users (or so I think) but most
users switched to user space implementations since its all super slow
stuff. Nobody stepped up to maintain the kernel code.
We were lucky enough to find someone who wants to help with NFC so
we're giving that a chance. Let's try to put the rest of this code
behind us"
* tag 'net-deletions' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next:
drivers: net: 8390: wd80x3: Remove this driver
drivers: net: 8390: ultra: Remove this driver
drivers: net: 8390: AX88190: Remove this driver
drivers: net: fujitsu: fmvj18x: Remove this driver
drivers: net: smsc: smc91c92: Remove this driver
drivers: net: smsc: smc9194: Remove this driver
drivers: net: amd: nmclan: Remove this driver
drivers: net: amd: lance: Remove this driver
drivers: net: 3com: 3c589: Remove this driver
drivers: net: 3com: 3c574: Remove this driver
drivers: net: 3com: 3c515: Remove this driver
drivers: net: 3com: 3c509: Remove this driver
net: packetengines: remove obsolete yellowfin driver and vendor dir
net: packetengines: remove obsolete hamachi driver
net: remove unused ATM protocols and legacy ATM device drivers
net: remove ax25 and amateur radio (hamradio) subsystem
net: remove ISDN subsystem and Bluetooth CMTP
caif: remove CAIF NETWORK LAYER
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- eventpoll: fix ep_remove() UAF and follow-up cleanup
- fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference
error
- writeback: Fix use after free in inode_switch_wbs_work_fn()
- fuse: reject oversized dirents in page cache
- fs: aio: reject partial mremap to avoid Null-pointer-dereference
error
- nstree: fix func. parameter kernel-doc warnings
- fs: Handle multiply claimed blocks more gracefully with mmb
* tag 'vfs-7.1-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
eventpoll: drop vestigial epi->dying flag
eventpoll: drop dead bool return from ep_remove_epi()
eventpoll: refresh eventpoll_release() fast-path comment
eventpoll: move f_lock acquisition into ep_remove_file()
eventpoll: fix ep_remove struct eventpoll / struct file UAF
eventpoll: move epi_fget() up
eventpoll: rename ep_remove_safe() back to ep_remove()
eventpoll: drop vestigial __ prefix from ep_remove_{file,epi}()
eventpoll: kill __ep_remove()
eventpoll: split __ep_remove()
eventpoll: use hlist_is_singular_node() in __ep_remove()
fs: Handle multiply claimed blocks more gracefully with mmb
nstree: fix func. parameter kernel-doc warnings
fs: aio: reject partial mremap to avoid Null-pointer-dereference error
fuse: reject oversized dirents in page cache
writeback: Fix use after free in inode_switch_wbs_work_fn()
fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.
Steady stream of fixes. Last two weeks feel comparable to the two
weeks before the merge window. Lots of AI-aided bug discovery. A newer
big source is Sashiko/Gemini (Roman Gushchin's system), which points
out issues in existing code during patch review (maybe 25% of fixes
here likely originating from Sashiko). Nice thing is these are often
fixed by the respective maintainers, not drive-bys.
Current release - new code bugs:
- kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP
Previous releases - regressions:
- add async ndo_set_rx_mode and switch drivers which we promised to
be called under the per-netdev mutex to it
- dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops
- hv_sock: report EOF instead of -EIO for FIN
- vsock/virtio: fix MSG_PEEK calculation on bytes to copy
Previous releases - always broken:
- ipv6: fix possible UAF in icmpv6_rcv()
- icmp: validate reply type before using icmp_pointers
- af_unix: drop all SCM attributes for SOCKMAP
- netfilter: fix a number of bugs in the osf (OS fingerprinting)
- eth: intel: fix timestamp interrupt configuration for E825C
Misc:
- bunch of data-race annotations"
* tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (148 commits)
rxrpc: Fix error handling in rxgk_extract_token()
rxrpc: Fix re-decryption of RESPONSE packets
rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets
rxrpc: Fix missing validation of ticket length in non-XDR key preparsing
rxgk: Fix potential integer overflow in length check
rxrpc: Fix conn-level packet handling to unshare RESPONSE packets
rxrpc: Fix potential UAF after skb_unshare() failure
rxrpc: Fix rxkad crypto unalignment handling
rxrpc: Fix memory leaks in rxkad_verify_response()
net: rds: fix MR cleanup on copy error
m68k: mvme147: Make me the maintainer
net: txgbe: fix firmware version check
selftests/bpf: check epoll readiness during reuseport migration
tcp: call sk_data_ready() after listener migration
vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll()
ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim
tipc: fix double-free in tipc_buf_append()
llc: Return -EINPROGRESS from llc_ui_connect()
ipv4: icmp: validate reply type before using icmp_pointers
selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges
...
|
|
The old comment justified the lockless READ_ONCE(file->f_ep) check
with "False positives simply cannot happen because the file is on
the way to be removed and nobody ( but eventpoll ) has still a
reference to this file." That reasoning was the root of the UAF
fixed in "eventpoll: fix ep_remove struct eventpoll / struct file
UAF": __ep_remove() could clear f_ep while another close raced
past the fast path and freed the watched eventpoll / recycled the
struct file slot.
With ep_remove() now pinning @file via epi_fget() across the f_ep
clear and hlist_del_rcu(), the invariant is re-established for the
right reason: anyone who might clear f_ep holds @file alive for
the duration, so a NULL observation really does mean no
concurrent eventpoll path has work left on this file. Refresh the
comment accordingly so the next reader doesn't inherit the broken
model.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-8-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
|
|
Use the correct parameter name ("__ns") for function parameter kernel-doc
to avoid 3 warnings:
Warning: include/linux/nstree.h:68 function parameter '__ns' not described in 'ns_tree_add_raw'
Warning: include/linux/nstree.h:77 function parameter '__ns' not described in 'ns_tree_add'
Warning: include/linux/nstree.h:88 function parameter '__ns' not described in 'ns_tree_remove'
Fixes: 885fc8ac0a4d ("nstree: make iterator generic")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260416215429.948898-1-rdunlap@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
"These fix two potential refcount leaks in error code paths in the ACPI
core code, address a recently introduced build breakage related to the
CPU UID handling consolidation, fix up a recently added MAINTAINERS
entry, fix the quirk list in the ACPI video bus driver, and add a new
quirk to it:
- Add an acpi_get_cpu_uid() stub helper to address an x86 Xen support
build breakage (Arnd Bergmann)
- Use acpi_dev_put() in object add error paths in the ACPI core to
avoid refcount leaks (Guangshuo Li)
- Adjust the file entry in the recently added NVIDIA GHES HANDLER
entry in MAINTAINERS to the actual existing file (Lukas Bulwahn)
- Add backlight=native quirk for Dell OptiPlex 7770 AIO to the ACPI
video bus driver (Jan Schär)
- Move Lenovo Legion S7 15ACH6 quirk to the right section of the
quirk list in the ACPI video bus driver (Hans de Goede)"
* tag 'acpi-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: video: Move Lenovo Legion S7 15ACH6 quirk to the right section
ACPI: video: Add backlight=native quirk for Dell OptiPlex 7770 AIO
ACPI: add acpi_get_cpu_uid() stub helper
MAINTAINERS: adjust file entry in NVIDIA GHES HANDLER
ACPI: scan: Use acpi_dev_put() in object add error paths
|
|
Remove the amateur radio (AX.25, NET/ROM, ROSE) protocol implementation
and all associated hamradio device drivers from the kernel tree.
This set of protocols has long been a huge bug/syzbot magnet,
and since nobody stepped up to help us deal with the influx
of the AI-generated bug reports we need to move it out of tree
to protect our sanity.
The code is moved to an out-of-tree repo:
https://github.com/linux-netdev/mod-orphan
if it's cleaned up and reworked there we can accept it back.
Minimal stub headers are kept for include/net/ax25.h (AX25_P_IP,
AX25_ADDR_LEN, ax25_address) and include/net/rose.h (ROSE_ADDR_LEN)
so that the conditional integration code in arp.c and tun.c continues
to compile and work when the out-of-tree modules are loaded.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Carlos Bilbao <carlos.bilbao@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20260421021824.1293976-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove the ISDN (mISDN, CAPI) subsystem and Bluetooth CMTP protocol
from the kernel tree.
ISDN is a pretty old technology and it's unclear whether anyone still
uses it. I went over the last few years of git history and all the
commits are either tree-wide conversions or syzbot/static analyzer
fixes.
When we discussed removal in the past IIRC there were some concerns
about ISDN still being used in parts of Germany. Unfortunately, the
code base is quite old, none of the current maintainers are familiar
with it and AI tools will have a field day finding bugs here.
Delete this code and preserve it in an out-of-tree repository
for any remaining users:
https://github.com/linux-netdev/mod-orphan
UAPI constants AF_ISDN/PF_ISDN and the SELinux isdn_socket class
are preserved for ABI stability, but the rest of uAPI is removed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260421022108.1299678-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove CAIF (Communication CPU to Application CPU Interface), the
ST-Ericsson modem protocol. The subsystem has been orphaned since 2013.
The last meaningful changes from the maintainers were in March 2013:
a8c7687bf216 ("caif_virtio: Check that vringh_config is not null")
b2273be8d2df ("caif_virtio: Use vringh_notify_enable correctly")
0d2e1a2926b1 ("caif_virtio: Introduce caif over virtio")
Not-so-coincidentally, according to "the Internet" ST-Ericsson officially
shut down its modem joint venture in Aug 2013.
If anyone is using this code please yell!
In the 13 years since, the code has accumulated 200 non-merge commits,
of which 71 were cross-tree API changes, 21 carried Fixes: tags, and
the remaining ~110 were cleanups, doc conversions, treewide refactors,
and one partial removal (caif_hsi, ca75bcf0a83b).
We are still getting fixes to this code, in the last 10 days there were
3 reports on security@ about CAIF that I have been CCed on.
UAPI constants (AF_CAIF, ARPHRD_CAIF, N_CAIF, VIRTIO_ID_CAIF) and the
SELinux classmap entry are intentionally kept for ABI stability.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260416182829.1440262-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|