| Age | Commit message (Collapse) | Author |
|
sendmsg() with a single iov becomes ITER_UBUF, sendmsg() with multiple
iovs becomes ITER_IOVEC. iter_iov_len does not return correct
value for UBUF, so teach to treat UBUF differently.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Fixes: bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As explained in the next patch, SO_PASSRIGHTS would have a problem
if we assigned a corresponding bit to socket->flags, so it must be
managed in struct sock.
Mixing socket->flags and sk->sk_flags for similar options will look
confusing, and sk->sk_flags does not have enough space on 32bit system.
Also, as mentioned in commit 16e572626961 ("af_unix: dont send
SCM_CREDENTIALS by default"), SOCK_PASSCRED and SOCK_PASSPID handling
is known to be slow, and managing the flags in struct socket cannot
avoid that for embryo sockets.
Let's move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
While at it, other SOCK_XXX flags in net.h are grouped as enum.
Note that assign_bit() was atomic, so the writer side is moved down
after lock_sock() in setsockopt(), but the bit is only read once
in sendmsg() and recvmsg(), so lock_sock() is not needed there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Lots of new things, notably:
* ath12k: monitor mode for WCN7850, better 6 GHz regulatory
* brcmfmac: SAE for some Cypress devices
* iwlwifi: rework device configuration
* mac80211: scan improvements with MLO
* mt76: EHT improvements, new device IDs
* rtw88: throughput improvements
* rtw89: MLO, STA/P2P concurrency improvements, SAR
* tag 'wireless-next-2025-05-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (389 commits)
wifi: mt76: mt7925: add rfkill_poll for hardware rfkill
wifi: mt76: support power delta calculation for 5 TX paths
wifi: mt76: fix available_antennas setting
wifi: mt76: mt7996: fix RX buffer size of MCU event
wifi: mt76: mt7996: change max beacon size
wifi: mt76: mt7996: fix invalid NSS setting when TX path differs from NSS
wifi: mt76: mt7996: drop fragments with multicast or broadcast RA
wifi: mt76: mt7996: set EHT max ampdu length capability
wifi: mt76: mt7996: fix beamformee SS field
wifi: mt76: remove capability of partial bandwidth UL MU-MIMO
wifi: mt76: mt7925: add test mode support
wifi: mt76: mt7925: extend MCU support for testmode
wifi: mt76: mt7925: ensure all MCU commands wait for response
wifi: mt76: mt7925: refine the sniffer commnad
wifi: mt76: mt7925: prevent multiple scan commands
wifi: mt76: mt7915: Fix null-ptr-deref in mt7915_mmio_wed_init()
wifi: mt76: mt7996: Fix null-ptr-deref in mt7996_mmio_wed_init()
wifi: mt76: mt7925: add RNR scan support for 6GHz
wifi: mt76: add mt76_connac_mcu_build_rnr_scan_param routine
wifi: mt76: scan: Fix 'mlink' dereferenced before IS_ERR_OR_NULL check
...
====================
Link: https://patch.msgid.link/20250522165501.189958-50-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc8).
Conflicts:
80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X")
4bcc063939a5 ("ice, irdma: fix an off by one in error handling code")
c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au
No extra adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"This is somewhat larger than what I hoped for, with a few PRs from
subsystems and follow-ups for the recent netdev locking changes,
anyhow there are no known pending regressions.
Including fixes from bluetooth, ipsec and CAN.
Current release - regressions:
- eth: team: grab team lock during team_change_rx_flags
- eth: bnxt_en: fix netdev locking in ULP IRQ functions
Current release - new code bugs:
- xfrm: ipcomp: fix truesize computation on receive
- eth: airoha: fix page recycling in airoha_qdma_rx_process()
Previous releases - regressions:
- sched: hfsc: fix qlen accounting bug when using peek in
hfsc_enqueue()
- mr: consolidate the ipmr_can_free_table() checks.
- bridge: netfilter: fix forwarding of fragmented packets
- xsk: bring back busy polling support in XDP_COPY
- can:
- add missing rcu read protection for procfs content
- kvaser_pciefd: force IRQ edge in case of nested IRQ
Previous releases - always broken:
- xfrm: espintcp: remove encap socket caching to avoid reference leak
- bluetooth: use skb_pull to avoid unsafe access in QCA dump handling
- eth: idpf:
- fix null-ptr-deref in idpf_features_check
- fix idpf_vport_splitq_napi_poll()
- eth: hibmcge: fix wrong ndo.open() after reset fail issue"
* tag 'net-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG
octeontx2-af: Set LMT_ENA bit for APR table entries
net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done
octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf
selftests/tc-testing: Add an HFSC qlen accounting test
sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
idpf: fix idpf_vport_splitq_napi_poll()
net: hibmcge: fix wrong ndo.open() after reset fail issue.
net: hibmcge: fix incorrect statistics update issue
xsk: Bring back busy polling support in XDP_COPY
can: slcan: allow reception of short error messages
net: lan743x: Restore SGMII CTRL register on resume
bnxt_en: Fix netdev locking in ULP IRQ functions
MAINTAINERS: Drop myself to reviewer for ravb driver
net: dwmac-sun8i: Use parsed internal PHY address instead of 1
net: ethernet: ti: am65-cpsw: Lower random mac address error print to info
can: kvaser_pciefd: Continue parsing DMA buf after dropped RX
can: kvaser_pciefd: Fix echo_skb race
can: kvaser_pciefd: Force IRQ edge in case of nested IRQ
idpf: fix null-ptr-deref in idpf_features_check
...
|
|
Multi-PTP source support within a network topology has been merged,
but the hardware timestamp source is not yet exposed to users.
Currently, users only see the PTP index, which does not indicate
whether the timestamp comes from a PHY or a MAC.
Add support for reporting the hwtstamp source using a
hwtstamp-source field, alongside hwtstamp-phyindex, to describe
the origin of the hardware timestamp.
Remove HWTSTAMP_SOURCE_UNSPEC enum value as it is not used at all.
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20250519-feature_ptp_source-v4-1-5d10e19a0265@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Introduce new API, genphy_match_phy_device(), to provide a way to check
to match a PHY driver for a PHY device based on the info stored in the
PHY device struct.
The function generalize the logic used in phy_bus_match() to check the
PHY ID whether if C45 or C22 ID should be used for matching.
This is useful for custom .match_phy_device function that wants to use
the generic logic under some condition. (example a PHY is already setup
and provide the correct PHY ID)
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://patch.msgid.link/20250517201353.5137-5-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pass PHY driver pointer to .match_phy_device OP in addition to phydev.
Having access to the PHY driver struct might be useful to check the
PHY ID of the driver is being matched for in case the PHY ID scanned in
the phydev is not consistent.
A scenario for this is a PHY that change PHY ID after a firmware is
loaded, in such case, the PHY ID stored in PHY device struct is not
valid anymore and PHY will manually scan the ID in the match_phy_device
function.
Having the PHY driver info is also useful for those PHY driver that
implement multiple simple .match_phy_device OP to match specific MMD PHY
ID. With this extra info if the parsing logic is the same, the matching
function can be generalized by using the phy_id in the PHY driver
instead of hardcoding.
Rust wrapper callback is updated to align to the new match_phy_device
arguments.
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Benno Lossin <lossin@kernel.org> # for Rust
Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Link: https://patch.msgid.link/20250517201353.5137-2-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that skb_copy_and_hash_datagram_iter() is no longer used, remove it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-11-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since skb_copy_and_hash_datagram_iter() is used only with CRC32C, the
crypto_ahash abstraction provides no value. Add
skb_copy_and_crc32c_datagram_iter() which just calls crc32c() directly.
This is faster and simpler. It also doesn't have the weird dependency
issue where skb_copy_and_hash_datagram_iter() depends on
CONFIG_CRYPTO_HASH=y without that being expressed explicitly in the
kconfig (presumably because it was too heavyweight for NET to select).
The new function is conditional on the hidden boolean symbol NET_CRC32C,
which selects CRC32. So it gets compiled only when something that
actually needs CRC32C packet checksums is enabled, it has no implicit
dependency, and it doesn't depend on the heavyweight crypto layer.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-9-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
crc32c_combine() and crc32c_shift() are no longer used (except by the
KUnit test that tests them), and their current implementation is very
slow. Remove them.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-8-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that the only remaining caller of __skb_checksum() is
skb_checksum(), fold __skb_checksum() into skb_checksum(). This makes
struct skb_checksum_ops unnecessary, so remove that too and simply do
the "regular" net checksum. It also makes the wrapper functions
csum_partial_ext() and csum_block_add_ext() unnecessary, so remove those
too and just use the underlying functions.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-7-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add skb_crc32c(), which calculates the CRC32C of a sk_buff. It will
replace __skb_checksum(), which unnecessarily supports arbitrary
checksums. Compared to __skb_checksum(), skb_crc32c():
- Uses the correct type for CRC32C values (u32, not __wsum).
- Does not require the caller to provide a skb_checksum_ops struct.
- Is faster because it does not use indirect calls and does not use
the very slow crc32c_combine().
According to commit 2817a336d4d5 ("net: skb_checksum: allow custom
update/combine for walking skb") which added __skb_checksum(), the
original motivation for the abstraction layer was to avoid code
duplication for CRC32C and other checksums in the future. However:
- No additional checksums showed up after CRC32C. __skb_checksum()
is only used with the "regular" net checksum and CRC32C.
- Indirect calls are expensive. Commit 2544af0344ba ("net: avoid
indirect calls in L4 checksum calculation") worked around this
using the INDIRECT_CALL_1 macro. But that only avoided the indirect
call for the net checksum, and at the cost of an extra branch.
- The checksums use different types (__wsum and u32), causing casts
to be needed.
- It made the checksums of fragments be combined (rather than
chained) for both checksums, despite this being highly
counterproductive for CRC32C due to how slow crc32c_combine() is.
This can clearly be seen in commit 4c2f24549644 ("sctp: linearize
early if it's not GSO") which tried to work around this performance
bug. With a dedicated function for each checksum, we can instead
just use the proper strategy for each checksum.
As shown by the following tables, the new function skb_crc32c() is
faster than __skb_checksum(), with the improvement varying greatly from
5% to 2500% depending on the case. The largest improvements come from
fragmented packets, mainly due to eliminating the inefficient
crc32c_combine(). But linear packets are improved too, especially
shorter ones, mainly due to eliminating indirect calls. These
benchmarks were done on AMD Zen 5. On that CPU, Linux uses IBRS instead
of retpoline; an even greater improvement might be seen with retpoline:
Linear sk_buffs
Length in bytes __skb_checksum cycles skb_crc32c cycles
=============== ===================== =================
64 43 18
256 94 77
1420 204 161
16384 1735 1642
Nonlinear sk_buffs (even split between head and one fragment)
Length in bytes __skb_checksum cycles skb_crc32c cycles
=============== ===================== =================
64 579 22
256 829 77
1420 1506 194
16384 4365 1682
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-3-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Constify the passed struct fixed_phy_status *status where possible.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/d1764b62-8538-408b-a4e3-b63715481a38@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All callers pass PHY_POLL, therefore remove irq argument from
fixed_phy_register().
Note: I keep the irq argument in fixed_phy_add_gpiod() for now,
for the case that somebody may want to use a GPIO interrupt in
the future, by e.g. adding a call to fwnode_irq_get() to
fixed_phy_get_gpiod().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/31cdb232-a5e9-4997-a285-cb9a7d208124@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All callers pass PHY_POLL, therefore remove irq argument from
fixed_phy_add().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Link: https://patch.msgid.link/b3b9b3bc-c310-4a54-b376-c909c83575de@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After having factored out the provider part from mdio_bus.c, we can
make the mdio consumer / device layer a separate module. This also
allows to remove Kconfig symbol MDIO_DEVICE.
The module init / exit functions from mdio_bus.c no longer have to be
called from phy_device.c. The link order defined in
drivers/net/phy/Makefile ensures that init / exit functions are called
in the right order.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/dba6b156-5748-44ce-b5e2-e8dc2fcee5a7@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Seems like the extack cookie hasn't found any users outside
of wireless, which always uses nl_set_extack_cookie_u64().
Thus, allocating 20 bytes for it is pointless, reduce that
to 8 bytes, and add a BUILD_BUG_ON() to ensure it's enough
(obviously it is, for a u64, but in case it changes again.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20250516115927.38209-2-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Nine singleton hotfixes, all MM. Four are cc:stable"
* tag 'mm-hotfixes-stable-2025-05-17-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: userfaultfd: correct dirty flags set for both present and swap pte
zsmalloc: don't underflow size calculation in zs_obj_write()
mm/page_alloc: fix race condition in unaccepted memory handling
mm/page_alloc: ensure try_alloc_pages() plays well with unaccepted memory
MAINTAINERS: add mm GUP section
mm/codetag: move tag retrieval back upfront in __free_pages()
mm/memory: fix mapcount / refcount sanity check for mTHP reuse
kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork()
mm: hugetlb: fix incorrect fallback for subpool
|
|
Guoyu Yin reported a splat in the ipmr netns cleanup path:
WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_free_table net/ipv4/ipmr.c:440 [inline]
WARNING: CPU: 2 PID: 14564 at net/ipv4/ipmr.c:440 ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361
Modules linked in:
CPU: 2 UID: 0 PID: 14564 Comm: syz.4.838 Not tainted 6.14.0 #1
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:ipmr_free_table net/ipv4/ipmr.c:440 [inline]
RIP: 0010:ipmr_rules_exit+0x135/0x1c0 net/ipv4/ipmr.c:361
Code: ff df 48 c1 ea 03 80 3c 02 00 75 7d 48 c7 83 60 05 00 00 00 00 00 00 5b 5d 41 5c 41 5d 41 5e e9 71 67 7f 00 e8 4c 2d 8a fd 90 <0f> 0b 90 eb 93 e8 41 2d 8a fd 0f b6 2d 80 54 ea 01 31 ff 89 ee e8
RSP: 0018:ffff888109547c58 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888108c12dc0 RCX: ffffffff83e09868
RDX: ffff8881022b3300 RSI: ffffffff83e098d4 RDI: 0000000000000005
RBP: ffff888104288000 R08: 0000000000000000 R09: ffffed10211825c9
R10: 0000000000000001 R11: ffff88801816c4a0 R12: 0000000000000001
R13: ffff888108c13320 R14: ffff888108c12dc0 R15: fffffbfff0b74058
FS: 00007f84f39316c0(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f84f3930f98 CR3: 0000000113b56000 CR4: 0000000000350ef0
Call Trace:
<TASK>
ipmr_net_exit_batch+0x50/0x90 net/ipv4/ipmr.c:3160
ops_exit_list+0x10c/0x160 net/core/net_namespace.c:177
setup_net+0x47d/0x8e0 net/core/net_namespace.c:394
copy_net_ns+0x25d/0x410 net/core/net_namespace.c:516
create_new_namespaces+0x3f6/0xaf0 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc3/0x180 kernel/nsproxy.c:228
ksys_unshare+0x78d/0x9a0 kernel/fork.c:3342
__do_sys_unshare kernel/fork.c:3413 [inline]
__se_sys_unshare kernel/fork.c:3411 [inline]
__x64_sys_unshare+0x31/0x40 kernel/fork.c:3411
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xa6/0x1a0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f84f532cc29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f84f3931038 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f84f5615fa0 RCX: 00007f84f532cc29
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000400
RBP: 00007f84f53fba18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f84f5615fa0 R15: 00007fff51c5f328
</TASK>
The running kernel has CONFIG_IP_MROUTE_MULTIPLE_TABLES disabled, and
the sanity check for such build is still too loose.
Address the issue consolidating the relevant sanity check in a single
helper regardless of the kernel configuration. Also share it between
the ipv4 and ipv6 code.
Reported-by: Guoyu Yin <y04609127@gmail.com>
Fixes: 50b94204446e ("ipmr: tune the ipmr_can_free_table() checks.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/372dc261e1bf12742276e1b984fc5a071b7fc5a8.1747321903.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since its introduction 6 yrs ago this functions has never had a user.
So remove it.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/ccbeef28-65ae-4e28-b1db-816c44338dee@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull NFS client bugfixes from Trond Myklebust:
- NFS: Fix a couple of missed handlers for the ENETDOWN and ENETUNREACH
transport errors
- NFS: Handle Oopsable failure of nfs_get_lock_context in the unlock
path
- NFSv4: Fix a race in nfs_local_open_fh()
- NFSv4/pNFS: Fix a couple of layout segment leaks in layoutreturn
- NFSv4/pNFS Avoid sharing pNFS DS connections between net namespaces
since IP addresses are not guaranteed to refer to the same nodes
- NFS: Don't flush file data while holding multiple directory locks in
nfs_rename()
* tag 'nfs-for-6.15-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Avoid flushing data while holding directory locks in nfs_rename()
NFS/pnfs: Fix the error path in pnfs_layoutreturn_retry_later_locked()
NFSv4/pnfs: Reset the layout state after a layoutreturn
NFS/localio: Fix a race in nfs_local_open_fh()
nfs: nfs3acl: drop useless assignment in nfs3_get_acl()
nfs: direct: drop useless initializer in nfs_direct_write_completion()
nfs: move the nfs4_data_server_cache into struct nfs_net
nfs: don't share pNFS DS connections between net namespaces
nfs: handle failure of nfs_get_lock_context in unlock path
pNFS/flexfiles: Record the RPC errors in the I/O tracepoints
NFSv4/pnfs: Layoutreturn on close must handle fatal networking errors
NFSv4: Handle fatal ENETDOWN and ENETUNREACH errors
|
|
The Linux client assumes that all filehandles are non-volatile for
renames within the same directory (otherwise sillyrename cannot work).
However, the existence of the Linux 'subtree_check' export option has
meant that nfs_rename() has always assumed it needs to flush writes
before attempting to rename.
Since NFSv4 does allow the client to query whether or not the server
exhibits this behaviour, and since knfsd does actually set the
appropriate flag when 'subtree_check' is enabled on an export, it
should be OK to optimise away the write flushing behaviour in the cases
where it is clearly not needed.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"Fix to zone block devices to make the maximum segment count match what
the block layer is capable of"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A handful small fixes. The only significant change is the fix for MIDI
2.0 UMP handling in ALSA sequencer, but as MIDI 2.0 stuff is still new
and rarely used, the impact should be pretty limited.
Other than that, quirks for USB-audio and a few cosmetic fixes and
changes in drivers that should be safe to apply"
* tag 'sound-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
ALSA: sh: SND_AICA should depend on SH_DMA_API
ALSA: usb-audio: Add sample rate quirk for Audioengine D1
ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info
ALSA/hda: intel-sdw-acpi: Correct sdw_intel_acpi_scan() function parameter
ALSA: seq: Fix delivery of UMP events to group ports
|
|
If the application can not drain fast enough a TCP socket queue,
tcp_rcv_space_adjust() can overestimate tp->rcvq_space.space.
Then sk->sk_rcvbuf can grow and hit tcp_rmem[2] for no good reason.
Fix this by taking into acount the number of available bytes.
Keeping sk->sk_rcvbuf at the right size allows better cache efficiency.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Wei Wang <weiwan@google.com>
Link: https://patch.msgid.link/20250513193919.1089692-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc7).
Conflicts:
tools/testing/selftests/drivers/net/hw/ncdevmem.c
97c4e094a4b2 ("tests/ncdevmem: Fix double-free of queue array")
2f1a805f32ba ("selftests: ncdevmem: Implement devmem TCP TX")
https://lore.kernel.org/20250514122900.1e77d62d@canb.auug.org.au
Adjacent changes:
net/core/devmem.c
net/core/devmem.h
0afc44d8cdf6 ("net: devmem: fix kernel panic when netlink socket close after module unload")
bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth and wireless.
A few more fixes for the locking changes trickling in. Nothing too
alarming, I suspect those will continue for another release. Other
than that things are slowing down nicely.
Current release - fix to a fix:
- Bluetooth: hci_event: use key encryption size when its known
- tools: ynl-gen: allow multi-attr without nested-attributes again
Current release - regressions:
- locking fixes:
- lock lower level devices when updating features
- eth: bnxt_en: bring back rtnl_lock() in the bnxt_open() path
- devmem: fix panic when Netlink socket closes after module unload
Current release - new code bugs:
- eth: txgbe: fixes for FW communication on new AML devices
Previous releases - always broken:
- sched: flush gso_skb list too during ->change(), avoid potential
null-deref on reconfig
- wifi: mt76: disable NAPI on driver removal
- hv_netvsc: fix error 'nvsp_rndis_pkt_complete error status: 2'"
* tag 'net-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
net: devmem: fix kernel panic when netlink socket close after module unload
tsnep: fix timestamping with a stacked DSA driver
net/tls: fix kernel panic when alloc_page failed
bnxt_en: bring back rtnl_lock() in the bnxt_open() path
mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
octeontx2-pf: Do not reallocate all ntuple filters
wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
wifi: mt76: disable napi on driver removal
Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
hv_netvsc: Remove rmsg_pgcnt
hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
octeontx2-af: Fix CGX Receive counters
net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
net: libwx: Fix FW mailbox unknown command
net: libwx: Fix FW mailbox reply timeout
net: txgbe: Fix to calculate EEPROM checksum for AML devices
octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
...
|
|
mirred_nest_level is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Move mirred_nest_level to struct netdev_xmit as u8, provide wrappers.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://patch.msgid.link/20250512092736.229935-11-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
system_page_pool is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Make a struct with a page_pool member (original system_page_pool) and a
local_lock_t and use local_lock_nested_bh() for locking. This change
adds only lockdep coverage and does not alter the functional behaviour
for !PREEMPT_RT.
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-6-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Update the for_each_netdev_in_bond_rcu macro to iterate through network
devices in the bond's network namespace instead of always using
init_net. This change is safe because:
1. **Bond-Slave Namespace Relationship**: A bond device and its slaves
must reside in the same network namespace. The bond device's
namespace is established at creation time and cannot change.
2. **Slave Movement Implications**: Any attempt to move a slave device
to a different namespace automatically removes it from the bond, as
per kernel networking stack rules.
This maintains the invariant that slaves must exist in the same
namespace as their bond.
This change is part of an effort to enable Link Aggregation (LAG) to
work properly inside custom network namespaces. Previously, the macro
would only find slave devices in the initial network namespace,
preventing proper bonding functionality in custom namespaces.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250513081922.525716-1-mbloch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
callers. Remove it.
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-6-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With some Infineon chips the timeouts in tpm_tis_send_data (both B and
C) can reach up to about 2250 ms.
Timeout C is retried since
commit de9e33df7762 ("tpm, tpm_tis: Workaround failed command reception on Infineon devices")
Timeout B still needs to be extended.
The problem is most commonly encountered with context related operation
such as load context/save context. These are issued directly by the
kernel, and there is no retry logic for them.
When a filesystem is set up to use the TPM for unlocking the boot fails,
and restarting the userspace service is ineffective. This is likely
because ignoring a load context/save context result puts the real TPM
state and the TPM state expected by the kernel out of sync.
Chips known to be affected:
tpm_tis IFX1522:00: 2.0 TPM (device-id 0x1D, rev-id 54)
Description: SLB9672
Firmware Revision: 15.22
tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22)
Firmware Revision: 7.83
tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1A, rev-id 16)
Firmware Revision: 5.63
Link: https://lore.kernel.org/linux-integrity/Z5pI07m0Muapyu9w@kitsune.suse.cz/
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
tpm2_start_auth_session() does not mask TPM RC correctly from the callers:
[ 28.766528] tpm tpm0: A TPM error (2307) occurred start auth session
Process TPM RCs inside tpm2_start_auth_session(), and map them to POSIX
error codes.
Cc: stable@vger.kernel.org # v6.10+
Fixes: 699e3efd6c64 ("tpm: Add HMAC session start and end functions")
Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Closes: https://lore.kernel.org/linux-integrity/Z_NgdRHuTKP6JK--@gondor.apana.org.au/
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The functionality of mdiobus_register_board_info() typically isn't
optional for the caller. Therefore remove the stub.
Note: Currently we have only one caller of mdiobus_register_board_info(),
in a DSA/PHYLINK context. Therefore CONFIG_MDIO_DEVICE is selected anyway.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/410a2222-c4e8-45b0-9091-d49674caeb00@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drivers need to make sure not to pass netmem dma-addrs to the
dma-mapping API in order to support netmem TX.
Add helpers and netmem_dma_*() helpers that enables special handling of
netmem dma-addrs that drivers can use.
Document in netmem.rst what drivers need to do to support netmem TX.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-7-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Augment dmabuf binding to be able to handle TX. Additional to all the RX
binding, we also create tx_vec needed for the TX path.
Provide API for sendmsg to be able to send dmabufs bound to this device:
- Provide a new dmabuf_tx_cmsg which includes the dmabuf to send from.
- MSG_ZEROCOPY with SCM_DEVMEM_DMABUF cmsg indicates send from dma-buf.
Devmem is uncopyable, so piggyback off the existing MSG_ZEROCOPY
implementation, while disabling instances where MSG_ZEROCOPY falls back
to copying.
We additionally pipe the binding down to the new
zerocopy_fill_skb_from_devmem which fills a TX skb with net_iov netmems
instead of the traditional page netmems.
We also special case skb_frag_dma_map to return the dma-address of these
dmabuf net_iovs instead of attempting to map pages.
The TX path may release the dmabuf in a context where we cannot wait.
This happens when the user unbinds a TX dmabuf while there are still
references to its netmems in the TX path. In that case, the netmems will
be put_netmem'd from a context where we can't unmap the dmabuf, Resolve
this by making __net_devmem_dmabuf_binding_free schedule_work'd.
Based on work by Stanislav Fomichev <sdf@fomichev.me>. A lot of the meat
of the implementation came from devmem TCP RFC v1[1], which included the
TX path, but Stan did all the rebasing on top of netmem/net_iov.
Cc: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Kaiyuan Zhang <kaiyuanz@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-5-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently net_iovs support only pp ref counts, and do not support a
page ref equivalent.
This is fine for the RX path as net_iovs are used exclusively with the
pp and only pp refcounting is needed there. The TX path however does not
use pp ref counts, thus, support for get_page/put_page equivalent is
needed for netmem.
Support get_netmem/put_netmem. Check the type of the netmem before
passing it to page or net_iov specific code to obtain a page ref
equivalent.
For dmabuf net_iovs, we obtain a ref on the underlying binding. This
ensures the entire binding doesn't disappear until all the net_iovs have
been put_netmem'ed. We do not need to track the refcount of individual
dmabuf net_iovs as we don't allocate/free them from a pool similar to
what the buddy allocator does for pages.
This code is written to be extensible by other net_iov implementers.
get_netmem/put_netmem will check the type of the netmem and route it to
the correct helper:
pages -> [get|put]_page()
dmabuf net_iovs -> net_devmem_[get|put]_net_iov()
new net_iovs -> new helpers
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-3-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The KSZ9477 PHY driver contained workarounds for broken EEE capability
advertisements by manually masking supported EEE modes and forcibly
disabling EEE if MICREL_NO_EEE was set.
With proper MAC-side EEE handling implemented via phylink, these quirks
are no longer necessary. Remove MICREL_NO_EEE handling and the use of
ksz9477_get_features().
This simplifies the PHY driver and avoids duplicated EEE management logic.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: stable@vger.kernel.org # v6.14+
Link: https://patch.msgid.link/20250504081434.424489-3-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The REPORT ZONES buffer size is currently limited by the HBA's maximum
segment count to ensure the buffer can be mapped. However, the block
layer further limits the number of iovec entries to 1024 when allocating
a bio.
To avoid allocation of buffers too large to be mapped, further restrict
the maximum buffer size to BIO_MAX_INLINE_VECS.
Replace the UIO_MAXIOV symbolic name with the more contextually
appropriate BIO_MAX_INLINE_VECS.
Fixes: b091ac616846 ("sd_zbc: Fix report zones buffer allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Steve Siwinski <ssiwinski@atto.com>
Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux
Tony Nguyen says:
====================
Prepare for Intel IPU E2000 (GEN3)
This is the first part in introducing RDMA support for idpf.
----------------------------------------------------------------
Tatyana Nikolova says:
To align with review comments, the patch series introducing RDMA
RoCEv2 support for the Intel Infrastructure Processing Unit (IPU)
E2000 line of products is going to be submitted in three parts:
1. Modify ice to use specific and common IIDC definitions and
pass a core device info to irdma.
2. Add RDMA support to idpf and modify idpf to use specific and
common IIDC definitions and pass a core device info to irdma.
3. Add RDMA RoCEv2 support for the E2000 products, referred to as
GEN3 to irdma.
This first part is a 5 patch series based on the original
"iidc/ice/irdma: Update IDC to support multiple consumers" patch
to allow for multiple CORE PCI drivers, using the auxbus.
Patches:
1) Move header file to new name for clarity and replace ice
specific DSCP define with a kernel equivalent one in irdma
2) Unify naming convention
3) Separate header file into common and driver specific info
4) Replace ice specific DSCP define with a kernel equivalent
one in ice
5) Implement core device info struct and update drivers to use it
----------------------------------------------------------------
v1: https://lore.kernel.org/20250505212037.2092288-1-anthony.l.nguyen@intel.com
IWL reviews:
[v5] https://lore.kernel.org/20250416021549.606-1-tatyana.e.nikolova@intel.com
[v4] https://lore.kernel.org/20250225050428.2166-1-tatyana.e.nikolova@intel.com
[v3] https://lore.kernel.org/20250207194931.1569-1-tatyana.e.nikolova@intel.com
[v2] https://lore.kernel.org/20240824031924.421-1-tatyana.e.nikolova@intel.com
[v1] https://lore.kernel.org/20240724233917.704-1-tatyana.e.nikolova@intel.com
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux:
iidc/ice/irdma: Update IDC to support multiple consumers
ice: Replace ice specific DSCP mapping num with a kernel define
iidc/ice/irdma: Break iidc.h into two headers
iidc/ice/irdma: Rename to iidc_* convention
iidc/ice/irdma: Rename IDC header file
====================
Link: https://patch.msgid.link/20250509200712.2911060-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 51ff4d7486f0 ("mm: avoid extra mem_alloc_profiling_enabled()
checks") introduces a possible use-after-free scenario, when page
is non-compound, page[0] could be released by other thread right
after put_page_testzero failed in current thread, pgalloc_tag_sub_pages
afterwards would manipulate an invalid page for accounting remaining
pages:
[timeline] [thread1] [thread2]
| alloc_page non-compound
V
| get_page, rf counter inc
V
| in ___free_pages
| put_page_testzero fails
V
| put_page, page released
V
| in ___free_pages,
| pgalloc_tag_sub_pages
| manipulate an invalid page
V
Restore __free_pages() to its state before, retrieve alloc tag
beforehand.
Link: https://lkml.kernel.org/r/20250505193034.91682-1-00107082@163.com
Fixes: 51ff4d7486f0 ("mm: avoid extra mem_alloc_profiling_enabled() checks")
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 ITS mitigation from Dave Hansen:
"Mitigate Indirect Target Selection (ITS) issue.
I'd describe this one as a good old CPU bug where the behavior is
_obviously_ wrong, but since it just results in bad predictions it
wasn't wrong enough to notice. Well, the researchers noticed and also
realized that thus bug undermined a bunch of existing indirect branch
mitigations.
Thus the unusually wide impact on this one. Details:
ITS is a bug in some Intel CPUs that affects indirect branches
including RETs in the first half of a cacheline. Due to ITS such
branches may get wrongly predicted to a target of (direct or indirect)
branch that is located in the second half of a cacheline. Researchers
at VUSec found this behavior and reported to Intel.
Affected processors:
- Cascade Lake, Cooper Lake, Whiskey Lake V, Coffee Lake R, Comet
Lake, Ice Lake, Tiger Lake and Rocket Lake.
Scope of impact:
- Guest/host isolation:
When eIBRS is used for guest/host isolation, the indirect branches
in the VMM may still be predicted with targets corresponding to
direct branches in the guest.
- Intra-mode using cBPF:
cBPF can be used to poison the branch history to exploit ITS.
Realigning the indirect branches and RETs mitigates this attack
vector.
- User/kernel:
With eIBRS enabled user/kernel isolation is *not* impacted by ITS.
- Indirect Branch Prediction Barrier (IBPB):
Due to this bug indirect branches may be predicted with targets
corresponding to direct branches which were executed prior to IBPB.
This will be fixed in the microcode.
Mitigation:
As indirect branches in the first half of cacheline are affected, the
mitigation is to replace those indirect branches with a call to thunk that
is aligned to the second half of the cacheline.
RETs that take prediction from RSB are not affected, but they may be
affected by RSB-underflow condition. So, RETs in the first half of
cacheline are also patched to a return thunk that executes the RET aligned
to second half of cacheline"
* tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftest/x86/bugs: Add selftests for ITS
x86/its: FineIBT-paranoid vs ITS
x86/its: Use dynamic thunks for indirect branches
x86/ibt: Keep IBT disabled during alternative patching
mm/execmem: Unify early execmem_cache behaviour
x86/its: Align RETs in BHB clear sequence to avoid thunking
x86/its: Add support for RSB stuffing mitigation
x86/its: Add "vmexit" option to skip mitigation on some CPUs
x86/its: Enable Indirect Target Selection mitigation
x86/its: Add support for ITS-safe return thunk
x86/its: Add support for ITS-safe indirect thunk
x86/its: Enumerate Indirect Target Selection (ITS) bug
Documentation: x86/bugs/its: Add ITS documentation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc timers fixes from Ingo Molnar:
- Fix time keeping bugs in CLOCK_MONOTONIC_COARSE clocks
- Work around absolute relocations into vDSO code that GCC erroneously
emits in certain arm64 build environments
- Fix a false positive lockdep warning in the i8253 clocksource driver
* tag 'timers-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
arm64: vdso: Work around invalid absolute relocations from GCC
timekeeping: Prevent coarse clocks going backwards
|
|
The acpi_handle should be just a handle and not a pointer in
sdw_intel_acpi_scan() parameter list.
It is called with 'acpi_handle handle' as parameter and it is passing it to
acpi_walk_namespace, which also expects acpi_handle and not acpi_handle*
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://patch.msgid.link/20250508181207.22113-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"22 hotfixes. 13 are cc:stable and the remainder address post-6.14
issues or aren't considered necessary for -stable kernels.
About half are for MM. Five OCFS2 fixes and a few MAINTAINERS updates"
* tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
mm: fix folio_pte_batch() on XEN PV
nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()
mm/hugetlb: copy the CMA flag when demoting
mm, swap: fix false warning for large allocation with !THP_SWAP
selftests/mm: fix a build failure on powerpc
selftests/mm: fix build break when compiling pkey_util.c
mm: vmalloc: support more granular vrealloc() sizing
tools/testing/selftests: fix guard region test tmpfs assumption
ocfs2: stop quota recovery before disabling quotas
ocfs2: implement handshaking with ocfs2 recovery thread
ocfs2: switch osb->disable_recovery to enum
mailmap: map Uwe's BayLibre addresses to a single one
MAINTAINERS: add mm THP section
mm/userfaultfd: fix uninitialized output field for -EAGAIN race
selftests/mm: compaction_test: support platform with huge mount of memory
MAINTAINERS: add core mm section
ocfs2: fix panic in failed foilio allocation
mm/huge_memory: fix dereferencing invalid pmd migration entry
MAINTAINERS: add reverse mapping section
x86: disable image size check for test builds
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO driver fixes from Greg KH:
"Here are a bunch of small driver fixes (mostly all IIO) for 6.15-rc6.
Included in here are:
- loads of tiny IIO driver fixes for reported issues
- hyperv driver fix for a much-reported and worked on sysfs ring
buffer creation bug
All of these have been in linux-next for over a week (the IIO ones for
many weeks now), with no reported issues"
* tag 'char-misc-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits)
Drivers: hv: Make the sysfs node size for the ring buffer dynamic
uio_hv_generic: Fix sysfs creation path for ring buffer
iio: adis16201: Correct inclinometer channel resolution
iio: adc: ad7606: fix serial register access
iio: pressure: mprls0025pa: use aligned_s64 for timestamp
iio: imu: adis16550: align buffers for timestamp
staging: iio: adc: ad7816: Correct conditional logic for store mode
iio: adc: ad7266: Fix potential timestamp alignment issue.
iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
iio: adc: dln2: Use aligned_s64 for timestamp
iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64
iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer.
iio: chemical: pms7003: use aligned_s64 for timestamp
iio: chemical: sps30: use aligned_s64 for timestamp
iio: imu: inv_mpu6050: align buffer for timestamp
iio: imu: st_lsm6dsx: Fix wakeup source leaks on device unbind
iio: adc: qcom-spmi-iadc: Fix wakeup source leaks on device unbind
iio: accel: fxls8962af: Fix wakeup source leaks on device unbind
iio: adc: ad7380: fix event threshold shift
iio: hid-sensor-prox: Fix incorrect OFFSET calculation
...
|
|
ITS mitigation moves the unsafe indirect branches to a safe thunk. This
could degrade the prediction accuracy as the source address of indirect
branches becomes same for different execution paths.
To improve the predictions, and hence the performance, assign a separate
thunk for each indirect callsite. This is also a defense-in-depth measure
to avoid indirect branches aliasing with each other.
As an example, 5000 dynamic thunks would utilize around 16 bits of the
address space, thereby gaining entropy. For a BTB that uses
32 bits for indexing, dynamic thunks could provide better prediction
accuracy over fixed thunks.
Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that
they are both more flexible (got to extend them later) and live in 2M TLBs,
just like kernel code, avoiding undue TLB pressure.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
|
|
Early kernel memory is RWX, only at the end of early boot (before SMP)
do we mark things ROX. Have execmem_cache mirror this behaviour for
early users.
This avoids having to remember what code is execmem and what is not --
we can poke everything with impunity ;-) Also performance for not
having to do endless text_poke_mm switches.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
|
|
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with
eIBRS. It affects prediction of indirect branch and RETs in the
lower half of cacheline. Due to ITS such branches may get wrongly predicted
to a target of (direct or indirect) branch that is located in the upper
half of the cacheline.
Scope of impact
===============
Guest/host isolation
--------------------
When eIBRS is used for guest/host isolation, the indirect branches in the
VMM may still be predicted with targets corresponding to branches in the
guest.
Intra-mode
----------
cBPF or other native gadgets can be used for intra-mode training and
disclosure using ITS.
User/kernel isolation
---------------------
When eIBRS is enabled user/kernel isolation is not impacted.
Indirect Branch Prediction Barrier (IBPB)
-----------------------------------------
After an IBPB, indirect branches may be predicted with targets
corresponding to direct branches which were executed prior to IBPB. This is
mitigated by a microcode update.
Add cmdline parameter indirect_target_selection=off|on|force to control the
mitigation to relocate the affected branches to an ITS-safe thunk i.e.
located in the upper half of cacheline. Also add the sysfs reporting.
When retpoline mitigation is deployed, ITS safe-thunks are not needed,
because retpoline sequence is already ITS-safe. Similarly, when call depth
tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return
thunk is not used, as CDT prevents RSB-underflow.
To not overcomplicate things, ITS mitigation is not supported with
spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy
lfence;jmp mitigation on ITS affected parts anyways.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
|