summaryrefslogtreecommitdiff
path: root/tools/testing
AgeCommit message (Collapse)Author
2025-12-19Merge tag 'net-6.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and CAN. Current release - regressions: - netfilter: nf_conncount: fix leaked ct in error paths - sched: act_mirred: fix loop detection - sctp: fix potential deadlock in sctp_clone_sock() - can: fix build dependency - eth: mlx5e: do not update BQL of old txqs during channel reconfiguration Previous releases - regressions: - sched: ets: always remove class from active list before deleting it - inet: frags: flush pending skbs in fqdir_pre_exit() - netfilter: nf_nat: remove bogus direction check - mptcp: - schedule rtx timer only after pushing data - avoid deadlock on fallback while reinjecting - can: gs_usb: fix error handling - eth: - mlx5e: - avoid unregistering PSP twice - fix double unregister of HCA_PORTS component - bnxt_en: fix XDP_TX path - mlxsw: fix use-after-free when updating multicast route stats Previous releases - always broken: - ethtool: avoid overflowing userspace buffer on stats query - openvswitch: fix middle attribute validation in push_nsh() action - eth: - mlx5: fw_tracer, validate format string parameters - mlxsw: spectrum_router: fix neighbour use-after-free - ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2() Misc: - Jozsef Kadlecsik retires from maintaining netfilter - tools: ynl: fix build on systems with old kernel headers" * tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) net: hns3: add VLAN id validation before using net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx net: hns3: using the num_tqps in the vf driver to apply for resources net: enetc: do not transmit redirected XDP frames when the link is down selftests/tc-testing: Test case exercising potential mirred redirect deadlock net/sched: act_mirred: fix loop detection sctp: Clear inet_opt in sctp_v6_copy_ip_options(). sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock(). net/handshake: duplicate handshake cancellations leak socket net/mlx5e: Don't include PSP in the hard MTU calculations net/mlx5e: Do not update BQL of old txqs during channel reconfiguration net/mlx5e: Trigger neighbor resolution for unresolved destinations net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init net/mlx5: Serialize firmware reset with devlink net/mlx5: fw_tracer, Handle escaped percent properly net/mlx5: fw_tracer, Validate format string parameters net/mlx5: Drain firmware reset in shutdown callback net/mlx5: fw reset, clear reset requested on drain_fw_reset net: dsa: mxl-gsw1xx: manually clear RANEG bit net: dsa: mxl-gsw1xx: fix .shutdown driver operation ...
2025-12-18Merge tag 'kvm-x86-fixes-6.19-rc1' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM fixes for 6.19-rc1 - Add a missing "break" to fix param parsing in the rseq selftest. - Apply runtime updates to the _current_ CPUID when userspace is setting CPUID, e.g. as part of vCPU hotplug, to fix a false positive and to avoid dropping the pending update. - Disallow toggling KVM_MEM_GUEST_MEMFD on an existing memslot, as it's not supported by KVM and leads to a use-after-free due to KVM failing to unbind the memslot from the previously-associated guest_memfd instance. - Harden against similar KVM_MEM_GUEST_MEMFD goofs, and prepare for supporting flags-only changes on KVM_MEM_GUEST_MEMFD memlslots, e.g. for dirty logging. - Set exit_code[63:32] to -1 (all 0xffs) when synthesizing a nested SVM_EXIT_ERR (a.k.a. VMEXIT_INVALID) #VMEXIT, as VMEXIT_INVALID is defined as -1ull (a 64-bit value). - Update SVI when activating APICv to fix a bug where a post-activation EOI for an in-service IRQ would effective be lost due to SVI being stale. - Immediately refresh APICv controls (if necessary) on a nested VM-Exit instead of deferring the update via KVM_REQ_APICV_UPDATE, as the request is effectively ignored because KVM thinks the vCPU already has the correct APICv settings.
2025-12-18selftests/tc-testing: Test case exercising potential mirred redirect deadlockVictor Nogueira
Add a test case that reproduces deadlock scenario where the user has a drr qdisc attached to root and has a mirred action that redirects to self on egress Signed-off-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251210162255.1057663-2-jhs@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1Alexei Starovoitov
Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-17Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Fix BPF builds due to -fms-extensions. selftests (Alexei Starovoitov), bpftool (Quentin Monnet). - Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n (Geert Uytterhoeven) - Fix livepatch/BPF interaction and support reliable unwinding through BPF stack frames (Josh Poimboeuf) - Do not audit capability check in arm64 JIT (Ondrej Mosnacek) - Fix truncated dmabuf BPF iterator reads (T.J. Mercier) - Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu) - Fix warnings in libbpf when built with -Wdiscarded-qualifiers under C23 (Mikhail Gavrilov) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: add regression test for bpf_d_path() bpf: Fix verifier assumptions of bpf_d_path's output buffer selftests/bpf: Add test for truncated dmabuf_iter reads bpf: Fix truncated dmabuf iterator reads x86/unwind/orc: Support reliable unwinding through BPF stack frames bpf: Add bpf_has_frame_pointer() bpf, arm64: Do not audit capability check in do_jit() libbpf: Fix -Wdiscarded-qualifiers under C23 bpftool: Fix build warnings due to MS extensions net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT selftests/bpf: Add -fms-extensions to bpf build flags
2025-12-16selftests/bpf: Add tests for the arena offset of globalsEmil Tsalapatis
Add tests for the new libbpf globals arena offset logic. The tests cover the case of globals being as large as the arena itself, and being smaller than the arena. In that case, the data is placed at the end of the arena, and the beginning of the arena is free. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-6-emil@etsalapatis.com
2025-12-16libbpf: Move arena globals to the end of the arenaEmil Tsalapatis
Arena globals are currently placed at the beginning of the arena by libbpf. This is convenient, but prevents users from reserving guard pages in the beginning of the arena to identify NULL pointer dereferences. Adjust the load logic to place the globals at the end of the arena instead. Also modify bpftool to set the arena pointer in the program's BPF skeleton to point to the globals. Users now call bpf_map__initial_value() to find the beginning of the arena mapping and use the arena pointer in the skeleton to determine which part of the mapping holds the arena globals and which part is free. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-5-emil@etsalapatis.com
2025-12-16bpf/verifier: Do not limit maximum direct offset into arena mapEmil Tsalapatis
The verifier currently limits direct offsets into a map to 512MiB to avoid overflow during pointer arithmetic. However, this prevents arena maps from using direct addressing instructions to access data at the end of > 512MiB arena maps. This is necessary when moving arena globals to the end of the arena instead of the front. Refactor the verifier code to remove the offset calculation during direct value access calculations. This is possible because the only two map types that implement .map_direct_value_addr() are arrays and arenas, and they both do their own internal checks to ensure the offset is within bounds. Adjust selftests that expect the old error. These tests still fail because the verifier identifies the access as out of bounds for the map, so change them to expect an "invalid access to map value pointer" error instead. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-3-emil@etsalapatis.com
2025-12-16selftests/bpf: Explicitly account for globals in verifier_arena_largeEmil Tsalapatis
The big_alloc1 test in verifier_arena_large assumes that the arena base and the first page allocated by bpf_arena_alloc_pages are identical. This is not the case, because the first page in the arena is populated by global arena data. The test still passes because the code makes the tacit assumption that the first page is on offset PAGE_SIZE instead of 0. Make this distinction explicit in the code, and adjust the page offsets requested during the test to count from the beginning of the arena instead of using the address of the first allocated page. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-2-emil@etsalapatis.com
2025-12-16Merge tag 'sched_ext-for-6.19-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix memory leak when destroying helper kthread workers during scheduler disable - Fix bypass depth accounting on scx_enable() failure which could leave the system permanently in bypass mode - Fix missing preemption handling when moving tasks to local DSQs via scx_bpf_dsq_move() - Misc fixes including NULL check for put_prev_task(), flushing stdout in selftests, and removing unused code * tag 'sched_ext-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Remove unused code in the do_pick_task_scx() selftests/sched_ext: flush stdout before test to avoid log spam sched_ext: Fix missing post-enqueue handling in move_local_task_to_local_dsq() sched_ext: Factor out local_dsq_post_enq() from dispatch_enqueue() sched_ext: Fix bypass depth leak on scx_enable() failure sched/ext: Avoid null ptr traversal when ->put_prev_task() is called with NULL next sched_ext: Fix the memleak for sch->helper objects
2025-12-15iommufd/selftest: Make it clearer to gcc that the access is not out of boundsJason Gunthorpe
GCC gets a bit confused and reports: In function '_test_cmd_get_hw_info', inlined from 'iommufd_ioas_get_hw_info' at iommufd.c:779:3, inlined from 'wrapper_iommufd_ioas_get_hw_info' at iommufd.c:752:1: >> iommufd_utils.h:804:37: warning: array subscript 'struct iommu_test_hw_info[0]' is partly outside array bounds of 'struct iommu_test_hw_info_buffer_smaller[1]' [-Warray-bounds=] 804 | assert(!info->flags); | ~~~~^~~~~~~ iommufd.c: In function 'wrapper_iommufd_ioas_get_hw_info': iommufd.c:761:11: note: object 'buffer_smaller' of size 4 761 | } buffer_smaller; | ^~~~~~~~~~~~~~ While it is true that "struct iommu_test_hw_info[0]" is partly out of bounds of the input pointer, it is not true that info->flags is out of bounds. Unclear why it warns on this. Reuse an existing properly sized stack buffer and pass a truncated length instead to test the same thing. Fixes: af4fde93c319 ("iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl") Link: https://patch.msgid.link/r/0-v1-63a2cffb09da+4486-iommufd_gcc_bounds_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512032344.kaAcKFIM-lkp@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-12-15selftests: netfilter: packetdrill: avoid failure on HZ=100 kernelFlorian Westphal
packetdrill --ip_version=ipv4 --mtu=1500 --tolerance_usecs=1000000 --non_fatal packet conntrack_syn_challenge_ack.pkt conntrack v1.4.8 (conntrack-tools): 1 flow entries have been shown. conntrack_syn_challenge_ack.pkt:32: error executing `conntrack -f $NFCT_IP_VERSION \ -L -p tcp --dport 8080 | grep UNREPLIED | grep -q SYN_SENT` command: non-zero status 1 Affected kernel had CONFIG_HZ=100; reset packet was still sitting in backlog. Reported-by: Yi Chen <yiche@redhat.com> Fixes: a8a388c2aae4 ("selftests: netfilter: add packetdrill based conntrack tests") Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-15selftests: statmount: tests for STATMOUNT_BY_FDBhavik Sachdev
Add tests for STATMOUNT_BY_FD flag, which adds support for passing a file descriptors to statmount(). The fd can also be on a "unmounted" mount (mount unmounted with MNT_DETACH), we also include tests for that. Co-developed-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com> Link: https://patch.msgid.link/20251129091455.757724-4-b.sachdev1904@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15lkdtm/bugs: Add cases for BUG and PANIC occurring in hardirq contextArd Biesheuvel
Add lkdtm cases to trigger a BUG() or panic() from hardirq context. This is useful for testing pstore behavior being invoked from such contexts. Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-12-12selftests: ublk: add user copy test casesCaleb Sander Mateos
The ublk selftests cover every data copy mode except user copy. Add tests for user copy based on the existing test suite: - generic_14 ("basic recover function verification (user copy)") based on generic_04 and generic_05 - null_03 ("basic IO test with user copy") based on null_01 and null_02 - loop_06 ("write and verify over user copy") based on loop_01 and loop_03 - loop_07 ("mkfs & mount & umount with user copy") based on loop_02 and loop_04 - stripe_05 ("write and verify test on user copy") based on stripe_03 - stripe_06 ("mkfs & mount & umount on user copy") based on stripe_02 and stripe_04 - stress_06 ("run IO and remove device (user copy)") based on stress_01 and stress_03 - stress_07 ("run IO and kill ublk server (user copy)") based on stress_02 and stress_04 Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: add support for user copy to kublkCaleb Sander Mateos
The ublk selftests mock ublk server kublk supports every data copy mode except user copy. Add support for user copy to kublk, enabled via the --user_copy (-u) command line argument. On writes, issue pread() calls to copy the write data into the ublk_io's buffer before dispatching the write to the target implementation. On reads, issue pwrite() calls to copy read data from the ublk_io's buffer before committing the request. Copy in 2 KB chunks to provide some coverage of the offseting logic. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: forbid multiple data copy modesCaleb Sander Mateos
The kublk mock ublk server allows multiple data copy mode arguments to be passed on the command line (--zero_copy, --get_data, and --auto_zc). The ublk device will be created with all the requested feature flags, however kublk will only use one of the modes to interact with request data (arbitrarily preferring auto_zc over zero_copy over get_data). To clarify the intent of the test, don't allow multiple data copy modes to be specified. --zero_copy and --auto_zc are allowed together for --auto_zc_fallback, which uses both copy modes. Don't set UBLK_F_USER_COPY for zero_copy, as it's a separate feature. Fix the test cases in test_stress_05 passing --get_data along with --zero_copy or --auto_zc. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: don't share backing files between ublk serversCaleb Sander Mateos
stress_04 is missing a wait between blocks of tests, meaning multiple ublk servers will be running in parallel using the same backing files. Add a wait after each section to ensure each backing file is in use by a single ublk server at a time. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04Caleb Sander Mateos
stress_04 is described as "run IO and kill ublk server(zero copy)" but the --per_io_tasks tests cases don't use zero copy. Plus, one of the test cases is duplicated. Add --auto_zc to these test cases and --auto_zc_fallback to one of the duplicated ones. This matches the test cases in stress_03. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: fix fio arguments in run_io_and_recover()Caleb Sander Mateos
run_io_and_recover() invokes fio with --size="${size}", but the variable size doesn't exist. Thus, the argument expands to --size=, which causes fio to exit immediately with an error without issuing any I/O. Pass the value for size as the first argument to the function. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: remove unused ios map in seq_io.btCaleb Sander Mateos
The ios map populated by seq_io.bt is never read, so remove it. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: correct last_rw map type in seq_io.btCaleb Sander Mateos
The last_rw map is initialized with a value of 0 but later assigned the value args.sector + args.nr_sector, which has type sector_t = u64. bpftrace complains about the type mismatch between int64 and uint64: trace/seq_io.bt:18:3-59: ERROR: Type mismatch for @last_rw: trying to assign value of type 'uint64' when map already contains a value of type 'int64' @last_rw[$dev, str($2)] = (args.sector + args.nr_sector); Cast the initial value to uint64 so bpftrace will load the program. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()Ming Lei
The functions ublk_queue_use_zc(), ublk_queue_use_auto_zc(), and ublk_queue_auto_zc_fallback() were returning int, but performing bitwise AND on q->flags which is __u64. When a flag bit is set in the upper 32 bits (beyond INT_MAX), the result of the bitwise AND operation could overflow when cast to int, leading to incorrect boolean evaluation. For example, if UBLKS_Q_AUTO_BUF_REG_FALLBACK is 0x8000000000000000: - (u64)flags & 0x8000000000000000 = 0x8000000000000000 - Cast to int: undefined behavior / incorrect value - Used in if(): may evaluate incorrectly Fix by: 1. Changing return type from int to bool for semantic correctness 2. Using !! to explicitly convert to boolean (0 or 1) This ensures the functions return proper boolean values regardless of which bit position the flags occupy in the 64-bit field. Fixes: c3a6d48f86da ("selftests: ublk: remove ublk queue self-defined flags") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-12-12selftests/sched_ext: flush stdout before test to avoid log spamEmil Tsalapatis
The sched_ext selftests runner runs each test in the same process, with each test possibly forking multiple times. When the main runner has not flushed its stdout, the children inherit the buffered output for previous tests and emit it during exit. This causes log spam. Make sure stdout/stderr is fully flushed before each test. Cc: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-11netfilter: nf_nat: remove bogus direction checkFlorian Westphal
Jakub reports spurious failures of the 'conntrack_reverse_clash.sh' selftest. A bogus test makes nat core resort to port rewrite even though there is no need for this. When the test is made, nf_nat_used_tuple() would already have caused us to return if no other CPU had added a colliding entry. Moreover, nf_nat_used_tuple() would have ignored the colliding entry if their origin tuples had been the same. All that is left to check is if the colliding entry in the hash table is subject to NAT, and, if its not, if our entry matches in the reverse direction, e.g. hash table has addr1:1234 -> addr2:80, and we want to commit addr2:80 -> addr1:1234. Because we already checked that neither the new nor the committed entry is subject to NAT we only have to check origin vs. reply tuple: for non-nat entries, the reply tuple is always the inverted original. Just in case there are more problems extend the error reporting in the selftest while at it and dump conntrack table/stats on error. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251206175135.4a56591b@kernel.org/ Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash") Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-11selftests/tc-testing: Create tests to exercise ets classes active list ↵Victor Nogueira
misplacements Add a test case for a bug fixed by Jamal [1] and for scenario where an ets drr class is inserted into the active list twice. - Try to delete ets drr class' qdisc while still keeping it in the active list - Try to add ets class to the active list twice [1] https://lore.kernel.org/netdev/20251128151919.576920-1-jhs@mojatatu.com/ Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251208190125.1868423-2-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11Merge tag 'nf-25-12-10' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter: updates for net 1) Fix refcount leaks in nf_conncount, from Fernando Fernandez Mancera. This addresses a recent regression that came in the last -next pull request. 2) Fix a null dereference in route error handling in IPVS, from Slavin Liu. This is an ancient issue dating back to 5.1 days. 3) Always set ifindex in route tuple in the flowtable output path, from Lorenzo Bianconi. This bug came in with the recent output path refactoring. 4) Prefer 'exit $ksft_xfail' over 'exit $ksft_skip' when we fail to trigger a nat race condition to exercise the clash resolution path in selftest infra, $ksft_skip should be reserved for missing tooling, From myself. * tag 'nf-25-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: netfilter: prefer xfail in case race wasn't triggered netfilter: always set route tuple out ifindex ipvs: fix ipv4 null-ptr-deref in route error path netfilter: nf_conncount: fix leaked ct in error paths ==================== Link: https://patch.msgid.link/20251210110754.22620-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11selftests: forwarding: vxlan_bridge_1q_mc_ul: Drop useless sleepingPetr Machata
After fixing traffic matching in the previous patch, the test does not need to use the sleep anymore. So drop vx_wait() altogether, migrate all callers of vx{10,20}_create_wait() to the corresponding _create(), and drop the now unused _create_wait() helpers. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/eabfe4fa12ae788cf3b8c5c876a989de81dfc3d3.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11selftests: forwarding: vxlan_bridge_1q_mc_ul: Fix flakinessPetr Machata
This test runs an overlay traffic, forwarded over a multicast-routed VXLAN underlay. In order to determine whether packets reach their intended destination, it uses a TC match. For convenience, it uses a flower match, which however does not allow matching on the encapsulated packet. So various service traffic ends up being indistinguishable from the test packets, and ends up confusing the test. To alleviate the problem, the test uses sleep to allow the necessary service traffic to run and clear the channel, before running the test traffic. This worked for a while, but lately we have nevertheless seen flakiness of the test in the CI. Fix the issue by using u32 to match the encapsulated packet as well. The confusing packets seem to always be IPv6 multicast listener reports. Realistically they could be ARP or other ICMP6 traffic as well. Therefore look for ethertype IPv4 in the IPv4 traffic test, and for IPv6 / UDP combination in the IPv6 traffic test. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/6438cb1613a2a667d3ff64089eb5994778f247af.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-11selftests: net: lib: tc_rule_stats_get(): Don't hard-code array indexPetr Machata
Flower is commonly used to match on packets in many bash-based selftests. A dump of a flower filter including statistics looks something like this: [ { "protocol": "all", "pref": 49152, "kind": "flower", "chain": 0 }, { ... "options": { ... "actions": [ { ... "stats": { "bytes": 0, "packets": 0, "drops": 0, "overlimits": 0, "requeues": 0, "backlog": 0, "qlen": 0 } } ] } } ] The JQ query in the helper function tc_rule_stats_get() assumes this form and looks for the second element of the array. However, a dump of a u32 filter looks like this: [ { "protocol": "all", "pref": 49151, "kind": "u32", "chain": 0 }, { "protocol": "all", "pref": 49151, "kind": "u32", "chain": 0, "options": { "fh": "800:", "ht_divisor": 1 } }, { ... "options": { ... "actions": [ { ... "stats": { "bytes": 0, "packets": 0, "drops": 0, "overlimits": 0, "requeues": 0, "backlog": 0, "qlen": 0 } } ] } }, ] There's an extra element which the JQ query ends up choosing. Instead of hard-coding a particular index, look for the entry on which a selector .options.actions yields anything. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/12982a44471c834511a0ee6c1e8f57e3a5307105.1765289566.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10selftests: netfilter: prefer xfail in case race wasn't triggeredFlorian Westphal
Jakub says: "We try to reserve SKIP for tests skipped because tool is missing in env, something isn't built into the kernel etc." use xfail, we can't force the race condition to appear at will so its expected that the test 'fails' occasionally. Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251206175647.5c32f419@kernel.org/ Signed-off-by: Florian Westphal <fw@strlen.de>
2025-12-10selftests/bpf: add regression test for bpf_d_path()Shuran Liu
Add a regression test for bpf_d_path() to cover incorrect verifier assumptions caused by an incorrect function prototype. The test attaches to the fallocate hook, calls bpf_d_path() and verifies that a simple prefix comparison on the returned pathname behaves correctly after the fix in patch 1. It ensures the verifier does not assume the buffer remains unwritten. Co-developed-by: Zesen Liu <ftyg@live.com> Signed-off-by: Zesen Liu <ftyg@live.com> Co-developed-by: Peili Gao <gplhust955@gmail.com> Signed-off-by: Peili Gao <gplhust955@gmail.com> Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Shuran Liu <electronlsr@gmail.com> Link: https://lore.kernel.org/r/20251206141210.3148-3-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-10selftests: net: tfo: Fix build warningGuenter Roeck
Fix tfo.c: In function ‘run_server’: tfo.c:84:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ by evaluating the return value from read() and displaying an error message if it reports an error. Fixes: c65b5bb2329e3 ("selftests: net: add passive TFO test binary") Cc: David Wei <dw@davidwei.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20251205171010.515236-14-linux@roeck-us.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10selftests: net: Fix build warningsGuenter Roeck
Fix ksft.h: In function ‘ksft_ready’: ksft.h:27:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ ksft.h: In function ‘ksft_wait’: ksft.h:51:9: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ by checking the return value of the affected functions and displaying an error message if an error is seen. Fixes: 2b6d490b82668 ("selftests: drv-net: Factor out ksft C helpers") Cc: Joe Damato <jdamato@fastly.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20251205171010.515236-11-linux@roeck-us.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10selftest: af_unix: Support compilers without flex-array-member-not-at-end ↵Guenter Roeck
support Fix: gcc: error: unrecognized command-line option ‘-Wflex-array-member-not-at-end’ by making the compiler option dependent on its support. Fixes: 1838731f1072c ("selftest: af_unix: Add -Wall and -Wflex-array-member-not-at-end to CFLAGS.") Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20251205171010.515236-7-linux@roeck-us.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10selftests: tls: fix warning of uninitialized variableAnkit Khushwaha
In 'poll_partial_rec_async' a uninitialized char variable 'token' with is used for write/read instruction to synchronize between threads via a pipe. tls.c:2833:26: warning: variable 'token' is uninitialized when passed as a const pointer argument Initialize 'token' to '\0' to silence compiler warning. Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Link: https://patch.msgid.link/20251205163242.14615-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-10Merge tag 'locking-futex-2025-12-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex updates from Ingo Molnar: - Standardize on ktime_t in restart_block::time as well (Thomas Weißschuh) - Futex selftests: - Add robust list testcases (André Almeida) - Formatting fixes/cleanups (Carlos Llamas) * tag 'locking-futex-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Store time as ktime_t in restart block selftests/futex: Create test for robust list selftests/futex: Skip tests if shmget unsupported selftests/futex: Add newline to ksft_exit_fail_msg() selftests/futex: Remove unused test_futex_mpol()
2025-12-10selftests/bpf: add verifier sign extension bound computation tests.Cupertino Miranda
This commit adds 3 tests to verify a common compiler generated pattern for sign extension (r1 <<= 32; r1 s>>= 32). The tests make sure the register bounds are correctly computed both for positive and negative register values. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Link: https://lore.kernel.org/r/20251202180220.11128-3-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-09selftests/bpf: add tests for attaching invalid fdKohei Enju
Add test cases for situations where adding the following types of file descriptors to a cpumap entry should fail: - Non-BPF file descriptor (expect -EINVAL) - Nonexistent file descriptor (expect -EBADF) Also tighten the assertion for the expected error when adding a non-BPF_XDP_CPUMAP program to a cpumap entry. Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20251208131449.73036-3-enjuk@amazon.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-09selftests/bpf: Add test for truncated dmabuf_iter readsT.J. Mercier
If many dmabufs are present, reads of the dmabuf iterator can be truncated at PAGE_SIZE or user buffer size boundaries before the fix in "bpf: Fix truncated dmabuf iterator reads". Add a test to confirm truncation does not occur. Signed-off-by: T.J. Mercier <tjmercier@google.com> Link: https://lore.kernel.org/r/20251204000348.1413593-2-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-08selftests: mptcp: pm: ensure unknown flags are ignoredMatthieu Baerts (NGI0)
This validates the previous commit: the userspace can set unknown flags -- the 7th bit is currently unused -- without errors, but only the supported ones are printed in the endpoints dumps. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251205-net-mptcp-misc-fixes-6-19-rc1-v1-2-9e4781a6c1b8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-08selftests/bpf: replace "__auto_type" with "auto"H. Peter Anvin
Replace instances of "__auto_type" with "auto" in: tools/testing/selftests/bpf/prog_tests/socket_helpers.h This file does not seem to be including <linux/compiler_types.h> directly or indirectly, so copy the definition but guard it with !defined(auto). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2025-12-06Merge tag 'tty-6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big set of tty/serial driver changes for 6.19-rc1. Nothing major at all, just small constant churn to make the tty layer "cleaner" as well as serial driver updates and even a new test added! Included in here are: - More tty/serial cleanups from Jiri - tty tiocsti test added to hopefully ensure we don't regress in this area again - sc16is7xx driver updates - imx serial driver updates - 8250 driver updates - new hardware device ids added - other minor serial/tty driver cleanups and tweaks All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (60 commits) serial: sh-sci: Fix deadlock during RSCI FIFO overrun error dt-bindings: serial: rsci: Drop "uart-has-rtscts: false" LoongArch: dts: Add uart new compatible string serial: 8250: Add Loongson uart driver support dt-bindings: serial: 8250: Add Loongson uart compatible serial: 8250: add driver for KEBA UART serial: Keep rs485 settings for devices without firmware node serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms serial: qcom-geni: Enable PM runtime for serial driver serial: sprd: Return -EPROBE_DEFER when uart clock is not ready tty: serial: samsung: Declare earlycon for Exynos850 serial: icom: Convert PCIBIOS_* return codes to errnos serial: 8250-of: Fix style issues in 8250_of.c serial: add support of CPCI cards serial: mux: Fix kernel doc for mux_poll() tty: replace use of system_unbound_wq with system_dfl_wq serial: 8250_platform: simplify IRQF_SHARED handling serial: 8250: make share_irqs local to 8250_platform serial: 8250: move skip_txen_test to core serial: drop SERIAL_8250_DEPRECATED_OPTIONS ...
2025-12-06Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
2025-12-06Merge tag 'landlock-6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mainly fixes handling of disconnected directories and adds new tests" * tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add disconnected leafs and branch test suites selftests/landlock: Add tests for access through disconnected paths landlock: Improve variable scope landlock: Fix handling of disconnected directories selftests/landlock: Fix makefile header list landlock: Make docs in cred.h and domain.h visible landlock: Minor comments improvements
2025-12-06Merge tag 'libnvdimm-for-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm updates from Ira Weiny: "These are mainly bug fixes and code updates. There is a new feature to divide up memmap= carve outs and a fix caught in linux-next for that patch. Managing memmap memory on the fly for multiple VM's was proving difficult and Mike provided a driver which allows for the memory to be better manged. Summary: - Allow exposing RAM carveouts as NVDIMM DIMM devices - Prevent integer overflow in ramdax_get_config_data() - Replace use of system_wq with system_percpu_wq - Documentation: btt: Unwrap bit 31-30 nested table - tools/testing/nvdimm: Use per-DIMM device handle" * tag 'libnvdimm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: Prevent integer overflow in ramdax_get_config_data() Documentation: btt: Unwrap bit 31-30 nested table nvdimm: replace use of system_wq with system_percpu_wq tools/testing/nvdimm: Use per-DIMM device handle nvdimm: allow exposing RAM carveouts as NVDIMM DIMM devices
2025-12-06Merge tag 'dma-mapping-6.19-2025-12-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - More DMA mapping API refactoring to physical addresses as the primary interface instead of page+offset parameters. This time dma_map_ops callbacks are converted to physical addresses, what in turn results also in some simplification of architecture specific code (Leon Romanovsky and Jason Gunthorpe) - Clarify that dma_map_benchmark is not a kernel self-test, but standalone tool (Qinxin Xia) * tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: remove unused map_page callback xen: swiotlb: Convert mapping routine to rely on physical address x86: Use physical address for DMA mapping sparc: Use physical address DMA mapping powerpc: Convert to physical address DMA mapping parisc: Convert DMA map_page to map_phys interface MIPS/jazzdma: Provide physical address directly alpha: Convert mapping routine to rely on physical address dma-mapping: remove unused mapping resource callbacks xen: swiotlb: Switch to physical address mapping callbacks ARM: dma-mapping: Switch to physical address mapping callbacks ARM: dma-mapping: Reduce struct page exposure in arch_sync_dma*() dma-mapping: convert dummy ops to physical address mapping dma-mapping: prepare dma_map_ops to conversion to physical address tools/dma: move dma_map_benchmark from selftests to tools/dma
2025-12-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
2025-12-05Merge tag 'riscv-for-linus-6.19-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Enable parallel hotplug for RISC-V - Optimize vector regset allocation for ptrace() - Add a kernel selftest for the vector ptrace interface - Enable the userspace RAID6 test to build and run using RISC-V vectors - Add initial support for the Zalasr RISC-V ratified ISA extension - For the Zicbop RISC-V ratified ISA extension to userspace, expose hardware and kernel support to userspace and add a kselftest for Zicbop - Convert open-coded instances of 'asm goto's that are controlled by runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(), following arm64's alternative_has_cap_{un,}likely() - Remove an unnecessary mask in the GFP flags used in some calls to pagetable_alloc() * tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: selftests/riscv: Add Zicbop prefetch test riscv: hwprobe: Expose Zicbop extension and its block size riscv: Introduce Zalasr instructions riscv: hwprobe: Export Zalasr extension dt-bindings: riscv: Add Zalasr ISA extension description riscv: Add ISA extension parsing for Zalasr selftests: riscv: Add test for the Vector ptrace interface riscv: ptrace: Optimize the allocation of vector regset raid6: test: Add support for RISC-V raid6: riscv: Allow code to be compiled in userspace raid6: riscv: Prevent compiler from breaking inline vector assembly code riscv: cmpxchg: Use riscv_has_extension_likely riscv: bitops: Use riscv_has_extension_likely riscv: hweight: Use riscv_has_extension_likely riscv: checksum: Use riscv_has_extension_likely riscv: pgtable: Use riscv_has_extension_unlikely riscv: Remove __GFP_HIGHMEM masking RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
2025-12-05selftests/bpf: Test getting associated struct_ops in timer callbackAmery Hung
Make sure 1) a timer callback can also reference the associated struct_ops, and then make sure 2) the timer callback cannot get a dangled pointer to the struct_ops when the map is freed. The test schedules a timer callback from a struct_ops program since struct_ops programs do not pin the map. It is possible for the timer callback to run after the map is freed. The timer callback calls a kfunc that runs .test_1() of the associated struct_ops, which should return MAP_MAGIC when the map is still alive or -1 when the map is gone. The first subtest added in this patch schedules the timer callback to run immediately, while the map is still alive. The second subtest added schedules the callback to run 500ms after syscall_prog runs and then frees the map right after syscall_prog runs. Both subtests then wait until the callback runs to check the return of the kfunc. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-7-ameryhung@gmail.com