diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-18 20:50:32 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-18 20:50:32 -0800 |
| commit | eeccf287a2a517954b57cf9d733b3cf5d47afa34 (patch) | |
| tree | 46b5cd55d8da25cbc9aa96b38470506958851005 /kernel/cgroup | |
| parent | 956b9cbd7f156c8672dac94a00de3c6a0939c692 (diff) | |
| parent | ac1ea219590c09572ed5992dc233bbf7bb70fef9 (diff) | |
Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
couple of issues in the demotion code - pages were failed demotion
and were finding themselves demoted into disallowed nodes (Bing Jiao)
- "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
mapledtree race and performs a number of cleanups (Liam Howlett)
- "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
them" implements a lot of cleanups following on from the conversion
of the VMA flags into a bitmap (Lorenzo Stoakes)
- "support batch checking of references and unmapping for large folios"
implements batching to greatly improve the performance of reclaiming
clean file-backed large folios (Baolin Wang)
- "selftests/mm: add memory failure selftests" does as claimed (Miaohe
Lin)
* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
mm/page_alloc: clear page->private in free_pages_prepare()
selftests/mm: add memory failure dirty pagecache test
selftests/mm: add memory failure clean pagecache test
selftests/mm: add memory failure anonymous page test
mm: rmap: support batched unmapping for file large folios
arm64: mm: implement the architecture-specific clear_flush_young_ptes()
arm64: mm: support batch clearing of the young flag for large folios
arm64: mm: factor out the address and ptep alignment into a new helper
mm: rmap: support batched checks of the references for large folios
tools/testing/vma: add VMA userland tests for VMA flag functions
tools/testing/vma: separate out vma_internal.h into logical headers
tools/testing/vma: separate VMA userland tests into separate files
mm: make vm_area_desc utilise vma_flags_t only
mm: update all remaining mmap_prepare users to use vma_flags_t
mm: update shmem_[kernel]_file_*() functions to use vma_flags_t
mm: update secretmem to use VMA flags on mmap_prepare
mm: update hugetlbfs to use VMA flags on mmap_prepare
mm: add basic VMA flag operation helper functions
tools: bitmap: add missing bitmap_[subset(), andnot()]
mm: add mk_vma_flags() bitmap flag macro helper
...
Diffstat (limited to 'kernel/cgroup')
| -rw-r--r-- | kernel/cgroup/cpuset.c | 54 |
1 files changed, 36 insertions, 18 deletions
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 832179236529..7607dfe516e6 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -4145,40 +4145,58 @@ bool cpuset_current_node_allowed(int node, gfp_t gfp_mask) return allowed; } -bool cpuset_node_allowed(struct cgroup *cgroup, int nid) +/** + * cpuset_nodes_allowed - return effective_mems mask from a cgroup cpuset. + * @cgroup: pointer to struct cgroup. + * @mask: pointer to struct nodemask_t to be returned. + * + * Returns effective_mems mask from a cgroup cpuset if it is cgroup v2 and + * has cpuset subsys. Otherwise, returns node_states[N_MEMORY]. + * + * This function intentionally avoids taking the cpuset_mutex or callback_lock + * when accessing effective_mems. This is because the obtained effective_mems + * is stale immediately after the query anyway (e.g., effective_mems is updated + * immediately after releasing the lock but before returning). + * + * As a result, returned @mask may be empty because cs->effective_mems can be + * rebound during this call. Besides, nodes in @mask are not guaranteed to be + * online due to hot plugins. Callers should check the mask for validity on + * return based on its subsequent use. + **/ +void cpuset_nodes_allowed(struct cgroup *cgroup, nodemask_t *mask) { struct cgroup_subsys_state *css; struct cpuset *cs; - bool allowed; /* * In v1, mem_cgroup and cpuset are unlikely in the same hierarchy * and mems_allowed is likely to be empty even if we could get to it, - * so return true to avoid taking a global lock on the empty check. + * so return directly to avoid taking a global lock on the empty check. */ - if (!cpuset_v2()) - return true; + if (!cgroup || !cpuset_v2()) { + nodes_copy(*mask, node_states[N_MEMORY]); + return; + } css = cgroup_get_e_css(cgroup, &cpuset_cgrp_subsys); - if (!css) - return true; + if (!css) { + nodes_copy(*mask, node_states[N_MEMORY]); + return; + } /* - * Normally, accessing effective_mems would require the cpuset_mutex - * or callback_lock - but node_isset is atomic and the reference - * taken via cgroup_get_e_css is sufficient to protect css. - * - * Since this interface is intended for use by migration paths, we - * relax locking here to avoid taking global locks - while accepting - * there may be rare scenarios where the result may be innaccurate. + * The reference taken via cgroup_get_e_css is sufficient to + * protect css, but it does not imply safe accesses to effective_mems. * - * Reclaim and migration are subject to these same race conditions, and - * cannot make strong isolation guarantees, so this is acceptable. + * Normally, accessing effective_mems would require the cpuset_mutex + * or callback_lock - but the correctness of this information is stale + * immediately after the query anyway. We do not acquire the lock + * during this process to save lock contention in exchange for racing + * against mems_allowed rebinds. */ cs = container_of(css, struct cpuset, css); - allowed = node_isset(nid, cs->effective_mems); + nodes_copy(*mask, cs->effective_mems); css_put(css); - return allowed; } /** |
