From 93ef83050e597634d2c7dc838a28caf5137b9404 Mon Sep 17 00:00:00 2001 From: "YoungJun.park" Date: Fri, 28 Oct 2022 07:42:41 -0700 Subject: kunit: alloc_string_stream_fragment error handling bug fix When it fails to allocate fragment, it does not free and return error. And check the pointer inappropriately. Fixed merge conflicts with commit 618887768bb7 ("kunit: update NULL vs IS_ERR() tests") Shuah Khan Signed-off-by: YoungJun.park Reviewed-by: David Gow Signed-off-by: Shuah Khan --- lib/kunit/string-stream.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'lib') diff --git a/lib/kunit/string-stream.c b/lib/kunit/string-stream.c index f5f51166d8c2..cc32743c1171 100644 --- a/lib/kunit/string-stream.c +++ b/lib/kunit/string-stream.c @@ -23,8 +23,10 @@ static struct string_stream_fragment *alloc_string_stream_fragment( return ERR_PTR(-ENOMEM); frag->fragment = kunit_kmalloc(test, len, gfp); - if (!frag->fragment) + if (!frag->fragment) { + kunit_kfree(test, frag); return ERR_PTR(-ENOMEM); + } return frag; } -- cgit v1.2.3 From e95d50d74b93a767a026f588e8de0b9718a0105e Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 5 Jan 2023 13:23:39 +0200 Subject: lib/scatterlist: Fix to merge contiguous pages into the last SG properly When sg_alloc_append_table_from_pages() calls to pages_are_mergeable() in its 'sgt_append->prv' flow to check whether it can merge contiguous pages into the last SG, it passes the page arguments in the wrong order. The first parameter should be the next candidate page to be merged to the last page and not the opposite. The current code leads to a corrupted SG which resulted in OOPs and unexpected errors when non-contiguous pages are merged wrongly. Fix to pass the page parameters in the right order. Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages") Link: https://lore.kernel.org/r/20230105112339.107969-1-yishaih@nvidia.com Signed-off-by: Yishai Hadas Reviewed-by: Jason Gunthorpe Reviewed-by: Logan Gunthorpe Signed-off-by: Jason Gunthorpe --- lib/scatterlist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'lib') diff --git a/lib/scatterlist.c b/lib/scatterlist.c index a0ad2a7959b5..f72aa50c6654 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -476,7 +476,7 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, /* Merge contiguous pages into the last SG */ prv_len = sgt_append->prv->length; last_pg = sg_page(sgt_append->prv); - while (n_pages && pages_are_mergeable(last_pg, pages[0])) { + while (n_pages && pages_are_mergeable(pages[0], last_pg)) { if (sgt_append->prv->length + PAGE_SIZE > max_segment) break; sgt_append->prv->length += PAGE_SIZE; -- cgit v1.2.3 From f5fe24ef17b5fbe6db49534163e77499fb10ae8c Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Fri, 13 Jan 2023 19:44:47 +0100 Subject: lockref: stop doing cpu_relax in the cmpxchg loop On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): proc stock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): proc stock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): proc stock patched patched +nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in commit d472d9d98b46 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: I would actually prefer just removing it entirely and see if somebody else hollers. You have the numbers to prove it hurts on real hardware, and I don't think we have any numbers to the contrary. So I think it's better to trust the numbers and remove it as a failure, than say "let's just remove it on x86-64 and leave everybody else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): So, from the arm64 side of the fence, I'm perfectly happy just removing the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/ Acked-by: Tony Luck # ia64 Acked-by: Nicholas Piggin # powerpc Acked-by: Will Deacon # arm64 Acked-by: Peter Zijlstra Signed-off-by: Linus Torvalds --- lib/lockref.c | 1 - 1 file changed, 1 deletion(-) (limited to 'lib') diff --git a/lib/lockref.c b/lib/lockref.c index 45e93ece8ba0..2afe4c5d8919 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -23,7 +23,6 @@ } \ if (!--retry) \ break; \ - cpu_relax(); \ } \ } while (0) -- cgit v1.2.3