From b985194c8c0a130ed155b71662e39f7eaea4876f Mon Sep 17 00:00:00 2001 From: Chen Yucong Date: Thu, 22 May 2014 11:54:15 -0700 Subject: hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage For handling a free hugepage in memory failure, the race will happen if another thread hwpoisoned this hugepage concurrently. So we need to check PageHWPoison instead of !PageHWPoison. If hwpoison_filter(p) returns true or a race happens, then we need to unlock_page(hpage). Signed-off-by: Chen Yucong Reviewed-by: Naoya Horiguchi Tested-by: Naoya Horiguchi Reviewed-by: Andi Kleen Cc: [2.6.36+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/memory-failure.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) (limited to 'mm/memory-failure.c') diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 35ef28acf137..dbf8922216ad 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1081,15 +1081,16 @@ int memory_failure(unsigned long pfn, int trapno, int flags) return 0; } else if (PageHuge(hpage)) { /* - * Check "just unpoisoned", "filter hit", and - * "race with other subpage." + * Check "filter hit" and "race with other subpage." */ lock_page(hpage); - if (!PageHWPoison(hpage) - || (hwpoison_filter(p) && TestClearPageHWPoison(p)) - || (p != hpage && TestSetPageHWPoison(hpage))) { - atomic_long_sub(nr_pages, &num_poisoned_pages); - return 0; + if (PageHWPoison(hpage)) { + if ((hwpoison_filter(p) && TestClearPageHWPoison(p)) + || (p != hpage && TestSetPageHWPoison(hpage))) { + atomic_long_sub(nr_pages, &num_poisoned_pages); + unlock_page(hpage); + return 0; + } } set_page_hwpoison_huge_page(hpage); res = dequeue_hwpoisoned_huge_page(hpage); -- cgit v1.2.3 From 3e030ecc0fc7de10fd0da10c1c19939872a31717 Mon Sep 17 00:00:00 2001 From: Naoya Horiguchi Date: Thu, 22 May 2014 11:54:21 -0700 Subject: mm/memory-failure.c: fix memory leak by race between poison and unpoison When a memory error happens on an in-use page or (free and in-use) hugepage, the victim page is isolated with its refcount set to one. When you try to unpoison it later, unpoison_memory() calls put_page() for it twice in order to bring the page back to free page pool (buddy or free hugepage list). However, if another memory error occurs on the page which we are unpoisoning, memory_failure() returns without releasing the refcount which was incremented in the same call at first, which results in memory leak and unconsistent num_poisoned_pages statistics. This patch fixes it. Signed-off-by: Naoya Horiguchi Cc: Andi Kleen Cc: [2.6.32+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/memory-failure.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'mm/memory-failure.c') diff --git a/mm/memory-failure.c b/mm/memory-failure.c index dbf8922216ad..9ccef39a9de2 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1153,6 +1153,8 @@ int memory_failure(unsigned long pfn, int trapno, int flags) */ if (!PageHWPoison(p)) { printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn); + atomic_long_sub(nr_pages, &num_poisoned_pages); + put_page(hpage); res = 0; goto out; } -- cgit v1.2.3