KVM: PPC: Book3S HV: Fix refcounting of hugepages

The H_REGISTER_VPA hcall implementation in HV Power KVM needs to pin some guest memory pages into host memory so that they can be safely accessed from usermode. It does this used get_user_pages_fast(). When the VPA is unregistered, or the VCPUs are cleaned up, these pages are released using put_page(). However, the get_user_pages() is invoked on the specific memory are of the VPA which could lie within hugepages. In case the pinned page is huge, we explicitly find the head page of the compound page before calling put_page() on it. At least with the latest kernel, this is not correct. put_page() already handles finding the correct head page of a compound, and also deals with various counts on the individual tail page which are important for transparent huge pages. We don't support transparent hugepages on Power, but even so, bypassing this count maintenance can lead (when the VM ends) to a hugepage being released back to the pool with a non-zero mapcount on one of the tail pages. This can then lead to a bad_page() when the page is released from the hugepage pool. This removes the explicit compound_head() call to correct this bug. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
author: David Gibson <david@gibson.dropbear.id.au> 2012-05-08 20:24:08 +1000
committer: Avi Kivity <avi@redhat.com> 2012-05-08 17:54:08 +0300
commit: de6c0b02d4d7bdf2587e679a6ddbb71b7d68bb89 (patch)
tree: d1b7cb647d6282052768c4a1e5a37fb88cd482f5 /arch
parent: 62c49cc976af84cb0ffcb5ec07ee88da1a94e222 (diff)
2 files changed, 13 insertions, 11 deletions
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index ddc485a529f2..c3beaeef3f60 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -258,6 +258,8 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
 			    !(memslot->userspace_addr & (s - 1))) {
 				start &= ~(s - 1);
 				pgsize = s;
+				get_page(hpage);
+				put_page(page);
 				page = hpage;
 			}
 		}
@@ -281,11 +283,8 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn,
 	err = 0;
 
  out:
-	if (got) {
-		if (PageHuge(page))
-			page = compound_head(page);
+	if (got)
 		put_page(page);
-	}
 	return err;
 
  up_err:
@@ -678,8 +677,15 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		SetPageDirty(page);
 
  out_put:
-	if (page)
-		put_page(page);
+	if (page) {
+		/*
+		 * We drop pages[0] here, not page because page might
+		 * have been set to the head page of a compound, but
+		 * we have to drop the reference on the correct tail
+		 * page to match the get inside gup()
+		 */
+		put_page(pages[0]);
+	}
 	return ret;
 
  out_unlock:
@@ -979,6 +985,7 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
 			pa = *physp;
 		}
 		page = pfn_to_page(pa >> PAGE_SHIFT);
+		get_page(page);
 	} else {
 		hva = gfn_to_hva_memslot(memslot, gfn);
 		npages = get_user_pages_fast(hva, 1, 1, pages);
@@ -991,8 +998,6 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
 		page = compound_head(page);
 		psize <<= compound_order(page);
 	}
-	if (!kvm->arch.using_mmu_notifiers)
-		get_page(page);
 	offset = gpa & (psize - 1);
 	if (nb_ret)
 		*nb_ret = psize - offset;
@@ -1003,7 +1008,6 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
 {
 	struct page *page = virt_to_page(va);
 
-	page = compound_head(page);
 	put_page(page);
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 01294a5099dd..108d1f580177 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1192,8 +1192,6 @@ static void unpin_slot(struct kvm *kvm, int slot_id)
 				continue;
 			pfn = physp[j] >> PAGE_SHIFT;
 			page = pfn_to_page(pfn);
-			if (PageHuge(page))
-				page = compound_head(page);
 			SetPageDirty(page);
 			put_page(page);
 		}
author	David Gibson <david@gibson.dropbear.id.au>	2012-05-08 20:24:08 +1000
committer	Avi Kivity <avi@redhat.com>	2012-05-08 17:54:08 +0300
commit	de6c0b02d4d7bdf2587e679a6ddbb71b7d68bb89 (patch)
tree	d1b7cb647d6282052768c4a1e5a37fb88cd482f5 /arch
parent	62c49cc976af84cb0ffcb5ec07ee88da1a94e222 (diff)