Age | Commit message (Collapse) | Author |
|
commit 462cdace790ac2ed6aad1b19c9c0af0143b6aab0 upstream.
The current test for bio vec merging is not fully accurate and can be
tricked into merging bios when certain grant combinations are used.
The result of these malicious bio merges is a bio that extends past
the memory page used by any of the originating bios.
Take into account the following scenario, where a guest creates two
grant references that point to the same mfn, ie: grant 1 -> mfn A,
grant 2 -> mfn A.
These references are then used in a PV block request, and mapped by
the backend domain, thus obtaining two different pfns that point to
the same mfn, pfn B -> mfn A, pfn C -> mfn A.
If those grants happen to be used in two consecutive sectors of a disk
IO operation becoming two different bios in the backend domain, the
checks in xen_biovec_phys_mergeable will succeed, because bfn1 == bfn2
(they both point to the same mfn). However due to the bio merging,
the backend domain will end up with a bio that expands past mfn A into
mfn A + 1.
Fix this by making sure the check in xen_biovec_phys_mergeable takes
into account the offset and the length of the bio, this basically
replicates whats done in __BIOVEC_PHYS_MERGEABLE using mfns (bus
addresses). While there also remove the usage of
__BIOVEC_PHYS_MERGEABLE, since that's already checked by the callers
of xen_biovec_phys_mergeable.
Reported-by: "Jan H. Schönherr" <jschoenh@amazon.de>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0b47a6bd1150f4846b1d61925a4cc5a96593a541 ]
Ensure all reserved fields of xatp are zero before making
hypervisor call to XEN in xen_map_device_mmio().
xenmem_add_to_physmap_one() in XEN fails the mapping request if
extra.res reserved field in xatp is not zero for XENMAPSPACE_dev_mmio
request.
Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9f4ab18ac51dc87345a9cbd2527e6acf7a0a9335 upstream.
scsiback_release_cmd() must not dereference se_cmd->se_tmr_req
because that memory is freed by target_free_cmd_mem() before
scsiback_release_cmd() is called. Fix this use-after-free by
inlining struct scsiback_tmr into struct vscsibk_pend.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f1225ee4c8fcf09afaa199b8b1f0450f38b8cd11 ]
In xen_swiotlb_map_page and xen_swiotlb_map_sg_attrs, if the original
page is not suitable, we swap it for another page from the swiotlb
pool.
In these cases, we don't update the previously calculated dma address
for the page before calling xen_dma_map_page. Thus, we end up calling
xen_dma_map_page passing the wrong dev_addr, resulting in
xen_dma_map_page mistakenly assuming that the page is foreign when it is
local.
Fix the bug by updating dev_addr appropriately.
This change has no effect on x86, because xen_dma_map_page is a stub
there.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Tested-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 753c09b5652bb4fe53e2db648002ec64b32b8827 upstream.
Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did
not go far enough to support 64KB in mmap_batch_fn.
The variable 'nr' is the number of 4KB chunk to map. However, when Linux
is using 64KB page granularity the array of pages (vma->vm_private_data)
contain one page per 64KB. Fix it by incrementing st->index correctly.
Furthermore, st->va is not correctly incremented as PAGE_SIZE !=
XEN_PAGE_SIZE.
Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity")
Reported-by: Feng Kan <fkan@apm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 84d582d236dc1f9085e741affc72e9ba061a67c2 upstream.
Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
established that commit 72a9b186292d ("xen: Remove event channel
notification through Xen PCI platform device") (and thus commit
da72ff5bfcb0 ("partially revert "xen: Remove event channel
notification through Xen PCI platform device"")) are unnecessary and,
in fact, prevent HVM guests from booting on Xen releases prior to 4.0
Therefore we revert both of those commits.
The summary of that discussion is below:
Here is the brief summary of the current situation:
Before the offending commit (72a9b186292):
1) INTx does not work because of the reset_watches path.
2) The reset_watches path is only taken if you have Xen > 4.0
3) The Linux Kernel by default will use vector inject if the hypervisor
support. So even INTx does not work no body running the kernel with
Xen > 4.0 would notice. Unless he explicitly disabled this feature
either in the kernel or in Xen (and this can only be disabled by
modifying the code, not user-supported way to do it).
After the offending commit (+ partial revert):
1) INTx is no longer support for HVM (only for PV guests).
2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
not have vector injection support. Since the only other mode
supported is INTx which.
So based on this summary, I think before commit (72a9b186292) we were
in much better position from a user point of view.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1914f0cd203c941bba72f9452c8290324f1ef3dc upstream.
This was broken in commit cd979883b9ed ("xen/acpi-processor:
fix enabling interrupts on syscore_resume"). do_suspend (from
xen/manage.c) and thus xen_resume_notifier never get called on
the initial-domain at resume (it is if running as guest.)
The rationale for the breaking change was that upload_pm_data()
potentially does blocking work in syscore_resume(). This patch
addresses the original issue by scheduling upload_pm_data() to
execute in workqueue context.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Based-on-patch-by: Konrad Wilk <konrad.wilk@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ae7871be189cb41184f1e05742b4a99e2c59774d upstream.
Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 30faaafdfa0c754c91bac60f216c9f34a2bfdf7e upstream.
Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
NUMA balancing") set VM_IO flag to prevent grant maps from being
subjected to NUMA balancing.
It was discovered recently that this flag causes get_user_pages() to
always fail with -EFAULT.
check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb
(which can happen if guest's vdisk has direct-io-safe option).
To avoid this let's use VM_MIXEDMAP flag instead --- it prevents
NUMA balancing just as VM_IO does and has no effect on
check_vma_flags().
Reported-by: Olaf Hering <olaf@aepfle.de>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from David Vrabel:
- advertise control feature flags in xenstore
- fix x86 build when XEN_PVHVM is disabled
* tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xenbus: check return value of xenbus_scanf()
xenbus: prefer list_for_each()
x86: xen: move cpu_up functions out of ifdef
xenbus: advertise control feature flags
|
|
Don't ignore errors here: Set backend state to unknown when
unsuccessful.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
This is more efficient than list_for_each_safe() when list modification
is accompanied by breaking out of the loop.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
The Xen docs specify several flags which a guest can set to advertise
which values of the xenstore control/shutdown key it will recognize.
This patch adds code to write all the relevant feature-flag keys.
Based-on-patch-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"xen features and fixes for 4.9:
- switch to new CPU hotplug mechanism
- support driver_override in pciback
- require vector callback for HVM guests (the alternate mechanism via
the platform device has been broken for ages)"
* tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/x86: Update topology map for PV VCPUs
xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
xen/pciback: support driver_override
xen/pciback: avoid multiple entries in slot list
xen/pciback: simplify pcistub device handling
xen: Remove event channel notification through Xen PCI platform device
xen/events: Convert to hotplug state machine
xen/x86: Convert to hotplug state machine
x86/xen: add missing \n at end of printk warning message
xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
xen: Make VPMU init message look less scary
xen: rename xen_pmu_init() in sys-hypervisor.c
hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
xen/x86: Move irq allocation from Xen smp_op.cpu_up()
|
|
Support the driver_override scheme introduced with commit 782a985d7af2
("PCI: Introduce new device binding path using pci_dev.driver_override")
As pcistub_probe() is called for all devices (it has to check for a
match based on the slot address rather than device type) it has to
check for driver_override set to "pciback" itself.
Up to now for assigning a pci device to pciback you need something like:
echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe
while with the patch you can use the same mechanism as for similar
drivers like pci-stub and vfio-pci:
echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override
echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe
So e.g. libvirt doesn't need special handling for pciback.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
The Xen pciback driver has a list of all pci devices it is ready to
seize. There is no check whether a to be added entry already exists.
While this might be no problem in the common case it might confuse
those which consume the list via sysfs.
Modify the handling of this list by not adding an entry which already
exists. As this will be needed later split out the list handling into
a separate function.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
The Xen pciback driver maintains a list of all its seized devices.
There are two functions searching the list for a specific device with
basically the same semantics just returning different structures in
case of a match.
Split out the search function.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
from old kernel") using the INTx interrupt from Xen PCI platform
device for event channel notification would just lockup the guest
during bootup. postcore_initcall now calls xs_reset_watches which
will eventually try to read a value from XenStore and will get stuck
on read_reply at XenBus forever since the platform driver is not
probed yet and its INTx interrupt handler is not registered yet. That
means that the guest can not be notified at this moment of any pending
event channels and none of the per-event handlers will ever be invoked
(including the XenStore one) and the reply will never be picked up by
the kernel.
The exact stack where things get stuck during xenbus_init:
-xenbus_init
-xs_init
-xs_reset_watches
-xenbus_scanf
-xenbus_read
-xs_single
-xs_single
-xs_talkv
Vector callbacks have always been the favourite event notification
mechanism since their introduction in commit 38e20b07efd5 ("x86/xen:
event channels delivery on HVM.") and the vector callback feature has
always been advertised for quite some time by Xen that's why INTx was
broken for several years now without impacting anyone.
Luckily this also means that event channel notification through INTx
is basically dead-code which can be safely removed without impacting
anybody since it has been effectively disabled for more than 4 years
with nobody complaining about it (at least as far as I'm aware of).
This commit removes event channel notification through Xen PCI
platform device.
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Install the callbacks via the state machine.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
There are two functions with name xen_pmu_init() in the kernel. Rename
the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in
arch/x86/xen/pmu.c
To avoid the same problem in future rename some more functions.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
This should really only be done for XS_TRANSACTION_END messages, or
else at least some of the xenstore-* tools don't work anymore.
Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition")
Reported-by: Richard Schütz <rschuetz@uni-koblenz.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Richard Schütz <rschuetz@uni-koblenz.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"Features and fixes for 4.8-rc0:
- ACPI support for guests on ARM platforms.
- Generic steal time support for arm and x86.
- Support cases where kernel cpu is not Xen VCPU number (e.g., if
in-guest kexec is used).
- Use the system workqueue instead of a custom workqueue in various
places"
* tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits)
xen: add static initialization of steal_clock op to xen_time_ops
xen/pvhvm: run xen_vcpu_setup() for the boot CPU
xen/evtchn: use xen_vcpu_id mapping
xen/events: fifo: use xen_vcpu_id mapping
xen/events: use xen_vcpu_id mapping in events_base
x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
xen: introduce xen_vcpu_id mapping
x86/acpi: store ACPI ids from MADT for future usage
x86/xen: update cpuid.h from Xen-4.7
xen/evtchn: add IOCTL_EVTCHN_RESTRICT
xen-blkback: really don't leak mode property
xen-blkback: constify instance of "struct attribute_group"
xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
xen-blkback: prefer xenbus_scanf() over xenbus_gather()
xen: support runqueue steal time on xen
arm/xen: add support for vm_assist hypercall
xen: update xen headers
xen-pciback: drop superfluous variables
xen-pciback: short-circuit read path used for merging write values
...
|
|
I have noticed that frontswap.h first declares "frontswap_enabled" as
extern bool variable, and then overrides it with "#define
frontswap_enabled (1)" for CONFIG_FRONTSWAP=Y or (0) when disabled. The
bool variable isn't actually instantiated anywhere.
This all looks like an unfinished attempt to make frontswap_enabled
reflect whether a backend is instantiated. But in the current state,
all frontswap hooks call unconditionally into frontswap.c just to check
if frontswap_ops is non-NULL. This should at least be checked inline,
but we can further eliminate the overhead when CONFIG_FRONTSWAP is
enabled and no backend registered, using a static key that is initially
disabled, and gets enabled only upon first backend registration.
Thus, checks for "frontswap_enabled" are replaced with
"frontswap_enabled()" wrapping the static key check. There are two
exceptions:
- xen's selfballoon_process() was testing frontswap_enabled in code guarded
by #ifdef CONFIG_FRONTSWAP, which was effectively always true when reachable.
The patch just removes this check. Using frontswap_enabled() does not sound
correct here, as this can be true even without xen's own backend being
registered.
- in SYSCALL_DEFINE2(swapon), change the check to IS_ENABLED(CONFIG_FRONTSWAP)
as it seems the bitmap allocation cannot currently be postponed until a
backend is registered. This means that frontswap will still have some
memory overhead by being configured, but without a backend.
After the patch, we can expect that some functions in frontswap.c are
called only when frontswap_ops is non-NULL. Change the checks there to
VM_BUG_ONs. While at it, convert other BUG_ONs to VM_BUG_ONs as
frontswap has been stable for some time.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1463152235-9717-1-git-send-email-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU
id for CPU0.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of
vCPU id should be used. Use the newly introduced xen_vcpu_id mapping
to convert it from Linux's id.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter
and Xen's idea of vCPU id should be used. Use the newly introduced
xen_vcpu_id mapping to convert it from Linux's id.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
while Xen's idea is expected. In some cases these ideas diverge so we
need to do remapping.
Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().
Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
they're only being called by PV guests before perpu areas are
initialized. While the issue could be solved by switching to
early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
probably never get to the point where their idea of vCPU id diverges
from Xen's.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
IOCTL_EVTCHN_RESTRICT limits the file descriptor to being able to bind
to interdomain event channels from a specific domain. Event channels
that are already bound continue to work for sending and receiving
notifications.
This is useful as part of deprivileging a user space PV backend or
device model (QEMU). e.g., Once the device model as bound to the
ioreq server event channels it can restrict the file handle so an
exploited DM cannot use it to create or bind to arbitrary event
channels.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and
CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on
both Intel and AMD systems. Doing any kind of hardware capability
checks in the driver as a prerequisite was wrong anyway: With the
hypervisor being in charge, all such checking should be done by it. If
ACPI data gets uploaded despite some missing capability, the hypervisor
is free to ignore part or all of that data.
Ditch the entire check_prereq() function, and do the only valid check
(xen_initial_domain()) in the caller in its place.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
No need to retain a local copy of the full request message, only the
type is really needed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
xenbus_dev_request_and_reply() needs to track whether a transaction is
open. For XS_TRANSACTION_START messages it calls transaction_start()
and for XS_TRANSACTION_END messages it calls transaction_end().
If sending an XS_TRANSACTION_START message fails or responds with an
an error, the transaction is not open and transaction_end() must be
called.
If sending an XS_TRANSACTION_END message fails, the transaction is
still open, but if an error response is returned the transaction is
closed.
Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus
stalling shutdown/restart") introduced a regression where failed
XS_TRANSACTION_START messages were leaving the transaction open. This
can cause problems with suspend (and migration) as all transactions
must be closed before suspending.
It appears that the problematic change was added accidentally, so just
remove it.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Inability to locate a user mode specified transaction ID should not
lead to a kernel crash. For other than XS_TRANSACTION_START also
don't issue anything to xenbus if the specified ID doesn't match that
of any active transaction.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Up to now reading the stolen time of a remote cpu was not possible in a
performant way under Xen. This made support of runqueue steal time via
paravirt_steal_rq_enabled impossible.
With the addition of an appropriate hypervisor interface this is now
possible, so add the support.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
req_start is simply an alias of the "offset" function parameter, and
req_end is being used just once in each function. (And both variables
were loop invariant anyway, so should at least have got initialized
outside the loop.)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
There's no point calling xen_pcibk_config_read() here - all it'll do is
return whatever conf_space_read() returns for the field which was found
here (and which would be found there again). Also there's no point
clearing tmp_val before the call.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Other than for raw BAR values, flags are properly separated in the
internal representation.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
It is now identical to bar_init().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
System workqueues have been able to handle high level of concurrency
for a long time now and there's no reason to use dedicated workqueues
just to gain concurrency. Replace dedicated xenbus_frontend_wq with the
use of system_wq.
Unlike a dedicated per-cpu workqueue created with create_workqueue(),
system_wq allows multiple work items to overlap executions even on
the same CPU; however, a per-cpu workqueue doesn't have any CPU
locality or global ordering guarantees unless the target CPU is
explicitly specified and the increase of local concurrency shouldn't
make any difference.
In this case, there is only a single work item, increase of concurrency
level by switching to system_wq should not make any difference.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
System workqueues have been able to handle high level of concurrency
for a long time now and there's no reason to use dedicated workqueues
just to gain concurrency. Replace dedicated xen_pcibk_wq with the
use of system_wq.
Unlike a dedicated per-cpu workqueue created with create_workqueue(),
system_wq allows multiple work items to overlap executions even on
the same CPU; however, a per-cpu workqueue doesn't have any CPU
locality or global ordering guarantees unless the target CPU is
explicitly specified and thus the increase of local concurrency shouldn't
make any difference.
Since the work items could be pending, flush_work() has been used in
xen_pcibk_disconnect(). xen_pcibk_xenbus_remove() calls free_pdev()
which in turn calls xen_pcibk_disconnect() for every pdev to ensure that
there is no pending task while disconnecting the driver.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
The pv_time_ops structure contains a function pointer for the
"steal_clock" functionality used only by KVM and Xen on ARM. Xen on x86
uses its own mechanism to account for the "stolen" time a thread wasn't
able to run due to hypervisor scheduling.
Add support in Xen arch independent time handling for this feature by
moving it out of the arm arch into drivers/xen and remove the x86 Xen
hack.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Replace explicit computation of vma page count by a call to
vma_pages().
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
When running on Xen hypervisor, runtime services are supported through
hypercall. Add a Xen specific function to initialize runtime services.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Move x86 specific codes to architecture directory and export those EFI
runtime service functions. This will be useful for initializing runtime
service on ARM later.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
|
|
Add a bus_notifier for AMBA bus device in order to map the device
mmio regions when DOM0 booting with ACPI.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
|
|
Add a bus_notifier for platform bus device in order to map the device
mmio regions when DOM0 booting with ACPI.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Julien Grall <julien.grall@arm.com>
|