summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2018-06-06powerpc/64s: Add support for a store forwarding barrier at kernel entry/exitNicholas Piggin
commit a048a07d7f4535baa4cbad6bc024f175317ab938 upstream. On some CPUs we can prevent a vulnerability related to store-to-load forwarding by preventing store forwarding between privilege domains, by inserting a barrier in kernel entry and exit paths. This is known to be the case on at least Power7, Power8 and Power9 powerpc CPUs. Barriers must be inserted generally before the first load after moving to a higher privilege, and after the last store before moving to a lower privilege, HV and PR privilege transitions must be protected. Barriers are added as patch sections, with all kernel/hypervisor entry points patched, and the exit points to lower privilge levels patched similarly to the RFI flush patching. Firmware advertisement is not implemented yet, so CPU flush types are hard coded. Thanks to Michal Suchánek for bug fixes and review. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06powerpc: Move default security feature flagsMauricio Faria de Oliveira
commit e7347a86830f38dc3e40c8f7e28c04412b12a2e7 upstream. This moves the definition of the default security feature flags (i.e., enabled by default) closer to the security feature flags. This can be used to restore current flags to the default flags. Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06powerpc/64s: Move cpu_show_meltdown()Michael Ellerman
commit 8ad33041563a10b34988800c682ada14b2612533 upstream. This landed in setup_64.c for no good reason other than we had nowhere else to put it. Now that we have a security-related file, that is a better place for it so move it. [mpe: Add extern for rfi_flush to fix bisection break] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06powerpc: Add security feature flags for Spectre/MeltdownMichael Ellerman
commit 9a868f634349e62922c226834aa23e3d1329ae7f upstream. This commit adds security feature flags to reflect the settings we receive from firmware regarding Spectre/Meltdown mitigations. The feature names reflect the names we are given by firmware on bare metal machines. See the hostboot source for details. Arguably these could be firmware features, but that then requires them to be read early in boot so they're available prior to asm feature patching, but we don't actually want to use them for patching. We may also want to dynamically update them in future, which would be incompatible with the way firmware features work (at the moment at least). So for now just make them separate flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flagsMichael Ellerman
commit c4bc36628d7f8b664657d8bd6ad1c44c177880b7 upstream. Add some additional values which have been defined for the H_GET_CPU_CHARACTERISTICS hypercall. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06powerpc/rfi-flush: Make it possible to call setup_rfi_flush() againMichael Ellerman
commit abf110f3e1cea40f5ea15e85f5d67c39c14568a7 upstream. For PowerVM migration we want to be able to call setup_rfi_flush() again after we've migrated the partition. To support that we need to check that we're not trying to allocate the fallback flush area after memblock has gone away (i.e., boot-time only). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc: Add missing prototype for arch_irq_work_raise()Mathieu Malaterre
[ Upstream commit f5246862f82f1e16bbf84cda4cddf287672b30fe ] In commit 4f8b50bbbe63 ("irq_work, ppc: Fix up arch hooks") a new function arch_irq_work_raise() was added without a prototype in header irq_work.h. Fix the following warning (treated as error in W=1): arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’ Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-19futex: Remove duplicated code and fix undefined behaviourJiri Slaby
commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream. There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/powernv: define a standard delay for OPAL_BUSY type retry loopsNicholas Piggin
commit 34dd25de9fe3f60bfdb31b473bf04b28262d0896 upstream. This is the start of an effort to tidy up and standardise all the delays. Existing loops have a range of delay/sleep periods from 1ms to 20ms, and some have no delay. They all loop forever except rtc, which times out after 10 retries, and that uses 10ms delays. So use 10ms as our standard delay. The OPAL maintainer agrees 10ms is a reasonable starting point. The idea is to use the same recipe everywhere, once this is proven to work then it will be documented as an OPAL API standard. Then both firmware and OS can agree, and if a particular call needs something else, then that can be documented with reasoning. This is not the end-all of this effort, it's just a relatively easy change that fixes some existing high latency delays. There should be provision for standardising timeouts and/or interruptible loops where possible, so non-fatal firmware errors don't cause hangs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/64: Fix smp_wmb barrier definition use use lwsync consistentlyNicholas Piggin
commit 0bfdf598900fd62869659f360d3387ed80eb71cf upstream. asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hashMichael Ellerman
[ Upstream commit e41e53cd4fe331d0d1f06f8e4ed7e2cc63ee2c34 ] virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on an address. What this means in practice is that it should only return true for addresses in the linear mapping which are backed by a valid PFN. We are failing to properly check that the address is in the linear mapping, because virt_to_pfn() will return a valid looking PFN for more or less any address. That bug is actually caused by __pa(), used in virt_to_pfn(). eg: __pa(0xc000000000010000) = 0x10000 # Good __pa(0xd000000000010000) = 0x10000 # Bad! __pa(0x0000000000010000) = 0x10000 # Bad! This started happening after commit bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give you something bogus back. Until we can verify if that GCC bug is no longer an issue, or come up with another solution, this commit does the minimal fix to make virt_addr_valid() work, by explicitly checking that the address is in the linear mapping region. Fixes: bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Tested-by: Breno Leitao <breno.leitao@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/modules: If mprofile-kernel is enabled add it to vermagicMichael Ellerman
[ Upstream commit 43e24e82f35291d4c1ca78877ce1b20d3aeb78f1 ] On powerpc we can build the kernel with two different ABIs for mcount(), which is used by ftrace. Kernels built with one ABI do not know how to load modules built with the other ABI. The new style ABI is called "mprofile-kernel", for want of a better name. Currently if we build a module using the old style ABI, and the kernel with mprofile-kernel, when we load the module we'll oops something like: # insmod autofs4-no-mprofile-kernel.ko ftrace-powerpc: Unexpected instruction f8810028 around bl _mcount ------------[ cut here ]------------ WARNING: CPU: 6 PID: 3759 at ../kernel/trace/ftrace.c:2024 ftrace_bug+0x2b8/0x3c0 CPU: 6 PID: 3759 Comm: insmod Not tainted 4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269 #11 ... NIP [c0000000001eaa48] ftrace_bug+0x2b8/0x3c0 LR [c0000000001eaff8] ftrace_process_locs+0x4a8/0x590 Call Trace: alloc_pages_current+0xc4/0x1d0 (unreliable) ftrace_process_locs+0x4a8/0x590 load_module+0x1c8c/0x28f0 SyS_finit_module+0x110/0x140 system_call+0x38/0xfc ... ftrace failed to modify [<d000000002a31024>] 0xd000000002a31024 actual: 35:65:00:48 We can avoid this by including in the vermagic whether the kernel/module was built with mprofile-kernel. Which results in: # insmod autofs4-pg.ko autofs4: version magic '4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269 SMP mod_unload modversions ' should be '4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269-dirty SMP mod_unload modversions mprofile-kernel' insmod: ERROR: could not insert module autofs4-pg.ko: Invalid module format Fixes: 8c50b72a3b4f ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24powerpc/64s: Remove SAO feature from Power9 DD1Nicholas Piggin
[ Upstream commit ca80d5d0a8175c9be04cfbce24180b8f5e0a744b ] Power9 DD1 does not implement SAO. Although it's not widely used, its presence or absence is visible to user space via arch_validate_prot() so it's moderately important that we get the value right. Fixes: 7dccfbc325bb ("powerpc/book3s: Add a cpu table entry for different POWER9 revs") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22powerpc/modules: Don't try to restore r2 after a sibling callJosh Poimboeuf
[ Upstream commit b9eab08d012fa093947b230f9a87257c27fb829b ] When attempting to load a livepatch module, I got the following error: module_64: patch_module: Expect noop after relocate, got 3c820000 The error was triggered by the following code in unregister_netdevice_queue(): 14c: 00 00 00 48 b 14c <unregister_netdevice_queue+0x14c> 14c: R_PPC64_REL24 net_set_todo 150: 00 00 82 3c addis r4,r2,0 GCC didn't insert a nop after the branch to net_set_todo() because it's a sibling call, so it never returns. The nop isn't needed after the branch in that case. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-and-tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25powerpc/64s: Improve RFI L1-D cache flush fallbackNicholas Piggin
commit bdcb1aefc5b3f7d0f1dc8b02673602bca2ff7a4b upstream. The fallback RFI flush is used when firmware does not provide a way to flush the cache. It's a "displacement flush" that evicts useful data by displacing it with an uninteresting buffer. The flush has to take care to work with implementation specific cache replacment policies, so the recipe has been in flux. The initial slow but conservative approach is to touch all lines of a congruence class, with dependencies between each load. It has since been determined that a linear pattern of loads without dependencies is sufficient, and is significantly faster. Measuring the speed of a null syscall with RFI fallback flush enabled gives the relative improvement: P8 - 1.83x P9 - 1.75x The flush also becomes simpler and more adaptable to different cache geometries. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Backport to 4.9] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25powerpc/64s: Simple RFI macro conversionsNicholas Piggin
commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream. This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Backport to 4.9, use RFI_TO_KERNEL in idle_book3s.S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-17powerpc/pseries: include linux/types.h in asm/hvcall.hMichal Suchanek
commit 1b689a95ce7427075f9ac9fb4aea1af530742b7f upstream. Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-13powerpc/64s: Add support for RFI flush of L1-D cacheMichael Ellerman
commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir - back ported to stable with changes] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-13powerpc/64: Add macros for annotating the destination of rfid/hrfidNicholas Piggin
commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-13powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapperMichael Neuling
commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed conflicts in backport] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofoldPaul Mackerras
commit b492f7e4e07a28e706db26cf4943bb0911435426 upstream. These functions compute an IP checksum by computing a 64-bit sum and folding it to 32 bits (the "nofold" in their names refers to folding down to 16 bits). However, doing (u32) (s + (s >> 32)) is not sufficient to fold a 64-bit sum to 32 bits correctly. The addition can produce a carry out from bit 31, which needs to be added in to the sum to produce the correct result. To fix this, we copy the from64to32() function from lib/checksum.c and use that. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14powerpc/64: Fix checksum folding in csum_add()Shile Zhang
[ Upstream commit 6ad966d7303b70165228dba1ee8da1a05c10eefe ] Paul's patch to fix checksum folding, commit b492f7e4e07a ("powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold") missed a case in csum_add(). Fix it. Signed-off-by: Shile Zhang <shile.zhang@nokia.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09powerpc/mm: Fix memory hotplug BUG() on radixReza Arbab
[ Upstream commit 32b53c012e0bfe20b2745962a89db0dc72ef3270 ] Memory hotplug is leading to hash page table calls, even on radix: arch_add_memory create_section_mapping htab_bolt_mapping BUG_ON(!ppc_md.hpte_insert); To fix, refactor {create,remove}_section_mapping() into hash__ and radix__ variants. Leave the radix versions stubbed for now. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-30powerpc/mm: Ensure cpumask update is orderedBenjamin Herrenschmidt
commit 1a92a80ad386a1a6e3b36d576d52a1a456394b70 upstream. There is no guarantee that the various isync's involved with the context switch will order the update of the CPU mask with the first TLB entry for the new context being loaded by the HW. Be safe here and add a memory barrier to order any subsequent load/store which may bring entries into the TLB. The corresponding barrier on the other side already exists as pte updates use pte_xchg() which uses __cmpxchg_u64 which has a sync after the atomic operation. Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add comments in the code] [mpe: Backport to 4.12, minor context change] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-06Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"Greg Kroah-Hartman
This reverts commit b4624ff952ec7d268a9651cd9184a1995befc271 which is commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream. Michal Hocko writes: JFYI. We have encountered a regression after applying this patch on a large ppc machine. While the patch is the right thing to do it doesn't work well with the current vmalloc area size on ppc and large machines where NUMA nodes are very far from each other. Just for the reference the boot fails on such a machine with bunch of warning preceeding it. See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz It seems the right thing to do is to enlarge the vmalloc space on ppc but this is not the case in the upstream kernel yet AFAIK. It is also questionable whether that is a stable material but I will decision on you here. We have reverted this patch from our 4.4 based kernel. Newer kernels do not have enlarged vmalloc space yet AFAIK so they won't work properly eiter. This bug is quite rare though because you need a specific HW configuration to trigger the issue - namely NUMA nodes have to be far away from each other in the physical memory space. Cc: Michal Hocko <mhocko@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27powerpc/asm: Mark cr0 as clobbered in mftb()Oliver O'Halloran
commit 2400fd822f467cb4c886c879d8ad99feac9cf319 upstream. The workaround for the CELL timebase bug does not correctly mark cr0 as being clobbered. This means GCC doesn't know that the asm block changes cr0 and might leave the result of an unrelated comparison in cr0 across the block, which we then trash, leading to basically random behaviour. Fixes: 859deea949c3 ("[POWERPC] Cell timebase bug workaround") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Tweak change log and flag for stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27powerpc/64: Fix atomic64_inc_not_zero() to return an intMichael Ellerman
commit 01e6a61aceb82e13bec29502a8eb70d9574f97ad upstream. Although it's not documented anywhere, there is an expectation that atomic64_inc_not_zero() returns a result which fits in an int. This is the behaviour implemented on all arches except powerpc. This has caused at least one bug in practice, in the percpu-refcount code, where the long result from our atomic64_inc_not_zero() was truncated to an int leading to lost references and stuck systems. That was worked around in that code in commit 966d2b04e070 ("percpu-refcount: fix reference leak during percpu-atomic transition"). To the best of my grepping abilities there are no other callers in-tree which truncate the value, but we should fix it anyway. Because the breakage is subtle and potentially very harmful I'm also tagging it for stable. Code generation is largely unaffected because in most cases the callers are just using the result for a test anyway. In particular the case of fget() that was mentioned in commit a6cf7ed5119f ("powerpc/atomic: Implement atomic*_inc_not_zero") generates exactly the same code. Fixes: a6cf7ed5119f ("powerpc/atomic: Implement atomic*_inc_not_zero") Noticed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21powerpc: move ELF_ET_DYN_BASE to 4GB / 4MBKees Cook
commit 47ebb09d54856500c5a5e14824781902b3bb738e upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14powerpc/numa: Fix percpu allocations to be NUMA awareMichael Ellerman
commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream. In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/mm: Ensure IRQs are off in switch_mm()David Gibson
commit 9765ad134a00a01cbcc69c78ff6defbfad209bc5 upstream. powerpc expects IRQs to already be (soft) disabled when switch_mm() is called, as made clear in the commit message of 9c1e105238c4 ("powerpc: Allow perf_counters to access user memory at interrupt time"). Aside from any race conditions that might exist between switch_mm() and an IRQ, there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't followed at some point by an IRQ enable then interrupts will remain disabled until we return to userspace. It is true that when switch_mm() is called from the scheduler IRQs are off, but not when it's called by use_mm(). Looking closer we see that last year in commit f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") this was made more explicit by the addition of switch_mm_irqs_off() which is now called by the scheduler, vs switch_mm() which is used by use_mm(). Arguably it is a bug in use_mm() to call switch_mm() in a different context than it expects, but fixing that will take time. This was discovered recently when vhost started throwing warnings such as: BUG: sleeping function called from invalid context at kernel/mutex.c:578 in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760 no locks held by vhost-10760/10768. irq event stamp: 10 hardirqs last enabled at (9): _raw_spin_unlock_irq+0x40/0x80 hardirqs last disabled at (10): switch_slb+0x2e4/0x490 softirqs last enabled at (0): copy_process+0x5e8/0x1260 softirqs last disabled at (0): (null) Call Trace: show_stack+0x88/0x390 (unreliable) dump_stack+0x30/0x44 __might_sleep+0x1c4/0x2d0 mutex_lock_nested+0x74/0x5c0 cgroup_attach_task_all+0x5c/0x180 vhost_attach_cgroups_work+0x58/0x80 [vhost] vhost_worker+0x24c/0x3d0 [vhost] kthread+0xec/0x100 ret_from_kernel_thread+0x5c/0xd4 Prior to commit 04b96e5528ca ("vhost: lockless enqueuing") (Aug 2016) the vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(), which had the effect of reenabling IRQs. Since that commit removed the locking in vhost_worker() the body of the vhost_worker() loop now runs with interrupts off causing the warnings. This patch addresses the problem by making the powerpc code mirror the x86 code, ie. we disable interrupts in switch_mm(), and optimise the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Flesh out/rewrite change log, add stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14powerpc/mm: Fixup wrong LPCR_VRMASD valueAneesh Kumar K.V
commit 4ab2537c4204b976e4ca350bbdc193b4649cad28 upstream. In commit a4b349540a26af ("powerpc/mm: Cleanup LPCR defines") we updated LPCR_VRMASD wrongly as below. -#define LPCR_VRMASD (0x1ful << (63-16)) +#define LPCR_VRMASD_SH 47 +#define LPCR_VRMASD (ASM_CONST(1) << LPCR_VRMASD_SH) We initialize the VRMA bits in LPCR to 0x00 in kvm. Hence using a different mask value as above while updating lpcr should not have any impact. This patch updates it to the correct value. Fixes: a4b349540a26 ("powerpc/mm: Cleanup LPCR defines") Reported-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Jia He <hejianet@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22powerpc/iommu: Stop using @current in mm_iommu_xxxAlexey Kardashevskiy
[ Upstream commit d7baee6901b34c4895eb78efdbf13a49079d7404 ] This changes mm_iommu_xxx helpers to take mm_struct as a parameter instead of getting it from @current which in some situations may not have a valid reference to mm. This changes helpers to receive @mm and moves all references to @current to the caller, including checks for !current and !current->mm; checks in mm_iommu_preregistered() are removed as there is no caller yet. This moves the mm_iommu_adjust_locked_vm() call to the caller as it receives mm_iommu_table_group_mem_t but it needs mm. This should cause no behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22powerpc/iommu: Pass mm_struct to init/cleanup helpersAlexey Kardashevskiy
[ Upstream commit 88f54a3581eb9deaa3bd1aade40aef266d782385 ] We are going to get rid of @current references in mmu_context_boos3s64.c and cache mm_struct in the VFIO container. Since mm_context_t does not have reference counting, we will be using mm_struct which does have the reference counter. This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather than mm_context_t (which is embedded into mm). This should not cause any behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12powerpc/mm: Add MMU_FTR_KERNEL_RO to possible feature maskAneesh Kumar K.V
commit a5ecdad4847897007399d7a14c9109b65ce4c9b7 upstream. Without this we will always find the feature disabled. Fixes: 984d7a1ec6 ("powerpc/mm: Fixup kernel read only mapping") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-14powerpc/powernv: Fix CPU hotplug to handle waking on HVIBenjamin Herrenschmidt
commit 9b256714979fad61ae11d90b53cf67dd5e6484eb upstream. The IPIs come in as HVI not EE, so we need to test the appropriate SRR1 bits. The encoding is such that it won't have false positives on P7 and P8 so we can just test it like that. We also need to handle the icp-opal variant of the flush. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09powerpc: Fix build failure with clang due to BUILD_BUG_ON()Michael Ellerman
commit b5fa0f7f88edcde37df1807fdf9ff10ec787a60e upstream. Anton says: In commit 4db7327194db ("powerpc: Add option to use jump label for cpu_has_feature()") and commit c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()") we added: BUILD_BUG_ON(!__builtin_constant_p(feature)) to cpu_has_feature() and mmu_has_feature() in order to catch usage issues (such as cpu_has_feature(cpu_has_feature(X), which has happened once in the past). Unfortunately LLVM isn't smart enough to resolve this, and it errors out. I work around it in my clang/LLVM builds of the kernel, but I have just discovered that it causes a lot of issues for the bcc (eBPF) trace tool (which uses LLVM). For now just #ifdef it away for clang builds. Fixes: 4db7327194db ("powerpc: Add option to use jump label for cpu_has_feature()") Fixes: c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()") Reported-by: Anton Blanchard <anton@samba.org> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26powerpc: Ignore reserved field in DCSR and PVR reads and writesAnton Blanchard
commit 178f358208ceb8b38e5cff3f815e0db4a6a70a07 upstream. IBM bit 31 (for the rest of us - bit 0) is a reserved field in the instruction definition of mtspr and mfspr. Hardware is encouraged to (and does) ignore it. As a result, if userspace executes an mtspr DSCR with the reserved bit set, we get a DSCR facility unavailable exception. The kernel fails to match against the expected value/mask, and we silently return to userspace to try and re-execute the same mtspr DSCR instruction. We loop forever until the process is killed. We should do something here, and it seems mirroring what hardware does is the better option vs killing the process. While here, relax the matching of mfspr PVR too. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19powerpc/64: Simplify adaptation to new ISA v3.00 HPTE formatPaul Mackerras
commit 6b243fcfb5f1e16bcf732e6f86a63f8af5b59a9f upstream. This changes the way that we support the new ISA v3.00 HPTE format. Instead of adapting everything that uses HPTE values to handle either the old format or the new format, depending on which CPU we are on, we now convert explicitly between old and new formats if necessary in the low-level routines that actually access HPTEs in memory. This limits the amount of code that needs to know about the new format and makes the conversions explicit. This is OK because the old format contains all the information that is in the new format. This also fixes operation under a hypervisor, because the H_ENTER hypercall (and other hypercalls that deal with HPTEs) will continue to require the HPTE value to be supplied in the old format. At present the kernel will not boot in HPT mode on POWER9 under a hypervisor. This fixes and partially reverts commit 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29). Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09KVM: PPC: Book3S HV: Save/restore XER in checkpointed register statePaul Mackerras
commit 0d808df06a44200f52262b6eb72bcb6042f5a7c5 upstream. When switching from/to a guest that has a transaction in progress, we need to save/restore the checkpointed register state. Although XER is part of the CPU state that gets checkpointed, the code that does this saving and restoring doesn't save/restore XER. This fixes it by saving and restoring the XER. To allow userspace to read/write the checkpointed XER value, we also add a new ONE_REG specifier. The visible effect of this bug is that the guest may see its XER value being corrupted when it uses transactions. Fixes: e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") Fixes: 0a8eccefcb34 ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-26Merge tag 'powerpc-4.9-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Set missing wakeup bit in LPCR on POWER9 - Fix the early OPAL console wrappers - Fixup kernel read only mapping Fixes for code merged this cycle: - Fix missing CRCs, add more asm-prototypes.h declarations" * tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fixup kernel read only mapping powerpc/boot: Fix the early OPAL console wrappers powerpc: Fix missing CRCs, add more asm-prototypes.h declarations powerpc: Set missing wakeup bit in LPCR on POWER9
2016-11-25powerpc/mm: Fixup kernel read only mappingAneesh Kumar K.V
With commit e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO") we started using the ppp value 0b110 to map kernel readonly. But that facility was only added as part of ISA 2.04. For earlier ISA version only supported ppp bit value for readonly mapping is 0b011. (This implies both user and kernel get mapped using the same ppp bit value for readonly mapping.). Update the code such that for earlier architecture version we use ppp value 0b011 for readonly mapping. We don't differentiate between power5+ and power5 here and apply the new ppp bits only from power6 (ISA 2.05). This keep the changes minimal. This fixes issue with PS3 spu usage reported at https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org Fixes: e58e87adc8bf9 ("powerpc/mm: Update _PAGE_KERNEL_RO") Cc: stable@vger.kernel.org # v4.7+ Tested-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22powerpc: Fix missing CRCs, add more asm-prototypes.h declarationsNicholas Piggin
After patch 4efca4ed0 ("kbuild: modversions for EXPORT_SYMBOL() for asm"), asm exports can get modversions CRCs generated if they have C definitions in asm-prototypes.h. This patch adds missing definitions for 32 and 64 bit allmodconfig builds. Fixes: 9445aa1a3062 ("ppc: move exports to definitions") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22powerpc: Set missing wakeup bit in LPCR on POWER9Benjamin Herrenschmidt
There is a new bit, LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable), which controls wakeup from STOP states on Hypervisor Virtualization Interrupts (which happen to also be all external interrupts in host or bare metal mode). It needs to be set or we will miss wakeups. Fixes: 9baaef0a22c8 ("powerpc/irq: Add support for HV virtualization interrupts") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Rename it to HVEE to match the name in the ISA] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-19Merge tag 'powerpc-4.9-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - fix system reset interrupt winkle wakeups - fix setting of AIL in hypervisor mode Fixes for code merged this cycle: - fix exception vector build with 2.23 era binutils - fix missing update of HID register on secondary CPUs Other: - fix missing pr_cont()s - invalidate ERAT on tlbiel for POWER9 DD1" * tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix missing update of HID register on secondary CPUs powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1 powerpc/64: Fix setting of AIL in hypervisor mode powerpc/oops: Fix missing pr_cont()s in instruction dump powerpc/oops: Fix missing pr_cont()s in show_regs() powerpc/oops: Fix missing pr_cont()s in print_msr_bits() et. al. powerpc/oops: Fix missing pr_cont()s in show_stack() powerpc: Fix exception vector build with 2.23 era binutils powerpc/64s: Fix system reset interrupt winkle wakeups
2016-11-18powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1Michael Neuling
On POWER9 DD1, when we do a local TLB invalidate we also need to explicitly invalidate the ERAT. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-12powerpc: Fix exception vector build with 2.23 era binutilsHugh Dickins
The changes to use gas sections for constructing the exception vectors causes a build break when using binutils 2.23: arch/powerpc/kernel/exceptions-64s.S:770: Error: operand out of range (0xffffffffffff8100 is not between 0x0000000000000000 and 0x000000000000ffff) And so on. Reported by Hugh with binutils-2.23.2-8.1.4.ppc64 from openSUSE 13.1 and also Naveen & Denis using 2.23.52.0.1-26.el7 from RHEL 7. Strangely binutils 2.22 (what I test with) is not affected. This is caused by the use of @l in LOAD_HANDLER(). The @l was only recently added in commit a24553dd02dc ("powerpc/pseries: Remove unnecessary syscall trampoline"). Luckily the gas section changes split out the LOAD_SYSCALL_HANDLER() macro, which means we actually *don't* need to use @l in LOAD_HANDLER() any more, only in LOAD_SYSCALL_HANDLER(). So drop the @l from LOAD_HANDLER(). Fixes: 57f266497d81 ("powerpc: Use gas sections for arranging exception vectors") Signed-off-by: Hugh Dickins <hughd@google.com> [mpe: Add gory details to change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-12powerpc/64s: Fix system reset interrupt winkle wakeupsNicholas Piggin
Wakeups from winkle set the low bit of the HSPRG0 register, to distinguish it from other sleep states. This is also the PACA pointer. The system reset exception handler fails to mask this bit away before using this value before using it as the PACA pointer. Fix this by adding a new type of exception prolog macro where we already have the PACA set in r13, and have the system reset vector mask it out. The winkle wakeup handler will store the masked value back into HSPRG0. Fixes: fb479e44a9e2 ("powerpc/64s: relocation, register save fixes for system reset interrupt") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Lots of fixes, mostly drivers as is usually the case. 1) Don't treat zero DMA address as invalid in vmxnet3, from Alexey Khoroshilov. 2) Fix element timeouts in netfilter's nft_dynset, from Anders K. Pedersen. 3) Don't put aead_req crypto struct on the stack in mac80211, from Ard Biesheuvel. 4) Several uninitialized variable warning fixes from Arnd Bergmann. 5) Fix memory leak in cxgb4, from Colin Ian King. 6) Fix bpf handling of VLAN header push/pop, from Daniel Borkmann. 7) Several VRF semantic fixes from David Ahern. 8) Set skb->protocol properly in ip6_tnl_xmit(), from Eli Cooper. 9) Socket needs to be locked in udp_disconnect(), from Eric Dumazet. 10) Div-by-zero on 32-bit fix in mlx4 driver, from Eugenia Emantayev. 11) Fix stale link state during failover in NCSCI driver, from Gavin Shan. 12) Fix netdev lower adjacency list traversal, from Ido Schimmel. 13) Propvide proper handle when emitting notifications of filter deletes, from Jamal Hadi Salim. 14) Memory leaks and big-endian issues in rtl8xxxu, from Jes Sorensen. 15) Fix DESYNC_FACTOR handling in ipv6, from Jiri Bohac. 16) Several routing offload fixes in mlxsw driver, from Jiri Pirko. 17) Fix broadcast sync problem in TIPC, from Jon Paul Maloy. 18) Validate chunk len before using it in SCTP, from Marcelo Ricardo Leitner. 19) Revert a netns locking change that causes regressions, from Paul Moore. 20) Add recursion limit to GRO handling, from Sabrina Dubroca. 21) GFP_KERNEL in irq context fix in ibmvnic, from Thomas Falcon. 22) Avoid accessing stale vxlan/geneve socket in data path, from Pravin Shelar" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (189 commits) geneve: avoid using stale geneve socket. vxlan: avoid using stale vxlan socket. qede: Fix out-of-bound fastpath memory access net: phy: dp83848: add dp83822 PHY support enic: fix rq disable tipc: fix broadcast link synchronization problem ibmvnic: Fix missing brackets in init_sub_crq_irqs ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context" arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold net/mlx4_en: Save slave ethtool stats command net/mlx4_en: Fix potential deadlock in port statistics flow net/mlx4: Fix firmware command timeout during interrupt test net/mlx4_core: Do not access comm channel if it has not yet been initialized net/mlx4_en: Fix panic during reboot net/mlx4_en: Process all completions in RX rings after port goes up net/mlx4_en: Resolve dividing by zero in 32-bit system net/mlx4_core: Change the default value of enable_qos net/mlx4_core: Avoid setting ports to auto when only one port type is supported net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec ...
2016-10-29arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofoldIvan Vecera
Commit 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their original types" changed parameters for csum_tcpudp_magic and csum_tcpudp_nofold for many platforms but not for PowerPC. Fixes: 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their original types" Cc: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27powerpc/64s: relocation, register save fixes for system reset interruptNicholas Piggin
This patch does a couple of things. First of all, powernv immediately explodes when running a relocated kernel, because the system reset exception for handling sleeps does not do correct relocated branches. Secondly, the sleep handling code trashes the condition and cfar registers, which we would like to preserve for debugging purposes (for non-sleep case exception). This patch changes the exception to use the standard format that saves registers before any tests or branches are made. It adds the test for idle-wakeup as an "extra" to break out of the normal exception path. Then it branches to a relocated idle handler that calls the various idle handling functions. After this patch, POWER8 CPU simulator now boots powernv kernel that is running at non-zero. Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>