summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2019-03-23powerpc: Always initialize input array when calling epapr_hypercall()Seth Forshee
commit 186b8f1587c79c2fa04bfa392fdf084443e398c1 upstream. Several callers to epapr_hypercall() pass an uninitialized stack allocated array for the input arguments, presumably because they have no input arguments. However this can produce errors like this one arch/powerpc/include/asm/epapr_hcalls.h:470:42: error: 'in' may be used uninitialized in this function [-Werror=maybe-uninitialized] unsigned long register r3 asm("r3") = in[0]; ~~^~~ Fix callers to this function to always zero-initialize the input arguments array to prevent this. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: "A. Wilcox" <awilfox@adelielinux.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20powerpc/uaccess: fix warning/error with access_ok()Christophe Leroy
[ Upstream commit 05a4ab823983d9136a460b7b5e0d49ee709a6f86 ] With the following piece of code, the following compilation warning is encountered: if (_IOC_DIR(ioc) != _IOC_NONE) { int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ; if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) { drivers/platform/test/dev.c: In function 'my_ioctl': drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' [-Wunused-variable] int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ; This patch fixes it by referencing 'type' in the macro allthough doing nothing with it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21powerpc/msi: Fix compile error on mpc83xxChristophe Leroy
commit 0f99153def98134403c9149128e59d3e1786cf04 upstream. mpic_get_primary_version() is not defined when not using MPIC. The compile error log like: arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe': fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version' Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Radu Rendec <radu.rendec@gmail.com> Fixes: 807d38b73b6 ("powerpc/mpic: Add get_version API both for internal and external use") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09powerpc/fadump: handle crash memory ranges array index overflowHari Bathini
commit 1bd6a1c4b80a28d975287630644e6b47d0f977a5 upstream. Crash memory ranges is an array of memory ranges of the crashing kernel to be exported as a dump via /proc/vmcore file. The size of the array is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since commit 142b45a72e22 ("memblock: Add array resizing support"). On large memory systems with a few DLPAR operations, the memblock memory regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such systems, registering fadump results in crash or other system failures like below: task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000 NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180 REGS: c00000000b73b570 TRAP: 0300 Tainted: G L X (4.4.140+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22004484 XER: 20000000 CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0 ... NIP [c000000000047df4] smp_send_reschedule+0x24/0x80 LR [c0000000000f9e58] resched_curr+0x138/0x160 Call Trace: resched_curr+0x138/0x160 (unreliable) check_preempt_curr+0xc8/0xf0 ttwu_do_wakeup+0x38/0x150 try_to_wake_up+0x224/0x4d0 __wake_up_common+0x94/0x100 ep_poll_callback+0xac/0x1c0 __wake_up_common+0x94/0x100 __wake_up_sync_key+0x70/0xa0 sock_def_readable+0x58/0xa0 unix_stream_sendmsg+0x2dc/0x4c0 sock_sendmsg+0x68/0xa0 ___sys_sendmsg+0x2cc/0x2e0 __sys_sendmsg+0x5c/0xc0 SyS_socketcall+0x36c/0x3f0 system_call+0x3c/0x100 as array index overflow is not checked for while setting up crash memory ranges causing memory corruption. To resolve this issue, dynamically allocate memory for crash memory ranges and resize it incrementally, in units of pagesize, on hitting array size limit. Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Just use PAGE_SIZE directly, fixup variable placement] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc: Add missing prototype for arch_irq_work_raise()Mathieu Malaterre
[ Upstream commit f5246862f82f1e16bbf84cda4cddf287672b30fe ] In commit 4f8b50bbbe63 ("irq_work, ppc: Fix up arch hooks") a new function arch_irq_work_raise() was added without a prototype in header irq_work.h. Fix the following warning (treated as error in W=1): arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’ Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPALStewart Smith
commit e4d54f71d29997344b4c4c8d47708240f9f23a5c upstream. Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: Remove OPALv2 firmware define and referencesStewart Smith
commit 7261aafc095763b119136a562540dea7b1ccf657 upstream. OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26futex: Remove duplicated code and fix undefined behaviourJiri Slaby
commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream. There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/powernv: define a standard delay for OPAL_BUSY type retry loopsNicholas Piggin
commit 34dd25de9fe3f60bfdb31b473bf04b28262d0896 upstream. This is the start of an effort to tidy up and standardise all the delays. Existing loops have a range of delay/sleep periods from 1ms to 20ms, and some have no delay. They all loop forever except rtc, which times out after 10 retries, and that uses 10ms delays. So use 10ms as our standard delay. The OPAL maintainer agrees 10ms is a reasonable starting point. The idea is to use the same recipe everywhere, once this is proven to work then it will be documented as an OPAL API standard. Then both firmware and OS can agree, and if a particular call needs something else, then that can be documented with reasoning. This is not the end-all of this effort, it's just a relatively easy change that fixes some existing high latency delays. There should be provision for standardising timeouts and/or interruptible loops where possible, so non-fatal firmware errors don't cause hangs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/64: Fix smp_wmb barrier definition use use lwsync consistentlyNicholas Piggin
commit 0bfdf598900fd62869659f360d3387ed80eb71cf upstream. asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hashMichael Ellerman
[ Upstream commit e41e53cd4fe331d0d1f06f8e4ed7e2cc63ee2c34 ] virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on an address. What this means in practice is that it should only return true for addresses in the linear mapping which are backed by a valid PFN. We are failing to properly check that the address is in the linear mapping, because virt_to_pfn() will return a valid looking PFN for more or less any address. That bug is actually caused by __pa(), used in virt_to_pfn(). eg: __pa(0xc000000000010000) = 0x10000 # Good __pa(0xd000000000010000) = 0x10000 # Bad! __pa(0x0000000000010000) = 0x10000 # Bad! This started happening after commit bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give you something bogus back. Until we can verify if that GCC bug is no longer an issue, or come up with another solution, this commit does the minimal fix to make virt_addr_valid() work, by explicitly checking that the address is in the linear mapping region. Fixes: bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Tested-by: Breno Leitao <breno.leitao@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/pseries: include linux/types.h in asm/hvcall.hMichal Suchanek
commit 1b689a95ce7427075f9ac9fb4aea1af530742b7f upstream. Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Add support for RFI flush of L1-D cacheMichael Ellerman
commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir - back ported to stable with changes] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Simple RFI macro conversionsNicholas Piggin
commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream. This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed issues with backporting to stable] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Add macros for annotating the destination of rfid/hrfidNicholas Piggin
commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapperMichael Neuling
commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed conflicts in backport] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Fix flush_(d|i)cache_range() called from modulesOliver O'Halloran
commit 8f5f525d5b83f7d76a6baf9c4e94d4bf312ea7f6 upstream. When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does not include a global entry point. A function's global entry point is used when the function is called from a different TOC context and in the kernel this typically means a call from a module into the vmlinux (or vice-versa). There are a few exported asm functions declared with _GLOBAL() and calling them from a module will likely crash the kernel since any TOC relative load will yield garbage. flush_icache_range() and flush_dcache_range() are both exported to modules, and use the TOC, so must use _GLOBAL_TOC(). Fixes: 721aeaa9fdf3 ("powerpc: Build little endian ppc64 kernel with ABIv2") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-06Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"Greg Kroah-Hartman
This reverts commit 8c92870bdbf20b5fa5150a2c8bf53ab498516b24 which is commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream. Michal Hocko writes: JFYI. We have encountered a regression after applying this patch on a large ppc machine. While the patch is the right thing to do it doesn't work well with the current vmalloc area size on ppc and large machines where NUMA nodes are very far from each other. Just for the reference the boot fails on such a machine with bunch of warning preceeding it. See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz It seems the right thing to do is to enlarge the vmalloc space on ppc but this is not the case in the upstream kernel yet AFAIK. It is also questionable whether that is a stable material but I will decision on you here. We have reverted this patch from our 4.4 based kernel. Newer kernels do not have enlarged vmalloc space yet AFAIK so they won't work properly eiter. This bug is quite rare though because you need a specific HW configuration to trigger the issue - namely NUMA nodes have to be far away from each other in the physical memory space. Cc: Michal Hocko <mhocko@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27powerpc/asm: Mark cr0 as clobbered in mftb()Oliver O'Halloran
commit 2400fd822f467cb4c886c879d8ad99feac9cf319 upstream. The workaround for the CELL timebase bug does not correctly mark cr0 as being clobbered. This means GCC doesn't know that the asm block changes cr0 and might leave the result of an unrelated comparison in cr0 across the block, which we then trash, leading to basically random behaviour. Fixes: 859deea949c3 ("[POWERPC] Cell timebase bug workaround") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Tweak change log and flag for stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27powerpc/64: Fix atomic64_inc_not_zero() to return an intMichael Ellerman
commit 01e6a61aceb82e13bec29502a8eb70d9574f97ad upstream. Although it's not documented anywhere, there is an expectation that atomic64_inc_not_zero() returns a result which fits in an int. This is the behaviour implemented on all arches except powerpc. This has caused at least one bug in practice, in the percpu-refcount code, where the long result from our atomic64_inc_not_zero() was truncated to an int leading to lost references and stuck systems. That was worked around in that code in commit 966d2b04e070 ("percpu-refcount: fix reference leak during percpu-atomic transition"). To the best of my grepping abilities there are no other callers in-tree which truncate the value, but we should fix it anyway. Because the breakage is subtle and potentially very harmful I'm also tagging it for stable. Code generation is largely unaffected because in most cases the callers are just using the result for a test anyway. In particular the case of fget() that was mentioned in commit a6cf7ed5119f ("powerpc/atomic: Implement atomic*_inc_not_zero") generates exactly the same code. Fixes: a6cf7ed5119f ("powerpc/atomic: Implement atomic*_inc_not_zero") Noticed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21powerpc: move ELF_ET_DYN_BASE to 4GB / 4MBKees Cook
commit 47ebb09d54856500c5a5e14824781902b3bb738e upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14powerpc/numa: Fix percpu allocations to be NUMA awareMichael Ellerman
commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream. In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09KVM: PPC: Book3S HV: Save/restore XER in checkpointed register statePaul Mackerras
commit 0d808df06a44200f52262b6eb72bcb6042f5a7c5 upstream. When switching from/to a guest that has a transaction in progress, we need to save/restore the checkpointed register state. Although XER is part of the CPU state that gets checkpointed, the code that does this saving and restoring doesn't save/restore XER. This fixes it by saving and restoring the XER. To allow userspace to read/write the checkpointed XER value, we also add a new ONE_REG specifier. The visible effect of this bug is that the guest may see its XER value being corrupted when it uses transactions. Fixes: e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") Fixes: 0a8eccefcb34 ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-16KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 registerThomas Huth
commit fa73c3b25bd8d0d393dc6109a1dba3c2aef0451e upstream. The MMCR2 register is available twice, one time with number 785 (privileged access), and one time with number 769 (unprivileged, but it can be disabled completely). In former times, the Linux kernel was using the unprivileged register 769 only, but since commit 8dd75ccb571f3c92c ("powerpc: Use privileged SPR number for MMCR2"), it uses the privileged register 785 instead. The KVM-PR code then of course also switched to use the SPR 785, but this is causing older guest kernels to crash, since these kernels still access 769 instead. So to support older kernels with KVM-PR again, we have to support register 769 in KVM-PR, too. Fixes: 8dd75ccb571f3c92c48014b3dabd3d51a115ab41 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24ppc32: fix copy_from_user()Al Viro
commit 224264657b8b228f949b42346e09ed8c90136a8e upstream. should clear on access_ok() failures. Also remove the useless range truncation logics. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15crypto: nx-842 - Mask XERS0 bit in return valueHaren Myneni
[ Upstream commit 6333ed8f26cf77311088d2e2b7cf16d8480bcbb2 ] NX842 coprocessor sets 3rd bit in CR register with XER[S0] which is nothing to do with NX request. Since this bit can be set with other valuable return status, mast this bit. One of other bits (INITIATED, BUSY or REJECTED) will be returned for any given NX request. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-24powerpc: Use privileged SPR number for MMCR2Thomas Huth
commit 8dd75ccb571f3c92c48014b3dabd3d51a115ab41 upstream. We are already using the privileged versions of MMCR0, MMCR1 and MMCRA in the kernel, so for MMCR2, we should better use the privileged versions, too, to be consistent. Fixes: 240686c13687 ("powerpc: Initialise PMU related regs on Power8") Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-24powerpc: Fix definition of SIAR and SDAR registersThomas Huth
commit d23fac2b27d94aeb7b65536a50d32bfdc21fe01e upstream. The SIAR and SDAR registers are available twice, one time as SPRs 780 / 781 (unprivileged, but read-only), and one time as the SPRs 796 / 797 (privileged, but read and write). The Linux kernel code currently uses the unprivileged SPRs - while this is OK for reading, writing to that register of course does not work. Since the KVM code tries to write to this register, too (see the mtspr in book3s_hv_rmhandlers.S), the contents of this register sometimes get lost for the guests, e.g. during migration of a VM. To fix this issue, simply switch to the privileged SPR numbers instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-11powerpc: Fix bad inline asm constraint in create_zero_mask()Anton Blanchard
commit b4c112114aab9aff5ed4568ca5e662bb02cdfe74 upstream. In create_zero_mask() we have: addi %1,%2,-1 andc %1,%1,%2 popcntd %0,%1 using the "r" constraint for %2. r0 is a valid register in the "r" set, but addi X,r0,X turns it into an li: li r7,-1 andc r7,r7,r0 popcntd r4,r7 Fix this by using the "b" constraint, for which r0 is not a valid register. This was found with a kernel build using gcc trunk, narrowed down to when -frename-registers was enabled at -O2. It is just luck however that we aren't seeing this on older toolchains. Thanks to Segher for working with me to find this issue. Fixes: d0cebfa650a0 ("powerpc: word-at-a-time optimization for 64-bit Little Endian") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04powerpc: scan_features() updates incorrect bits for REAL_LEAnton Blanchard
commit 6997e57d693b07289694239e52a10d2f02c3a46f upstream. The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU feature value, meaning all the remaining elements initialise the wrong values. This means instead of checking for byte 5, bit 0, we check for byte 0, bit 0, and then we incorrectly set the CPU feature bit as well as MMU feature bit 1 and CPU user feature bits 0 and 2 (5). Checking byte 0 bit 0 (IBM numbering), means we're looking at the "Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU. In practice that bit is set on all platforms which have the property. This means we set CPU_FTR_REAL_LE always. In practice that seems not to matter because all the modern cpus which have this property also implement REAL_LE, and we've never needed to disable it. We're also incorrectly setting MMU feature bit 1, which is: #define MMU_FTR_TYPE_8xx 0x00000002 Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E code, which can't run on the same cpus as scan_features(). So this also doesn't matter in practice. Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2 is not currently used, and bit 0 is: #define PPC_FEATURE_PPC_LE 0x00000001 Which says the CPU supports the old style "PPC Little Endian" mode. Again this should be harmless in practice as no 64-bit CPUs implement that mode. Fix the code by adding the missing initialisation of the MMU feature. Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It would be unsafe to start using it as old kernels incorrectly set it. Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out MMU-related features") Signed-off-by: Anton Blanchard <anton@samba.org> [mpe: Flesh out changelog, add comment reserving 0x4] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-16powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usagesRussell Currey
commit c88c5d43732a0356f99e5e4d1ad62ab1ea516b81 upstream. The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no parameters and returned nothing. The call was updated to accept the terminal number to flush, and returned various values depending on the state of the output buffer. The prototype has been updated and its usage in the OPAL kmsg dumper has been modified to support its new behaviour as an incremental flush. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-16powerpc/powernv: Add a kmsg_dumper that flushes console output on panicRussell Currey
commit affddff69c55eb68969448f35f59054a370bc7c1 upstream. On BMC machines, console output is controlled by the OPAL firmware and is only flushed when its pollers are called. When the kernel is in a panic state, it no longer calls these pollers and thus console output does not completely flush, causing some output from the panic to be lost. Output is only actually lost when the kernel is configured to not power off or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL flushes the console buffer as part of its power down routines. Before this patch, however, only partial output would be printed during the timeout wait. This patch adds a new kmsg_dumper which gets called at panic time to ensure panic output is not lost. It accomplishes this by calling OPAL_CONSOLE_FLUSH in the OPAL API, and if that is not available, the pollers are called enough times to (hopefully) completely flush the buffer. The flushing mechanism will only affect output printed at and before the kmsg_dump call in kernel/panic.c:panic(). As such, the "end Kernel panic" message may still be truncated as follows: >Call Trace: >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable) >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4 >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254 >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350 >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130 >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac >---[ end Kernel panic - not This functionality is implemented as a kmsg_dumper as it seems to be the most sensible way to introduce platform-specific functionality to the panic function. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25powerpc/eeh: Fix stale cached primary busGavin Shan
commit 05ba75f848647135f063199dc0e9f40fee769724 upstream. When PE is created, its primary bus is cached to pe->bus. At later point, the cached primary bus is returned from eeh_pe_bus_get(). However, we could get stale cached primary bus and run into kernel crash in one case: full hotplug as part of fenced PHB error recovery releases all PCI busses under the PHB at unplugging time and recreate them at plugging time. pe->bus is still dereferencing the PCI bus that was released. This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity of pe->bus. pe->bus is updated when its first child EEH device is online and the flag is set. Before unplugging in full hotplug for error recovery, the flag is cleared. Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE") Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/module: Handle R_PPC64_ENTRY relocationsUlrich Weigand
commit a61674bdfc7c2bf909c4010699607b62b69b7bec upstream. GCC 6 will include changes to generated code with -mcmodel=large, which is used to build kernel modules on powerpc64le. This was necessary because the large model is supposed to allow arbitrary sizes and locations of the code and data sections, but the ELFv2 global entry point prolog still made the unconditional assumption that the TOC associated with any particular function can be found within 2 GB of the function entry point: func: addis r2,r12,(.TOC.-func)@ha addi r2,r2,(.TOC.-func)@l .localentry func, .-func To remove this assumption, GCC will now generate instead this global entry point prolog sequence when using -mcmodel=large: .quad .TOC.-func func: .reloc ., R_PPC64_ENTRY ld r2, -8(r12) add r2, r2, r12 .localentry func, .-func The new .reloc triggers an optimization in the linker that will replace this new prolog with the original code (see above) if the linker determines that the distance between .TOC. and func is in range after all. Since this new relocation is now present in module object files, the kernel module loader is required to handle them too. This patch adds support for the new relocation and implements the same optimization done by the GNU linker. Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc: Make {cmp}xchg* and their atomic_ versions fully orderedBoqun Feng
commit 81d7a3294de7e9828310bbf986a67246b13fa01e upstream. According to memory-barriers.txt, xchg*, cmpxchg* and their atomic_ versions all need to be fully ordered, however they are now just RELEASE+ACQUIRE, which are not fully ordered. So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") This patch depends on patch "powerpc: Make value-returning atomics fully ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc: Make value-returning atomics fully orderedBoqun Feng
commit 49e9cf3f0c04bf76ffa59242254110309554861d upstream. According to memory-barriers.txt: > Any atomic operation that modifies some state in memory and returns > information about the state (old or new) implies an SMP-conditional > general memory barrier (smp_mb()) on each side of the actual > operation ... Which mean these operations should be fully ordered. However on PPC, PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation, which is currently "lwsync" if SMP=y. The leading "lwsync" can not guarantee fully ordered atomics, according to Paul Mckenney: https://lkml.org/lkml/2015/10/14/970 To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee the fully-ordered semantics. This also makes futex atomics fully ordered, which can avoid possible memory ordering problems if userspace code relies on futex system call for fully ordered semantics. Fixes: b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-16Partial revert of "powerpc: Individual System V IPC system calls"Michael Ellerman
This partially reverts commit a34236155afb1cc41945e58388ac988431bcb0b8. While reviewing the glibc patch to exploit the individual IPC calls, Arnd & Andreas noticed that we were still requiring userspace to pass IPC_64 in order to get the new style IPC API. With a bit of cleanup in the kernel we can drop that requirement, and instead only provide the new style API, which will simplify things for userspace. Rather than try and sneak that patch into 4.4, instead we will drop the individual IPC calls for powerpc, and merge them again in 4.5 once the cleanup patch has gone in. Because we've already added sys_mlock2() as syscall #378, we don't do a full revert of the IPC calls. Instead we drop the __NR #defines, and send those now undefined syscall numbers to sys_ni_syscall(). This leaves a gap in the syscall numbers, but we'll reuse them when we merge the individual IPC calls. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de>
2015-11-23powerpc/tm: Block signal return setting invalid MSR stateMichael Neuling
Currently we allow both the MSR T and S bits to be set by userspace on a signal return. Unfortunately this is a reserved configuration and will cause a TM Bad Thing exception if attempted (via rfid). This patch checks for this case in both the 32 and 64 bit signals code. If both T and S are set, we mark the context as invalid. Found using a syscall fuzzer. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-16powerpc: Wire up sys_mlock2()Michael Ellerman
The selftest passes on 64-bit LE and 32-bit BE. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-09kmap_atomic_to_page() has no users, remove itNicolas Pitre
Removal started in commit 5bbeed12bdc3 ("sparc32: drop unused kmap_atomic_to_page"). Let's do it across the whole tree. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05Merge tag 'powerpc-4.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng - Refresh ps3_defconfig from Geoff Levand - Emit GNU & SysV hashes for the vdso from Michael Ellerman - Define an enum for the bolted SLB indexes from Anshuman Khandual - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman - Add gettimeofday() benchmark from Michael Neuling - Avoid link stack corruption in __get_datapage() from Michael Neuling - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V - Add ppc64le_defconfig from Michael Ellerman - pseries: extract of_helpers module from Andy Shevchenko - Correct string length in pseries_of_derive_parent() from Nathan Fontenot - Free the MSI bitmap if it was slab allocated from Denis Kirjanov - Shorten irq_chip name for the SIU from Christophe Leroy - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas - Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King - Disable hugepd for 64K page size, from Aneesh Kumar K.V - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V - Make PCI non-optional for pseries from Michael Ellerman - Individual System V IPC system calls from Sam bobroff - Add selftest of unmuxed IPC calls from Michael Ellerman - discard .exit.data at runtime from Stephen Rothwell - Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul Gortmaker - Use of_get_next_parent to simplify code from Christophe Jaillet - Paginate some xmon output from Sam bobroff - Add some more elements to the xmon PACA dump from Michael Ellerman - Allow the tm-syscall selftest to build with old headers from Michael Ellerman - Run EBB selftests only on POWER8 from Denis Kirjanov - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet - Quieten boot wrapper output with run_cmd from Geoff Levand - EEH fixes and cleanups from Gavin Shan - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov - Fix ps3-lpm white space from Rudhresh Kumar J - Fix ps3-vuart null dereference from Colin King - nvram: Add missing kfree in error path from Christophe Jaillet - nvram: Fix function name in some errors messages, from Christophe Jaillet - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov - cxl: Free virtual PHB when removing from Andrew Donnellan - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump support, a rework of the qoriq clock driver, device tree changes including qoriq fman nodes, support for a new 85xx board, and some fixes. - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x LocalPlus Bus FIFO with its device tree binding documentation, mpc512x device tree updates and some minor fixes. * tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits) powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc() powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id() powerpc/pseries: Correct string length in pseries_of_derive_parent() powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s) powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes powerpc: handle error case in cpm_muram_alloc() powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake powerpc/book3e-64: Enable kexec powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 powerpc/book3e-64/kexec: Enable SMP release powerpc/book3e-64/kexec: create an identity TLB mapping powerpc/book3e-64: Don't limit paca to 256 MiB powerpc/book3e/kdump: Enable crash_kexec_wait_realmode powerpc/book3e: support CONFIG_RELOCATABLE powerpc/booke64: Fix args to copy_and_flush powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts powerpc/e6500: kexec: Handle hardware threads ...
2015-11-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge patch-bomb from Andrew Morton: - inotify tweaks - some ocfs2 updates (many more are awaiting review) - various misc bits - kernel/watchdog.c updates - Some of mm. I have a huge number of MM patches this time and quite a lot of it is quite difficult and much will be held over to next time. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits) selftests: vm: add tests for lock on fault mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage mm: introduce VM_LOCKONFAULT mm: mlock: add new mlock system call mm: mlock: refactor mlock, munlock, and munlockall code kasan: always taint kernel on report mm, slub, kasan: enable user tracking by default with KASAN=y kasan: use IS_ALIGNED in memory_is_poisoned_8() kasan: Fix a type conversion error lib: test_kasan: add some testcases kasan: update reference to kasan prototype repo kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile kasan: various fixes in documentation kasan: update log messages kasan: accurately determine the type of the bad access kasan: update reported bug types for kernel memory accesses kasan: update reported bug types for not user nor kernel memory accesses mm/kasan: prevent deadlock in kasan reporting mm/kasan: don't use kasan shadow pointer in generic functions mm/kasan: MODULE_VADDR is not available on all archs ...
2015-11-05mm: mlock: add mlock flags to enable VM_LOCKONFAULT usageEric B Munson
The previous patch introduced a flag that specified pages in a VMA should be placed on the unevictable LRU, but they should not be made present when the area is created. This patch adds the ability to set this state via the new mlock system calls. We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall. MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED. MCL_ONFAULT should be used as a modifier to the two other mlockall flags. When used with MCL_CURRENT, all current mappings will be marked with VM_LOCKED | VM_LOCKONFAULT. When used with MCL_FUTURE, the mm->def_flags will be marked with VM_LOCKED | VM_LOCKONFAULT. When used with both MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be marked with VM_LOCKED | VM_LOCKONFAULT. Prior to this patch, mlockall() will unconditionally clear the mm->def_flags any time it is called without MCL_FUTURE. This behavior is maintained after adding MCL_ONFAULT. If a call to mlockall(MCL_FUTURE) is followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and new VMAs will be unlocked. This remains true with or without MCL_ONFAULT in either mlockall() invocation. munlock() will unconditionally clear both vma flags. munlockall() unconditionally clears for VMA flags on all VMAs and in the mm->def_flags field. Signed-off-by: Eric B Munson <emunson@akamai.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.4. s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: Quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits) KVM: VMX: Fix commit which broke PML KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() KVM: x86: allow RSM from 64-bit mode KVM: VMX: fix SMEP and SMAP without EPT KVM: x86: move kvm_set_irq_inatomic to legacy device assignment KVM: device assignment: remove pointless #ifdefs KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic KVM: x86: zero apic_arb_prio on reset drivers/hv: share Hyper-V SynIC constants with userspace KVM: x86: handle SMBASE as physical address in RSM KVM: x86: add read_phys to x86_emulate_ops KVM: x86: removing unused variable KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr() KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings KVM: arm/arm64: Optimize away redundant LR tracking KVM: s390: use simple switch statement as multiplexer KVM: s390: drop useless newline in debugging data KVM: s390: SCA must not cross page boundaries KVM: arm: Do not indent the arguments of DECLARE_BITMAP ...
2015-11-05Merge branch 'next' of git://git.denx.de/linux-denx-agust into nextMichael Ellerman
MPC5xxx updates from Anatolij: "Highlights include a driver for MPC512x LocalPlus Bus FIFO with its device tree binding documentation, mpc512x device tree updates and some minor fixes."
2015-11-04Merge tag 'kvm-arm-for-4.4' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.4-rc1 Includes a number of fixes for the arch-timer, introducing proper level-triggered semantics for the arch-timers, a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups getting rid of redundant state, and finally a stylistic change that gets rid of some ctags warnings. Conflicts: arch/x86/include/asm/kvm_host.h
2015-10-27powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same ↵Kevin Hao
tlb entry In order to workaround Erratum A-008139, we have to invalidate the tlb entry with tlbilx before overwriting. Due to the performance consideration, we don't add any memory barrier when acquire/release the tcd lock. This means the two load instructions for esel_next do have the possibility to return different value. This is definitely not acceptable due to the Erratum A-008139. We have two options to fix this issue: a) Add memory barrier when acquire/release tcd lock to order the load/store to esel_next. b) Just make sure to invalidate and write to the same tlb entry and tolerate the race that we may get the wrong value and overwrite the tlb entry just updated by the other thread. We observe better performance using option b. So reserve an additional register to save the value of the esel_next. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27Merge branch 'clock' into HEADScott Wood
This is a major overhaul of the clk-qoriq driver, which I'm merging via PPC with Stephen Boyd's ack in order to apply subsequent PPC patches that depend on it.
2015-10-27powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32Scott Wood
The way VIRT_PHYS_OFFSET is not correct on book3e-64, because it does not account for CONFIG_RELOCATABLE other than via the 32-bit-only virt_phys_offset. book3e-64 can (and if the comment about a GCC miscompilation is still relevant, should) use the normal ppc64 __va/__pa. At this point, only booke-32 will use VIRT_PHYS_OFFSET, so given the issues with its calculation, restrict its definition to booke-32. Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-10-27powerpc/book3e: support CONFIG_RELOCATABLETiejun Chen
book3e is different with book3s since 3s includes the exception vectors code in head_64.S as it relies on absolute addressing which is only possible within this compilation unit. So we have to get that label address with got. And when boot a relocated kernel, we should reset ipvr properly again after .relocate. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: cleanup and ifdef removal] Signed-off-by: Scott Wood <scottwood@freescale.com>