summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2018-06-06powerpc/64s: Clear PCR on bootMichael Neuling
commit faf37c44a105f3608115785f17cbbf3500f8bc71 upstream. Clear the PCR (Processor Compatibility Register) on boot to ensure we are not running in a compatibility mode. We've seen this cause problems when a crash (and kdump) occurs while running compat mode guests. The kdump kernel then runs with the PCR set and causes problems. The symptom in the kdump kernel (also seen in petitboot after fast-reboot) is early userspace programs taking sigills on newer instructions (seen in libc). Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc: Add missing prototype for arch_irq_work_raise()Mathieu Malaterre
[ Upstream commit f5246862f82f1e16bbf84cda4cddf287672b30fe ] In commit 4f8b50bbbe63 ("irq_work, ppc: Fix up arch hooks") a new function arch_irq_work_raise() was added without a prototype in header irq_work.h. Fix the following warning (treated as error in W=1): arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’ Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/perf: Fix kernel address leak via sampling registersMichael Ellerman
[ Upstream commit e1ebd0e5b9d0a10ba65e63a3514b6da8c6a5a819 ] Current code in power_pmu_disable() does not clear the sampling registers like Sampling Instruction Address Register (SIAR) and Sampling Data Address Register (SDAR) after disabling the PMU. Since these are userspace readable and could contain kernel addresses, add code to explicitly clear the content of these registers. Also add a "context synchronizing instruction" to enforce no further updates to these registers as suggested by Power ISA v3.0B. From section 9.4, on page 1108: "If an mtspr instruction is executed that changes the value of a Performance Monitor register other than SIAR, SDAR, and SIER, the change is not guaranteed to have taken effect until after a subsequent context synchronizing instruction has been executed (see Chapter 11. "Synchronization Requirements for Context Alterations" on page 1133)." Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Massage change log and add ISA reference] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/perf: Prevent kernel address leak to userspace via BHRB bufferMadhavan Srinivasan
[ Upstream commit bb19af816025d495376bd76bf6fbcf4244f9a06d ] The current Branch History Rolling Buffer (BHRB) code does not check for any privilege levels before updating the data from BHRB. This could leak kernel addresses to userspace even when profiling only with userspace privileges. Add proper checks to prevent it. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/mpic: Check if cpu_possible() in mpic_physmask()Michael Ellerman
[ Upstream commit 0834d627fbea00c1444075eb3e448e1974da452d ] In mpic_physmask() we loop over all CPUs up to 32, then get the hard SMP processor id of that CPU. Currently that's possibly walking off the end of the paca array, but in a future patch we will change the paca array to be an array of pointers, and in that case we will get a NULL for missing CPUs and oops. eg: Unable to handle kernel paging request for data at address 0x88888888888888b8 Faulting instruction address: 0xc00000000004e380 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP .mpic_set_affinity+0x60/0x1a0 LR .irq_do_set_affinity+0x48/0x100 Fix it by checking the CPU is possible, this also fixes the code if there are gaps in the CPU numbering which probably never happens on mpic systems but who knows. Debugged-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backingPaul Mackerras
[ Upstream commit debd574f4195e205ba505b25e19b2b797f4bcd94 ] The current code for initializing the VRMA (virtual real memory area) for HPT guests requires the page size of the backing memory to be one of 4kB, 64kB or 16MB. With a radix host we have the possibility that the backing memory page size can be 2MB or 1GB. In these cases, if the guest switches to HPT mode, KVM will not initialize the VRMA and the guest will fail to run. In fact it is not necessary that the VRMA page size is the same as the backing memory page size; any VRMA page size less than or equal to the backing memory page size is acceptable. Therefore we now choose the largest page size out of the set {4k, 64k, 16M} which is not larger than the backing memory page size. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data accessMark Lord
[ Upstream commit 083b20907185b076f21c265b30fe5b5f24c03d8c ] I am using SECCOMP to filter syscalls on a ppc32 platform, and noticed that the JIT compiler was failing on the BPF even though the interpreter was working fine. The issue was that the compiler was missing one of the instructions used by SECCOMP, so here is a patch to enable JIT for that instruction. Fixes: eb84bab0fb38 ("ppc: Kconfig: Enable BPF JIT on ppc32") Signed-off-by: Mark Lord <mlord@pobox.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/numa: Ensure nodes initialized for hotplugMichael Bringmann
[ Upstream commit ea05ba7c559c8e5a5946c3a94a2a266e9a6680a6 ] This patch fixes some problems encountered at runtime with configurations that support memory-less nodes, or that hot-add CPUs into nodes that are memoryless during system execution after boot. The problems of interest include: * Nodes known to powerpc to be memoryless at boot, but to have CPUs in them are allowed to be 'possible' and 'online'. Memory allocations for those nodes are taken from another node that does have memory until and if memory is hot-added to the node. * Nodes which have no resources assigned at boot, but which may still be referenced subsequently by affinity or associativity attributes, are kept in the list of 'possible' nodes for powerpc. Hot-add of memory or CPUs to the system can reference these nodes and bring them online instead of redirecting the references to one of the set of nodes known to have memory at boot. Note that this software operates under the context of CPU hotplug. We are not doing memory hotplug in this code, but rather updating the kernel's CPU topology (i.e. arch_update_cpu_topology / numa_update_cpu_topology). We are initializing a node that may be used by CPUs or memory before it can be referenced as invalid by a CPU hotplug operation. CPU hotplug operations are protected by a range of APIs including cpu_maps_update_begin/cpu_maps_update_done, cpus_read/write_lock / cpus_read/write_unlock, device locks, and more. Memory hotplug operations, including try_online_node, are protected by mem_hotplug_begin/mem_hotplug_done, device locks, and more. In the case of CPUs being hot-added to a previously memoryless node, the try_online_node operation occurs wholly within the CPU locks with no overlap. Using HMC hot-add/hot-remove operations, we have been able to add and remove CPUs to any possible node without failures. HMC operations involve a degree self-serialization, though. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30powerpc/numa: Use ibm,max-associativity-domains to discover possible nodesMichael Bringmann
[ Upstream commit a346137e9142b039fd13af2e59696e3d40c487ef ] On powerpc systems which allow 'hot-add' of CPU or memory resources, it may occur that the new resources are to be inserted into nodes that were not used for these resources at bootup. In the kernel, any node that is used must be defined and initialized. These empty nodes may occur when, * Dedicated vs. shared resources. Shared resources require information such as the VPHN hcall for CPU assignment to nodes. Associativity decisions made based on dedicated resource rules, such as associativity properties in the device tree, may vary from decisions made using the values returned by the VPHN hcall. * memoryless nodes at boot. Nodes need to be defined as 'possible' at boot for operation with other code modules. Previously, the powerpc code would limit the set of possible nodes to those which have memory assigned at boot, and were thus online. Subsequent add/remove of CPUs or memory would only work with this subset of possible nodes. * memoryless nodes with CPUs at boot. Due to the previous restriction on nodes, nodes that had CPUs but no memory were being collapsed into other nodes that did have memory at boot. In practice this meant that the node assignment presented by the runtime kernel differed from the affinity and associativity attributes presented by the device tree or VPHN hcalls. Nodes that might be known to the pHyp were not 'possible' in the runtime kernel because they did not have memory at boot. This patch ensures that sufficient nodes are defined to support configuration requirements after boot, as well as at boot. This patch set fixes a couple of problems. * Nodes known to powerpc to be memoryless at boot, but to have CPUs in them are allowed to be 'possible' and 'online'. Memory allocations for those nodes are taken from another node that does have memory until and if memory is hot-added to the node. * Nodes which have no resources assigned at boot, but which may still be referenced subsequently by affinity or associativity attributes, are kept in the list of 'possible' nodes for powerpc. Hot-add of memory or CPUs to the system can reference these nodes and bring them online instead of redirecting to one of the set of nodes that were known to have memory at boot. This patch extracts the value of the lowest domain level (number of allocable resources) from the device tree property "ibm,max-associativity-domains" to use as the maximum number of nodes to setup as possibly available in the system. This new setting will override the instruction: nodes_and(node_possible_map, node_possible_map, node_online_map); presently seen in the function arch/powerpc/mm/numa.c:initmem_init(). If the "ibm,max-associativity-domains" property is not present at boot, no operation will be performed to define or enable additional nodes, or enable the above 'nodes_and()'. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: Fix NVRAM sleep in invalid context when crashingNicholas Piggin
commit c1d2a31397ec51f0370f6bd17b19b39152c263cb upstream. Similarly to opal_event_shutdown, opal_nvram_write can be called in the crash path with irqs disabled. Special case the delay to avoid sleeping in invalid context. Fixes: 3b8070335f75 ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops") Cc: stable@vger.kernel.org # v3.2 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc: Don't preempt_disable() in show_cpuinfo()Benjamin Herrenschmidt
commit 349524bc0da698ec77f2057cf4a4948eb6349265 upstream. This causes warnings from cpufreq mutex code. This is also rather unnecessary and ineffective. If we really want to prevent concurrent unplug, we could take the unplug read lock but I don't see this being critical. Fixes: cd77b5ce208c ("powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPALStewart Smith
commit e4d54f71d29997344b4c4c8d47708240f9f23a5c upstream. Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: Remove OPALv2 firmware define and referencesStewart Smith
commit 7261aafc095763b119136a562540dea7b1ccf657 upstream. OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26powerpc/powernv: panic() on OPAL < V3Stewart Smith
commit 786842b62f81f20d14894925e8c225328ee8144b upstream. The OpenPower Abstraction Layer firmware went through a couple of iterations in the lab before being released. What we now know as OPAL advertises itself as OPALv3. OPALv2 and OPALv1 never made it outside the lab, and the possibility of anyone at all ever building a mainline kernel today and expecting it to boot on such hardware is zero. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26futex: Remove duplicated code and fix undefined behaviourJiri Slaby
commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream. There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/lib: Fix off-by-one in alternate feature patchingMichael Ellerman
commit b8858581febb050688e276b956796bc4a78299ed upstream. When we patch an alternate feature section, we have to adjust any relative branches that branch out of the alternate section. But currently we have a bug if we have a branch that points to past the last instruction of the alternate section, eg: FTR_SECTION_ELSE 1: b 2f or 6,6,6 2: ALT_FTR_SECTION_END(...) nop This will result in a relative branch at 1 with a target that equals the end of the alternate section. That branch does not need adjusting when it's moved to the non-else location. Currently we do adjust it, resulting in a branch that goes off into the link-time location of the else section, which is junk. The fix is to not patch branches that have a target == end of the alternate section. Fixes: d20fe50a7b3c ("KVM: PPC: Book3S HV: Branch inside feature section") Fixes: 9b1a735de64c ("powerpc: Add logic to patch alternative feature sections") Cc: stable@vger.kernel.org # v2.6.27+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/eeh: Fix enabling bridge MMIO windowsMichael Neuling
commit 13a83eac373c49c0a081cbcd137e79210fe78acd upstream. On boot we save the configuration space of PCIe bridges. We do this so when we get an EEH event and everything gets reset that we can restore them. Unfortunately we save this state before we've enabled the MMIO space on the bridges. Hence if we have to reset the bridge when we come back MMIO is not enabled and we end up taking an PE freeze when the driver starts accessing again. This patch forces the memory/MMIO and bus mastering on when restoring bridges on EEH. Ideally we'd do this correctly by saving the configuration space writes later, but that will have to come later in a larger EEH rewrite. For now we have this simple fix. The original bug can be triggered on a boston machine by doing: echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound On boston, this PHB has a PCIe switch on it. Without this patch, you'll see two EEH events, 1 expected and 1 the failure we are fixing here. The second EEH event causes the anything under the PHB to disappear (i.e. the i40e eth). With this patch, only 1 EEH event occurs and devices properly recover. Fixes: 652defed4875 ("powerpc/eeh: Check PCIe link after reset") Cc: stable@vger.kernel.org # v3.11+ Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loopsNicholas Piggin
commit 3b8070335f751aac9f1526ae2e012e6f5b8b0f21 upstream. The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or OPAL_BUSY_EVENT from firmware, which causes large scheduling latencies, and various lockup errors to trigger (again, BMC reboot can cause it). Fix this by converting it to the standard form OPAL_BUSY loop that sleeps. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Depends-on: 34dd25de9fe3 ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/powernv: define a standard delay for OPAL_BUSY type retry loopsNicholas Piggin
commit 34dd25de9fe3f60bfdb31b473bf04b28262d0896 upstream. This is the start of an effort to tidy up and standardise all the delays. Existing loops have a range of delay/sleep periods from 1ms to 20ms, and some have no delay. They all loop forever except rtc, which times out after 10 retries, and that uses 10ms delays. So use 10ms as our standard delay. The OPAL maintainer agrees 10ms is a reasonable starting point. The idea is to use the same recipe everywhere, once this is proven to work then it will be documented as an OPAL API standard. Then both firmware and OS can agree, and if a particular call needs something else, then that can be documented with reasoning. This is not the end-all of this effort, it's just a relatively easy change that fixes some existing high latency delays. There should be provision for standardising timeouts and/or interruptible loops where possible, so non-fatal firmware errors don't cause hangs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/64: Fix smp_wmb barrier definition use use lwsync consistentlyNicholas Piggin
commit 0bfdf598900fd62869659f360d3387ed80eb71cf upstream. asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()Nicholas Piggin
commit 741de617661794246f84a21a02fc5e327bffc9ad upstream. opal_nvram_write currently just assumes success if it encounters an error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO on other errors instead. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/spufs: Fix coredump of SPU contextsMichael Ellerman
[ Upstream commit 99acc9bede06bbb2662aafff51f5b9e529fa845e ] If a process dumps core while it has SPU contexts active then we have code to also dump information about the SPU contexts. Unfortunately it's been broken for 3 1/2 years, and we didn't notice. In commit 7b1f4020d0d1 ("spufs: get rid of dump_emit() wrappers") the nread variable was removed and rc used instead. That means when the loop exits successfully, rc has the number of bytes read, but it's then used as the return value for the function, which should return 0 on success. So fix it by setting rc = 0 before returning in the success case. Fixes: 7b1f4020d0d1 ("spufs: get rid of dump_emit() wrappers") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/[booke|4xx]: Don't clobber TCR[WP] when setting TCR[DIE]Ivan Mikhaylov
[ Upstream commit 6e2f03e292ef46eed2b31b0a344a91d514f9cd81 ] Prevent a kernel panic caused by unintentionally clearing TCR watchdog bits. At this point in the kernel boot, the watchdog may have already been enabled by u-boot. The original code's attempt to write to the TCR register results in an inadvertent clearing of the watchdog configuration bits, causing the 476 to reset. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hashMichael Ellerman
[ Upstream commit e41e53cd4fe331d0d1f06f8e4ed7e2cc63ee2c34 ] virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on an address. What this means in practice is that it should only return true for addresses in the linear mapping which are backed by a valid PFN. We are failing to properly check that the address is in the linear mapping, because virt_to_pfn() will return a valid looking PFN for more or less any address. That bug is actually caused by __pa(), used in virt_to_pfn(). eg: __pa(0xc000000000010000) = 0x10000 # Good __pa(0xd000000000010000) = 0x10000 # Bad! __pa(0x0000000000010000) = 0x10000 # Bad! This started happening after commit bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give you something bogus back. Until we can verify if that GCC bug is no longer an issue, or come up with another solution, this commit does the minimal fix to make virt_addr_valid() work, by explicitly checking that the address is in the linear mapping region. Fixes: bdbc29c19b26 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Tested-by: Breno Leitao <breno.leitao@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13KVM: PPC: Book3S PR: Check copy_to/from_user return valuesPaul Mackerras
[ Upstream commit 67325e988faea735d663799b6d152b5f4254093c ] The PR KVM implementation of the PAPR HPT hypercalls (H_ENTER etc.) access an image of the HPT in userspace memory using copy_from_user and copy_to_user. Recently, the declarations of those functions were annotated to indicate that the return value must be checked. Since this code doesn't currently check the return value, this causes compile warnings like the ones shown below, and since on PPC the default is to compile arch/powerpc with -Werror, this causes the build to fail. To fix this, we check the return values, and if non-zero, fail the hypercall being processed with a H_FUNCTION error return value. There is really no good error return value to use since PAPR didn't envisage the possibility that the hypervisor may not be able to access the guest's HPT, and H_FUNCTION (function not supported) seems as good as any. The typical compile warnings look like this: CC arch/powerpc/kvm/book3s_pr_papr.o /home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c: In function ‘kvmppc_h_pr_enter’: /home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c:53:2: error: ignoring return value of ‘copy_from_user’, declared with attribute warn_unused_result [-Werror=unused-result] copy_from_user(pteg, (void __user *)pteg_addr, sizeof(pteg)); ^ /home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c:74:2: error: ignoring return value of ‘copy_to_user’, declared with attribute warn_unused_result [-Werror=unused-result] copy_to_user((void __user *)pteg_addr, hpte, HPTE_SIZE); ^ ... etc. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24KVM: PPC: Book3S PR: Exit KVM on failed mappingAlexey Kardashevskiy
[ Upstream commit bd9166ffe624000140fc6b606b256df01fc0d060 ] At the moment kvmppc_mmu_map_page() returns -1 if mmu_hash_ops.hpte_insert() fails for any reason so the page fault handler resumes the guest and it faults on the same address again. This adds distinction to kvmppc_mmu_map_page() to return -EIO if mmu_hash_ops.hpte_insert() failed for a reason other than full pteg. At the moment only pSeries_lpar_hpte_insert() returns -2 if plpar_pte_enter() failed with a code other than H_PTEG_FULL. Other mmu_hash_ops.hpte_insert() instances can only fail with -1 "full pteg". With this change, if PR KVM fails to update HPT, it can signal the userspace about this instead of returning to guest and having the very same page fault over and over again. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22powerpc: Avoid taking a data miss on every userspace instruction missAnton Blanchard
[ Upstream commit a7a9dcd882a67b68568868b988289fce5ffd8419 ] Early on in do_page_fault() we call store_updates_sp(), regardless of the type of exception. For an instruction miss this doesn't make sense, because we only use this information to detect if a data miss is the result of a stack expansion instruction or not. Worse still, it results in a data miss within every userspace instruction miss handler, because we try and load the very instruction we are about to install a pte for! A simple exec microbenchmark runs 6% faster on POWER8 with this fix: #include <stdlib.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { unsigned long left = atol(argv[1]); char leftstr[16]; if (left-- == 0) return 0; sprintf(leftstr, "%ld", left); execlp(argv[0], argv[0], leftstr, NULL); perror("exec failed\n"); return 0; } Pass the number of iterations on the command line (eg 10000) and time how long it takes to execute. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28powerpc/64s: Fix RFI flush dependency on HARDLOCKUP_DETECTORMichael Ellerman
The backport of commit aa8a5e0062ac ("powerpc/64s: Add support for RFI flush of L1-D cache"), incorrectly placed the new RFI flush code inside an existing #ifdef CONFIG_HARDLOCKUP_DETECTOR block. This has the obvious effect of requiring HARDLOCKUP_DETECTOR to be enabled in order for RFI flush to be enabled, which is a bug. Fix it by moving the #endif up to where it belongs. Fixes: c3892946315e ("powerpc/64s: Add support for RFI flush of L1-D cache") Reported-by: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25powerpc/perf: Fix oops when grouping different pmu eventsRavi Bangoria
[ Upstream commit 5aa04b3eb6fca63d2e9827be656dcadc26d54e11 ] When user tries to group imc (In-Memory Collections) event with normal event, (sometime) kernel crashes with following log: Faulting instruction address: 0x00000000 [link register ] c00000000010ce88 power_check_constraints+0x128/0x980 ... c00000000010e238 power_pmu_event_init+0x268/0x6f0 c0000000002dc60c perf_try_init_event+0xdc/0x1a0 c0000000002dce88 perf_event_alloc+0x7b8/0xac0 c0000000002e92e0 SyS_perf_event_open+0x530/0xda0 c00000000000b004 system_call+0x38/0xe0 'event_base' field of 'struct hw_perf_event' is used as flags for normal hw events and used as memory address for imc events. While grouping these two types of events, collect_events() tries to interpret imc 'event_base' as a flag, which causes a corruption resulting in a crash. Consider only those events which belongs to 'perf_hw_context' in collect_events(). Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-By: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/pseries: include linux/types.h in asm/hvcall.hMichal Suchanek
commit 1b689a95ce7427075f9ac9fb4aea1af530742b7f upstream. Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Allow control of RFI flush via debugfsMichael Ellerman
commit 236003e6b5443c45c18e613d2b0d776a9f87540e upstream. Expose the state of the RFI flush (enabled/disabled) via debugfs, and allow it to be enabled/disabled at runtime. eg: $ cat /sys/kernel/debug/powerpc/rfi_flush 1 $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush $ cat /sys/kernel/debug/powerpc/rfi_flush 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Wire up cpu_show_meltdown()Michael Ellerman
commit fd6e440f20b1a4304553775fc55938848ff617c9 upstream. The recent commit 87590ce6e373 ("sysfs/cpu: Add vulnerability folder") added a generic folder and set of files for reporting information on CPU vulnerabilities. One of those was for meltdown: /sys/devices/system/cpu/vulnerabilities/meltdown This commit wires up that file for 64-bit Book3S powerpc. For now we default to "Vulnerable" unless the RFI flush is enabled. That may not actually be true on all hardware, further patches will refine the reporting based on the CPU/platform etc. But for now we default to being pessimists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/powernv: Check device-tree for RFI flush settingsOliver O'Halloran
commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream. New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/pseries: Query hypervisor for RFI flush settingsMichael Neuling
commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream. A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Support disabling RFI flush with no_rfi_flush and noptiMichael Ellerman
commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream. Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Add support for RFI flush of L1-D cacheMichael Ellerman
commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir - back ported to stable with changes] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNELNicholas Piggin
commit c7305645eb0c1621351cfc104038831ae87c0053 upstream. In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Backport to 4.4 based on patch from Balbir] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNELNicholas Piggin
commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream. In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNELNicholas Piggin
commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream. Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64s: Simple RFI macro conversionsNicholas Piggin
commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream. This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed issues with backporting to stable] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Add macros for annotating the destination of rfid/hrfidNicholas Piggin
commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapperMichael Neuling
commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed conflicts in backport] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc: Simplify module TOC handlingAlan Modra
commit c153693d7eb9eeb28478aa2deaaf0b4e7b5ff5e9 upstream. PowerPC64 uses the symbol .TOC. much as other targets use _GLOBAL_OFFSET_TABLE_. It identifies the value of the GOT pointer (or in powerpc parlance, the TOC pointer). Global offset tables are generally local to an executable or shared library, or in the kernel, module. Thus it does not make sense for a module to resolve a relocation against .TOC. to the kernel's .TOC. value. A module has its own .TOC., and indeed the powerpc64 module relocation processing ignores the kernel value of .TOC. and instead calculates a module-local value. This patch removes code involved in exporting the kernel .TOC., tweaks modpost to ignore an undefined .TOC., and the module loader to twiddle the section symbol so that .TOC. isn't seen as undefined. Note that if the kernel was compiled with -msingle-pic-base then ELFv2 would not have function global entry code setting up r2. In that case the module call stubs would need to be modified to set up r2 using the kernel .TOC. value, requiring some of this code to be reinstated. mpe: Furthermore a change in binutils master (not yet released) causes the current way we handle the TOC to no longer work when building with MODVERSIONS=y and RELOCATABLE=n. The symptom is that modules can not be loaded due to there being no version found for TOC. Signed-off-by: Alan Modra <amodra@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VECBenjamin Herrenschmidt
commit 5a69aec945d27e78abac9fd032533d3aaebf7c1e upstream. VSX uses a combination of the old vector registers, the old FP registers and new "second halves" of the FP registers. Thus when we need to see the VSX state in the thread struct (flush_vsx_to_thread()) or when we'll use the VSX in the kernel (enable_kernel_vsx()) we need to ensure they are all flushed into the thread struct if either of them is individually enabled. Unfortunately we only tested if the whole VSX was enabled, not if they were individually enabled. Fixes: 72cd7b44bc99 ("powerpc: Uncomment and make enable_kernel_vsx() routine available") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Backported due to changed context] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/64: Fix flush_(d|i)cache_range() called from modulesOliver O'Halloran
commit 8f5f525d5b83f7d76a6baf9c4e94d4bf312ea7f6 upstream. When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does not include a global entry point. A function's global entry point is used when the function is called from a different TOC context and in the kernel this typically means a call from a module into the vmlinux (or vice-versa). There are a few exported asm functions declared with _GLOBAL() and calling them from a module will likely crash the kernel since any TOC relative load will yield garbage. flush_icache_range() and flush_dcache_range() are both exported to modules, and use the TOC, so must use _GLOBAL_TOC(). Fixes: 721aeaa9fdf3 ("powerpc: Build little endian ppc64 kernel with ABIv2") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16powerpc/bpf/jit: Disable classic BPF JIT on ppc64leNaveen N. Rao
commit 844e3be47693f92a108cb1fb3b0606bf25e9c7a6 upstream. Classic BPF JIT was never ported completely to work on little endian powerpc. However, it can be enabled and will crash the system when used. As such, disable use of BPF JIT on ppc64le. Fixes: 7c105b63bd98 ("powerpc: Add CONFIG_CPU_LITTLE_ENDIAN kernel config option.") Reported-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02powerpc/perf: Dereference BHRB entries safelyRavi Bangoria
commit f41d84dddc66b164ac16acf3f584c276146f1c48 upstream. It's theoretically possible that branch instructions recorded in BHRB (Branch History Rolling Buffer) entries have already been unmapped before they are processed by the kernel. Hence, trying to dereference such memory location will result in a crash. eg: Unable to handle kernel paging request for data at address 0xd000000019c41764 Faulting instruction address: 0xc000000000084a14 NIP [c000000000084a14] branch_target+0x4/0x70 LR [c0000000000eb828] record_and_restart+0x568/0x5c0 Call Trace: [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable) [c0000000000ec378] perf_event_interrupt+0x298/0x460 [c000000000027964] performance_monitor_exception+0x54/0x70 [c000000000009ba4] performance_monitor_common+0x114/0x120 Fix it by deferefencing the addresses safely. Fixes: 691231846ceb ("powerpc/perf: Fix setting of "to" addresses for BHRB") Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Use probe_kernel_read() which is clearer, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20powerpc/perf/hv-24x7: Fix incorrect comparison in memordMichael Ellerman
[ Upstream commit 05c14c03138532a3cb2aa29c2960445c8753343b ] In the hv-24x7 code there is a function memord() which tries to implement a sort function return -1, 0, 1. However one of the conditions is incorrect, such that it can never be true, because we will have already returned. I don't believe there is a bug in practice though, because the comparisons are an optimisation prior to calling memcmp(). Fix it by swapping the second comparision, so it can be true. Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20powerpc/ipic: Fix status get and status clearChristophe Leroy
[ Upstream commit 6b148a7ce72a7f87c81cbcde48af014abc0516a9 ] IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR which is the mask register. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20powerpc/opal: Fix EBUSY bug in acquiring tokensWilliam A. Kennington III
[ Upstream commit 71e24d7731a2903b1ae2bba2b2971c654d9c2aa6 ] The current code checks the completion map to look for the first token that is complete. In some cases, a completion can come in but the token can still be on lease to the caller processing the completion. If this completed but unreleased token is the first token found in the bitmap by another tasks trying to acquire a token, then the __test_and_set_bit call will fail since the token will still be on lease. The acquisition will then fail with an EBUSY. This patch reorganizes the acquisition code to look at the opal_async_token_map for an unleased token. If the token has no lease it must have no outstanding completions so we should never see an EBUSY, unless we have leased out too many tokens. Since opal_async_get_token_inrerruptible is protected by a semaphore, we will practically never see EBUSY anymore. Fixes: 8d7248232208 ("powerpc/powernv: Infrastructure to support OPAL async completion") Signed-off-by: William A. Kennington III <wak@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>