summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/pseries
AgeCommit message (Collapse)Author
2014-03-06powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctlyTony Breeds
commit 41dd03a94c7d408d2ef32530545097f7d1befe5c upstream. Currently we're storing a host endian RTAS token in rtas_stop_self_args.token. We then pass that directly to rtas. This is fine on big endian however on little endian the token is not what we expect. This will typically result in hitting: panic("Alas, I survived.\n"); To fix this we always use the stop-self token in host order and always convert it to be32 before passing this to rtas. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26powerpc: Default arch idle could cede processor on pseriesVaidyanathan Srinivasan
commit 363edbe2614aa90df706c0f19ccfa2a6c06af0be upstream. When adding cpuidle support to pSeries, we introduced two regressions: - The new cpuidle backend driver only works under hypervisors supporting the "SLPLAR" option, which isn't the case of the old POWER4 hypervisor and the HV "light" used on js2x blades - The cpuidle driver registers fairly late, meaning that for a significant portion of the boot process, we end up having all threads spinning. This slows down the boot process and increases the overall resource usage if the hypervisor has shared processors. This fixes both by implementing a "default" idle that will cede to the hypervisor when possible, in a very simple way without all the bells and whisles of cpuidle. Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-30powerpc/eeh: Fix fetching bus for single-dev-PEGavin Shan
While running Linux as guest on top of phyp, we possiblly have PE that includes single PCI device. However, we didn't return its PCI bus correctly and it leads to failure on recovery from EEH errors for single-dev-PE. The patch fixes the issue. Cc: <stable@vger.kernel.org> # v3.7+ Cc: Steve Best <sbest@us.ibm.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-28powerpc/eeh: Add eeh_dev to the cache during bootThadeu Lima de Souza Cascardo
commit f8f7d63fd96ead101415a1302035137a866f8998 ("powerpc/eeh: Trace eeh device from I/O cache") broke EEH on pseries for devices that were present during boot and have not been hotplugged/DLPARed. eeh_check_failure will get the eeh_dev from the cache, and will get NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev for the giving pci_device is not set yet. Just reordering the call to eeh_addr_cache_insert_dev works fine. The ordering is similar to the one in eeh_add_device_late. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10powerpc/eeh: Don't check RTAS token to get PE addrGavin Shan
RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2" are used to retrieve the PE address according to PCI address, which made up of domain/bus/slot/function. If we don't have those 2 tokens, the domain/bus/slot/function would be used as the address for EEH RTAS operations. Some older f/w might not have those 2 tokens and that blocks the EEH functionality to be initialized. It was introduced by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization"). The patch skips the check on those 2 tokens so we can bring up EEH functionality successfully. And domain/bus/slot/function will be used as address for EEH RTAS operations. Cc: <stable@vger.kernel.org> # v3.4+ Reported-by: Robert Knight <knight@princeton.edu> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Robert Knight <knight@princeton.edu> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/pseries: Always enable CONFIG_HOTPLUG_CPU on PSERIES SMPSrivatsa S. Bhat
Adam Lackorzynski reported the following build failure on !CONFIG_HOTPLUG_CPU configuration: CC arch/powerpc/kernel/rtas.o arch/powerpc/kernel/rtas.c: In function ‘rtas_cpu_state_change_mask’: arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function ‘cpu_down’ [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1 make: *** [arch/powerpc/kernel] Error 2 The build fails because cpu_down() is defined only under CONFIG_HOTPLUG_CPU. Looking further, the mobility code in pseries is one of the call-sites which uses rtas_ibm_suspend_me(), which in turn calls rtas_cpu_state_change_mask(). And the mobility code is unconditionally compiled-in (it does not fall under any Kconfig option). And commit 120496ac (powerpc: Bring all threads online prior to migration/hibernation) which introduced this build regression is critical for the proper functioning of the migration code. So it appears that the only solution to this problem is to enable CONFIG_HOTPLUG_CPU if SMP is enabled on PPC_PSERIES platforms. So make that change in the Kconfig. Reported-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de> Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc/pseries: Make 32-bit MSI quirk work on systems lacking firmware supportBrian King
Recent commit e61133dda480062d221f09e4fc18f66763f8ecd0 added support for a new firmware feature to force an adapter to use 32 bit MSIs. However, this firmware is not available for all systems. The hack below allows devices needing 32 bit MSIs to work on these systems as well. It is careful to only enable this on Gen2 slots, which should limit this to configurations where this hack is needed and tested to work. [Small change to factor out the hack into a separate function -- BenH] Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-24powerpc: Make radeon 32-bit MSI quirk work on powernvBenjamin Herrenschmidt
This moves the quirk itself to pci_64.c as to get built on all ppc64 platforms (the only ones with a pci_dn), factors the two implementations of get_pdn() into a single pci_get_dn() and use the quirk to do 32-bit MSIs on IODA based powernv platforms. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: select HAVE_CONTEXT_TRACKING for pSeriesLi Zhong
Start context tracking support from pSeries. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-14powerpc: Bring all threads online prior to migration/hibernationRobert Jennings
This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Cc: <stable@kernel.org> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-06powerpc/pseries: Perform proper max_bus_speed detectionKleber Sacilotto de Souza
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries. This is done by overwriting the weak pcibios_root_bridge_prepare function which is called by pci_create_root_bus(). From: Lucas Kannebley Tavares <lucaskt@linux.vnet.ibm.com> Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-06powerpc/pseries: Force 32 bit MSIs for devices that require itBrian King
The following patch implements a new PAPR change which allows the OS to force the use of 32 bit MSIs, regardless of what the PCI capabilities indicate. This is required for some devices that advertise support for 64 bit MSIs but don't actually support them. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-02Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlights this time around are: - A pile of addition POWER8 bits and nits, such as updated performance counter support (Michael Ellerman), new branch history buffer support (Anshuman Khandual), base support for the new PCI host bridge when not using the hypervisor (Gavin Shan) and other random related bits and fixes from various contributors. - Some rework of our page table format by Aneesh Kumar which fixes a thing or two and paves the way for THP support. THP itself will not make it this time around however. - More Freescale updates, including Altivec support on the new e6500 cores, new PCI controller support, and a pile of new boards support and updates. - The usual batch of trivial cleanups & fixes" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) powerpc: Fix build error for book3e powerpc: Context switch the new EBB SPRs powerpc: Turn on the EBB H/FSCR bits powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S powerpc: Setup BHRB instructions facility in HFSCR for POWER8 powerpc: Fix interrupt range check on debug exception powerpc: Update tlbie/tlbiel as per ISA doc powerpc: Print page size info during boot powerpc: print both base and actual page size on hash failure powerpc: Fix hpte_decode to use the correct decoding for page sizes powerpc: Decode the pte-lp-encoding bits correctly. powerpc: Use encode avpn where we need only avpn values powerpc: Reduce PTE table memory wastage powerpc: Move the pte free routines from common header powerpc: Reduce the PTE_INDEX_SIZE powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format powerpc: New hugepage directory format powerpc: Don't truncate pgd_index wrongly powerpc: Don't hard code the size of pte page powerpc: Save DAR and DSISR in pt_regs on MCE ...
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
2013-05-01ppc: Clean up scanlogDavid Howells
Clean up the pseries scanlog driver's use of procfs: (1) Don't need to save the proc_dir_entry pointer as we have the filename to remove with. (2) Save the scan log buffer pointer in a static variable (there is only one of it) and don't save it in the PDE (which doesn't have a destructor). Signed-off-by: David Howells <dhowells@redhat.com> cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> cc: Paul Mackerras <paulus@samba.org> cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01proc: Supply PDE attribute setting accessor functionsDavid Howells
Supply accessor functions to set attributes in proc_dir_entry structs. The following are supplied: proc_set_size() and proc_set_user(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> cc: linuxppc-dev@lists.ozlabs.org cc: linux-media@vger.kernel.org cc: netdev@vger.kernel.org cc: linux-wireless@vger.kernel.org cc: linux-pci@vger.kernel.org cc: netfilter-devel@vger.kernel.org cc: alsa-devel@alsa-project.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-30Merge tag 'pm+acpi-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael J Wysocki: - ARM big.LITTLE cpufreq driver from Viresh Kumar. - exynos5440 cpufreq driver from Amit Daniel Kachhap. - cpufreq core cleanup and code consolidation from Viresh Kumar and Stratos Karafotis. - cpufreq scalability improvement from Nathan Zimmer. - AMD "frequency sensitivity feedback" powersave bias for the ondemand cpufreq governor from Jacob Shin. - cpuidle code consolidation and cleanups from Daniel Lezcano. - ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano. - ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim, Lv Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto. - ACPI core updates related to hotplug from Toshi Kani, Paul Bolle, Yasuaki Ishimatsu, and Rafael J Wysocki. - Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements from Rafael J Wysocki and Andy Shevchenko. * tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (192 commits) cpufreq: Revert incorrect commit 5800043 cpufreq: MAINTAINERS: Add co-maintainer cpuidle: add maintainer entry ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points ARM: s3c64xx: cpuidle: use init/exit common routine cpufreq: pxa2xx: initialize variables ACPI: video: correct acpi_video_bus_add error processing SH: cpuidle: use init/exit common routine ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y ACPI: Fix wrong parameter passed to memblock_reserve cpuidle: fix comment format pnp: use %*phC to dump small buffers isapnp: remove debug leftovers ARM: imx: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: calxeda: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: OMAP4: cpuidle: use init/exit common routine ...
2013-04-30powerpc: Decode the pte-lp-encoding bits correctly.Aneesh Kumar K.V
We look at both the segment base page size and actual page size and store the pte-lp-encodings in an array per base page size. We also update all relevant functions to take actual page size argument so that we can use the correct PTE LP encoding in HPTE. This should also get the basic Multiple Page Size per Segment (MPSS) support. This is needed to enable THP on ppc64. [Fixed PR KVM build --BenH] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-30powerpc: Use signed formatting when printing errorAneesh Kumar K.V
PAPR defines these errors as negative values. So print them accordingly for easy debugging. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-29mm, hotplug: avoid compiling memory hotremove functions when disabledDavid Rientjes
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC pseries will return -EOPNOTSUPP if unsupported. Adding an #ifdef causes several other functions it depends on to also become unnecessary, which saves in .text when disabled (it's disabled in most defconfigs besides powerpc, including x86). remove_memory_block() becomes static since it is not referenced outside of drivers/base/memory.c. Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled and disabled. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-28Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: (51 commits) cpuidle: add maintainer entry ARM: s3c64xx: cpuidle: use init/exit common routine SH: cpuidle: use init/exit common routine cpuidle: fix comment format ARM: imx: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: calxeda: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: OMAP4: cpuidle: use init/exit common routine ARM: shmobile: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine ARM: OMAP3: cpuidle: use init/exit common routine ARM: at91: cpuidle: use init/exit common routine ARM: ux500: cpuidle: use init/exit common routine cpuidle: make a single register function for all ARM: ux500: cpuidle: replace for_each_online_cpu by for_each_possible_cpu cpuidle: remove en_core_tk_irqen flag ARM: OMAP3: remove cpuidle_wrap_enter ...
2013-04-26powerpc/pseries: Update CPU maps when device tree is updatedJesse Larrew
Platform events such as partition migration or the new PRRN firmware feature can cause the NUMA characteristics of a CPU to change, and these changes will be reflected in the device tree nodes for the affected CPUs. This patch registers a handler for Open Firmware device tree updates and reconfigures the CPU and node maps whenever the associativity changes. Currently, this is accomplished by marking the affected CPUs in the cpu_associativity_changes_mask and allowing arch_update_cpu_topology() to retrieve the new associativity information using hcall_vphn(). Protecting the NUMA cpu maps from concurrent access during an update operation will be addressed in a subsequent patch in this series. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Update firmware_has_feature() to check architecture vector ↵Nathan Fontenot
5 bits The firmware_has_feature() function makes it easy to check for supported features of the hypervisor. This patch extends the capability of firmware_has_feature() to include checking for specified bits in vector 5 of the architecture vector as reported in the device tree. As part of this the #defines used for the architecture vector are re-defined such that each option has the index into vector 5 and the feature bit encoded into it. This makes checking for architecture bits when initiating data for firmware_has_feature much easier. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Use ARRAY_SIZE to iterate over firmware_features_table arrayNathan Fontenot
When iterating over the entries in firmware_features_table we only need to go over the actual number of entries in the array instead of declaring it to be bigger and checking to make sure there is a valid entry in every slot. This patch removes the FIRMWARE_MAX_FEATURES #define and replaces the array looping with the use of ARRAY_SIZE(). Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Correct buffer parsing in update_dt_node()Nathan Fontenot
Correct parsing of the buffer returned from ibm,update-properties. The first element is a length and the path to the property which is slightly different from the list of properties in the buffer so we need to specifically handle this. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/pseries: Expose pseries devicetree_update()Nathan Fontenot
Newer firmware on Power systems can transparently reassign platform resources (CPU and Memory) in use. For instance, if a processor or memory unit is predicted to fail, the platform may transparently move the processing to an equivalent unused processor or the memory state to an equivalent unused memory unit. However, reassigning resources across NUMA boundaries may alter the performance of the partition. When such reassignment is necessary, the Platform Resource Reassignment Notification (PRRN) option provides a mechanism to inform the Linux kernel of changes to the NUMA affinity of its platform resources. When rtasd receives a PRRN event, it needs to make a series of RTAS calls (ibm,update-nodes and ibm,update-properties) to retrieve the updated device tree information. These calls are already handled in the pseries_devicetree_update() routine used in partition migration. This patch exposes pseries_devicetree_update() to make it accessible to other pseries routines, this patch also updates pseries_devicetree_update() to take a 32-bit scope parameter. The scope value, which was previously hard coded to 1 for partition migration, is used for the RTAS calls ibm,update-nodes/properties to update the device tree. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23cpuidle: remove en_core_tk_irqen flagDaniel Lezcano
The en_core_tk_irqen flag is set in all the cpuidle driver which means it is not necessary to specify this flag. Remove the flag and the code related to it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Kevin Hilman <khilman@linaro.org> # for mach-omap2/* Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-18powerpc/pseries: close DDW race between functions of adapterNishanth Aravamudan
Given a PCI device with multiple functions in a DDW capable slot, the following situation can be encountered: When the first function sets a 64-bit DMA mask, enable_ddw() will be called and we can fail to properly configure DDW (the most common reason being the new DMA window's size is not large enough to map all of an LPAR's memory). With the recent changes to DDW, we remove the base window in order to determine if the new window is of sufficient size to cover an LPAR's memory. We correctly replace the base window if we find that not to be the case. However, once we go through and re-configured 32-bit DMA via the IOMMU, the next function of the adapter will go through the same process. And since DDW is a characteristic of the slot itself, we are most likely going to fail again. But to determine we are going to fail the second slot, we again remove the base window -- but that is now in-use by the first function/driver, which might be issuing I/O already. To close this window, keep a list of all the failed struct device_nodes that have failed to configure DDW. If the current device_node is in that list, just fail out immediately and fall back to 32-bit DMA without doing any DDW manipulation. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18powerpc: Use VPA subfunction macros instead of numbers for vpa callsLi Zhong
Use macros in vpa calls. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-09procfs: new helper - PDE_DATA(inode)Al Viro
The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-08POWERPC: pseries: cpuidle: use time keeping flagDaniel Lezcano
The current code computes the idle time but that can be handled by the cpuidle framework if we enable the .en_core_tk_irqen flag. Set the flag and remove the code related to the time computation. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-08powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being ↵Michael Wolf
performed before the ANDCOND test Some versions of pHyp will perform the adjunct partition test before the ANDCOND test. The result of this is that H_RESOURCE can be returned and cause the BUG_ON condition to occur. The HPTE is not removed. So add a check for H_RESOURCE, it is ok if this HPTE is not removed as pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a specific HPTE to remove. So it is ok to just move on to the next slot and try again. Cc: stable@vger.kernel.org Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2013-03-05powerpc/pseries/hvcserver: Fix strncpy buffer limit in location codeChen Gang
the dest buf len is 80 (HVCS_CLC_LENGTH + 1). the src buf len is PAGE_SIZE. if src buf string len is more than 80, it will cause issue. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-23Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "So from the depth of frozen Minnesota, here's the powerpc pull request for 3.9. It has a few interesting highlights, in addition to the usual bunch of bug fixes, minor updates, embedded device tree updates and new boards: - Hand tuned asm implementation of SHA1 (by Paulus & Michael Ellerman) - Support for Doorbell interrupts on Power8 (kind of fast thread-thread IPIs) by Ian Munsie - Long overdue cleanup of the way we handle relocation of our open firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard - Support for saving/restoring & context switching the PPR (Processor Priority Register) on server processors that support it. This allows the kernel to preserve thread priorities established by userspace. By Haren Myneni. - DAWR (new watchpoint facility) support on Power8 by Michael Neuling - Ability to change the DSCR (Data Stream Control Register) which controls cache prefetching on a running process via ptrace by Alexey Kardashevskiy - Support for context switching the TAR register on Power8 (new branch target register meant to be used by some new specific userspace perf event interrupt facility which is yet to be enabled) by Ian Munsie. - Improve preservation of the CFAR register (which captures the origin of a branch) on various exception conditions by Paulus. - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where it belongs by Philippe De Muyter - Support for Transactional Memory on Power8 by Michael Neuling (based on original work by Matt Evans). For those curious about the feature, the patch contains a pretty good description." (See commit db8ff907027b: "powerpc: Documentation for transactional memory on powerpc" for the mentioned description added to the file Documentation/powerpc/transactional_memory.txt) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits) powerpc/kexec: Disable hard IRQ before kexec powerpc/85xx: l2sram - Add compatible string for BSC9131 platform powerpc/85xx: bsc9131 - Correct typo in SDHC device node powerpc/e500/qemu-e500: enable coreint powerpc/mpic: allow coreint to be determined by MPIC version powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct powerpc/85xx: Board support for ppa8548 powerpc/fsl: remove extraneous DIU platform functions arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test powerpc: Documentation for transactional memory on powerpc powerpc: Add transactional memory to pseries and ppc64 defconfigs powerpc: Add config option for transactional memory powerpc: Add transactional memory to POWER8 cpu features powerpc: Add new transactional memory state to the signal context powerpc: Hook in new transactional memory code powerpc: Routines for FP/VSX/VMX unavailable during a transaction powerpc: Add transactional memory unavaliable execption handler powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes powerpc: Add FP/VSX and VMX register load functions for transactional memory powerpc: Add helper functions for transactional memory context switching ...
2013-02-22new helper: file_inode(file)Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-08pseries/iommu: Remove DDW on kexecNishanth Aravamudan
pseries/iommu: remove DDW on kexec We currently insert a property in the device-tree when we successfully configure DDW for a given slot. This was meant to be an optimization to speed up kexec/kdump, so that we don't need to make the RTAS calls again to re-configured DDW in the new kernel. However, we end up tripping a plpar_tce_stuff failure on kexec/kdump because we unconditionally parse the ibm,dma-window property for the node at bus/dev setup time. This property contains the 32-bit DMA window LIOBN, which is distinct from the DDW window's. We pass that LIOBN (via iommu_table_init -> iommu_table_clear -> tce_free -> tce_freemulti_pSeriesLP) to plpar_tce_stuff, which fails because that 32-bit window is no longer present after 25ebc45b93452d0bc60271f178237123c4b26808 ("powerpc/pseries/iommu: remove default window before attempting DDW manipulation"). I believe the simplest, easiest-to-maintain fix is to just change our initcall to, rather than detecting and updating the new kernel's DDW knowledge, just remove all DDW configurations. When the drivers re-initialize, we will set everything back up as it was before. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-08pseries/iommu: Restore_default_window does not use liobn parameterNishanth Aravamudan
The parameter is unused, and complicates a following fix. Just remove it. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29pseries/iommu: Ensure TCEs are cleared with non-huge DDWNishanth Aravamudan
There are now two kinds of DMA windows that might be presented by PowerVM DDW support -- huge windows (that can map all of system memory regardless of the LPAR configuration) and non-huge windows (which can't). They are implemented slightly differently in PowerVM, and thus have different characteristics. The most obvious is that slot isolate doesn't clear the TCEs/window for us with non-huge windows. Thus, when a DLPAR operation occurs on a slot using a non-huge window, TCEs are still present (the notifier chain doesn't currently remove them explicitly) and the DLPAR fails. Fix this by calling remove_ddw() first, which will unmap the DDW TCEs. Note: a corresponding change to drmgr is needed to actually successfully DLPAR, such that the device-tree update (which causes the notifier chain to fire) occurs before slot isolate. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29pseries/iommu: Fix iteration in DDW TCE clearrangeNishanth Aravamudan
tce_clearrange_multi_pSeriesLP is attempting to iterate over all TCEs in a given range. However, is it not advancing the dma_offset value passed to plpar_tce_stuff via the next value. This prevents DLPAR from completing, because TCEs are still present at slot isolation time. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-27cputime: Generic on-demand virtual cputime accountingFrederic Weisbecker
If we want to stop the tick further idle, we need to be able to account the cputime without using the tick. Virtual based cputime accounting solves that problem by hooking into kernel/user boundaries. However implementing CONFIG_VIRT_CPU_ACCOUNTING require low level hooks and involves more overhead. But we already have a generic context tracking subsystem that is required for RCU needs by archs which plan to shut down the tick outside idle. This patch implements a generic virtual based cputime accounting that relies on these generic kernel/user hooks. There are some upsides of doing this: - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING if context tracking is already built (already necessary for RCU in full tickless mode). - We can rely on the generic context tracking subsystem to dynamically (de)activate the hooks, so that we can switch anytime between virtual and tick based accounting. This way we don't have the overhead of the virtual accounting when the tick is running periodically. And one downside: - There is probably more overhead than a native virtual based cputime accounting. But this relies on hooks that are already set anyway. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2013-01-10powerpc/eeh: Fix crash when adding a device in a slot with DDWThadeu Lima de Souza Cascardo
The DDW code uses a eeh_dev struct from the pci_dev. However, this is not set until eeh_add_device_late is called. Since pci_bus_add_devices is called before eeh_add_device_late, the PCI devices are added to the bus, making drivers' probe hooks to be called. These will call set_dma_mask, which will call the DDW code, which will require the eeh_dev struct from pci_dev. This would result in a crash, due to a NULL dereference. Calling eeh_add_device_late after pci_bus_add_devices would make the system BUG, because device files shouldn't be added to devices there were not added to the system. So, a new function is needed to add such files only after pci_bus_add_devices have been called. Cc: stable@vger.kernel.org Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Add the DAWR support to the set_break()Michael Neuling
This adds DAWR supoprt to the set_break(). It does both bare metal and PAPR versions of setting the DAWR. There is still some work we can do to make full use of the watchpoint but that will come later. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Add helper functions set the DAWR and CIABR using set_modeIan Munsie
These are just wrappers around the new set_mode HCALL. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Disable relocation on exceptions whenever PR KVM is activeIan Munsie
For PR KVM we allow userspace to map 0xc000000000000000. Because transitioning from userspace to the guest kernel may use the relocated exception vectors we have to disable relocation on exceptions whenever PR KVM is active as we cannot trust that address. This issue does not apply to HV KVM, since changing from a guest to the hypervisor will never use the relocated exception vectors. Currently the hypervisor interface only allows us to toggle relocation on exceptions on a partition wide scope, so we need to globally disable relocation on exceptions when the first PR KVM instance is started and only re-enable them when all PR KVM instances have been destroyed. It's a bit heavy handed, but until the hypervisor gives us a lightweight way to toggle relocation on exceptions on a single thread it's only real option. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Build kernel with -mcmodel=mediumAnton Blanchard
Finally remove the two level TOC and build with -mcmodel=medium. Unfortunately we can't build modules with -mcmodel=medium due to the tricks the kernel module loader plays with percpu data: # -mcmodel=medium breaks modules because it uses 32bit offsets from # the TOC pointer to create pointers where possible. Pointers into the # percpu data area are created by this method. # # The kernel module loader relocates the percpu data section from the # original location (starting with 0xd...) to somewhere in the base # kernel percpu data space (starting with 0xc...). We need a full # 64bit relocation for this to work, hence -mcmodel=large. On older kernels we fall back to the two level TOC (-mminimal-toc) Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc/pseries/pci: Use NULL instead of 0 for pointersTushar Behera
The third argument for of_get_property() is a pointer, hence pass NULL instead of 0. Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Hook up doorbells on serverIan Munsie
This patch actually hooks up doorbell interrupts on POWER8: - Select the PPC_DOORBELL Kconfig option from PPC_PSERIES - Add the doorbell CPU feature bit to POWER8 - We define a new pSeries_cause_ipi_mux() function that issues a doorbell interrupt if the recipient is another thread within the same core as the sender. If the recipient is in a different core it falls back to using XICS to deliver the IPI as before. - During pSeries_smp_probe() at boot, we check if doorbell interrupts are supported. If they are we set the cause_ipi function pointer to the above mentioned function, otherwise we leave it as whichever XICS cause_ipi function was determined by xics_smp_probe(). Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc/pseries: Cleanup best_energy_hcall detectionMichael Neuling
Currently we search for the best_energy hcall using a custom function. Move this to using the firmware_feature_table. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> cc: Linux PPC dev <linuxppc-dev@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc/pseries: Allow firmware features to match partial stringsMichael Neuling
This allows firmware_features_table names to add a '*' at the end so that only partial strings are matched. When a '*' is added, only upto the '*' is matched when setting firmware feature bits. This is useful for the matching best energy feature. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> cc: Linux PPC dev <linuxppc-dev@ozlabs.org> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>