summaryrefslogtreecommitdiff
path: root/arch/arm/mm
AgeCommit message (Collapse)Author
2017-07-05ARM: 8685/1: ensure memblock-limit is pmd-alignedDoug Berger
commit 9e25ebfe56ece7541cd10a20d715cbdd148a2e06 upstream. The pmd containing memblock_limit is cleared by prepare_page_table() which creates the opportunity for early_alloc() to allocate unmapped memory if memblock_limit is not pmd aligned causing a boot-time hang. Commit 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") attempted to resolve this problem, but there is a path through the adjust_lowmem_bounds() routine where if all memory regions start and end on pmd-aligned addresses the memblock_limit will be set to arm_lowmem_limit. Since arm_lowmem_limit can be affected by the vmalloc early parameter, the value of arm_lowmem_limit may not be pmd-aligned. This commit corrects this oversight such that memblock_limit is always rounded down to pmd-alignment. Fixes: 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") Signed-off-by: Doug Berger <opendmb@gmail.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-24mm: larger stack guard gap, between vmasHugh Dickins
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14ARM: 8637/1: Adjust memory boundaries after reservationsLaura Abbott
commit 985626564eedc470ce2866e53938303368ad41b7 upstream. adjust_lowmem_bounds is responsible for setting up the boundary for lowmem/highmem. This needs to be setup before memblock reservations can occur. At the time memblock reservations can occur, memory can also be removed from the system. The lowmem/highmem boundary and end of memory may be affected by this but it is currently not recalculated. On some systems this may be harmless, on others this may result in incorrect ranges being passed to the main memory allocator. Correct this by recalculating the lowmem/highmem boundary after all reservations have been made. Tested-by: Magnus Lilja <lilja.magnus@gmail.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Julien Grall <julien.grall@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14ARM: 8636/1: Cleanup sanity_check_meminfoLaura Abbott
commit 374d446d25d6271ee615952a3b7f123ba4983c35 upstream. The logic for sanity_check_meminfo has become difficult to follow. Clean up the code so it's more obvious what the code is actually trying to do. Additionally, meminfo is now removed so rename the function to better describe its purpose. Tested-by: Magnus Lilja <lilja.magnus@gmail.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Julien Grall <julien.grall@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 callVladimir Murzin
commit 6d80594936914e798b1b54b3bfe4bd68d8418966 upstream. We save/restore registers around v7m_invalidate_l1 to address pointed by r12, which is vector table, so the first eight entries are overwritten with a garbage. We already have stack setup at that stage, so use it to save/restore register. Fixes: 6a8146f420be ("ARM: 8609/1: V7M: Add support for the Cortex-M7 processor") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-14ARM: 8642/1: LPAE: catch pending imprecise abort on unmaskAlexander Sverdlin
commit 97a98ae5b8acf08d07d972c087b2def060bc9b73 upstream. Asynchronous external abort is coded differently in DFSR with LPAE enabled. Fixes: 9254970c "ARM: 8447/1: catch pending imprecise abort on unmask". Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-15ARM: 8628/1: dma-mapping: preallocate DMA-debug hash tables in core_initcallMarek Szyprowski
fs_initcall is definitely too late to initialize DMA-debug hash tables, because some drivers might get probed and use DMA mapping framework already in core_initcall. Late initialization of DMA-debug results in false warning about accessing memory, that was not allocated, like this one: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at lib/dma-debug.c:1104 check_unmap+0xa1c/0xe50 exynos-sysmmu 10a60000.sysmmu: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000006ebd0000] [size=16384 bytes] Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc5-00028-g39dde3d-dirty #44 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c0119dd4>] (unwind_backtrace) from [<c01122bc>] (show_stack+0x20/0x24) [<c01122bc>] (show_stack) from [<c062714c>] (dump_stack+0x84/0xa0) [<c062714c>] (dump_stack) from [<c0132560>] (__warn+0x14c/0x180) [<c0132560>] (__warn) from [<c01325dc>] (warn_slowpath_fmt+0x48/0x50) [<c01325dc>] (warn_slowpath_fmt) from [<c06814f8>] (check_unmap+0xa1c/0xe50) [<c06814f8>] (check_unmap) from [<c06819c4>] (debug_dma_unmap_page+0x98/0xc8) [<c06819c4>] (debug_dma_unmap_page) from [<c076c3e8>] (exynos_iommu_domain_free+0x158/0x380) [<c076c3e8>] (exynos_iommu_domain_free) from [<c0764a30>] (iommu_domain_free+0x34/0x60) [<c0764a30>] (iommu_domain_free) from [<c011f168>] (release_iommu_mapping+0x30/0xb8) [<c011f168>] (release_iommu_mapping) from [<c011f23c>] (arm_iommu_release_mapping+0x4c/0x50) [<c011f23c>] (arm_iommu_release_mapping) from [<c0b061ac>] (s5p_mfc_probe+0x640/0x80c) [<c0b061ac>] (s5p_mfc_probe) from [<c07e6750>] (platform_drv_probe+0x70/0x148) [<c07e6750>] (platform_drv_probe) from [<c07e25c0>] (driver_probe_device+0x12c/0x6b0) [<c07e25c0>] (driver_probe_device) from [<c07e2c6c>] (__driver_attach+0x128/0x17c) [<c07e2c6c>] (__driver_attach) from [<c07df74c>] (bus_for_each_dev+0x88/0xc8) [<c07df74c>] (bus_for_each_dev) from [<c07e1b6c>] (driver_attach+0x34/0x58) [<c07e1b6c>] (driver_attach) from [<c07e1350>] (bus_add_driver+0x18c/0x32c) [<c07e1350>] (bus_add_driver) from [<c07e4198>] (driver_register+0x98/0x148) [<c07e4198>] (driver_register) from [<c07e5cb0>] (__platform_driver_register+0x58/0x74) [<c07e5cb0>] (__platform_driver_register) from [<c174cb30>] (s5p_mfc_driver_init+0x1c/0x20) [<c174cb30>] (s5p_mfc_driver_init) from [<c0102690>] (do_one_initcall+0x64/0x258) [<c0102690>] (do_one_initcall) from [<c17014c0>] (kernel_init_freeable+0x3d0/0x4d0) [<c17014c0>] (kernel_init_freeable) from [<c116eeb4>] (kernel_init+0x18/0x134) [<c116eeb4>] (kernel_init) from [<c010bbd8>] (ret_from_fork+0x14/0x3c) ---[ end trace dc54c54bd3581296 ]--- This patch moves initialization of DMA-debug to core_initcall. This is safe from the initialization perspective. dma_debug_do_init() internally calls debugfs functions and debugfs also gets initialised at core_initcall(), and that is earlier than arch code in the link order, so it will get initialized just before the DMA-debug. Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-11-15ARM: 8624/1: proc-v7m.S: fix init section nameNicolas Pitre
There is no .text.init sections. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-10-19ARM: fix oops when using older ARMv4T CPUsRussell King
Alexander Shiyan reports that CLPS711x fails at boot time in the data exception handler due to a NULL pointer dereference. This is caused by the late-v4t abort handler overwriting R9 (which becomes zero). Fix this by making the abort handler save and restore R9. Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = c3b58000 [00000008] *pgd=800000000, *pte=00000000, *ppte=feff4140 Internal error: Oops: 63c11817 [#1] PREEMPT ARM CPU: 0 PID: 448 Comm: ash Not tainted 4.8.1+ #1 Hardware name: Cirrus Logic CLPS711X (Device Tree Support) task: c39e03a0 ti: c3b4e000 task.ti: c3b4e000 PC is at __dabt_svc+0x4c/0x60 LR is at do_page_fault+0x144/0x2ac pc : [<c000d3ac>] lr : [<c000fcec>] psr: 60000093 sp : c3b4fe6c ip : 00000001 fp : b6f1bf88 r10: c387a5a0 r9 : 00000000 r8 : e4e0e001 r7 : bee3ef83 r6 : 00100000 r5 : 80000013 r4 : c022fcf8 r3 : 00000000 r2 : 00000008 r1 : bf000000 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 0000217f Table: c3b58055 DAC: 00000055 Process ash (pid: 448, stack limit = 0xc3b4e190) Stack: (0xc3b4fe6c to 0xc3b50000) fe60: bee3ef83 c05168d1 ffffffff 00000000 c3adfe80 fe80: c3a03300 00000000 c3b4fed0 c3a03400 bee3ef83 c387a5a0 b6f1bf88 00000001 fea0: c3b4febc 00000076 c022fcf8 80000013 ffffffff 0000003f bf000000 bee3ef83 fec0: 00000004 00000000 c3adfe80 c00e432c 00000812 00000005 00000001 00000006 fee0: b6f1b000 00000000 00010000 0003c944 0004d000 0004d439 00010000 b6f1b000 ff00: 00000005 00000000 00015ecc c3b4fed0 0000000a 00000000 00000000 c00a1dc0 ff20: befff000 c3a03300 c3b4e000 c0507cd8 c0508024 fffffff8 c3a03300 00000000 ff40: c0516a58 c00a35bc c39e03a0 000001c0 bea84ce8 0004e008 c3b3a000 c00a3ac0 ff60: c3b40374 c3b3a000 bea84d11 00000000 c0500188 bea84d11 bea84ce8 00000001 ff80: 0000000b c000a304 c3b4e000 00000000 bea84ce4 c00a3cd0 00000000 bea84d11 ffa0: bea84ce8 c000a160 bea84d11 bea84ce8 bea84d11 bea84ce8 0004e008 0004d450 ffc0: bea84d11 bea84ce8 00000001 0000000b b6f45ee4 00000000 b6f5ff70 bea84ce4 ffe0: b6f2f130 bea84cb0 b6f2f194 b6ef29f4 a0000010 bea84d11 02c7cffa 02c7cffd [<c000d3ac>] (__dabt_svc) from [<c022fcf8>] (__copy_to_user_std+0xf8/0x330) [<c022fcf8>] (__copy_to_user_std) from [<c00e432c>] +(load_elf_binary+0x920/0x107c) [<c00e432c>] (load_elf_binary) from [<c00a35bc>] +(search_binary_handler+0x80/0x16c) [<c00a35bc>] (search_binary_handler) from [<c00a3ac0>] +(do_execveat_common+0x418/0x600) [<c00a3ac0>] (do_execveat_common) from [<c00a3cd0>] (do_execve+0x28/0x30) [<c00a3cd0>] (do_execve) from [<c000a160>] (ret_fast_syscall+0x0/0x30) Code: e1a0200d eb00136b e321f093 e59d104c (e5891008) ---[ end trace 4b4f8086ebef98c5 ]--- Fixes: e6978e4bf181 ("ARM: save and reset the address limit when entering an exception") Reported-by: Alexander Shiyan <shc_work@mail.ru> Tested-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-10-11Merge branch 'work.uaccess2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess.h prepwork from Al Viro: "Preparations to tree-wide switch to use of linux/uaccess.h (which, obviously, will allow to start unifying stuff for real). The last step there, ie PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ `git grep -l "$PATT"|grep -v ^include/linux/uaccess.h` is not taken here - I would prefer to do it once just before or just after -rc1. However, everything should be ready for it" * 'work.uaccess2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: remove a stray reference to asm/uaccess.h in docs sparc64: separate extable_64.h, switch elf_64.h to it score: separate extable.h, switch module.h to it mips: separate extable.h, switch module.h to it x86: separate extable.h, switch sections.h to it remove stray include of asm/uaccess.h from cacheflush.h mn10300: remove a bogus processor.h->uaccess.h include xtensa: split uaccess.h into C and asm sides bonding: quit messing with IOCTL kill __kernel_ds_p off mn10300: finish verify_area() off frv: move HAVE_ARCH_UNMAPPED_AREA to pgtable.h exceptions: detritus removal
2016-10-07Merge tag 'armsoc-soc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform updates from Arnd Bergmann: "These are updates for platform specific code on 32-bit ARM machines, essentially anything that can not (yet) be expressed using DT files. Noteworthy changes include: - We get support for running in big-endian mode on two platforms: sunxi (Allwinner) and s3c24xx (old Samsung). - The recently added Uniphier platform now uses standard PSCI methods for SMP booting and we remove support for old bootloader versions that did not support it yet. - In sunxi, we gain support for the "Nextthing GR8" SoC, which is a close relative of the Allwinner A13 and R8 chips. - PXA completes its move over to the generic dmaengine framework and removes its old private API - mach-bcm gains support for BCM47189/BCM53573, their first ARM SoC with integrated 802.11ac wireless networking" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits) ARM: imx legacy: pca100: move peripheral initialization to .init_late ARM: imx legacy: mx27ads: move peripheral initialization to .init_late ARM: imx legacy: mx21ads: move peripheral initialization to .init_late ARM: imx legacy: pcm043: move peripheral initialization to .init_late ARM: imx legacy: mx35-3ds: move peripheral initialization to .init_late ARM: imx legacy: mx27-3ds: move peripheral initialization to .init_late ARM: imx legacy: imx27-visstrim-m10: move peripheral initialization to .init_late ARM: imx legacy: vpr200: move peripheral initialization to .init_late ARM: imx legacy: mx31moboard: move peripheral initialization to .init_late ARM: imx legacy: armadillo5x0: move peripheral initialization to .init_late ARM: imx legacy: qong: move peripheral initialization to .init_late ARM: imx legacy: mx31-3ds: move peripheral initialization to .init_late ARM: imx legacy: pcm037: move peripheral initialization to .init_late ARM: imx legacy: mx31lilly: move peripheral initialization to .init_late ARM: imx legacy: mx31ads: move peripheral initialization to .init_late ARM: imx legacy: mx31lite: move peripheral initialization to .init_late ARM: imx legacy: kzm: move peripheral initialization to .init_late MAINTAINERS: update list of Oxnas maintainers ARM: orion5x: remove extraneous NO_IRQ ARM: orion: simplify orion_ge00_switch_init ...
2016-10-06Merge tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull dmaengine updates from Vinod Koul: "This is bit large pile of code which bring in some nice additions: - Error reporting: we have added a new mechanism for users of dmaenegine to register a callback_result which tells them the result of the dma transaction. Right now only one user (ntb) is using it. - As we discussed on KS mailing list and pointed out NO_IRQ has no place in kernel, this also remove NO_IRQ from dmaengine subsystem (both arm and ppc users) - Support for IOMMU slave transfers and its implementation for arm. - To get better build coverage, enable COMPILE_TEST for bunch of driver, and fix the warning and sparse complaints on these. - Apart from above, usual updates spread across drivers" * tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (169 commits) async_pq_val: fix DMA memory leak dmaengine: virt-dma: move function declarations dmaengine: omap-dma: Enable burst and data pack for SG DT: dmaengine: rcar-dmac: document R8A7743/5 support dmaengine: fsldma: Unmap region obtained by of_iomap dmaengine: jz4780: fix resource leaks on error exit return dma-debug: fix ia64 build, use PHYS_PFN dmaengine: coh901318: fix integer overflow when shifting more than 32 places dmaengine: edma: avoid uninitialized variable use dma-mapping: fix m32r build warning dma-mapping: fix ia64 build, use PHYS_PFN dmaengine: ti-dma-crossbar: enable COMPILE_TEST dmaengine: omap-dma: enable COMPILE_TEST dmaengine: edma: enable COMPILE_TEST dmaengine: ti-dma-crossbar: Fix of_device_id data parameter usage dmaengine: ti-dma-crossbar: Correct type for of_find_property() third parameter dmaengine/ARM: omap-dma: Fix the DMAengine compile test on non OMAP configs dmaengine: edma: Rename set_bits and remove unused clear_bits helper dmaengine: edma: Use correct type for of_find_property() third parameter dmaengine: edma: Fix of_device_id data parameter usage (legacy vs TPCC) ...
2016-10-06Merge branches 'misc' and 'sa1111-base' into for-linusRussell King
2016-09-27exceptions: detritus removalAl Viro
externs and defines for stuff that is never used Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-26arm: dma-mapping: add {map,unmap}_resource for iommu opsNiklas Söderlund
Add methods to map/unmap device resources addresses for dma_map_ops that are IOMMU aware. This is needed to map a device MMIO register from a physical address. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-09-20Merge tag 'uniphier-soc-v4.9' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier into next/soc Pull "UniPhier ARM SoC updates for v4.9" from Masahiro Yamada: * Remove unneeded SMP code * tag 'uniphier-soc-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier: ARM: uniphier: remove SoC-specific SMP code
2016-09-12ARM: 8612/1: LPAE: initialize cache policy correctlyStefan Agner
The cachepolicy variable gets initialized using a masked pmd value. So far, the pmd has been masked with flags valid for the 2-page table format, but the 3-page table format requires a different mask. On LPAE, this lead to a wrong assumption of what initial cache policy has been used. Later a check forces the cache policy to writealloc and prints the following warning: Forcing write-allocate cache policy for SMP This patch introduces a new definition PMD_SECT_CACHE_MASK for both page table formats which masks in all cache flags in both cases. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8611/1: l2x0: add PMU supportMark Rutland
The L2C-220 (AKA L220) and L2C-310 (AKA PL310) cache controllers feature a Performance Monitoring Unit (PMU), which can be useful for tuning and/or debugging. This hardware is always present and the relevant registers are accessible to non-secure accesses. Thus, no special firmware interface is necessary. This patch adds support for the PMU, plugging into the usual perf infrastructure. The overflow interrupt is not always available (e.g. on RealView PBX A9 it is not wired up at all), and the hardware counters saturate, so the driver does not make use of this. Instead, the driver periodically polls and reset counters as required to avoid losing events due to saturation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8610/1: V7M: Add dsb before jumping in handler modeTorgue Alexandre
According to ARM AN321 (section 4.12): "If the vector table is in writable memory such as SRAM, either relocated by VTOR or a device dependent memory remapping mechanism, then architecturally a memory barrier instruction is required after the vector table entry is updated, and if the exception is to be activated immediately" Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8609/1: V7M: Add support for the Cortex-M7 processorJonathan Austin
Cortex-M7 is a new member of the V7M processor family that adds, among other things, caches over the features available in Cortex-M4. This patch adds support for recognising the processor at boot time, and make use of recently introduced cache functions. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8608/1: V7M: Indirect proc_info construction for V7M CPUsJonathan Austin
This patch copies the method used for V7A/R CPUs to specify differing processor info for different cores. This patch differentiates Cortex-M3 and Cortex-M4 and leaves a fallback case for any other V7M processors. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8607/1: V7M: Wire up caches for V7M processors with cache support.Jonathan Austin
This patch does the plumbing required to invoke the V7M cache code added in earlier patches in this series, although there is no users for that yet. In order to honour the I/D cache disable config options, this patch changes the mechanism by which the CCR is set on boot, to be more like V7A/R. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06ARM: 8606/1: V7M: introduce cache operationsVladimir Murzin
This commit implements the cache operation for V7M. It is based on V7 counterpart and differs as follows: - cache operations are memory mapped - only Thumb instruction set is supported - we don't handle user access faults Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-29ARM: uniphier: remove SoC-specific SMP codeMasahiro Yamada
The UniPhier architecture (32bit) switched over to PSCI. Remove the SoC-specific SMP operations. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2016-08-23ARM: 8599/1: mm: pull asm/memory.h explicitlyVladimir Murzin
Commit d78114554939a (""ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL"") introduced a macro which lives under asm/memory.h. Unfortunately, for MMU-less systems (like R-class) it leads to build failure: arch/arm/mm/proc-v7.S: Assembler messages: arch/arm/mm/proc-v7.S:538: Error: unrecognised relocation suffix make[1]: *** [arch/arm/mm/proc-v7.o] Error 1 make: *** [arch/arm/mm] Error 2 since it is implicitly pulled via asm/pgtable.h for MMU capable systems only. To fix it include asm/memory.h explicitly. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12ARM: 8595/2: apply more __ro_after_initKees Cook
Guided by grsecurity's analogous __read_only markings in arch/arm, this applies several uses of __ro_after_init to structures that are only updated during __init. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12ARM: 8593/1: cache-l2x0.c: Do not clear bit 23 in prefetch control registerAndrey Smirnov
As per L2C-310 TRM[1]: "... You can control this feature using bits 30,27 and 23 of the Prefetch Control Register. Bit 23 and 27 are only used if you set bit 30 HIGH..." which means there is no need to clear bit 23 if bit 30 is being cleared. [1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0246e/CJAJACBJ.html Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12ARM: 8592/1: cache-l2x0.c: Replace magic numbersAndrey Smirnov
Replace magic numbers used for L310 Prefetch Control Register Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12ARM: 8587/1: dma-mapping: Use %zu for printing a size_t variableFabio Estevam
According to Documentation/printk-formats.txt when printing a size_t variable we should use %zu or %zx format specifiers. As we are printing a memory size value, we should better use %zu in this case. Reported-by: Frank Mori Hess <fmh6jj@gmail.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-09ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocationsArd Biesheuvel
The late_alloc() PTE allocation function used by create_mapping_late() does not call pgtable_page_ctor() on PTE pages it allocates, leaving the per-page spinlock uninitialized. Since generic page table manipulation code may assume that translation table pages that are not owned by init_mm are covered by fully constructed struct pages, the following crash may occur with the new UEFI memory attributes table code. efi: memattr: Processing EFI Memory Attributes table: efi: memattr: 0x0000ffa16000-0x0000ffa82fff [Runtime Code |RUN| | |XP| | | | | | | | ] Unable to handle kernel NULL pointer dereference at virtual address 00000010 pgd = c0204000 [00000010] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361 Hardware name: Generic DT based system task: ed858000 ti: ed842000 task.ti: ed842000 PC is at __lock_acquire+0xa0/0x19a8 ... [<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88) [<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c) [<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238) [<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c) [<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378) [<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c) [<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174) [<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8) [<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114) [<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24) The crash is due to the fact that the UEFI page tables are not owned by init_mm, but are not covered by fully constructed struct pages. Given that the UEFI subsystem is currently the only user of create_mapping_late(), add an unconditional call to pgtable_page_ctor() to late_alloc(). Fixes: 9fc68b717c24 ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-09ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limitNicolas Pitre
To limit the amount of mapped low memory, we determine a physical address boundary based on the start of the vmalloc area using __pa(). Strictly speaking, the vmalloc area location is arbitrary and does not necessarily corresponds to a valid physical address. For example, if PAGE_OFFSET = 0x80000000 PHYS_OFFSET = 0x90000000 vmalloc_min = 0xf0000000 then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t is a 32-bit type. Then the code that follows determines that the entire physical memory is above that boundary and no low memory gets mapped at all: |[...] |Machine model: Freescale i.MX51 NA04 Board |Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM) |Consider using a HIGHMEM enabled kernel. To avoid this problem let's make vmalloc_limit a 64-bit value all the time and determine that boundary explicitly without using __pa(). Reported-by: Emil Renner Berthing <kernel@esmil.dk> Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Emil Renner Berthing <kernel@esmil.dk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the next part of the hotplug rework. - Convert all notifiers with a priority assigned - Convert all CPU_STARTING/DYING notifiers The final removal of the STARTING/DYING infrastructure will happen when the merge window closes. Another 700 hundred line of unpenetrable maze gone :)" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) timers/core: Correct callback order during CPU hot plug leds/trigger/cpu: Move from CPU_STARTING to ONLINE level powerpc/numa: Convert to hotplug state machine arm/perf: Fix hotplug state machine conversion irqchip/armada: Avoid unused function warnings ARC/time: Convert to hotplug state machine clocksource/atlas7: Convert to hotplug state machine clocksource/armada-370-xp: Convert to hotplug state machine clocksource/exynos_mct: Convert to hotplug state machine clocksource/arm_global_timer: Convert to hotplug state machine rcu: Convert rcutree to hotplug state machine KVM/arm/arm64/vgic-new: Convert to hotplug state machine smp/cfd: Convert core to hotplug state machine x86/x2apic: Convert to CPU hotplug state machine profile: Convert to hotplug state machine timers/core: Convert to hotplug state machine hrtimer: Convert to hotplug state machine x86/tboot: Convert to hotplug state machine arm64/armv8 deprecated: Convert to hotplug state machine hwtracing/coresight-etm4x: Convert to hotplug state machine ...
2016-07-29Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Included in this update are: - Patches from Gregory Clement to fix the coherent DMA cases in our dma-mapping code. - A number of CPU errata updates and fixes. - ARM cpuidle improvements from Jisheng Zhang. - Fix from Kees for the location of _etext. - Cleanups from Masahiro Yamada to avoid duplicated messages during the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS. - Remove a udelay loop limitation, allowing for faster CPUs to calibrate the delay correctly. - Cleanup some left-overs from the SW PAN implementation. - Ensure that a modified address limit is not visible to exception handlers" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits) ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421 ARM: 8559/1: errata: Workaround erratum A12 821420 ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423 ARM: save and reset the address limit when entering an exception ARM: 8577/1: Fix Cortex-A15 798181 errata initialization ARM: 8584/1: floppy: avoid gcc-6 warning ARM: 8583/1: mm: fix location of _etext ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS ARM: 8306/1: loop_udelay: remove bogomips value limitation ARM: 8581/1: add missing <asm/prom.h> to arch/arm/kernel/devtree.c ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready" ARM: 8556/1: on a generic DT system: do not touch l2x0 ARM: uaccess: remove put_user() code duplication ARM: 8580/1: Remove orphaned __addr_ok() definition ARM: get rid of horrible *(unsigned int *)(regs + 1) ARM: introduce svc_pt_regs structure ...
2016-07-26mm: do not pass mm_struct into handle_mm_faultKirill A. Shutemov
We always have vma->vm_mm around. Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26arm: get rid of superfluous __GFP_REPEATMichal Hocko
__GFP_REPEAT has a rather weak semantic but since it has been introduced around 2.6.12 it has been ignored for low order allocations. PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this flag is for more than order-2. This means that this flag has never been actually useful here because it has always been used only for PAGE_ALLOC_COSTLY requests. Link: http://lkml.kernel.org/r/1464599699-30131-5-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-15arm/l2c: Convert to hotplug state machineRichard Cochran
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Brad Mouring <brad.mouring@ni.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153336.801270887@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is usedGregory CLEMENT
When doing dma allocation with IOMMU the __iommu_alloc_atomic() was used even when the system was coherent. However, this function allocates from a non-cacheable pool, which is fine when the device is not cache coherent but won't work as expected in the device is cache coherent. Indeed, the CPU and device must access the memory using the same cacheability attributes. Moreover when the devices are coherent, the mmap call must not change the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs has been updated in the same way that it was done for the arm_dma_mmap in commit 55af8a91640d ("ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap"). Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherentGregory CLEMENT
When a L2 cache controller is used in a system that provides hardware coherency, the entire outer cache operations are useless, and can be skipped. Moreover, on some systems, it is harmful as it causes deadlocks between the Marvell coherency mechanism, the Marvell PCIe controller and the Cortex-A9. In the current kernel implementation, the outer cache flush range operation is triggered by the dma_alloc function. This operation can be take place during runtime and in some circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x SoCs. This patch extends the __dma_clear_buffer() function to receive a boolean argument related to the coherency of the system. The same things is done for the calling functions. Reported-by: Nadav Haklai <nadavh@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421Doug Anderson
The workaround for both errata is to set bit 24 in the diagnostic register. There are no known end-user bugs solved by fixing this errata, but the fix is trivial and it seems sane to apply it. The arguments for why this needs to be in the kernel are similar to the arugments made in the patch "Workaround errata A12 818325/852422 A17 852423". Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14ARM: 8559/1: errata: Workaround erratum A12 821420Doug Anderson
This erratum has a very simple workaround (set a bit in a register), so let's apply it. Apparently the workaround's downside is a very slight power impact. Note that applying this errata fixes deadlocks that are easy to reproduce with real world applications. The arguments for why this needs to be in the kernel are similar to the arugments made in the patch "Workaround errata A12 818325/852422 A17 852423". Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423Doug Anderson
There are several similar errata on Cortex A12 and A17 that all have the same workaround: setting bit[12] of the Feature Register. Technically the list of errata are: - A12 818325: Execution of an UNPREDICTABLE STR or STM instruction might deadlock. Fixed in r0p1. - A12 852422: Execution of a sequence of instructions might lead to either a data corruption or a CPU deadlock. Not fixed in any A12s yet. - A17 852423: Execution of a sequence of instructions might lead to either a data corruption or a CPU deadlock. Not fixed in any A17s yet. Since A12 got renamed to A17 it seems likely that there won't be any future Cortex-A12 cores, so we'll enable for all Cortex-A12. For Cortex-A17 I believe that all known revisions are affected and that all knows revisions means <= r1p2. Presumably if a new A17 was released it would have this problem fixed. Note that in <https://patchwork.kernel.org/patch/4735341/> folks previously expressed opposition to this change because: A) It was thought to only apply to r0p0 and there were no known r0p0 boards supported in mainline. B) It was argued that such a workaround beloned in firmware. Now that this same fix solves other errata on real boards (like rk3288) point A) is addressed. Point B) is impossible to address on boards like rk3288. On rk3288 the firmware doesn't stay resident in RAM and isn't involved at all in the suspend/resume process nor in the SMP bringup process. That means that the most the firmware could do would be to set the bit on "core 0" and this bit would be lost at suspend/resume time. It is true that we could write a "generic" solution that saved the boot-time "core 0" value of this register and applied it at SMP bringup / resume time. However, since this register (described as the "Feature Register" in errata) appears to be undocumented (as far as I can tell) and is only modified for these errata, that "generic" solution seems questionably cleaner. The generic solution also won't fix existing users that haven't happened to do a FW update. Note that in ARM64 presumably PSCI will be universal and fixes like this will end up in ATF. Hopefully we are nearing the end of this style of errata workaround. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Huang Tao <huangtao@rock-chips.com> Signed-off-by: Kever Yang <kever.yang@rock-chips.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-02ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERSMasahiro Yamada
Since commit 2b749cb3a515 ("ARM: realview: remove private barrier implementation"), this config is not used by any platform. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-20lib/GCD.c: use binary GCD algorithm instead of EuclideanZhaoxiu Zeng
The binary GCD algorithm is based on the following facts: 1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2) 2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b) 3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b) Even on x86 machines with reasonable division hardware, the binary algorithm runs about 25% faster (80% the execution time) than the division-based Euclidian algorithm. On platforms like Alpha and ARMv6 where division is a function call to emulation code, it's even more significant. There are two variants of the code here, depending on whether a fast __ffs (find least significant set bit) instruction is available. This allows the unpredictable branches in the bit-at-a-time shifting loop to be eliminated. If fast __ffs is not available, the "even/odd" GCD variant is used. I use the following code to benchmark: #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <string.h> #include <time.h> #include <unistd.h> #define swap(a, b) \ do { \ a ^= b; \ b ^= a; \ a ^= b; \ } while (0) unsigned long gcd0(unsigned long a, unsigned long b) { unsigned long r; if (a < b) { swap(a, b); } if (b == 0) return a; while ((r = a % b) != 0) { a = b; b = r; } return b; } unsigned long gcd1(unsigned long a, unsigned long b) { unsigned long r = a | b; if (!a || !b) return r; b >>= __builtin_ctzl(b); for (;;) { a >>= __builtin_ctzl(a); if (a == b) return a << __builtin_ctzl(r); if (a < b) swap(a, b); a -= b; } } unsigned long gcd2(unsigned long a, unsigned long b) { unsigned long r = a | b; if (!a || !b) return r; r &= -r; while (!(b & r)) b >>= 1; for (;;) { while (!(a & r)) a >>= 1; if (a == b) return a; if (a < b) swap(a, b); a -= b; a >>= 1; if (a & r) a += b; a >>= 1; } } unsigned long gcd3(unsigned long a, unsigned long b) { unsigned long r = a | b; if (!a || !b) return r; b >>= __builtin_ctzl(b); if (b == 1) return r & -r; for (;;) { a >>= __builtin_ctzl(a); if (a == 1) return r & -r; if (a == b) return a << __builtin_ctzl(r); if (a < b) swap(a, b); a -= b; } } unsigned long gcd4(unsigned long a, unsigned long b) { unsigned long r = a | b; if (!a || !b) return r; r &= -r; while (!(b & r)) b >>= 1; if (b == r) return r; for (;;) { while (!(a & r)) a >>= 1; if (a == r) return r; if (a == b) return a; if (a < b) swap(a, b); a -= b; a >>= 1; if (a & r) a += b; a >>= 1; } } static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = { gcd0, gcd1, gcd2, gcd3, gcd4, }; #define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0])) #if defined(__x86_64__) #define rdtscll(val) do { \ unsigned long __a,__d; \ __asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \ (val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \ } while(0) static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long), unsigned long a, unsigned long b, unsigned long *res) { unsigned long long start, end; unsigned long long ret; unsigned long gcd_res; rdtscll(start); gcd_res = gcd(a, b); rdtscll(end); if (end >= start) ret = end - start; else ret = ~0ULL - start + 1 + end; *res = gcd_res; return ret; } #else static inline struct timespec read_time(void) { struct timespec time; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time); return time; } static inline unsigned long long diff_time(struct timespec start, struct timespec end) { struct timespec temp; if ((end.tv_nsec - start.tv_nsec) < 0) { temp.tv_sec = end.tv_sec - start.tv_sec - 1; temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec; } else { temp.tv_sec = end.tv_sec - start.tv_sec; temp.tv_nsec = end.tv_nsec - start.tv_nsec; } return temp.tv_sec * 1000000000ULL + temp.tv_nsec; } static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long), unsigned long a, unsigned long b, unsigned long *res) { struct timespec start, end; unsigned long gcd_res; start = read_time(); gcd_res = gcd(a, b); end = read_time(); *res = gcd_res; return diff_time(start, end); } #endif static inline unsigned long get_rand() { if (sizeof(long) == 8) return (unsigned long)rand() << 32 | rand(); else return rand(); } int main(int argc, char **argv) { unsigned int seed = time(0); int loops = 100; int repeats = 1000; unsigned long (*res)[TEST_ENTRIES]; unsigned long long elapsed[TEST_ENTRIES]; int i, j, k; for (;;) { int opt = getopt(argc, argv, "n:r:s:"); /* End condition always first */ if (opt == -1) break; switch (opt) { case 'n': loops = atoi(optarg); break; case 'r': repeats = atoi(optarg); break; case 's': seed = strtoul(optarg, NULL, 10); break; default: /* You won't actually get here. */ break; } } res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops); memset(elapsed, 0, sizeof(elapsed)); srand(seed); for (j = 0; j < loops; j++) { unsigned long a = get_rand(); /* Do we have args? */ unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand(); unsigned long long min_elapsed[TEST_ENTRIES]; for (k = 0; k < repeats; k++) { for (i = 0; i < TEST_ENTRIES; i++) { unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]); if (k == 0 || min_elapsed[i] > tmp) min_elapsed[i] = tmp; } } for (i = 0; i < TEST_ENTRIES; i++) elapsed[i] += min_elapsed[i]; } for (i = 0; i < TEST_ENTRIES; i++) printf("gcd%d: elapsed %llu\n", i, elapsed[i]); k = 0; srand(seed); for (j = 0; j < loops; j++) { unsigned long a = get_rand(); unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand(); for (i = 1; i < TEST_ENTRIES; i++) { if (res[j][i] != res[j][0]) break; } if (i < TEST_ENTRIES) { if (k == 0) { k = 1; fprintf(stderr, "Error:\n"); } fprintf(stderr, "gcd(%lu, %lu): ", a, b); for (i = 0; i < TEST_ENTRIES; i++) fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n"); } } if (k == 0) fprintf(stderr, "PASS\n"); free(res); return 0; } Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got: zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10 gcd0: elapsed 10174 gcd1: elapsed 2120 gcd2: elapsed 2902 gcd3: elapsed 2039 gcd4: elapsed 2812 PASS zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10 gcd0: elapsed 9309 gcd1: elapsed 2280 gcd2: elapsed 2822 gcd3: elapsed 2217 gcd4: elapsed 2710 PASS zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10 gcd0: elapsed 9589 gcd1: elapsed 2098 gcd2: elapsed 2815 gcd3: elapsed 2030 gcd4: elapsed 2718 PASS zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10 gcd0: elapsed 9914 gcd1: elapsed 2309 gcd2: elapsed 2779 gcd3: elapsed 2228 gcd4: elapsed 2709 PASS [akpm@linux-foundation.org: avoid #defining a CONFIG_ variable] Signed-off-by: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com> Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Changes included in this pull request: - revert pxa2xx-flash back to using ioremap_cached() and switch memremap() to use arch_memremap_wb() - remove pci=firmware command line argument handling - remove unnecessary arm_dma_set_mask() implementation, the generic implementation will do for ARM - removal of the ARM kallsyms "hack" to work around mode switching veneers and vectors located below PAGE_OFFSET - tidy up build system output a little - add L2 cache power management DT bindings - remove duplicated local_irq_disable() in reboot paths - handle AMBA primecell devices better at registration time with PM domains (needed for Samsung SoCs) - ARM specific preparation to support Keystone II kexec" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs ARM: 8570/2: Documentation: devicetree: Add PL310 PM bindings ARM: 8569/1: pl2x0: Add OF control of cache power management ARM: 8568/1: reboot: remove duplicated local_irq_disable() ARM: 8566/1: drivers: amba: properly handle devices with power domains ARM: provide arm_has_idmap_alias() helper ARM: kexec: remove 512MB restriction on kexec crashdump ARM: provide improved virt_to_idmap() functionality ARM: kexec: fix crashkernel= handling ARM: 8557/1: specify install, zinstall, and uinstall as PHONY targets ARM: 8562/1: suppress "include/generated/mach-types.h is up to date." ARM: 8553/1: kallsyms: remove --page-offset command line option ARM: 8552/1: kallsyms: remove special lower address limit for CONFIG_ARM ARM: 8555/1: kallsyms: ignore ARM mode switching veneers ARM: 8548/1: dma-mapping: remove arm_dma_set_mask() ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handling ARM: memremap: implement arch_memremap_wb() memremap: add arch specific hook for MEMREMAP_WB mappings mtd: pxa2xx-flash: switch back from memremap to ioremap_cached ARM: reintroduce ioremap_cached() for creating cached I/O mappings
2016-05-19Merge branches 'amba', 'devel-stable', 'kexec-for-next' and 'misc' into ↵Russell King
for-linus
2016-05-09iommu: of: enforce const-ness of struct iommu_opsRobin Murphy
As a set of driver-provided callbacks and static data, there is no compelling reason for struct iommu_ops to be mutable in core code, so enforce const-ness throughout. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-07Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "These are a number of updates to fix a few problems found in the ARM nommu code over the last couple of years, caused mostly by changes on the mmu side" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8573/1: domain: move {set,get}_domain under config guard ARM: 8572/1: nommu: change memory reserve for the vectors ARM: 8571/1: nommu: fix PMSAv7 setup
2016-05-05ARM: 8567/1: cache-uniphier: activate ways for secondary CPUsMasahiro Yamada
This outer cache allows to control active ways independently for each CPU, but currently nothing is done for secondary CPUs. In other words, all the ways are locked for secondary CPUs by default. This commit fixes it to fully bring out the performance of this outer cache. There would be two possible ways to achieve this: [1] Each CPU initializes active ways for itself. This can be done via the SSCLPDAWCR register. This is a banked register, so each CPU sees a different instance of the register for its own. [2] The master CPU initializes active ways for all the CPUs. This is available via SSCDAWCARMR(N) registers, where all instances of SSCLPDAWCR are mirrored. They are mapped at the address SSCDAWCARMR + 4 * N, where N is the CPU number. The outer cache frame work does not support a per-CPU init callback. So this commit adopts [2]; the master CPU iterates over possible CPUs setting up SSCDAWCARMR(N) registers. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05ARM: 8572/1: nommu: change memory reserve for the vectorsJean-Philippe Brucker
Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an additional page above the base vector one. This change wasn't taken into account by the nommu memreserve. This patch ensures that the kernel won't overwrite any vector stub on nommu. [changed the MPU side too] Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>