Age | Commit message (Collapse) | Author |
|
Change-Id: I318afbe66efa346b71e82413ac6442672cef4d36
Reviewed-on: http://git-master/r/21196
Reviewed-by: Jonathan B White (Engrg-Mobile) <jwhite@nvidia.com>
Tested-by: Jonathan B White (Engrg-Mobile) <jwhite@nvidia.com>
Reviewed-by: Maria Gutowski <mgutowski@nvidia.com>
|
|
video:tegra:nvmap: Clean whole L1 instead of cleaning by MVA
For large allocations, cleaning each page of the allocation can
take a significant amount of time. If an allocation that nvmap needs
to clean or invalidate out of the cache is significantly larger than
the cache, just flush the entire cache by set/ways.
bug 788967
Reviewed-on: http://git-master/r/19354
(cherry picked from commit c01c12e63b1476501204152356867aeb5091fb80)
tegra:video:nvmap: optimize cache_maint operation.
optimize cache_maint operation for carveout and heap memories.
flush carveout memory allocations on memory free.
Bug 761637
Reviewed-on: http://git-master/r/21205
Conflicts:
drivers/video/tegra/nvmap/nvmap_dev.c
drivers/video/tegra/nvmap/nvmap_heap.c
drivers/video/tegra/nvmap/nvmap_ioctl.c
(cherry picked from commit 731df4df5e895e1d4999359d6d5939fc2095f883)
tegra:video:nvmap: optimize cache flush for system heap pages.
optimize cache flush for pages allocated from system heap.
Bug 788187
Reviewed-on: http://git-master/r/21687
(cherry picked from commit 3f318911ad91410aed53c90494210e2b8f74308b)
Change-Id: Ia7b90ba0b50acfef1b88dd8095219c51733e027f
Reviewed-on: http://git-master/r/23465
Reviewed-by: Kirill Artamonov <kartamonov@nvidia.com>
Tested-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
Conflicts:
arch/arm/mach-tegra/Makefile
arch/arm/mach-tegra/fuse.c
arch/arm/mach-tegra/fuse.h
arch/arm/mach-tegra/kfuse.c
arch/arm/mach-tegra/tegra2_clocks.c
drivers/video/tegra/dc/Makefile
drivers/video/tegra/dc/hdmi.c
drivers/video/tegra/dc/hdmi.h
drivers/video/tegra/dc/nvhdcp.c
Change-Id: I60a025d9e23e0699afcfaf9e3e42a98263cd7de8
|
|
Change-Id: I1d7f83e8eb433df8076a9d636ff03e174a3ff581
|
|
Enable dynamic high level clock gating for Cortex-A9 CPUs, as
described in 2.3.3 "Dynamic high level clock gating" of the
Cortex-A9 TRM. This may cut the clock of the integer core,
system control block, and Data Engine in certain conditions.
Add ARM errata 720791 to avoid corrupting the Jazelle
instruction stream on earlier Cortex-A9 revisions.
Change-Id: I48e51d907e593f26982ea91b0a811553f68e3c86
Signed-off-by: Todd Poynor <toddpoynor@google.com>
|
|
Conflicts:
arch/arm/mach-tegra/fuse.c
drivers/misc/Makefile
Change-Id: I300b925d78b31efe00c342190d8dbd50e2e81230
|
|
Conflicts:
arch/arm/mm/cache-v6.S
Change-Id: I1a2063218dd705a762a40f4a9dfe504ce1a1d491
|
|
|
|
commit 85b093bcc5322baa811a03ec73de0909c157f181 upstream.
Cache ownership must be acquired by reading/writing data from the
cache line to make cache operation have the desired effect on the
SMP MPCore CPU. However, the ownership is never acquired in the
v6_dma_inv_range function when cleaning the first line and
flushing the last one, in case the address is not aligned
to D_CACHE_LINE_SIZE boundary.
Fix this by reading/writing data if needed, before performing
cache operations.
While at it, fix v6_dma_flush_range to prevent RWFO outside
the buffer.
Signed-off-by: Valentine Barshak <vbarshak@mvista.com>
Signed-off-by: George G. Davis <gdavis@mvista.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Conflicts:
arch/arm/mach-tegra/board-ventana-power.c
drivers/mfd/tps6586x.c
Change-Id: Ic8c46d4251d6e71fa2900b7e876f87e256299bc4
|
|
Conflicts:
drivers/usb/gadget/composite.c
Change-Id: I1a332ec21da62aea98912df9a01cf0282ed50ee1
|
|
|
|
commit 4e54d93d3c9846ba1c2644ad06463dafa690d1b7 upstream.
When running following code in a machine which has VIVT caches and
USE_SPLIT_PTLOCKS is not defined:
fd = open("/etc/passwd", O_RDONLY);
addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
v = *((int *)addr);
we will hang in spinlock recursion in the page fault handler:
BUG: spinlock recursion on CPU#0, mmap_test/717
lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717,
.owner_cpu: 0
[<c0026604>] (unwind_backtrace+0x0/0xec)
[<c014ee48>] (do_raw_spin_lock+0x40/0x140)
[<c0027f68>] (update_mmu_cache+0x208/0x250)
[<c0079db4>] (__do_fault+0x320/0x3ec)
[<c007af7c>] (handle_mm_fault+0x2f0/0x6d8)
[<c0027834>] (do_page_fault+0xdc/0x1cc)
[<c00202d0>] (do_DataAbort+0x34/0x94)
This comes from the fact that when USE_SPLIT_PTLOCKS is not defined,
the only lock protecting the page tables is mm->page_table_lock
which is already locked before update_mmu_cache() is called.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Change-Id: Idabf0fb5eed1dee0d1329c6588524ab828c278ab
|
|
|
|
Conflicts:
arch/arm/mach-tegra/tegra_i2s_audio.c
Change-Id: I3e05a70e3fb8fdaa8ca4c5ed78ca020c75ed0caa
|
|
Based on patch by rmk on lkml at http://lkml.org/lkml/2010/10/11/179
Reverts changes to find_limits to fix crash when using memblock_remove
on the end of memory.
Original-author: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Colin Cross <ccross@android.com>
Change-Id: I6137a7939329381e4ed34bfcdc8b713dc50ebcc8
|
|
Conflicts:
arch/arm/configs/tegra_defconfig
arch/arm/mach-tegra/tegra_i2s_audio.c
Change-Id: Ib0d7fc5c84b21a58f78a4a987c245e0e110ff437
|
|
|
|
This reverts commit 54d414570432ce07fa1a14b657f53bed752e3d7e.
Change-Id: I8e5cf6ef3555129da9741ef52a1e6a3a772ad588
Signed-off-by: Gary King <gking@nvidia.com>
|
|
|
|
|
|
Current flush_dcache_page() implementation does not allow lazy cache
flushing for highmem pages (introduced by commit d73cd42) on the
assumption that the temporary kmap mapping would disappear. A subsequent
commit (7e5a69e) allows __flush_dcache_page() to handle highmem pages so
we can allow lazy cache flushing even for highmem pages.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
|
|
ARMv7 processors like Cortex-A9 broadcast the cache maintenance
operations in hardware. This patch allows the
flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
similar to the UP case.
Note that cache flushing on SMP systems now takes place via the
set_pte_at() call (__sync_icache_dcache) and there is no race with other
CPUs executing code from the new PTE before the cache flushing took
place.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
|
|
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the cache maintenance operations in
update_mmu_cache(). This patch follows the IA-64 and PowerPC approach of
synchronising the I and D caches via the set_pte_at() function. In this
case, there is no need for update_mmu_cache() to be implemented since
lazy cache flushing is already handled by the time this function is
called.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
|
|
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().
The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
|
|
This reverts commit ff6c5cd434c779b6b3e8140a5bd5c30793a6123f.
|
|
|
|
This reverts commit ac21b321048091bdbf45bbda87161cc9f312c393.
|
|
when allocating uncached pages, the outer cache should be flushed;
the end address should be specified in bytes, not in pages.
Change-Id: I3fe036f4f7e10e009f96567e3afeeef6ea603240
Signed-off-by: Gary King <gking@nvidia.com>
|
|
... but produce a big warning about the problem as encouragement
for people to fix their drivers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
For streaming-style operations (e.g., software rendering of graphics
surfaces shared with non-coherent DMA devices), the cost of performing
L2 cache maintenance can exceed the benefit of having the larger cache
(this is particularly true for OUTER_CACHE configurations like the ARM
PL2x0).
This change uses the currently-unused mapping 5 (TEX[0]=1, C=0, B=1)
in the tex remapping tables as an inner-writeback-write-allocate, outer
non-cacheable memory type, so that this mapping will be available to
clients which will benefit from the reduced L2 maintenance.
Change-Id: Iaec3314a304eab2215100d991b1e880b676ac906
Signed-off-by: Gary King <gking@nvidia.com>
|
|
the "streaming" mode optimization which skips cacheline allocation
for fully-dirty lines is frequently defeated when coherent processors
perfom stores simultaneously
this results in cachelines being allocated in SMP which are not
allocated when run in uniprocessor, resulting in a significant
reduction in aggregate write bandwidth. for example, on Tegra 2
systems with 300MHz DDR main memory, running memset over a large
buffer (i.e., L2 miss) on a single processor will achieve ~2GB/sec
of write bandwidth, but if the same operation is run in parallel on
both CPUs, the aggregate write bandwidth is just 500MB/sec
changing the cache allocation policy to read-allocate reduces some
of this performance loss on SMP systems.
Change-Id: Ice47ab0a15f2490b7e9a007b4b37800566ed7be1
Signed-off-by: Gary King <gking@nvidia.com>
|
|
ARM CPUs with speculative prefetching have undefined behaviors when the
same physical page is mapped to two different virtual addresses with
conflicting cache attributes.
since many recent systems include IOMMU functionality (i.e., remapping
of discontiguous physical pages into a virtually-contiguous address
range for I/O devices), it is desirable to support allocating any
available OS memory for use by the I/O devices. however, since many
systems do not support cache coherency between the CPU and DMA devices,
these devices are left with using DMA-coherent allocations from the OS
(which severely limits the benefit of an IOMMU) or performing cache
maintenance (which can be a severe performance loss, particularly on
systems with outer caches, compared to using DMA-coherent memory).
this change adds an API for allocating pages from the OS with specific
cache maintenance properties and ensures that the kernel's mapping
of the page reflects the desired cache attributes, in line with the
ARMv7 architectural requirements
Change-Id: If0bd3cfe339b9a9b10fd6d45a748cd5e65931cf0
Signed-off-by: Gary King <gking@nvidia.com>
|
|
add a kernel configuration to map the kernel's lowmem pages using PTE
mappings, rather than the default behavior of 1MiB section mappings.
on ARMv7 processors, to support allocating pages with DMA-coherent
cache attributes, the cache attributes specified in the kernel's
mapping must match cache attributes specified for other mappings;
to ensure that this is the case, the kernel's attributes must be
specified on a per-page basis.
to avoid problems caused by the init_mm page table allocations exceeding
the available initial memory, when this config is enabled lowmem is
initially mapped using sections (matches current behavior), then remapped
using pages after bootmem is initialized
Change-Id: I8a6feba1d6806d007e17d9d4616525b0446c0fb1
Signed-off-by: Gary King <gking@nvidia.com>
|
|
Commit 14eff1812679c76564b775aa95cdd378965f6cfb added proper
detection for ARM11MPCore/Cortex-A9 instead of detecting them
as ARMv7. However, it was missing the HWCAP_TLS flags.
HWCAP_TLS is needed if support for earlier ARMv6 is compiled
into the same kernel. Without HWCAP_TLS flags the userspace
won't work unless nosmp is specified:
Kernel panic - not syncing: Attempted to kill init!
CPU0: stopping
<c005d5e4>] (unwind_backtrace+0x0/0xec) from [<c004c2f8>] (do_IPI+0xfc/0x184)
<c004c2f8>] (do_IPI+0xfc/0x184) from [<c03f25bc>] (__irq_svc+0x9c/0x160)
Exception stack(0xc0565f80 to 0xc0565fc8)
5f80: 00000001 c05772a0 00000000 00003a61 c0564000 c05cf500 c003603c c0578600
5fa0: 80033ef0 410fc091 0000001f 00000000 00000000 c0565fc8 c00b91f8 c0057cb4
5fc0: 20000013 ffffffff
[<c03f25bc>] (__irq_svc+0x9c/0x160) from [<c0057cb4>] (default_idle+0x30/0x38)
[<c0057cb4>] (default_idle+0x30/0x38) from [<c005829c>] (cpu_idle+0x9c/0xf8)
[<c005829c>] (cpu_idle+0x9c/0xf8) from [<c0008d48>] (start_kernel+0x2a4/0x300)
[<c0008d48>] (start_kernel+0x2a4/0x300) from [<80008084>] (0x80008084)
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
ARMv7 processors like Cortex-A9 broadcast the cache maintenance
operations in hardware. The patch adds the CPU ID checks for such
feature and allows the flush_dcache_page/update_mmu_cache pair to work
in lazy flushing mode similar to the UP case.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Conflicts:
drivers/input/touchscreen/Kconfig
Change-Id: Ifc75266e258f9513d78c47c12e2f1de1d2344f02
|
|
Some development platforms may have issues with this controller, so
allow easy disabling from the kernel command line. The patch also adds
a check for l2x0_disabled in the realview_pbx.c code to avoid setting
additional L2x0 registers.
Change-Id: Icbbd3e054688811200a4c96bf7e0a81c9c0ab790
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add shutdown and restart functions to the L2X0 outer cache controller,
so that machines which need to flush and disable the outer cache
controller prior to executing the architecture reset or platform
suspend code can do so.
Change-Id: I042aae121e7ba75223ed502afb4d118b0441597e
Signed-off-by: Gary King <gking@nvidia.com>
Signed-off-by: Colin Cross <ccross@android.com>
|
|
The commit f1a2481c0 sets up the default flags for MT_MEMORY and
MT_MEMORY_NONCACHED memory types. L_PTE_USER flag is wrongly
set as default for these entries so remove it. Also adding
the 'L_PTE_WRITE' flag so that these pages become read-write
instead of just being read-only
[this stops them being exposed to userspace, which is the main
concern here --rmk]
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
data corruption
On the r2p0, r2p1 and r2p2 versions of the Cortex-A9, data corruption
can occur under very rare conditions due to a store buffer optimisation.
This workaround sets a bit in the diagnostic register of the Cortex-A9,
disabling the optimisation and preventing the problem from occurring.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
With this L2 cache controller, the cache maintenance by PA and sync
operations are atomic and do not require a "wait" loop or spinlocks.
This patch conditionally defines the cache_wait() function and locking
primitives (rather than duplicating the functions or file).
Since L2x0 cache controllers do not work with ARMv7 CPUs, the patch
automatically enables CACHE_PL310 when CPU_V7 is defined.
Change-Id: I23e8fc326e6c42e7b36c7b67393fa91576692b48
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
If CACHE_FLUSH_RANGE_LIMIT is defined, then the entire dcache will
be flushed if the requested range is larger than this limit.
Change-Id: I29277d645a9d6716b1952cf3b870c78496261dd0
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
This patch populates the L1 entries for MT_MEMORY and MT_MEMORY_NONCACHED
types so that at boot-up, we can map memories outside system memory
at page level granularity
Previously the mapping was limiting to section level, which creates
unnecessary additional mapping for which physical memory may not
present. On the newer ARM with speculation, this is dangerous and can
result in untraceable aborts.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
When the policy for user space is to ignore misaligned accesses from user
space, the processor then performs a documented rotation on the accessed
data. This is the result of the access being trapped, and the kernel
disabling the alignment trap before returning to user space again.
In kernel space we always want misaligned accesses to be fixed up. This
is enforced by always re-enabling the alignment trap on every entry into
kernel space from user space. No such re-enabling is performed when an
exception occurs while already in kernel space as the alignment trap is
always supposed to be enabled in that case.
There is however a small race window when a misaligned access in user
space is trapped and the alignment trap disabled, but the CPU didn't
return to user space just yet. Any exception would be entered from kernel
space at that point and the kernel would then execute with the alignment
trap disabled.
Thanks to Maxime Bizon <mbizon@freebox.fr> for providing a test module
that made this issue reproducible.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
ARMv7 onwards requires that there are no aliases to the same physical
location using different memory types (i.e. Normal vs Strongly Ordered).
Access to SO mappings when the unaligned accesses are handled in
hardware is also Unpredictable (pgprot_noncached() mappings in user
space).
The /dev/mem driver requires uncached mappings with O_SYNC. The patch
implements the phys_mem_access_prot() function which generates Strongly
Ordered memory attributes if !pfn_valid() (independent of O_SYNC) and
Normal Noncacheable (writecombine) if O_SYNC.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Setting of these bits can cause issues on other SMP SoC's not produced
by ARM.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Daniel Walker <dwalker@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
corruption
On the r2p0, r2p1 and r2p2 versions of the Cortex-A9, data corruption
can occur if a shared cache line is replaced on one CPU as another CPU
is accessing it.
This workaround sets two bits in the diagnostic register of the Cortex-A9,
reducing the linefill issuing capabilities of the processor and
avoiding the erroneous behaviour.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|