Age | Commit message (Collapse) | Author |
|
|
|
There are so many retries happen on the per-cpu event device
when run the command 'cat /proc/timer_list', as following:
root@~$ cat /proc/timer_list
Timer List Version: v0.6
HRTIMER_MAX_CLOCK_BASES: 3
now at 3297691988044 nsecs
Tick Device: mode: 1
Per CPU device: 0
Clock Event Device: local_timer
max_delta_ns: 8624432320
min_delta_ns: 1000
mult: 2138893713
shift: 32
mode: 3
next_event: 3297700000000 nsecs
set_next_event: twd_set_next_event
set_mode: twd_set_mode
event_handler: hrtimer_interrupt
retries: 36383
the reason is that the local timer will stop when enter C3 state,
we need switch the local timer to bc timer when enter the state
and switch back when exit from the that state.The code is like this:
void arch_idle(void)
{
....
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
enter_the_wait_mode();
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
}
when the broadcast timer interrupt arrives(this interrupt just wakeup
the ARM, and ARM has no chance to handle it since local irq is disabled.
In fact it's disabled in cpu_idle() of arch/arm/kernel/process.c)
the broadcast timer interrupt will wake up the CPU and run:
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); ->
tick_broadcast_oneshot_control(...);
->
tick_program_event(dev->next_event, 1);
->
tick_dev_program_event(dev, expires, force);
->
for (i = 0;;) {
int ret = clockevents_program_event(dev, expires, now);
if (!ret || !force)
return ret;
dev->retries++;
....
now = ktime_get();
expires = ktime_add_ns(now, dev->min_delta_ns);
}
clockevents_program_event(dev, expires, now);
delta = ktime_to_ns(ktime_sub(expires, now));
if (delta <= 0)
return -ETIME;
when the bc timer interrupt arrives, which means the last local timer
expires too. so, clockevents_program_event will return -ETIME, which will
cause the dev->retries++ when retry to program the expired timer.
Even under the worst case, after the re-program the expired timer,
then CPU enter idle quickly before the re-progam timer expired,
it will make system ping-pang forever if no interrupt happen.
We have found the ping-pang issue during the video play-back test.
system will freeze and video not playing for sometime until other interrupt
occured to break the error condition.
The detailed information, please refer to the LKML:https://lkml.org/lkml/2013/2/20/216
which posted by Jason Liu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason Liu <r64343@freescale.com>
Tested-by: Jason Liu <r64343@freescale.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
|
Change-Id: Icb73d60324ad0ddfc3e8a450a28bb3d90c702788
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Ib91208a2c33621aa2d7bd9aa72bfbc670d9d5f1d
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: I520dfb6e593dac131de8b9b1db77f1c734f18c24
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Iff964ba2f6236ed81863e02ec7b3ec9fbc48044a
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: If2cb7928d0711f48348443d882a12416be9c5910
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
When PowerDown was requested at the same time as ProgBit, the
formatter flush command that follows could get stuck.
Change-Id: Iafb665f61f055819e64ca1dcb60398c656f593e4
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Without this change a saw an 18% increase in idle power consumption
on one deivce when trace support is compiled into the kernel. Now
I see the same increase only when tracing.
Change-Id: I21bb5ecf1b7d29ce3790ceeb5323409cc22d5a3b
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
If more than one ETM or PTM are present, configure all of them
and enable the formatter in the ETB. This allows tracing on dual
core systems (e.g. omap4).
Change-Id: I028657d5cf2bee1b23f193d4387b607953b35888
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
On some SOCs the read and write pointer are reset when the chip
resets, but the trace buffer content is preserved. If the status
bits indicates that the buffer is empty and we have never started
tracing, assume the buffer is full instead. This can be useful
if the system rebooted from a watchdog reset.
Change-Id: Iaf21c2c329c6059004ee1d38e3dfff66d7d28029
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
It is not safe to call etm_lock or etb_lock without holding the
mutex since another thread may also have unlocked the registers.
Also add some missing checks for valid etb_regs in the etm sysfs
entries.
Change-Id: I939f76a6ea7546a8fc0d4ddafa2fd2b6f38103bb
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
The old code enabled data tracing, but did not configure the
range. We now configure it to trace all data addresses by default,
and add a trace_data_range attribute to change the range or disable
data tracing.
Change-Id: I9d04e3e1ea0d0b4d4d5bcb93b1b042938ad738b2
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Trace kernel text segment by default as before, allow tracing of other
ranges by writing a range to /sys/devices/etm/trace_range, or to trace
everything by writing 0 0.
Change-Id: Ibb734ca820fedf79560b20536247f1e1700cdc71
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
If the write address was at the end of the buffer, toggling the trace
capture bit would set the RAM-full status instead of clearing it, and
if any of the stop bits in the formatter is set toggling the trace
capture bit may not do anything.
Instead use the read position to find out if the data has already
been returned.
This also fixes the read function so it works when the trace buffer is
larger than the buffer passed in from user space. The old version
would reset the trace buffer pointers after every read, so the second
call to read would always return 0.
Change-Id: I75256abe2556adfd66fd5963e46f9e84ae4645e1
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
On some systems kernel code is considered secure, and this code
already limits tracing to the kernel text segment which results
in no trace data.
Change-Id: I098a0753e874859446d098e1ee209f67fc13cd5d
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
If clk_get fail, assume the etb does not need a separate clock.
Change-Id: Ia0bf3f5391e94a60ea45876aa7afc8a88a7ec3bf
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
commit 7deabca0acfe02b8e18f59a4c95676012f49a304 upstream.
We can stall RCU processing on SMP platforms if a CPU sits in its idle
loop for a long time. This happens because we don't call irq_enter()
and irq_exit() around generic_smp_call_function_interrupt() and
friends. Add the necessary calls, and remove the one from within
ipi_timer(), so that they're all in a common place.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[add irq_enter()/irq_exit() in do_local_timer]
Signed-off-by: UCHINO Satoshi <satoshi.uchino@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
During booting of cpu1, there is a short window where cpu1
is online, but not active where cpu1 is occupied by waiting
to become active. If cpu0 then decides to schedule something
on cpu1 and wait for it to complete, before cpu0 has set
cpu1 active, we have a deadlock.
Typically it's this CPU frequency transition that happens at
this time, so let's just not wait for it to happen, it will
happen whenever the CPU eventually comes online instead.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Reviewed-by: Rickard Andersson <rickard.andersson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
When a CPU is hotplugged off, we migrate any IRQs currently affine to it
away and onto another online CPU by calling the irq_set_affinity
function of the relevant interrupt controller chip. This function
returns either IRQ_SET_MASK_OK or IRQ_SET_MASK_OK_NOCOPY, to indicate
whether irq_data.affinity was updated.
If we are forcefully migrating an interrupt (because the affinity mask
no longer identifies any online CPUs) then we should update the IRQ
affinity mask to reflect the new CPU set. Failure to do so can
potentially leave /proc/irq/n/smp_affinity identifying only offline
CPUs, which may confuse userspace IRQ balancing daemons.
This patch updates migrate_one_irq to copy the affinity mask when
the interrupt chip returns IRQ_SET_MASK_OK after forcefully changing the
affinity of an interrupt.
Cc: stable@vger.kernel.org
Reported-by: Leif Lindholm <leif.lindholm@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
ARM unconditionally selects CONFIG_GENERIC_HARDIRQS, so the definition
of for_each_irq_desc will check that the desc is non-NULL anyway.
This patch removes a redundant check from the IRQ migration code.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Our selection of interrupts to consider for IRQ migration is sub-
standard. We were potentially including per-CPU interrupts in our
migration strategy, but omitting chained interrupts. This caused
some interrupts to remain on a downed CPU.
We were also trying to migrate interrupts which were not migratable,
resulting in an OOPS.
Instead, iterate over all interrupts, skipping per-CPU interrupts
or interrupts whose affinity does not include the downed CPU, and
attempt to set the affinity for every one else if their chip
implements irq_set_affinity().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Now that the GIC takes care of selecting a target interrupt from the
affinity mask, we don't need all this complexity in the core code
anymore. Just detect when we need to break affinity.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
irqdesc's node member is supposed to mark the numa node number for the
interrupt. Our use of it is non-standard. Remove this, replacing the
functionality with a test of the affinity mask.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
On Mon, Jul 11, 2011 at 3:52 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
...
> The __exception annotation on a function causes this to happen:
>
> [<c002406c>] (asm_do_IRQ+0x6c/0x8c) from [<c0024b84>]
> (__irq_svc+0x44/0xcc)
> Exception stack(0xc3897c78 to 0xc3897cc0)
> 7c60: 4022d320 4022e000
> 7c80: 08000075 00001000 c32273c0 c03ce1c0 c2b49b78 4022d000 c2b420b4 00000001
> 7ca0: 00000000 c3897cfc 00000000 c3897cc0 c00afc54 c002edd8 00000013 ffffffff
>
> Where that stack dump represents the pt_regs for the exception which
> happened. Any function found in while unwinding will cause this to
> be printed.
>
> If you insert a C function between the IRQ assembly and asm_do_IRQ,
> the
> dump you get from asm_do_IRQ will be the stack for your function,
> not
> the pt_regs. That makes the feature useless.
>
When __irq_svc - or any of the other exception handling assembly code -
calls the C code, the stack pointer will be pointing at the pt_regs
structure.
All the entry points into C code from the exception handling code are
marked with __exception or __exception_irq_enter to indicate that they
are one of the functions which has pt_regs above them.
Normally, when you've entered asm_do_IRQ() you will have this stack
layout (higher address towards top):
pt_regs
asm_do_IRQ frame
If you insert a C function between the exception assembly code and
asm_do_IRQ, you end up with this stack layout instead:
pt_regs
your function frame
asm_do_IRQ frame
This means when we unwind, we'll get to asm_do_IRQ, and rather than
dumping out the pt_regs, we'll dump out your functions stack frame
instead, because that's what is above the asm_do_IRQ stack frame
rather than the expected pt_regs structure.
The fix is to introduce handle_IRQ() for no exception stack dump, so
it can be called with MULTI_IRQ_HANDLER is selected and a C function
is between the assembly code and the actual IRQ handling code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
|
|
Rather than open-coding the jiffy-based wait, and polling for the
secondary CPU to come online, use a completion instead. This
removes the need to poll, instead we will be notified when the
secondary CPU has initialized.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Stepan found:
CPU0 CPUn
_cpu_up()
__cpu_up()
boostrap()
notify_cpu_starting()
set_cpu_online()
while (!cpu_active())
cpu_relax()
<PREEMPT-out>
smp_call_function(.wait=1)
/* we find cpu_online() is true */
arch_send_call_function_ipi_mask()
/* wait-forever-more */
<PREEMPT-in>
local_irq_enable()
cpu_notify(CPU_ONLINE)
sched_cpu_active()
set_cpu_active()
Now the purpose of cpu_active is mostly with bringing down a cpu, where
we mark it !active to avoid the load-balancer from moving tasks to it
while we tear down the cpu. This is required because we only update the
sched_domain tree after we brought the cpu-down. And this is needed so
that some tasks can still run while we bring it down, we just don't want
new tasks to appear.
On cpu-up however the sched_domain tree doesn't yet include the new cpu,
so its invisible to the load-balancer, regardless of the active state.
So instead of setting the active state after we boot the new cpu (and
consequently having to wait for it before enabling interrupts) set the
cpu active before we set it online and avoid the whole mess.
Reported-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1323965362.18942.71.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
We can stall RCU processing on SMP platforms if a CPU sits in its idle
loop for a long time. This happens because we don't call irq_enter()
and irq_exit() around generic_smp_call_function_interrupt() and
friends. Add the necessary calls, and remove the one from within
ipi_timer(), so that they're all in a common place.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Patch is the last version from tglx on Oct 7.
Discussion is at: http://comments.gmane.org/gmane.linux.ports.arm.kernel/131919
The original commit message for the first patch version:
Frank Rowand reported:
I have a consistent (every boot) hang on boot with the RT patches.
With a few hacks to get console output, I get:
rcu_preempt_state detected stalls on CPUs/tasks
I have also replicated the problem on the ARM RealView (in tree) and
without the RT patches.
The problem ended up being caused by the allowed cpus mask being set
to all possible cpus for the ksoftirqd on the secondary processors.
So the RCU softirq was never executing on the secondary cpu.
The problem was that ksoftirqd was woken on the secondary processors before
the secondary processors were online. This led to allowed cpus being set
to all cpus.
wake_up_process()
try_to_wake_up()
select_task_rq()
if (... || !cpu_online(cpu))
select_fallback_rq(task_cpu(p), p)
...
/* No more Mr. Nice Guy. */
dest_cpu = cpuset_cpus_allowed_fallback(p)
do_set_cpus_allowed(p, cpu_possible_mask)
# Thus ksoftirqd can now run on any cpu...
</report>
The reason is that the ARM SMP boot code for the secondary CPUs enables
interrupts before the newly brought up CPU is marked online and
active.
That causes a wakeup of ksoftirqd or a wakeup of any other kernel
thread which is affine to the brought up CPU break that threads
affinity and therefor being scheduled on already online CPUs.
This problem has been observed on x86 before and the only solution is
to mark the CPU online and wait for the CPU active bit before the
point where interrupts are enabled.
Change-Id: If948ef52d434191579e1ca95d18d0c50e91a03b9
Signed-off-by: Dima Zavin <dima@android.com>
|
|
cpufreq: interactive: New 'interactive' governor
This governor is designed for latency-sensitive workloads, such as
interactive user interfaces. The interactive governor aims to be
significantly more responsive to ramp CPU quickly up when CPU-intensive
activity begins.
Existing governors sample CPU load at a particular rate, typically
every X ms. This can lead to under-powering UI threads for the period of
time during which the user begins interacting with a previously-idle system
until the next sample period happens.
The 'interactive' governor uses a different approach. Instead of sampling
the CPU at a specified rate, the governor will check whether to scale the
CPU frequency up soon after coming out of idle. When the CPU comes out of
idle, a timer is configured to fire within 1-2 ticks. If the CPU is very
busy from exiting idle to when the timer fires then we assume the CPU is
underpowered and ramp to MAX speed.
If the CPU was not sufficiently busy to immediately ramp to MAX speed, then
the governor evaluates the CPU load since the last speed adjustment,
choosing the highest value between that longer-term load or the short-term
load since idle exit to determine the CPU speed to ramp to.
A realtime thread is used for scaling up, giving the remaining tasks the
CPU performance benefit, unlike existing governors which are more likely to
schedule rampup work to occur after your performance starved tasks have
completed.
The tuneables for this governor are:
/sys/devices/system/cpu/cpufreq/interactive/min_sample_time:
The minimum amount of time to spend at the current frequency before
ramping down. This is to ensure that the governor has seen enough
historic CPU load data to determine the appropriate workload.
/sys/devices/system/cpu/cpufreq/interactive/go_maxspeed_load
The CPU load at which to ramp to max speed.
Signed-off-by: Anson Huang <b20788@freescale.com>
|
|
Need to disable localtimer's PPI when suspend, or ARM core
will run into exception when resume.
Signed-off-by: Anson Huang <b20788@freescale.com>
|
|
After a cpufreq transition, update the clockevent's frequency
by fetching the new clock rate from the clock framework and
reprogram the next clock event.
Signed-off-by: Xinyu Chen <xinyu.chen@freescale.com>
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The smp_twd clockevents driver currently enables the local timer PPI
before the clockevents device is registered. This can lead to a kernel
panic if a spurious timer interrupt is generated before registration
has completed since the kernel will treat it as an IPI timer.
This patch moves the clockevents device registration before the IRQ
unmasking so that we can always handle timer interrupts once they can
occur.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
commit 435a7ef52db7d86e67a009b36cac1457f8972391 upstream.
We can't be holding the mmap_sem while calling flush_cache_user_range
because the flush can fault. If we fault on a user address, the
page fault handler will try to take mmap_sem again. Since both places
acquire the read lock, most of the time it succeeds. However, if another
thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
between the call to flush_cache_user_range and the fault, the down_read
in do_page_fault will deadlock.
[will: removed drop of vma parameter as already queued by rmk (7365/1)]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dima Zavin <dima@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4542b6a0fa6b48d9ae6b41c1efeb618b7a221b2a upstream.
vma isn't used and flush_cache_user_range isn't a standard macro that
is used on several archs with the same prototype. In fact only unicore32
has a macro with the same name (with an identical implementation and no
in-tree users).
This is a part of a patch proposed by Dima Zavin (with Message-id:
1272439931-12795-1-git-send-email-dima@android.com) that didn't get
accepted.
Cc: Dima Zavin <dima@android.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fde165b2a29673aabf18ceff14dea1f1cfb0daad upstream.
Commit 4e8ee7de227e3ab9a72040b448ad728c5428a042 (ARM: SMP: use
idmap_pgd for mapping MMU enable during secondary booting)
switched secondary boot to use idmap_pgd, which is initialized
during early_initcall, instead of a page table initialized during
__cpu_up. This causes idmap_pgd to contain the static mappings
but be missing all dynamic mappings.
If a console is registered that creates a dynamic mapping, the
printk in secondary_start_kernel will trigger a data abort on
the missing mapping before the exception handlers have been
initialized, leading to a hang. Initial boot is not affected
because no consoles have been registered, and resume is usually
not affected because the offending console is suspended.
Onlining a cpu with hotplug triggers the problem.
A workaround is to the printk in secondary_start_kernel until
after the page tables have been switched back to init_mm.
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e787ec1376e862fcea1bfd523feb7c5fb43ecdb9 upstream.
The inline assembly in kernel_execve() uses r8 and r9. Since this
code sequence does not return, it usually doesn't matter if the
register clobber list is accurate. However, I saw a case where a
particular version of gcc used r8 as an intermediate for the value
eventually passed to r9. Because r8 is used in the inline
assembly, and not mentioned in the clobber list, r9 was set
to an incorrect value.
This resulted in a kernel panic on execution of the first user-space
program in the system. r9 is used in ret_to_user as the thread_info
pointer, and if it's wrong, bad things happen.
Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8130b9d7b9d858aa04ce67805e8951e3cb6e9b2f upstream.
If we are context switched whilst copying into a thread's
vfp_hard_struct then the partial copy may be corrupted by the VFP
context switching code (see "ARM: vfp: flush thread hwstate before
restoring context from sigframe").
This patch updates the ptrace VFP set code so that the thread state is
flushed before the copy, therefore disabling VFP and preventing
corruption from occurring.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 247f4993a5974e6759606c4d380748eecfd273ff upstream.
In a preemptible kernel, vfp_set() can be preempted, causing the
hardware VFP context to be switched while the thread vfp state is
being read and modified. This leads to a race condition which can
cause the thread vfp state to become corrupted if lazy VFP context
save occurs due to preemption in between the time thread->vfpstate
is read and the time the modified state is written back.
This may occur if preemption occurs during the execution of a
ptrace() call which modifies the VFP register state of a thread.
Such instances should be very rare in most realistic scenarios --
none has been reported, so far as I am aware. Only uniprocessor
systems should be affected, since VFP context save is not currently
lazy in SMP kernels.
The problem was introduced by my earlier patch migrating to use
regsets to implement ptrace.
This patch does a vfp_sync_hwstate() before reading
thread->vfpstate, to make sure that the thread's VFP state is not
live in the hardware registers while the registers are modified.
Thanks to Will Deacon for spotting this.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2af276dfb1722e97b190bd2e646b079a2aa674db upstream.
Following execution of a signal handler, we currently restore the VFP
context from the ucontext in the signal frame. This involves copying
from the user stack into the current thread's vfp_hard_struct and then
flushing the new data out to the hardware registers.
This is problematic when using a preemptible kernel because we could be
context switched whilst updating the vfp_hard_struct. If the current
thread has made use of VFP since the last context switch, the VFP
notifier will copy from the hardware registers into the vfp_hard_struct,
overwriting any data that had been partially copied by the signal code.
Disabling preemption across copy_from_user calls is a terrible idea, so
instead we move the VFP thread flush *before* we update the
vfp_hard_struct. Since the flushing is performed lazily, this has the
effect of disabling VFP and clearing the CPU's VFP state pointer,
therefore preventing the thread from being updated with stale data on
the next context switch.
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 11ed0ba1754841316d4095478944300acf19acc3 upstream.
This patch implements a workaround for PL310 erratum 769419. On
revisions of the PL310 prior to r3p2, the Store Buffer does not
automatically drain. This can cause normal, non-cacheable writes to be
retained when the memory system is idle, leading to suboptimal I/O
performance for drivers using coherent DMA.
This patch adds an optional wmb() call to the cpu_idle loop. On systems
with an outer cache, this causes an explicit flush of the store buffer.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 8428e84d42179c2a00f5f6450866e70d802d1d05 upstream.
Recent gcc versions generate unaligned accesses by default on ARMv6 and
later processors. This patch ensures that the SCTLR.A bit is always
cleared on such processors to avoid kernel traping before
alignment_init() is called.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: John Linn <John.Linn@xilinx.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 29a541f6c1f6e4a85628bb86071b9e72c9f8be2c upstream.
Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.
Reported-by: Alasdair Grant <alasdair.grant@arm.com>
Reported-by: Matt Horsnell <matt.horsnell@arm.com>
Reported-by: Michael Williams <michael.williams@arm.com>
Cc: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit f630c1bdfbf8fe423325beaf60027cfc7fd7c610 upstream.
This patch implements a workaround for erratum 764369 affecting
Cortex-A9 MPCore with two or more processors (all current revisions).
Under certain timing circumstances, a data cache line maintenance
operation by MVA targeting an Inner Shareable memory region may fail to
proceed up to either the Point of Coherency or to the Point of
Unification of the system. This workaround adds a DSB instruction before
the relevant cache maintenance functions and sets a specific bit in the
diagnostic control register of the SCU.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
To get hundredths of MHz the rate needs to be divided by 10'000.
Here is an example:
twd_timer_rate = 123456789
Before the patch:
twd_timer_rate / 1000000 = 123
(twd_timer_rate / 1000000) % 100 = 23
Result: 123.23MHz.
After being fixed:
twd_timer_rate / 1000000 = 123
(twd_timer_rate / 10000) % 100 = 45
Result: 123.45MHz.
Signed-off-by: Vitaly Kuzmichev <vkuzmichev@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Ensure that the meminfo array is sanity checked before we pass the
memory to memblock. This helps to ensure that memblock and meminfo
agree on the dimensions of memory, especially when more memory is
passed than the kernel can deal with.
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
armpmu_enable can be called in situations where no events are present
(for example, from the event rotation tick after a profiled task has
exited). In this case, we currently start the PMU anyway which may
leave it active inevitably without any events being monitored.
This patch adds a simple check to the enabling code so that we avoid
starting the PMU when no events are present.
Cc: <stable@kernel.org>
Reported-by: Ashwin Chaugle <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
When we bring a CPU online, we should wait for it to become active
before entering the idle thread, so we know that the scheduler and
thread migration is going to work.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The "Thumb bit" of a symbol is only really meaningful for function
symbols (STT_FUNC).
However, sometimes a branch is relocated against a non-function
symbol; for example, PC-relative branches to anonymous assembler
local symbols are typically fixed up against the start-of-section
symbol, which is not a function symbol. Some inline assembler
generates references of this type, such as fixup code generated by
macros in <asm/uaccess.h>.
The existing relocation code for R_ARM_THM_CALL/R_ARM_THM_JUMP24
interprets this case as an error, because the target symbol appears
to be an ARM symbol; but this is really not the case, since the
target symbol is just a base in these cases. The addend defines
the precise offset to the target location, but since the addend is
encoded in a non-interworking Thumb branch instruction, there is no
explicit Thumb bit in the addend. Because these instructions never
interwork, the implied Thumb bit in the addend is 1, and the
destination is Thumb by definition.
This patch removes the extraneous Thumb bit check for non-function
symbols, enabling modules containing the affected relocation types
to be loaded. No modification to the actual relocation code is
required, since this code does not take bit[0] of the
location->destination offset into account in any case.
Function symbols are always checked for interworking conflicts, as
before.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Dump out the following 16-bit instruction to the faulting instruction
in the Code: line. This allows Thumb-2 instructions to be properly
encoded.
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|