| Age | Commit message (Collapse) | Author |
|
If hrtimer_try_to_cancel() requires a retry, then depending on the
priority setting te retry loop might prevent timer callback completion
on RT. Prevent that by waiting for completion on RT, no change for a
non RT kernel.
Reported-by: Sankara Muthukrishnan <sankara.m@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
On Fri, Oct 07, 2011 at 10:25:25AM -0700, Fernando Lopez-Lezcano wrote:
> On 10/06/2011 06:15 PM, Thomas Gleixner wrote:
> >Dear RT Folks,
> >
> >I'm pleased to announce the 3.0.6-rt17 release.
>
> Hi and thanks again. So far this one is not hanging which is very
> good news. But I still see the hrtimer_fixup_activate warnings I
> reported for rt16...
Hi Fernando,
I think below patch will smooth your concern?
Thanks,
Yong
|
|
[<ffffffff812de4a9>] __delay+0xf/0x11
[<ffffffff812e36e9>] do_raw_spin_lock+0xd2/0x13c
[<ffffffff815028ee>] _raw_spin_lock+0x60/0x73 rt_b->rt_runtime_lock
[<ffffffff81068f68>] ? sched_rt_period_timer+0xad/0x281
[<ffffffff81068f68>] sched_rt_period_timer+0xad/0x281
[<ffffffff8109e5e1>] __run_hrtimer+0x1e4/0x347
[<ffffffff81068ebb>] ? enqueue_rt_entity+0x36/0x36
[<ffffffff8109f2b1>] __hrtimer_start_range_ns+0x2b5/0x40a base->cpu_base->lock (lock_hrtimer_base)
[<ffffffff81068b6f>] __enqueue_rt_entity+0x26f/0x2aa rt_b->rt_runtime_lock (start_rt_bandwidth)
[<ffffffff81068ead>] enqueue_rt_entity+0x28/0x36
[<ffffffff81069355>] enqueue_task_rt+0x3d/0xb0
[<ffffffff810679d6>] enqueue_task+0x5d/0x64
[<ffffffff810714fc>] task_setprio+0x210/0x29c rq->lock
[<ffffffff810b56cb>] __rt_mutex_adjust_prio+0x25/0x2a p->pi_lock
[<ffffffff810b5d2c>] task_blocks_on_rt_mutex+0x196/0x20f
Instead make __hrtimer_start_range_ns() return -ETIME when the timer
is in the past. Since body actually uses the hrtimer_start*() return
value its pretty safe to wreck it.
Also, it will only ever return -ETIME for timer->irqsafe || !wakeup
timers.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
|
|
In preempt-rt we can not call the callbacks which take sleeping locks
from the timer interrupt context.
Bring back the softirq split for now, until we fixed the signal
delivery problem for real.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Make cancellation of a running callback in softirq context safe
against preemption.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
printk_tick() can't be called in atomic context when RT is enabled,
otherwise below warning will show:
[ 117.597095] BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
[ 117.597102] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0
[ 117.597111] Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419-dirty #7
[ 117.597116] Call Trace:
[ 117.597131] [<c06e3b61>] ? printk+0x1d/0x24
[ 117.597142] [<c01390b6>] __might_sleep+0xe6/0x110
[ 117.597151] [<c06e634c>] rt_spin_lock+0x1c/0x30
[ 117.597158] [<c0142f26>] __wake_up+0x26/0x60
[ 117.597166] [<c014c78e>] printk_tick+0x3e/0x40
[ 117.597173] [<c014c7b4>] printk_needs_cpu+0x24/0x30
[ 117.597181] [<c017ecc8>] tick_nohz_stop_sched_tick+0x2e8/0x410
[ 117.597191] [<c017305a>] ? sched_clock_idle_wakeup_event+0x1a/0x20
[ 117.597201] [<c010182a>] cpu_idle+0x4a/0xb0
[ 117.597209] [<c06e0b97>] start_secondary+0xd3/0xd7
Now this is a really rare case and it's very unlikely that we starve
an logbuf waiter that way.
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/1318762607-2261-4-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
On RT that code is preemptible, so we cannot assign NULL to timers
base as a preempter would spin forever in lock_timer_base().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
People were complaining about broken balancing with the recent -rt
series.
A look at /proc/sched_debug yielded:
cpu#0, 2393.874 MHz
.nr_running : 0
.load : 0
.cpu_load[0] : 177522
.cpu_load[1] : 177522
.cpu_load[2] : 177522
.cpu_load[3] : 177522
.cpu_load[4] : 177522
cpu#1, 2393.874 MHz
.nr_running : 4
.load : 4096
.cpu_load[0] : 181618
.cpu_load[1] : 180850
.cpu_load[2] : 180274
.cpu_load[3] : 179938
.cpu_load[4] : 179758
Which indicated the cpu_load computation was hosed, the 177522 value
indicates that there is one RT task runnable. Initially I thought the
old problem of calculating the cpu_load from a softirq had re-surfaced,
however looking at the code shows its being done from scheduler_tick().
[ we really should fix this RT/cfs interaction some day... ]
A few trace_printk()s later:
sirq-timer/1-19 [001] 174.289744: 19: 50:S ==> [001] 0:140:R <idle>
<idle>-0 [001] 174.290724: enqueue_task_rt: adding task: 19/sirq-timer/1 with load: 177522
<idle>-0 [001] 174.290725: 0:140:R + [001] 19: 50:S sirq-timer/1
<idle>-0 [001] 174.290730: scheduler_tick: current load: 177522
<idle>-0 [001] 174.290732: scheduler_tick: current: 0/swapper
<idle>-0 [001] 174.290736: 0:140:R ==> [001] 19: 50:R sirq-timer/1
sirq-timer/1-19 [001] 174.290741: dequeue_task_rt: removing task: 19/sirq-timer/1 with load: 177522
sirq-timer/1-19 [001] 174.290743: 19: 50:S ==> [001] 0:140:R <idle>
We see that we always raise the timer softirq before doing the load
calculation. Avoid this by re-ordering the scheduler_tick() call in
update_process_times() to occur before we deal with timers.
This lowers the load back to sanity and restores regular load-balancing
behaviour.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Here we are in the CPU_DEAD notifier, and we must not sleep nor
enable interrupts.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When softirqs can be preempted we need to make sure that cancelling
the timer from the active thread can not deadlock vs. a running timer
callback. Add a waitqueue to resolve that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
remove timer calls (!!!) from deep within the tracing infrastructure.
This was totally bogus code that can cause lockups and worse. Poll
the buffer every 2 jiffies for now.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
(Repost for v3.0-rt1 and changed the distination addreses)
I have tested the following patch on v3.0-rt1 with PREEMPT_RT_FULL.
In POSIX message queue, if a sender process uses SCHED_FIFO and
has a higher priority than a receiver process, the sender will
be stuck at ipc/mqueue.c:452
452 while (ewp->state == STATE_PENDING)
453 cpu_relax();
Description of the problem
(receiver process)
1. receiver changes sender's state to STATE_PENDING (mqueue.c:846)
2. wake up sender process and "switch to sender" (mqueue.c:847)
Note: This context switch only happens in PREEMPT_RT_FULL kernel.
(sender process)
3. sender check the own state in above loop (mqueue.c:452-453)
*. receiver will never wake up and cannot change sender's state to
STATE_READY because sender has higher priority
Signed-off-by: Yoshitake Kobayashi <yoshitake.kobayashi@toshiba.co.jp>
Cc: viro@zeniv.linux.org.uk
Cc: dchinner@redhat.com
Cc: npiggin@kernel.dk
Cc: hch@lst.de
Cc: arnd@arndb.de
Link: http://lkml.kernel.org/r/4E2A38A0.1090601@toshiba.co.jp
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
RT serializes the code with the (rt)spinlock but keeps preemption
enabled. Some parts of the code need to be atomic nevertheless.
Protect it with preempt_disable/enable_rt pairts.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Without this patch, ARM can not use SPLIT_PTLOCK_CPUS if
PREEMPT_RT_FULL=y because vectors_user_mapping() creates a
VM_ALWAYSDUMP mapping of the vector page (address 0xffff0000), but no
ptl->lock has been allocated for the page. An attempt to coredump
that page will result in a kernel NULL pointer dereference when
follow_page() attempts to lock the page.
The call tree to the NULL pointer dereference is:
do_notify_resume()
get_signal_to_deliver()
do_coredump()
elf_core_dump()
get_dump_page()
__get_user_pages()
follow_page()
pte_offset_map_lock() <----- a #define
...
rt_spin_lock()
The underlying problem is exposed by mm-shrink-the-page-frame-to-rt-size.patch.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Frank <Frank_Rowand@sonyusa.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4E87C535.2030907@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
He below is a boot-tested hack to shrink the page frame size back to
normal.
Should be a net win since there should be many less PTE-pages than
page-frames.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Split out the pages which are to be freed into a separate list and
call free_pages_bulk() outside of the percpu page allocator locks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
rt-friendly per-cpu pages: convert the irqs-off per-cpu locking
method into a preemptible, explicit-per-cpu-locks method.
Contains fixes from:
Peter Zijlstra <a.p.zijlstra@chello.nl>
Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Handle __free_pages outside of the locked regions. This reduces the
lock contention on the percpu slab locks in -rt significantly.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The array cache in __do_drain() was using the cpu_cache_get() function
which uses smp_processor_id() to get the proper array. On mainline, this
is fine as __do_drain() is called by for_each_cpu() which runs
__do_drain() on the CPU it is processing. In RT locks are used instead
and __do_drain() is only called from a single CPU. This can cause the
accounting to be off and trigger the following bug:
slab error in kmem_cache_destroy(): cache `nfs_write_data': Can't free all objects
Pid: 2905, comm: rmmod Not tainted 3.0.6-test-rt17+ #78
Call Trace:
[<ffffffff810fb623>] kmem_cache_destroy+0xa0/0xdf
[<ffffffffa03aaffb>] nfs_destroy_writepagecache+0x49/0x4e [nfs]
[<ffffffffa03c0fe0>] exit_nfs_fs+0xe/0x46 [nfs]
[<ffffffff8107af09>] sys_delete_module+0x1ba/0x22c
[<ffffffff8109429d>] ? audit_syscall_entry+0x11c/0x148
[<ffffffff814b6442>] system_call_fastpath+0x16/0x1b
This can be easily triggered by a simple while loop:
# while :; do modprobe nfs; rmmod nfs; done
The proper function to use is cpu_cache_get_on_cpu(). It works for both
RT and non-RT as the non-RT passes in smp_processor_id() into
__do_drain().
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Clark Williams <clark@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1318391783.13262.11.camel@gandalf.stny.rr.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When copying large amounts of data between the USB storage devices and
the hard disk, the USB mouse will not work, this patch fixes it.
[NOTE: This problem have been found in the Loongson family machines, not
sure whether it is producible on other platforms]
Signed-off-by: Hu Hongbing <huhb@lemote.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
|
|
The adjust_link() disables interrupts before taking the queue
locks. On RT those locks are converted to "sleeping" locks and
therefor the local_irq_save/restore must be converted to
local_irq_save/restore_nort.
Reported-by: Xianghua Xiao <xiaoxianghua@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Xianghua Xiao <xiaoxianghua@gmail.com>
|
|
Argh, cut and paste wasn't enough...
Use this patch instead. It needs an irq disable. But, believe it or not,
on SMP this is actually better. If the irq is shared (as it is in Mark's
case), we don't stop the irq of other devices from being handled on
another CPU (unfortunately for Mark, he pinned all interrupts to one CPU).
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
drivers/net/ethernet/3com/3c59x.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Preempt-RT runs into a live lock issue with the NETDEV_TX_LOCKED micro
optimization. The reason is that the softirq thread is rescheduling
itself on that return value. Depending on priorities it starts to
monoplize the CPU and livelock on UP systems.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Creates long latencies for no value
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The random call introduces high latencies and is almost
unused. Disable it for -rt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
bit_spin_locks break under RT.
Based on a previous patch from Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
--
include/linux/buffer_head.h | 10 ++++++++++
include/linux/jbd_common.h | 24 ++++++++++++++++++++++++
2 files changed, 34 insertions(+)
|
|
Wrap the bit_spin_lock calls into a separate inline and add the RT
replacements with a real spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Bit spinlocks are not working on RT. Replace them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Fixes the following on PREEMPT_RT:
BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
in_atomic(): 0, irqs_disabled(): 1, pid: 9116, name: sshd
Pid: 9116, comm: sshd Not tainted 2.6.31-rc6-rt2 #6
Call Trace:
[<ffffffff81034a4f>] __might_sleep+0xec/0xee
[<ffffffff812fbc6d>] rt_spin_lock+0x34/0x75
[ffffffff81064a83>] atomic_dec_and_spin_lock+0x36/0x54
[<ffffffff811df7c7>] put_ldisc+0x57/0xa6
[<ffffffff811dfb87>] tty_ldisc_hangup+0xe7/0x19f
[<ffffffff811d9224>] do_tty_hangup+0xff/0x319
[<ffffffff811d9453>] tty_vhangup+0x15/0x17
[<ffffffff811e1263>] pty_close+0x127/0x12b
[<ffffffff811dac41>] tty_release_dev+0x1ad/0x4c0
....
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|