summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-02-15kconfig-preempt-rt-full.patchv3.2.6-rt13Thomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15kconfig-disable-a-few-options-rt.patchThomas Gleixner
Disable stuff which is known to have issues on RT Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15cpumask: Disable CONFIG_CPUMASK_OFFSTACK for RTThomas Gleixner
We can't deal with the cpumask allocations which happen in atomic context (see arch/x86/kernel/apic/io_apic.c) on RT right now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15acpi-gpe-use-wait-simple.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15wait-simple: Simple waitqueue implementationThomas Gleixner
wait_queue is a swiss army knife and in most of the cases the complexity is not needed. For RT waitqueues are a constant source of trouble as we can't convert the head lock to a raw spinlock due to fancy and long lasting callbacks. Provide a slim version, which allows RT to replace wait queues. This should go mainline as well, as it lowers memory consumption and runtime overhead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15dm: Make rt awareThomas Gleixner
Use the BUG_ON_NORT variant for the irq_disabled() checks. RT has interrupts legitimately enabled here as we cant deadlock against the irq thread due to the "sleeping spinlocks" conversion. Reported-by: Luis Claudio R. Goncalves <lclaudio@uudg.org> Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86: crypto: Reduce preempt disabled regionsPeter Zijlstra
Restrict the preempt disabled regions to the actual floating point operations and enable preemption for the administrative actions. This is necessary on RT to avoid that kfree and other operations are called with preemption disabled. Reported-and-tested-by: Carsten Emde <cbe@osadl.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15scsi-fcoe-rt-aware.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86-kvm-require-const-tsc-for-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15sysrq: Allow immediate Magic SysRq output for PREEMPT_RT_FULLFrank Rowand
Add a CONFIG option to allow the output from Magic SysRq to be output immediately, even if this causes large latencies. If PREEMPT_RT_FULL, printk() will not try to acquire the console lock when interrupts or preemption are disabled. If the console lock is not acquired the printk() output will be buffered, but will not be output immediately. Some drivers call into the Magic SysRq code with interrupts or preemption disabled, so the output of Magic SysRq will be buffered instead of printing immediately if this option is not selected. Even with this option selected, Magic SysRq output will be delayed if the attempt to acquire the console lock fails. Signed-off-by: Frank Rowand <frank.rowand@am.sony.com> Link: http://lkml.kernel.org/r/4E7CEF60.5020508@am.sony.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15ipc/sem: Rework semaphore wakeupsPeter Zijlstra
Current sysv sems have a weird ass wakeup scheme that involves keeping preemption disabled over a potential O(n^2) loop and busy waiting on that on other CPUs. Kill this and simply wake the task directly from under the sem_lock. This was discovered by a migrate_disable() debug feature that disallows: spin_lock(); preempt_disable(); spin_unlock() preempt_enable(); Cc: Manfred Spraul <manfred@colorfullife.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Manfred Spraul <manfred@colorfullife.com> Link: http://lkml.kernel.org/r/1315994224.5040.1.camel@twins Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15mm, rt: kmap_atomic schedulingPeter Zijlstra
In fact, with migrate_disable() existing one could play games with kmap_atomic. You could save/restore the kmap_atomic slots on context switch (if there are any in use of course), this should be esp easy now that we have a kmap_atomic stack. Something like the below.. it wants replacing all the preempt_disable() stuff with pagefault_disable() && migrate_disable() of course, but then you can flip kmaps around like below. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [dvhart@linux.intel.com: build fix] Link: http://lkml.kernel.org/r/1311842631.5890.208.camel@twins
2012-02-15add /sys/kernel/realtime entryClark Williams
Add a /sys/kernel entry to indicate that the kernel is a realtime kernel. Clark says that he needs this for udev rules, udev needs to evaluate if its a PREEMPT_RT kernel a few thousand times and parsing uname output is too slow or so. Are there better solutions? Should it exist and return 0 on !-rt? Signed-off-by: Clark Williams <williams@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2012-02-15kgdb/serial: Short term workaroundJason Wessel
On 07/27/2011 04:37 PM, Thomas Gleixner wrote: > - KGDB (not yet disabled) is reportedly unusable on -rt right now due > to missing hacks in the console locking which I dropped on purpose. > To work around this in the short term you can use this patch, in addition to the clocksource watchdog patch that Thomas brewed up. Comments are welcome of course. Ultimately the right solution is to change separation between the console and the HW to have a polled mode + work queue so as not to introduce any kind of latency. Thanks, Jason.
2012-02-15ping-sysrq.patchCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2012-02-15net: Avoid livelock in net_tx_action() on RTSteven Rostedt
qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code holding a qdisc_lock() can be interrupted and softirqs can run on the return of interrupt in !RT. The spin_trylock() in net_tx_action() makes sure, that the softirq does not deadlock. When the lock can't be acquired q is requeued and the NET_TX softirq is raised. That causes the softirq to run over and over. That works in mainline as do_softirq() has a retry loop limit and leaves the softirq processing in the interrupt return path and schedules ksoftirqd. The task which holds qdisc_lock cannot be preempted, so the lock is released and either ksoftirqd or the next softirq in the return from interrupt path can proceed. Though it's a bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it decides to bail out even if it's clear in the first iteration :) On RT all softirq processing is done in a FIFO thread and we don't have a loop limit, so ksoftirqd preempts the lock holder forever and unqueues and requeues until the reset button is hit. Due to the forced threading of ksoftirqd on RT we actually cannot deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to replace the spin_trylock() with a spin_lock(). When contended, ksoftirqd is scheduled out and the lock holder can proceed. [ tglx: Massaged changelog and code comments ] Solved-by: Thomas Gleixner <tglx@linuxtronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Carsten Emde <cbe@osadl.org> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15mips-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15ARM: at91: tclib: Default to tclib timer for RTThomas Gleixner
RT is not too happy about the shared timer interrupt in AT91 devices. Default to tclib timer for RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15arm-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15power-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15power-use-generic-rwsem-on-rtThomas Gleixner
2012-02-15printk: Disable migration instead of preemptionRichard Weinberger
There is no need do disable preemption in vprintk(), disable_migrate() is sufficient. This fixes the following bug in -rt: [ 14.759233] BUG: sleeping function called from invalid context at /home/rw/linux-rt/kernel/rtmutex.c:645 [ 14.759235] in_atomic(): 1, irqs_disabled(): 0, pid: 547, name: bash [ 14.759244] Pid: 547, comm: bash Not tainted 3.0.12-rt29+ #3 [ 14.759246] Call Trace: [ 14.759301] [<ffffffff8106fade>] __might_sleep+0xeb/0xf0 [ 14.759318] [<ffffffff810ad784>] rt_spin_lock_fastlock.constprop.9+0x21/0x43 [ 14.759336] [<ffffffff8161fef0>] rt_spin_lock+0xe/0x10 [ 14.759354] [<ffffffff81347ad1>] serial8250_console_write+0x81/0x121 [ 14.759366] [<ffffffff8107ecd3>] __call_console_drivers+0x7c/0x93 [ 14.759369] [<ffffffff8107ef31>] _call_console_drivers+0x5c/0x60 [ 14.759372] [<ffffffff8107f7e5>] console_unlock+0x147/0x1a2 [ 14.759374] [<ffffffff8107fd33>] vprintk+0x3ea/0x462 [ 14.759383] [<ffffffff816160e0>] printk+0x51/0x53 [ 14.759399] [<ffffffff811974e4>] ? proc_reg_poll+0x9a/0x9a [ 14.759403] [<ffffffff81335b42>] __handle_sysrq+0x50/0x14d [ 14.759406] [<ffffffff81335c8a>] write_sysrq_trigger+0x4b/0x53 [ 14.759408] [<ffffffff81335c3f>] ? __handle_sysrq+0x14d/0x14d [ 14.759410] [<ffffffff81197583>] proc_reg_write+0x9f/0xbe [ 14.759426] [<ffffffff811497ec>] vfs_write+0xac/0xf3 [ 14.759429] [<ffffffff8114a9b3>] ? fget_light+0x3a/0x9b [ 14.759431] [<ffffffff811499db>] sys_write+0x4a/0x6e [ 14.759438] [<ffffffff81625d52>] system_call_fastpath+0x16/0x1b Signed-off-by: Richard Weinberger <rw@linutronix.de> Link: http://lkml.kernel.org/r/1323696956-11445-1-git-send-email-rw@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15console-make-rt-friendly.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86-no-perf-irq-work-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15skbufhead-raw-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15jump-label-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15debugobjects-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15hotplug-stuff.patchThomas Gleixner
Do not take lock for non handled cases (might be atomic context) Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15workqueue: Use get_cpu_light() in flush_gcwq()Yong Zhang
BUG: sleeping function called from invalid context at kernel/rtmutex.c:645 in_atomic(): 1, irqs_disabled(): 0, pid: 1739, name: bash Pid: 1739, comm: bash Not tainted 3.0.6-rt17-00284-gb76d419 #3 Call Trace: [<c06e3b5d>] ? printk+0x1d/0x20 [<c01390b6>] __might_sleep+0xe6/0x110 [<c06e633c>] rt_spin_lock+0x1c/0x30 [<c01655a6>] flush_gcwq+0x236/0x320 [<c021c651>] ? kfree+0xe1/0x1a0 [<c05b7178>] ? __cpufreq_remove_dev+0xf8/0x260 [<c0183fad>] ? rt_down_write+0xd/0x10 [<c06cd91e>] workqueue_cpu_down_callback+0x26/0x2d [<c06e9d65>] notifier_call_chain+0x45/0x60 [<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30 [<c014c9b4>] __cpu_notify+0x24/0x40 [<c06cbc6f>] _cpu_down+0xdf/0x330 [<c06cbef0>] cpu_down+0x30/0x50 [<c06cd6b0>] store_online+0x50/0xa7 [<c06cd660>] ? acpi_os_map_memory+0xec/0xec [<c04f2faa>] sysdev_store+0x2a/0x40 [<c02887a4>] sysfs_write_file+0xa4/0x100 [<c0229ab2>] vfs_write+0xa2/0x170 [<c0288700>] ? sysfs_poll+0x90/0x90 [<c0229d92>] sys_write+0x42/0x70 [<c06ecedf>] sysenter_do_call+0x12/0x2d CPU 1 is now offline SMP alternatives: switching to UP code SMP alternatives: switching to SMP code Booting Node 0 Processor 1 APIC 0x1 smpboot cpu 1: start_ip = 9b000 Initializing CPU#1 BUG: sleeping function called from invalid context at kernel/rtmutex.c:645 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0 Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419 #3 Call Trace: [<c06e3b5d>] ? printk+0x1d/0x20 [<c01390b6>] __might_sleep+0xe6/0x110 [<c06e633c>] rt_spin_lock+0x1c/0x30 [<c06cd85b>] workqueue_cpu_up_callback+0x56/0xf3 [<c06e9d65>] notifier_call_chain+0x45/0x60 [<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30 [<c014c9b4>] __cpu_notify+0x24/0x40 [<c014c9ec>] cpu_notify+0x1c/0x20 [<c06e1d43>] notify_cpu_starting+0x1e/0x20 [<c06e0aad>] smp_callin+0xfb/0x10e [<c06e0ad9>] start_secondary+0x19/0xd7 NMI watchdog enabled, takes one hw-pmu counter. Switched to NOHz mode on CPU #1 Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/1318762607-2261-5-git-send-email-yong.zhang0@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15workqueue: Fix PF_THREAD_BOUND abusePeter Zijlstra
PF_THREAD_BOUND is set by kthread_bind() and means the thread is bound to a particular cpu for correctness. The workqueue code abuses this flag and blindly sets it for all created threads, including those that are free to migrate. Restore the original semantics now that the worst abuse in the cpu-hotplug path are gone. The only icky bit is the rescue thread for per-cpu workqueues, this cannot use kthread_bind() but will use set_cpus_allowed_ptr() to migrate itself to the desired cpu. Set and clear PF_THREAD_BOUND manually here. XXX: I think worker_maybe_bind_and_lock()/worker_unbind_and_unlock() should also do a get_online_cpus(), this would likely allow us to remove the while loop. XXX: should probably repurpose GCWQ_DISASSOCIATED to warn on adding works after CPU_DOWN_PREPARE -- its dual use to mark unbound gcwqs is a tad annoying though. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15workqueue: Fix cpuhotplug trainwreckPeter Zijlstra
The current workqueue code does crazy stuff on cpu unplug, it relies on forced affine breakage, thereby violating per-cpu expectations. Worse, it tries to re-attach to a cpu if the thing comes up again before all previously queued works are finished. This breaks (admittedly bonkers) cpu-hotplug use that relies on a down-up cycle to push all usage away. Introduce a new WQ_NON_AFFINE flag that indicates a per-cpu workqueue will not respect cpu affinity and use this to migrate all its pending works to whatever cpu is doing cpu-down. This also adds a warning for queue_on_cpu() users which warns when its used on WQ_NON_AFFINE workqueues for the API implies you care about what cpu things are ran on when such workqueues cannot guarantee this. For the rest, simply flush all per-cpu works and don't mess about. This also means that currently all workqueues that are manually flushing things on cpu-down in order to provide the per-cpu guarantee no longer need to do so. In short, we tell the WQ what we want it to do, provide validation for this and loose ~250 lines of code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15mm-vmalloc.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15epoll.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15workqueue-use-get-cpu-light.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RTAndi Kleen
Normally the x86-64 trap handlers for debug/int 3/stack fault run on a special interrupt stack to make them more robust when dealing with kernel code. The PREEMPT_RT kernel can sleep in locks even while allocating GFP_ATOMIC memory. When one of these trap handlers needs to send real time signals for ptrace it allocates memory and could then try to to schedule. But it is not allowed to schedule on a IST stack. This can cause warnings and hangs. This patch disables the IST stacks for these handlers for PREEMPT_RT kernel. Instead let them run on the normal process stack. The kernel only really needs the ISTs here to make kernel debuggers more robust in case someone sets a break point somewhere where the stack is invalid. But there are no kernel debuggers in the standard kernel that do this. It also means kprobes cannot be set in situations with invalid stack; but that sounds like a reasonable restriction. The stack fault change could minimally impact oops quality, but not very much because stack faults are fairly rare. A better solution would be to use similar logic as the NMI "paranoid" path: check if signal is for user space, if yes go back to entry.S, switch stack, call sync_regs, then do the signal sending etc. But this patch is much simpler and should work too with minimal impact. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86: Use generic rwsem_spinlocks on -rtThomas Gleixner
Simplifies the separation of anon_rw_semaphores and rw_semaphores for -rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86: stackprotector: Avoid random pool on rtThomas Gleixner
CPU bringup calls into the random pool to initialize the stack canary. During boot that works nicely even on RT as the might sleep checks are disabled. During CPU hotplug the might sleep checks trigger. Making the locks in random raw is a major PITA, so avoid the call on RT is the only sensible solution. This is basically the same randomness which we get during boot where the random pool has no entropy and we rely on the TSC randomnness. Reported-by: Carsten Emde <carsten.emde@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15x86: Convert mce timer to hrtimerThomas Gleixner
mce_timer is started in atomic contexts of cpu bringup. This results in might_sleep() warnings on RT. Convert mce_timer to a hrtimer to avoid this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15fs: ntfs: disable interrupt only on !RTMike Galbraith
On Sat, 2007-10-27 at 11:44 +0200, Ingo Molnar wrote: > * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > > > > [10138.175796] [<c0105de3>] show_trace+0x12/0x14 > > > [10138.180291] [<c0105dfb>] dump_stack+0x16/0x18 > > > [10138.184769] [<c011609f>] native_smp_call_function_mask+0x138/0x13d > > > [10138.191117] [<c0117606>] smp_call_function+0x1e/0x24 > > > [10138.196210] [<c012f85c>] on_each_cpu+0x25/0x50 > > > [10138.200807] [<c0115c74>] flush_tlb_all+0x1e/0x20 > > > [10138.205553] [<c016caaf>] kmap_high+0x1b6/0x417 > > > [10138.210118] [<c011ec88>] kmap+0x4d/0x4f > > > [10138.214102] [<c026a9d8>] ntfs_end_buffer_async_read+0x228/0x2f9 > > > [10138.220163] [<c01a0e9e>] end_bio_bh_io_sync+0x26/0x3f > > > [10138.225352] [<c01a2b09>] bio_endio+0x42/0x6d > > > [10138.229769] [<c02c2a08>] __end_that_request_first+0x115/0x4ac > > > [10138.235682] [<c02c2da7>] end_that_request_chunk+0x8/0xa > > > [10138.241052] [<c0365943>] ide_end_request+0x55/0x10a > > > [10138.246058] [<c036dae3>] ide_dma_intr+0x6f/0xac > > > [10138.250727] [<c0366d83>] ide_intr+0x93/0x1e0 > > > [10138.255125] [<c015afb4>] handle_IRQ_event+0x5c/0xc9 > > > > Looks like ntfs is kmap()ing from interrupt context. Should be using > > kmap_atomic instead, I think. > > it's not atomic interrupt context but irq thread context - and -rt > remaps kmap_atomic() to kmap() internally. Hm. Looking at the change to mm/bounce.c, perhaps I should do this instead? Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15fs-block-rt-support.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15mm-protect-activate-switch-mm.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15fs: namespace preemption fixThomas Gleixner
On RT we cannot loop with preemption disabled here as mnt_make_readonly() might have been preempted. We can safely enable preemption while waiting for MNT_WRITE_HOLD to be cleared. Safe on !RT as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15rt: Improve the serial console PASS_LIMITIngo Molnar
Beyond the warning: drivers/tty/serial/8250.c:1613:6: warning: unused variable ‘pass_counter’ [-Wunused-variable] the solution of just looping infinitely was ugly - up it to 1 million to give it a chance to continue in some really ugly situation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15drivers-tty-fix-omap-lock-crap.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15serial: 8250: Call flush_to_ldisc when the irq is threadedIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-15serial: 8250: Clean up the locking for -rtIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15lglocks-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15rt/rcutree: Move misplaced prototypeIngo Molnar
Fix this warning on x86 defconfig: kernel/rcutree.h:433:13: warning: ‘rcu_preempt_qs’ declared ‘static’ but never defined [-Wunused-function] The #ifdefs and prototypes here are a maze, move it closer to the usage site that needs it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15rcu: Make ksoftirqd do RCU quiescent statesPaul E. McKenney
Implementing RCU-bh in terms of RCU-preempt makes the system vulnerable to network-based denial-of-service attacks. This patch therefore makes __do_softirq() invoke rcu_bh_qs(), but only when __do_softirq() is running in ksoftirqd context. A wrapper layer in interposed so that other calls to __do_softirq() avoid invoking rcu_bh_qs(). The underlying function __do_softirq_common() does the actual work. The reason that rcu_bh_qs() is bad in these non-ksoftirqd contexts is that there might be a local_bh_enable() inside an RCU-preempt read-side critical section. This local_bh_enable() can invoke __do_softirq() directly, so if __do_softirq() were to invoke rcu_bh_qs() (which just calls rcu_preempt_qs() in the PREEMPT_RT_FULL case), there would be an illegal RCU-preempt quiescent state in the middle of an RCU-preempt read-side critical section. Therefore, quiescent states can only happen in cases where __do_softirq() is invoked directly from ksoftirqd. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20111005184518.GA21601@linux.vnet.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-15rcu-more-fallout.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>