summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/smp.c
AgeCommit message (Collapse)Author
2010-09-02powerpc: Account time using timebase rather than PURRPaul Mackerras
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the PURR register for measuring the user and system time used by processes, as well as other related times such as hardirq and softirq times. This turns out to be quite confusing for users because it means that a program will often be measured as taking less time when run on a multi-threaded processor (SMT2 or SMT4 mode) than it does when run on a single-threaded processor (ST mode), even though the program takes longer to finish. The discrepancy is accounted for as stolen time, which is also confusing, particularly when there are no other partitions running. This changes the accounting to use the timebase instead, meaning that the reported user and system times are the actual number of real-time seconds that the program was executing on the processor thread, regardless of which SMT mode the processor is in. Thus a program will generally show greater user and system times when run on a multi-threaded processor than on a single-threaded processor. On pSeries systems on POWER5 or later processors, we measure the stolen time (time when this partition wasn't running) using the hypervisor dispatch trace log. We check for new entries in the log on every entry from user mode and on every transition from kernel process context to soft or hard IRQ context (i.e. when account_system_vtime() gets called). So that we can correctly distinguish time stolen from user time and time stolen from system time, without having to check the log on every exit to user mode, we store separate timestamps for exit to user mode and entry from user mode. On systems that have a SPURR (POWER6 and POWER7), we read the SPURR in account_system_vtime() (as before), and then apportion the SPURR ticks since the last time we read it between scaled user time and scaled system time according to the relative proportions of user time and system time over the same interval. This avoids having to read the SPURR on every kernel entry and exit. On systems that have PURR but not SPURR (i.e., POWER5), we do the same using the PURR rather than the SPURR. This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl for now since it conflicts with the use of the dispatch trace log by the time accounting code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Move arch_sd_sibling_asym_packing() to smp.cMichael Neuling
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to smp.c to save an #ifdef CONFIG_SMP No functionality change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Silence __cpu_up() under normal operationSigned-off-by: Darren Hart
During CPU offline/online tests __cpu_up would flood the logs with the following message: Processor 0 found. This provides no useful information to the user as there is no context provided, and since the operation was a success (to this point) it is expected that the CPU will come back online, providing all the feedback necessary. Change the "Processor found" message to DBG() similar to other such messages in the same function. Also, add an appropriate log level for the "Processor is stuck" message. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31powerpc/smp: remove the incorrect decrementer initial codes for APTiejun Chen
We already defined start_cpu_decrementer() to invoke decrementer for AP as the following path: start_secondary() -> secondary_cpu_time_init() -> start_cpu_decrementer() So remove these incorrect codes introduced from commit: e7f75ad0 powerpc/47x: Base ppc476 support And actually we really should not enable decrementer before calling set_dec(). Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09powerpc: Clean up obsolete code relating to decrementer and timebasePaul Mackerras
Since the decrementer and timekeeping code was moved over to using the generic clockevents and timekeeping infrastructure, several variables and functions have been obsolete and effectively unused. This deletes them. In particular, wakeup_decrementer() is no longer needed since the generic code reprograms the decrementer as part of the process of resuming the timekeeping code, which happens during sysdev resume. Thus the wakeup_decrementer calls in the suspend_enter methods for 52xx platforms have been removed. The call in the powermac cpu frequency change code has been replaced by set_dec(1), which will cause a timer interrupt as soon as interrupts are enabled, and the generic code will then reprogram the decrementer with the correct value. This also simplifies the generic_suspend_en/disable_irqs functions and makes them static since they are not referenced outside time.c. The preempt_enable/disable calls are removed because the generic code has disabled all but the boot cpu at the point where these functions are called, so we can't be moved to another cpu. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc: Use common cpu_die (fixes SMP+SUSPEND build)Milton Miller
Configuring a powerpc 32 bit kernel for both SMP and SUSPEND turns on CPU_HOTPLUG to enable disable_nonboot_cpus to be called by the common suspend code. Previously the definition of cpu_die for ppc32 was in the powermac platform code, causing it to be undefined if that platform as not selected. arch/powerpc/kernel/built-in.o: In function 'cpu_idle': arch/powerpc/kernel/idle.c:98: undefined reference to 'cpu_die' Move the code from setup_64 to smp.c and rename the power mac versions to their specific names. Note that this does not setup the cpu_die pointers in either smp_ops (request a given cpu die) or ppc_md (make this cpu die), for other platforms but there are generic versions in smp.c. Reported-by: Matt Sealey <matt@genesi-usa.com> Reported-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06powerpc/cpumask: Update some commentsAnton Blanchard
Since the *_map cpumask variants are deprecated, change the comments to instead refer to *_mask. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06powerpc/cpumask: Dynamically allocate cpu_sibling_map and cpu_core_map cpumasksAnton Blanchard
Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks. We don't need to set_cpu_online() the boot cpu in smp_prepare_boot_cpu, init/main.c does it for us. We also postpone setting of the boot cpu in cpu_sibling_map and cpu_core_map until when the memory allocator is available (smp_prepare_cpus), similar to x86. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06powerpc/cpumask: Convert fixup_irqs to new cpumask APIAnton Blanchard
Use new cpumask_* functions, and dynamically allocate cpumask in fixup_irqs. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06powerpc/cpumask: Convert smp_cpus_done to new cpumask APIAnton Blanchard
Use the new cpumask_* functions and dynamically allocate the cpumask in smp_cpus_done. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-05powerpc/47x: Base ppc476 supportDave Kleikamp
This patch adds the base support for the 476 processor. The code was primarily written by Ben Herrenschmidt and Torez Smith, but I've been maintaining it for a while. The goal is to have a single binary that will run on 44x and 47x, but we still have some details to work out. The biggest is that the L1 cache line size differs on the two platforms, but it's currently a compile-time option. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-04-07powerpc: Use set_cpus_allowed_ptrJulia Lawall
Use set_cpus_allowed_ptr rather than set_cpus_allowed. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1,E2; @@ - set_cpus_allowed(E1, cpumask_of_cpu(E2)) + set_cpus_allowed_ptr(E1, cpumask_of(E2)) @@ expression E; identifier I; @@ - set_cpus_allowed(E, I) + set_cpus_allowed_ptr(E, &I) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15powerpc: Move cpu hotplug driver lock from pseries to powerpcNathan Fontenot
Move the defintion and lock helper routines for the cpu hotplug driver lock from pseries to powerpc code to avoid build breaks for platforms other than pseries that use cpu hotplug. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
2009-12-09powerpc: stop_this_cpu: remove the cpu from the online map.Valentine Barshak
Remove the CPU from the online map to prevent smp_call_function from sending messages to a stopped CPU. Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-29percpu: make percpu symbols in powerpc uniqueTejun Heo
This patch updates percpu related symbols in powerpc such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/powerpc/kernel/perf_callchain.c: s/callchain/cpu_perf_callchain/ * arch/powerpc/kernel/setup-common.c: s/pvr/cpu_pvr/ * arch/powerpc/platforms/pseries/dtl.c: s/dtl/cpu_dtl/ * arch/powerpc/platforms/cell/interrupt.c: s/iic/cpu_iic/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org
2009-09-24cpumask: Use accessors for cpu_*_mask: powerpcRusty Russell
Use the accessors rather than frobbing bits directly (the new versions are const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24cpumask: arch_send_call_function_ipi_mask: powerpcRusty Russell
We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-11powerpc/85xx: Fix SMP compile error and allow NULL for smp_opsKumar Gala
The following commit introduced a compile error since it removed the implementation of smp_85xx_basic_setup: commit 77c0a700c1c292edafa11c1e52821ce4636f81b0 Author: Benjamin Herrenschmidt <benh@kernel.crashing.org> Date: Fri Aug 28 14:25:04 2009 +1000 powerpc: Properly start decrementer on BookE secondary CPUs Make it so that smp_ops probe() and setup_cpu() can be set to NULL. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27powerpc/pseries: Reduce the polling interval in __cpu_up()Gautham R Shenoy
Time time taken for a single cpu online operation on a pseries machine is as follows: Dedicated LPAR (POWER6): ~220ms. Shared LPAR (POWER5) : ~240ms. Of this time, approximately 200ms is taken up by __cpu_up(). This is because we poll every 200ms to check if the new cpu has notified it's presence through the cpu_callin_map. We repeat this operation until the new cpu sets the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier. However, using completion_structs instead of polling loops, the time taken by the new processor to indicate it's presence has found to be less than 1ms on pseries. This method however may not work on all powerpc platforms due to the time-base synchronization code. Keeping this in mind, we could reduce msleep polling interval from 200ms to 1ms while retaining the 5 second timeout. With this, the time taken for a cpu online operation changes as follows: Dedicated LPAR (POWER6): 20-25ms. Shared LPAR (POWER5) : 60-80ms. In both these cases, it was found that the code polls through the loop only once indicating that 1ms is a reasonable value, atleast on pseries. The code needs testing on other powerpc platforms. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26powerpc/pmac: Fix issues with PowerMac "PowerSurge" SMPBenjamin Herrenschmidt
The old PowerSurge SMP (ie, dual or quad 604 machines) code has numerous issues in modern world. One is cpu_possible_map is set too late (the device-tree is bogus) so we fail to allocate the interrupt stacks and crash. Another problem is the fact the timebase is frozen by the bringup of the second CPU so the delays in the generic code will hang, we need to move some of the calling procedure to inside the powermac code. This makes it boot again for me Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-02Merge branch 'cpus4096-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits) x86: export vector_used_by_percpu_irq x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and() sched: nominate preferred wakeup cpu, fix x86: fix lguest used_vectors breakage, -v2 x86: fix warning in arch/x86/kernel/io_apic.c sched: fix warning in kernel/sched.c sched: move test_sd_parent() to an SMP section of sched.h sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 sched: activate active load balancing in new idle cpus sched: bias task wakeups to preferred semi-idle packages sched: nominate preferred wakeup cpu sched: favour lower logical cpu number for sched_mc balance sched: framework for sched_mc/smt_power_savings=N sched: convert BALANCE_FOR_xx_POWER to inline functions x86: use possible_cpus=NUM to extend the possible cpus allowed x86: fix cpu_mask_to_apicid_and to include cpu_online_mask x86: update io_apic.c to the new cpumask code x86: Introduce topology_core_cpumask()/topology_thread_cpumask() x86: xen: use smp_call_function_many() x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c ... Fixed up trivial conflict in kernel/time/tick-sched.c manually
2008-12-21powerpc: Convert cpu_to_l2cache() to of_find_next_cache_node()Nathan Lynch
The smp code uses cache information to populate cpu_core_map; change it to use common code for cache lookup. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-16powerpc: Move smp_hw_index to 32-bit codeNathan Lynch
smp_hw_index isn't used on 64-bit, so move it from smp.c to setup_32.c. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-13cpumask: centralize cpu_online_map and cpu_possible_mapRusty Russell
Impact: cleanup Each SMP arch defines these themselves. Move them to a central location. Twists: 1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a CONFIG_INIT_ALL_POSSIBLE for this rather than break them. 2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'. Those archs simply have phys_cpu_present_map replaced everywhere. 3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky so I just manipulate them both in sync. 4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map' declarations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Tested-by: Tony Luck <tony.luck@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Mike Travis <travis@sgi.com> Cc: ink@jurassic.park.msu.ru Cc: rmk@arm.linux.org.uk Cc: starvik@axis.com Cc: tony.luck@intel.com Cc: takata@linux-m32r.org Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: paulus@samba.org Cc: schwidefsky@de.ibm.com Cc: lethal@linux-sh.org Cc: wli@holomorphy.com Cc: davem@davemloft.net Cc: jdike@addtoit.com Cc: mingo@redhat.com
2008-11-19powerpc: Provide a separate handler for each IPI actionMilton Miller
With the new generic smp call function helpers, I noticed the code in smp_message_recv was a single function call in many cases. While getting the message number from the ipi data is easy, we can reduce the path length by a function and data-dependent switch by registering seperate IPI actions for these simple calls. Originally I left the ipi action array exposed, but then I realized the registration code should be common too. The three users each had their own name array, so I made a fourth to convert all users to use a common one. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-10-15Merge commit 'origin'Benjamin Herrenschmidt
Manual fixup of conflicts on: arch/powerpc/include/asm/dcr-regs.h drivers/net/ibm_newemac/core.h
2008-10-13powerpc/smp: No need to set_need_resched when getting a resched IPIMilton Miller
The comment in the code was asking "Do we have to do this?", and according to x86 and s390 the answer is no, the scheduler will do it before calling the arch hook. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-09-08kernel/cpu.c: create a CPU_STARTING cpu_chain notifierManfred Spraul
Right now, there is no notifier that is called on a new cpu, before the new cpu begins processing interrupts/softirqs. Various kernel function would need that notification, e.g. kvm works around by calling smp_call_function_single(), rcu polls cpu_online_map. The patch adds a CPU_STARTING notification. It also adds a helper function that sends the message to all cpu_chain handlers. Tested on x86-64. All other archs are untested. Especially on sparc, I'm not sure if I got it right. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28powerpc: Make core id information available to userspaceNathan Lynch
Existing Open Firmware practice is to report each processor core as a separate node in the device tree. Report the value of the "reg" OF property corresponding to a logical CPU's device node as the core_id attribute in /sys/devices/system/cpu/cpu*/topology/core_id. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28powerpc: Make core sibling information available to userspaceNathan Lynch
Implement the notion of "core siblings" for powerpc. This makes /sys/devices/system/cpu/cpu*/topology/core_siblings present sensible values, indicating online CPUs which share an L2 cache. BenH: Made cpu_to_l2cache() use of_find_node_by_phandle() instead of IBM-specific open coded search Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28powerpc: Update cpu_sibling_maps dynamicallyNathan Lynch
Rather doing one initialization pass over all the per-cpu cpu_sibling_maps at boot, update the maps at cpu online/offline time. This is a behavior change -- the thread_siblings attribute now reflects only online siblings, whereas it would display offline siblings before. The new behavior matches that of x86, and is arguably more useful. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-16Merge commit 'origin/master'Benjamin Herrenschmidt
Manual merge of: arch/powerpc/Kconfig arch/powerpc/kernel/stacktrace.c arch/powerpc/mm/slice.c arch/ppc/kernel/smp.c
2008-06-26smp_call_function: get rid of the unused nonatomic/retry argumentJens Axboe
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26powerpc: convert to generic helpers for IPI function callsJens Axboe
This converts ppc to use the new helpers for smp_call_function() and friends, and adds support for smp_call_function_single(). ppc loses the timeout functionality of smp_call_function_mask() with this change, as the generic code does not provide that. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-14[POWERPC] Fix sparse warnings in arch/powerpc/kernelMichael Ellerman
Make a few things static in lparcfg.c Make init and exit routines static in rtas_flash.c Make things static in rtas_pci.c Make some functions static in rtas.c Make fops static in rtas-proc.c Remove unneeded extern for do_gtod in smp.c Make clocksource_init() static in time.c Make last_tick_len and ticklen_to_xs static in time.c Move the declaration of the pvr per-cpu into smp.h Make kexec_smp_down() and kexec_stack static in machine_kexec_64.c Don't return void in arch_teardown_msi_irqs() in msi.c Move declaration of GregorianDay()into asm/time.h Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-02[POWERPC] Bolt in SLB entry for kernel stack on secondary cpusPaul Mackerras
This fixes a regression reported by Kamalesh Bulabel where a POWER4 machine would crash because of an SLB miss at a point where the SLB miss exception was unrecoverable. This regression is tracked at: http://bugzilla.kernel.org/show_bug.cgi?id=10082 SLB misses at such points shouldn't happen because the kernel stack is the only memory accessed other than things in the first segment of the linear mapping (which is mapped at all times by entry 0 of the SLB). The context switch code ensures that SLB entry 2 covers the kernel stack, if it is not already covered by entry 0. None of entries 0 to 2 are ever replaced by the SLB miss handler. Where this went wrong is that the context switch code assumes it doesn't have to write to SLB entry 2 if the new kernel stack is in the same segment as the old kernel stack, since entry 2 should already be correct. However, when we start up a secondary cpu, it calls slb_initialize, which doesn't set up entry 2. This is correct for the boot cpu, where we will be using a stack in the kernel BSS at this point (i.e. init_thread_union), but not necessarily for secondary cpus, whose initial stack can be allocated anywhere. This doesn't cause any immediate problem since the SLB miss handler will just create an SLB entry somewhere else to cover the initial stack. In fact it's possible for the cpu to go quite a long time without SLB entry 2 being valid. Eventually, though, the entry created by the SLB miss handler will get overwritten by some other entry, and if the next access to the stack is at an unrecoverable point, we get the crash. This fixes the problem by making slb_initialize create a suitable entry for the kernel stack, if we are on a secondary cpu and the stack isn't covered by SLB entry 0. This requires initializing the get_paca()->kstack field earlier, so I do that in smp_create_idle where the current field is initialized. This also abstracts a bit of the computation that mk_esid_data in slb.c does so that it can be used in slb_initialize. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-25[POWERPC] Make smp_send_stop() handle panic and xmon rebootOlof Johansson
smp_send_stop() will send an IPI to all other cpus to shut them down. However, for the case of xmon-based reboots (as well as potentially some panics), the other cpus are (or might be) spinning with interrupts off, and won't take the IPI. Current code will drop us into the debugger when the IPI fails, which means we're in an infinite loop that we can't get out of without an external reset of some sort. Instead, make the smp_send_stop() IPI call path just print the warning about being unable to send IPIs, but make it return so the rest of the shutdown sequence can continue. It's not perfect, but the lesser of two evils. Also move the call_lock handling outside of smp_call_function_map so we can avoid deadlocks in smp_send_stop(). Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-25[POWERPC] Make smp_call_function_map staticOlof Johansson
smp_call_function_map should be static, and for consistency prepend it with __ like other local helper functions in the same file. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-16Convert cpu_sibling_map to be a per cpu variableMike Travis
Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu variable. This saves sizeof(cpumask_t) * NR unused cpus. Access is mostly from startup and CPU HOTPLUG functions. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-03[POWERPC] Implement clockevents driver for powerpcTony Breeds
This registers a clock event structure for the decrementer and turns on CONFIG_GENERIC_CLOCKEVENTS, which means that we now don't need most of timer_interrupt(), since the work is done in generic code. For secondary CPUs, their decrementer clockevent is registered when the CPU comes up (the generic code automatically removes the clockevent when the CPU goes down). Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-22[POWERPC] Avoid pointless WARN_ON(irqs_disabled()) from panic codepathSatyam Sharma
> ------------[ cut here ]------------ > Badness at arch/powerpc/kernel/smp.c:202 comes when smp_call_function_map() has been called with irqs disabled, which is illegal. However, there is a special case, the panic() codepath, when we do not want to warn about this -- warning at that time is pointless anyway, and only serves to scroll away the *real* cause of the panic and distracts from the real bug. * So let's extract the WARN_ON() from smp_call_function_map() into all its callers -- smp_call_function() and smp_call_function_single() * Also, introduce another caller of smp_call_function_map(), namely __smp_call_function() (and make smp_call_function() a wrapper over this) which does *not* warn about disabled irqs * Use this __smp_call_function() from the panic codepath's smp_send_stop() We also end having to move code of smp_send_stop() below the definition of __smp_call_function(). Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-08-03[POWERPC] Fix num_cpus calculation in smp_call_function_map()Kevin Corry
In smp_call_function_map(), num_cpus is set to the number of online CPUs minus one. However, if the CPU mask does not include all CPUs (except the one we're running on), the routine will hang in the first while() loop until the 8 second timeout occurs. The num_cpus should be set to the number of CPUs specified in the mask passed into the routine, after we've made any modifications to the mask. With this change, we can also get rid of the call to cpus_empty() and avoid adding another pass through the bitmask. Signed-off-by: Kevin Corry <kevcorry@us.ibm.com> Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-22[POWERPC] Allow smp_call_function_single() to current cpuAvi Kivity
This removes the requirement for callers to get_cpu() to check in simple cases. i386 and x86_64 already received a similar treatment. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-22[POWERPC] Fix smp_call_function to be preempt-safeHugh Dickins
smp_call_function_map() was not safe against preemption to another cpu: its test for removing self from map was outside the spinlock. Rearrange it a little to fix that. smp_call_function_single() was also wrong: now get_cpu() before excluding self, as other architectures do. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-07[POWERPC] Add smp_call_function_map and smp_call_function_singlewill schmidt
Add a new function named smp_call_function_single(). This matches a generic prototype from include/linux/smp.h. Add a function smp_call_function_map(). This is, for the most part, a rename of smp_call_function, with some added cpumask support. smp_call_function and smp_call_function_single call into smp_call_function_map. Lightly tested on 970mp (blade), power4 and power5. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> cc: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13[POWERPC] Make tlb flush batch use lazy MMU modeBenjamin Herrenschmidt
The current tlb flush code on powerpc 64 bits has a subtle race since we lost the page table lock due to the possible faulting in of new PTEs after a previous one has been removed but before the corresponding hash entry has been evicted, which can leads to all sort of fatal problems. This patch reworks the batch code completely. It doesn't use the mmu_gather stuff anymore. Instead, we use the lazy mmu hooks that were added by the paravirt code. They have the nice property that the enter/leave lazy mmu mode pair is always fully contained by the PTE lock for a given range of PTEs. Thus we can guarantee that all batches are flushed on a given CPU before it drops that lock. We also generalize batching for any PTE update that require a flush. Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and disabled by arch_leave_lazy_mmu_mode(). The code epects that this is always contained within a PTE lock section so no preemption can happen and no PTE insertion in that range from another CPU. When batching is enabled on a CPU, every PTE updates that need a hash flush will use the batch for that flush. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-14[POWERPC] Move MPIC smp routines into mpic.cMichael Ellerman
Move a couple of MPIC smp routines into mpic.c, they're inside an SMP block in mpic.c - so they're still only built for SMP. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-01-11[PATCH] Change cpu_up and co from __devinit to __cpuinitGautham R Shenoy
Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n with CONFIG_RELOCATABLE = y generates the following modpost warnings WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up' This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are defined as __devinit AND __cpu_up calls some __cpuinit functions. Since __cpuinit would map to __init with this kind of a configuration, we get a .text refering .init.data warning. This patch solves the problem by converting all of __cpu_up, _cpu_up and cpu_up from __devinit to __cpuinit. The approach is justified since the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or are of __init type. Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would land up in .init section. Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-25[POWERPC] cell: add cpufreq driver for Cell BE processorChristian Krafft
This patch adds a cpufreq backend driver to enable frequency scaling on cell. Signed-off-by: Christian Krafft <krafft@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>