summaryrefslogtreecommitdiff
path: root/arch/ia64/kernel/ivt.S
AgeCommit message (Collapse)Author
2008-05-27[IA64] Workaround for RSE issueTony Luck
Problem: An application violating the architectural rules regarding operation dependencies and having specific Register Stack Engine (RSE) state at the time of the violation, may result in an illegal operation fault and invalid RSE state. Such faults may initiate a cascade of repeated illegal operation faults within OS interruption handlers. The specific behavior is OS dependent. Implication: An application causing an illegal operation fault with specific RSE state may result in a series of illegal operation faults and an eventual OS stack overflow condition. Workaround: OS interruption handlers that switch to kernel backing store implement a check for invalid RSE state to avoid the series of illegal operation faults. The core of the workaround is the RSE_WORKAROUND code sequence inserted into each invocation of the SAVE_MIN_WITH_COVER and SAVE_MIN_WITH_COVER_R19 macros. This sequence includes hard-coded constants that depend on the number of stacked physical registers being 96. The rest of this patch consists of code to disable this workaround should this not be the case (with the presumption that if a future Itanium processor increases the number of registers, it would also remove the need for this patch). Move the start of the RBS up to a mod32 boundary to avoid some corner cases. The dispatch_illegal_op_fault code outgrew the spot it was squatting in when built with this patch and CONFIG_VIRT_CPU_ACCOUNTING=y Move it out to the end of the ivt. Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-20[IA64] VIRT_CPU_ACCOUNTING (accurate cpu time accounting)Hidetoshi Seto
This patch implements VIRT_CPU_ACCOUNTING for ia64, which enable us to use more accurate cpu time accounting. The VIRT_CPU_ACCOUNTING is an item of kernel config, which s390 and powerpc arch have. By turning this config on, these archs change the mechanism of cpu time accounting from tick-sampling based one to state-transition based one. The state-transition based accounting is done by checking time (cycle counter in processor) at every state-transition point, such as entrance/exit of kernel, interrupt, softirq etc. The difference between point to point is the actual time consumed during in the state. There is no doubt about that this value is more accurate than that of tick-sampling based accounting. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-06[IA64] relax per-cpu TLB requirement to DTCChen, Kenneth W
Instead of pinning per-cpu TLB into a DTR, use DTC. This will free up one TLB entry for application, or even kernel if access pattern to per-cpu data area has high temporal locality. Since per-cpu is mapped at the top of region 7 address, we just need to add special case in alt_dtlb_miss. The physical address of per-cpu data is already conveniently stored in IA64_KR(PER_CPU_DATA). Latency for alt_dtlb_miss is not affected as we can hide all the latency. It was measured that alt_dtlb_miss handler has 23 cycles latency before and after the patch. The performance effect is massive for applications that put lots of tlb pressure on CPU. Workload environment like database online transaction processing or application uses tera-byte of memory would benefit the most. Measurement with industry standard database benchmark shown an upward of 1.6% gain. While smaller workloads like cpu, java also showing small improvement. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24[IA64] MCA recovery: kernel context recovery tableRuss Anderson
Memory errors encountered by user applications may surface when the CPU is running in kernel context. The current code will not attempt recovery if the MCA surfaces in kernel context (privilage mode 0). This patch adds a check for cases where the user initiated the load that surfaces in kernel interrupt code. An example is a user process lauching a load from memory and the data in memory had bad ECC. Before the bad data gets to the CPU register, and interrupt comes in. The code jumps to the IVT interrupt entry point and begins execution in kernel context. The process of saving the user registers (SAVE_REST) causes the bad data to be loaded into a CPU register, triggering the MCA. The MCA surfaces in kernel context, even though the load was initiated from user context. As suggested by David and Tony, this patch uses an exception table like approach, puting the tagged recovery addresses in a searchable table. One difference from the exception table is that MCAs do not surface in precise places (such as with a TLB miss), so instead of tagging specific instructions, address ranges are registers. A single macro is used to do the tagging, with the input parameter being the label of the starting address and the macro being the ending address. This limits clutter in the code. This patch only tags one spot, the interrupt ivt entry. Testing showed that spot to be a "heavy hitter" with MCAs surfacing while saving user registers. Other spots can be added as needed by adding a single macro. Signed-off-by: Russ Anderson (rja@sgi.com) Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-03-08[IA64] Fix race in the accessed/dirty bit handlersChristoph Lameter
A pte may be zapped by the swapper, exiting process, unmapping or page migration while the accessed or dirty bit handers are about to run. In that case the accessed bit or dirty is set on an zeroed pte which leads the VM to conclude that this is a swap pte. This may lead to - Messages from the vm like swap_free: Bad swap file entry 4000000000000000 - Processes being aborted swap_dup: Bad swap file entry 4000000000000000 VM: killing process .... Page migration is particular suitable for the creation of this race since it needs to remove and restore page table entries. The fix here is to check for the present bit and simply not update the pte if the page is not present anymore. If the page is not present then the fault handler should run next which will take care of the problem by bringing the page back and then mark the page dirty or move it onto the active list. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-02-27[IA64] Delete a redundant instruction in unaligned_accessZhang, Yanmin
unaligned_access does fetch cr.ipsr, then calls dispatch_unaligned_handler, but dispatch_unaligned_handler fetches cr.ipsr again, so delete the first one. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-17[IA64] polish comments for tlb fault handler in ivt.SChen, Kenneth W
Polish the comments specifically in vhpt_miss and nested_dtlb_miss handlers. I think it's better to explicitly name each page table level with its name instead of numerically name them. i.e., use pgd, pud, pmd, and pte instead of referring as L1, L2, L3 etc. Along the line, remove some magic number in the comments like: "PTA + (((IFA(61,63) << 7) | IFA(33,39))*8)". No code change at all, pure comment update. Feel free to shoot anything you have, darts or tomahawk cruise missile. I will duck behind a bunker ;-) Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-17[IA64] 4 level page table bug fix in vhpt_missChen, Kenneth W
From source code inspection, I think there is a bug with 4 level page table with vhpt_miss handler. In the code path of rechecking page table entry against previously read value after tlb insertion, *pte value in register r18 was overwritten with value newly read from pud pointer, render the check of new *pte against previous *pte completely wrong. Though the bug is none fatal and the penalty is to purge the entry and retry. For functional correctness, it should be fixed. The fix is to use a different register so new *pud don't trash *pte. (btw, the comments in the cmp statement is wrong as well, which I will address in the next patch). Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-11[IA64] 4-level page tablesRobin Holt
This patch introduces 4-level page tables to ia64. I have run some benchmarks and found nothing interesting. Performance has consistently fallen within the noise range. It also introduces a config option (setting the default to 3 levels). The config option prevents having 4 level page tables with 64k base page size. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11[IA64] MCA/INIT: remove the physical mode path from minstate.hKeith Owens
Remove the physical mode path from minstate.h. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-09kbuild: ia64 use generic asm-offsets.h supportSam Ravnborg
Delete obsolete stuff from arch Makefile Rename file to asm-offsets.h The trick used in the arch Makefile to circumvent the circular dependency is kept. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-06-28[IA64] Speed up lfetch.fault [NULL]David Mosberger-Tang
This patch greatly speeds up the handling of lfetch.fault instructions which result in NaT consumption. Due to the NaT-page mapped at address 0, this is guaranteed to happen when lfetch.fault'ing a NULL pointer. With this patch in place, we can even define prefetch()/prefetchw() as lfetch.fault without significant performance degradation. More importantly, it allows compilers to be more aggressive with using lfetch.fault on pointers that might be NULL. Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28Auto merge with /home/aegl/GIT/ia64-testTony Luck
2005-06-21[IA64] fix nested_dtlb_miss handler for hugetlb addressKen Chen
The nested_dtlb_miss handler currently does not handle fault from hugetlb address correctly. It walks the page table assuming PAGE_SIZE. Thus when taking a fault triggered from hugetlb address, it would not calculate the pgd/pmd/pte address correctly and thus result an incorrect invocation of ia64_do_page_fault(). In there, kernel will signal SIGBUS and application dies (The faulting address is perfectly legal and we have a valid pte for the corresponding user hugetlb address as well). This patch fix the described kernel bug. Since nested_dtlb_miss is a rare event and a slow path anyway, I'm making the change without #ifdef CONFIG_HUGETLB_PAGE for code readability. Tony, please apply. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-27[IA64] Reschedule break_fault() for better performance.David Mosberger-Tang
This patch reorganizes break_fault() to optimistically assume that a system-call is being performed from user-space (which is almost always the case). If it turns out that (a) we're not being called due to a system call or (b) we're being called from within the kernel, we fixup the no-longer-valid assumptions in non_syscall() and .break_fixup(), respectively. With this approach, there are 3 major phases: - Phase 1: Read various control & application registers, in particular the current task pointer from AR.K6. - Phase 2: Do all memory loads (load system-call entry, load current_thread_info()->flags, prefetch kernel register-backing store) and switch to kernel register-stack. - Phase 3: Call ia64_syscall_setup() and invoke syscall-handler. Good for 26-30 cycles of improvement on break-based syscall-path. Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-27[IA64] In syscall-entry, use st8 instead of stf8 to clear pt_regs.r8David Mosberger-Tang
Using stf8 seemed like a clever idea at the time, but stf8 forces the cache-line to be invalidated in the L1D (if it happens to be there already). This patch eliminates a guaranteed L1D cache-miss and, by itself, is good for a 1-2 cycle improvement for heavy-weight syscalls. Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!