summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-02-28Merge branch 'tip/tracing/ftrace' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
2009-02-28tracing: create the C style tracing for the irq subsystemSteven Rostedt
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style faster tracing for the irq subsystem trace points. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: create the C style tracing for the sched subsystemSteven Rostedt
This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style faster tracing for the sched subsystem trace points. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add raw fast tracing interface for trace eventsSteven Rostedt
This patch adds the interface to enable the C style trace points. In the directory /debugfs/tracing/events/subsystem/event We now have three files: enable : values 0 or 1 to enable or disable the trace event. available_types: values 'raw' and 'printf' which indicate the tracing types available for the trace point. If a developer does not use the TRACE_EVENT_FORMAT macro and just uses the TRACE_FORMAT macro, then only 'printf' will be available. This file is read only. type: values 'raw' or 'printf'. This indicates which type of tracing is active for that trace point. 'printf' is the default and if 'raw' is not available, this file is read only. # echo raw > /debug/tracing/events/sched/sched_wakeup/type # echo 1 > /debug/tracing/events/sched/sched_wakeup/enable Will enable the C style tracing for the sched_wakeup trace point. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add raw trace point recording infrastructureSteven Rostedt
Impact: lower overhead tracing The current event tracer can automatically pick up trace points that are registered with the TRACE_FORMAT macro. But it required a printf format string and parsing. Although, this adds the ability to get guaranteed information like task names and such, it took a hit in overhead processing. This processing can add about 500-1000 nanoseconds overhead, but in some cases that too is considered too much and we want to shave off as much from this overhead as possible. Tom Zanussi recently posted tracing patches to lkml that are based on a nice idea about capturing the data via C structs using STRUCT_ENTER, STRUCT_EXIT type of macros. I liked that method very much, but did not like the implementation that required a developer to add data/code in several disjoint locations. This patch extends the event_tracer macros to do a similar "raw C" approach that Tom Zanussi did. But instead of having the developers needing to tweak a bunch of code all over the place, they can do it all in one macro - preferably placed near the code that it is tracing. That makes it much more likely that tracepoints will be maintained on an ongoing basis by the code they modify. The new macro TRACE_EVENT_FORMAT is created for this approach. (Note, a developer may still utilize the more low level DECLARE_TRACE macros if they don't care about getting their traces automatically in the event tracer.) They can also use the existing TRACE_FORMAT if they don't need to code the tracepoint in C, but just want to use the convenience of printf. So if the developer wants to "hardwire" a tracepoint in the fastest possible way, and wants to acquire their data via a user space utility in a raw binary format, or wants to see it in the trace output but not sacrifice any performance, then they can implement the faster but more complex TRACE_EVENT_FORMAT macro. Here's what usage looks like: TRACE_EVENT_FORMAT(name, TPPROTO(proto), TPARGS(args), TPFMT(fmt, fmt_args), TRACE_STUCT( TRACE_FIELD(type1, item1, assign1) TRACE_FIELD(type2, item2, assign2) [...] ), TPRAWFMT(raw_fmt) ); Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT uses. name: is the unique identifier of the trace point proto: The proto type that the trace point uses args: the args in the proto type fmt: printf format to use with the event printf tracer fmt_args: the printf argments to match fmt TRACE_STRUCT starts the ability to create a structure. Each item in the structure is defined with a TRACE_FIELD TRACE_FIELD(type, item, assign) type: the C type of item. item: the name of the item in the stucture assign: what to assign the item in the trace point callback raw_fmt is a way to pretty print the struct. It must match the order of the items are added in TRACE_STUCT An example of this would be: TRACE_EVENT_FORMAT(sched_wakeup, TPPROTO(struct rq *rq, struct task_struct *p, int success), TPARGS(rq, p, success), TPFMT("task %s:%d %s", p->comm, p->pid, success?"succeeded":"failed"), TRACE_STRUCT( TRACE_FIELD(pid_t, pid, p->pid) TRACE_FIELD(int, success, success) ), TPRAWFMT("task %d success=%d") ); This creates us a unique struct of: struct { pid_t pid; int success; }; And the way the call back would assign these values would be: entry->pid = p->pid; entry->success = success; The nice part about this is that the creation of the assignent is done via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is created, the developer will then have a faster method to record into the ring buffer. They do not need to worry about the tracer itself. The developer would only need to touch the files in include/trace/*.h Again, I would like to give special thanks to Tom Zanussi for this nice idea. Idea-from: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add interface to write into current tracer bufferSteven Rostedt
Right now all tracers must manage their own trace buffers. This was to enforce tracers to be independent in case we finally decide to allow each tracer to have their own trace buffer. But now we are adding event tracing that writes to the current tracer's buffer. This adds an interface to allow events to write to the current tracer buffer without having to manage its own. Since event tracing has no "tracer", and is just a way to hook into any other tracer. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add subsystem sched for sched eventsSteven Rostedt
Add the TRACE_SYSTEM sched for the sched events. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: add subsystem irq for irq eventsSteven Rostedt
Add the TRACE_SYSTEM irq for the irq events. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: make the set_event and available_events subsystem awareSteven Rostedt
This patch makes the event files, set_event and available_events aware of the subsystem. Now you can enable an entire subsystem with: echo 'irq:*' > set_event Note: the '*' is not needed. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28Merge branch 'tip/tracing/ftrace' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
2009-02-28tracing: add subsystem level to trace eventsSteven Rostedt
If a trace point header defines TRACE_SYSTEM, then it will add the following trace points into that event system. If include/trace/irq_event_types.h has: #define TRACE_SYSTEM irq at the top and #undef TRACE_SYSTEM at the bottom, then a directory "irq" will be created in the /debug/tracing/events directory. Inside that directory will contain the two trace points that are defined in include/trace/irq_event_types.h. Only adding the above to irq and not to sched, we get: # ls /debug/tracing/events/ irq sched_process_exit sched_signal_send sched_wakeup_new sched_kthread_stop sched_process_fork sched_switch sched_kthread_stop_ret sched_process_free sched_wait_task sched_migrate_task sched_process_wait sched_wakeup # ls /debug/tracing/events/irq irq_handler_entry irq_handler_exit If we add #define TRACE_SYSTEM sched to the trace/sched_event_types.h then the rest of the trace events will be put in a sched directory within the events directory. I've been playing with this idea of the subsystem for a while, but recently Tom Zanussi posted some patches to lkml that included this method. Tom's approach was clean and got me to finally put some effort to clean up the event trace points. Thanks to Tom Zanussi for demonstrating how nice the subsystem method is. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28tracing: move trace point formats to files in include/trace directorySteven Rostedt
Impact: clean up To further facilitate the ease of adding trace points for developers, this patch creates include/trace/trace_events.h and include/trace/trace_event_types.h. The former file will hold the trace/<type>.h files and the latter will hold the trace/<type>_event_types.h files. To create new tracepoints and to have them automatically appear in the event tracer, a developer makes the trace/<type>.h file which includes <linux/tracepoint.h> and the trace/<type>_event_types.h file. The trace/<type>_event_types.h file will hold the TRACE_FORMAT macros. Then add the trace/<type>.h file to trace/trace_events.h, and add the trace/<type>_event_types.h to the trace_event_types.h file. No need to modify files elsewhere. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-27tracing: replace kzalloc with kcallocSteven Rostedt
Impact: clean up kcalloc is a better approach to allocate a NULL array. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-27Merge branch 'tip/tracing/ftrace' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace Conflicts: kernel/sched_clock.c
2009-02-27Merge branches 'tracing/ftrace' and 'linus' into tracing/coreIngo Molnar
2009-02-27Merge branch 'sched/clock' into tracing/ftraceIngo Molnar
Conflicts: kernel/sched_clock.c
2009-02-27tracing: use newline separator for trace options listSteven Rostedt
Impact: clean up Instead of listing the trace options like: # cat /debug/tracing/trace_options print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock nostacktrace nosched-tree ftrace_printk noftrace_preempt nobranch annotate nouserstacktrace nosym-userobj We now list them like: # cat /debug/tracing/trace_options print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock nostacktrace nosched-tree ftrace_printk noftrace_preempt nobranch annotate nouserstacktrace nosym-userobj Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-27tracing: use pointer error returns for __tracing_openSteven Rostedt
Impact: fix compile warning and clean up When I first wrote __tracing_open, instead of passing the error code via the ERR_PTR macros, I lazily used a separate parameter to hold the return for errors. When Frederic Weisbecker updated that function, he used the Linux kernel ERR_PTR for the returns. This caused the parameter return to possibly not be initialized on error. gcc correctly pointed this out with a warning. This patch converts the entire function to use the Linux kernel ERR_PTR macro methods. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-26tracing: add protection around open use of current_tracerSteven Rostedt
Impact: fix to possible race conditions There's some uses of current_tracer that is not protected by the trace_types_lock. There is a small chance that a sysadmin changes the tracer while the current_tracer is being referenced. If the race is hit, it is unlikely to cause any harm since the tracers are constant and are not freed. But some strang side effects may occur. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-26tracing: add tracer dependent options to options directorySteven Rostedt
This patch adds the tracer dependent options dynamically to the options directory when the tracer is activated. These options are removed when the tracer is deactivated. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-26tracing: add options directory and core option filesSteven Rostedt
This patch creates an options directory in the debugfs, that contains the available tracing options. These files contain 1 or 0, where 1 is the option is enabled and 0 it is disabled. Simply echoing in 1 will enable the option and 0 will disable it. This patch only contains the core options, not the tracer options. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-26Merge branch 'sched/clock' into tracing/ftraceIngo Molnar
Conflicts: kernel/sched_clock.c
2009-02-26x86: set X86_FEATURE_TSC_RELIABLEIngo Molnar
If the TSC is constant and non-stop, also set it reliable. (We will turn this off in DMI quirks for multi-chassis systems) The performance number on a 16-way Nehalem system running 32 tasks that context-switch between each other is significant: sched_clock_stable=0 sched_clock_stable=1 .................... .................... 22.456925 million/sec 24.306972 million/sec [+8.2%] lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to 0.59 microseconds - a 6.7% increase in context-switching performance. Perfstat of 1 million pipe context switches between two tasks: Performance counter stats for './pipe-test-1m': [before] [after] ............ ............ 37621.421089 36436.848378 task clock ticks (msecs) 0 0 CPU migrations (events) 2000274 2000189 context switches (events) 194 193 pagefaults (events) 8433799643 8171016416 CPU cycles (events) -3.21% 8370133368 8180999694 instructions (events) -2.31% 4158565 3895941 cache references (events) -6.74% 44312 46264 cache misses (events) 2349.287976 2279.362465 wall-time (msecs) -3.06% The speedup comes straight from the reduction in the instruction count. sched_clock_cpu() got simpler and the whole workload thus executes faster. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26sched: allow architectures to specify sched_clock_stableIngo Molnar
Allow CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures to still specify that their sched_clock() implementation is reliable. This will be used by x86 to switch on a faster sched_clock_cpu() implementation on certain CPU types. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: try committing transaction before returning ENOSPC Btrfs: add better -ENOSPC handling
2009-02-26Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: xen/blkfront: use blk_rq_map_sg to generate ring entries block: reduce stack footprint of blk_recount_segments() cciss: shorten 30s timeout on controller reset block: add documentation for register_blkdev() block: fix bogus gcc warning for uninitialized var usage
2009-02-26Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix 64bit __copy_tofrom_user() regression powerpc: Fix 64bit memcpy() regression powerpc: Fix load/store float double alignment handler
2009-02-26Make ieee1394_init a fs-initcallLinus Torvalds
It needs to happen before any firewire driver actually registers itself, and that was previously handled by having the Makefile list the core ieee1394 files before the drivers. But now there are firewire drivers in drivers/media, and the Makefile games aren't enough. So just make ieee1394_init happen earlier in the init sequence, the way all other bus layers already do. Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Henrik Kurelid <henrik@kurelid.se> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Ben Backx <ben@bbackx.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-26tracing: implement trace_clock_*() APIsIngo Molnar
Impact: implement new tracing timestamp APIs Add three trace clock variants, with differing scalability/precision tradeoffs: - local: CPU-local trace clock - medium: scalable global clock with some jitter - global: globally monotonic, serialized clock Make the ring-buffer use the local trace clock internally. Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26sched: sched_clock() improvement: use in_nmi()Ingo Molnar
make sure we dont execute more complex sched_clock() code in NMI context. Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26tracing, genirq: add irq enter and exit trace eventsJason Baron
Impact: add new tracepoints Add them to the generic IRQ code, that way every architecture gets these new tracepoints, not just x86. Using Steve's new 'TRACE_FORMAT', I can get function graph trace as follows using the original two IRQ tracepoints: 3) | handle_IRQ_event() { 3) | /* (irq_handler_entry) irq=28 handler=eth0 */ 3) | e1000_intr_msi() { 3) 2.460 us | __napi_schedule(); 3) 9.416 us | } 3) | /* (irq_handler_exit) irq=28 handler=eth0 return=handled */ 3) + 22.935 us | } Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26tracing/core: make the per cpu trace files in per cpu directoriesFrederic Weisbecker
Impact: restructure the VFS layout of per CPU trace buffers The per cpu trace files are all in a single directory: /debug/tracing/per_cpu. In case of a large number of cpu, the content of this directory becomes messy so we create now one directory per cpu inside /debug/tracing/per_cpu which contain each their own trace_pipe and trace files. Ie: /debug/tracing$ ls -R per_cpu per_cpu: cpu0 cpu1 per_cpu/cpu0: trace trace_pipe per_cpu/cpu1: trace trace_pipe Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26xen/blkfront: use blk_rq_map_sg to generate ring entriesJens Axboe
On occasion, the request will apparently have more segments than we fit into the ring. Jens says: > The second problem is that the block layer then appears to create one > too many segments, but from the dump it has rq->nr_phys_segments == > BLKIF_MAX_SEGMENTS_PER_REQUEST. I suspect the latter is due to > xen-blkfront not handling the merging on its own. It should check that > the new page doesn't form part of the previous page. The > rq_for_each_segment() iterates all single bits in the request, not dma > segments. The "easiest" way to do this is to call blk_rq_map_sg() and > then iterate the mapped sg list. That will give you what you are > looking for. > Here's a test patch, compiles but otherwise untested. I spent more > time figuring out how to enable XEN than to code it up, so YMMV! > Probably the sg list wants to be put inside the ring and only > initialized on allocation, then you can get rid of the sg on stack and > sg_init_table() loop call in the function. I'll leave that, and the > testing, to you. [Moved sg array into info structure, and initialize once. -J] Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-26block: reduce stack footprint of blk_recount_segments()Jens Axboe
blk_recalc_rq_segments() requires a request structure passed in, which we don't have from blk_recount_segments(). So the latter allocates one on the stack, using > 400 bytes of stack for that. This can cause us to spill over one page of stack from ext4 at least: 0) 4560 400 blk_recount_segments+0x43/0x62 1) 4160 32 bio_phys_segments+0x1c/0x24 2) 4128 32 blk_rq_bio_prep+0x2a/0xf9 3) 4096 32 init_request_from_bio+0xf9/0xfe 4) 4064 112 __make_request+0x33c/0x3f6 5) 3952 144 generic_make_request+0x2d1/0x321 6) 3808 64 submit_bio+0xb9/0xc3 7) 3744 48 submit_bh+0xea/0x10e 8) 3696 368 ext4_mb_init_cache+0x257/0xa6a [ext4] 9) 3328 288 ext4_mb_regular_allocator+0x421/0xcd9 [ext4] 10) 3040 160 ext4_mb_new_blocks+0x211/0x4b4 [ext4] 11) 2880 336 ext4_ext_get_blocks+0xb61/0xd45 [ext4] 12) 2544 96 ext4_get_blocks_wrap+0xf2/0x200 [ext4] 13) 2448 80 ext4_da_get_block_write+0x6e/0x16b [ext4] 14) 2368 352 mpage_da_map_blocks+0x7e/0x4b3 [ext4] 15) 2016 352 ext4_da_writepages+0x2ce/0x43c [ext4] 16) 1664 32 do_writepages+0x2d/0x3c 17) 1632 144 __writeback_single_inode+0x162/0x2cd 18) 1488 96 generic_sync_sb_inodes+0x1e3/0x32b 19) 1392 16 sync_sb_inodes+0xe/0x10 20) 1376 48 writeback_inodes+0x69/0xb3 21) 1328 208 balance_dirty_pages_ratelimited_nr+0x187/0x2f9 22) 1120 224 generic_file_buffered_write+0x1d4/0x2c4 23) 896 176 __generic_file_aio_write_nolock+0x35f/0x393 24) 720 80 generic_file_aio_write+0x6c/0xc8 25) 640 80 ext4_file_write+0xa9/0x137 [ext4] 26) 560 320 do_sync_write+0xf0/0x137 27) 240 48 vfs_write+0xb3/0x13c 28) 192 64 sys_write+0x4c/0x74 29) 128 128 system_call_fastpath+0x16/0x1b Split the segment counting out into a __blk_recalc_rq_segments() helper to avoid allocating an onstack request just for checking the physical segment count. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26cciss: shorten 30s timeout on controller resetJens Axboe
If reset_devices is set for kexec, then cciss will delay 30 seconds since the old 5i controller _may_ need that long to recover. Replace the long sleep with incremental sleep and tests to reduce the 30 seconds to worst case for 5i, so that other controllers will proceed quickly. Reviewed-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26block: add documentation for register_blkdev()Márton Németh
Add documentation for register_blkdev() function and for the parameters. Signed-off-by: Márton Németh <nm127@freemail.hu> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26block: fix bogus gcc warning for uninitialized var usageJens Axboe
Newer gcc throw this warning: fs/bio.c: In function ?bio_alloc_bioset?: fs/bio.c:305: warning: ?p? may be used uninitialized in this function since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26powerpc: Fix 64bit __copy_tofrom_user() regressionMark Nelson
This fixes a regression introduced by commit a4e22f02f5b6518c1484faea1f88d81802b9feac ("powerpc: Update 64bit __copy_tofrom_user() using CPU_FTR_UNALIGNED_LD_STD"). The same bug that existed in the 64bit memcpy() also exists here so fix it here too. The fix is the same as that applied to memcpy() with the addition of fixes for the exception handling code required for __copy_tofrom_user(). This stops us reading beyond the end of the source region we were told to copy. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26powerpc: Fix 64bit memcpy() regressionMark Nelson
This fixes a regression introduced by commit 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556 ("powerpc: Update 64bit memcpy() using CPU_FTR_UNALIGNED_LD_STD"). This commit allowed CPUs that have the CPU_FTR_UNALIGNED_LD_STD CPU feature bit present to do the memcpy() with unaligned load doubles. But, along with this came a bug where our final load double would read bytes beyond a page boundary and into the next (unmapped) page. This was caught by enabling CONFIG_DEBUG_PAGEALLOC, The fix was to read only the number of bytes that we need to store rather than reading a full 8-byte doubleword and storing only a portion of that. In order to minimise the amount of existing code touched we use the original do_tail for the src_unaligned case. Below is an example of the regression, as reported by Sachin Sant: Unable to handle kernel paging request for data at address 0xc00000003f380000 Faulting instruction address: 0xc000000000039574 cpu 0x1: Vector: 300 (Data Access) at [c00000003baf3020] pc: c000000000039574: .memcpy+0x74/0x244 lr: d00000000244916c: .ext3_xattr_get+0x288/0x2f4 [ext3] sp: c00000003baf32a0 msr: 8000000000009032 dar: c00000003f380000 dsisr: 40000000 current = 0xc00000003e54b010 paca = 0xc000000000a53680 pid = 1840, comm = readahead enter ? for help [link register ] d00000000244916c .ext3_xattr_get+0x288/0x2f4 [ext3] [c00000003baf32a0] d000000002449104 .ext3_xattr_get+0x220/0x2f4 [ext3] (unreliab le) [c00000003baf3390] d00000000244a6e8 .ext3_xattr_security_get+0x40/0x5c [ext3] [c00000003baf3400] c000000000148154 .generic_getxattr+0x74/0x9c [c00000003baf34a0] c000000000333400 .inode_doinit_with_dentry+0x1c4/0x678 [c00000003baf3560] c00000000032c6b0 .security_d_instantiate+0x50/0x68 [c00000003baf35e0] c00000000013c818 .d_instantiate+0x78/0x9c [c00000003baf3680] c00000000013ced0 .d_splice_alias+0xf0/0x120 [c00000003baf3720] d00000000243e05c .ext3_lookup+0xec/0x134 [ext3] [c00000003baf37c0] c000000000131e74 .do_lookup+0x110/0x260 [c00000003baf3880] c000000000134ed0 .__link_path_walk+0xa98/0x1010 [c00000003baf3970] c0000000001354a0 .path_walk+0x58/0xc4 [c00000003baf3a20] c000000000135720 .do_path_lookup+0x138/0x1e4 [c00000003baf3ad0] c00000000013645c .path_lookup_open+0x6c/0xc8 [c00000003baf3b70] c000000000136780 .do_filp_open+0xcc/0x874 [c00000003baf3d10] c0000000001251e0 .do_sys_open+0x80/0x140 [c00000003baf3dc0] c00000000016aaec .compat_sys_open+0x24/0x38 [c00000003baf3e30] c00000000000855c syscall_exit+0x0/0x40 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26powerpc: Fix load/store float double alignment handlerMichael Neuling
When we introduced VSX, we changed the way FPRs are stored in the thread_struct. Unfortunately we missed the load/store float double alignment handler code when updating how we access FPRs in the thread_struct. Below fixes this and merges the little/big endian case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26Merge branch 'tip/tracing/ftrace' of ↵Ingo Molnar
ssh://master.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
2009-02-26Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'linus' ↵Ingo Molnar
into tracing/core
2009-02-25tracing: wrap arguments with PARAMSSteven Rostedt
Peter Zijlstra warned that TPPROTO and TPARGS might become something other than a simple copy of itself. To prevent this from having side effects in the TRACE_FORMAT macro in tracepoint.h, we add a PARAMS() macro to be defined as just a wrapper. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25tracing: rename DEFINE_TRACE_FMT to just TRACE_FORMATSteven Rostedt
There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it TRACE_FORMAT. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: emu10k1 - Fix digital/analog switch on audigy2 ZS ALSA: hda - Quirk for Acer Aspire 6530G ALSA: hda - add another MacBook Pro 3,1 SSID ALSA: fix excessive background noise introduced by OSS emulation rate shrink ALSA: aw2: do not grab every saa7146 based device ALSA: hda - Fix parse of init_verbs sysfs entry ALSA: pcxhr.h replace signed one-bit bitfields
2009-02-25Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Don't go beyond iosapic_intr_info's arraysize [IA64] Do not go beyond ARRAY_SIZE of unw.hash [IA64] enable setting DMAR on by default
2009-02-25Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] pata_legacy: for VLB 32bit PIO don't try tricks with slop [libata] pata_amd: program FIFO sata_mv: fix SoC interrupt breakage pata_it821x: resume from hibernation fails with RAID volume
2009-02-25[libata] pata_legacy: for VLB 32bit PIO don't try tricks with slopAlan Cox
These devices are generally used with ATA anyway and it seems that some ATAPI will need us to issue the right number of words. Therefore as we can't switch mid burst on VLB devices we should only use 32bit I/O for suitable block sizes. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25[libata] pata_amd: program FIFOAlan Cox
With 32bit PIO we can use the posted write buffers, but only for 32bit I/O cycles. This means we must disable the FIFO for ATAPI where a final 16bit cycle may occur. Rework the FIFO logic so that we disable the FIFO then selectively re-enable it when we set the timings on AMD devices. Also fix a case where we scribbled on PCI config 0x41 of Nvidia chips when we shouldn't. Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25sata_mv: fix SoC interrupt breakageMark Lord
For some reason, sata_mv doesn't clear interrupt status during init when it's running on an SoC host adapter. If the bootloader has touched the SATA controller before starting Linux, Linux can end up enabling the SATA interrupt with events pending, which will cause the interrupt to be marked as spurious and then be disabled, which then breaks all further accesses to the controller. This patch makes the SoC path clear interrupt status on init like in the non-SoC case. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>