summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2014-06-07ftrace/module: Hardcode ftrace_module_init() call into load_module()Steven Rostedt (Red Hat)
commit a949ae560a511fe4e3adf48fa44fefded93e5c2b upstream. A race exists between module loading and enabling of function tracer. CPU 1 CPU 2 ----- ----- load_module() module->state = MODULE_STATE_COMING register_ftrace_function() mutex_lock(&ftrace_lock); ftrace_startup() update_ftrace_function(); ftrace_arch_code_modify_prepare() set_all_module_text_rw(); <enables-ftrace> ftrace_arch_code_modify_post_process() set_all_module_text_ro(); [ here all module text is set to RO, including the module that is loading!! ] blocking_notifier_call_chain(MODULE_STATE_COMING); ftrace_init_module() [ tries to modify code, but it's RO, and fails! ftrace_bug() is called] When this race happens, ftrace_bug() will produces a nasty warning and all of the function tracing features will be disabled until reboot. The simple solution is to treate module load the same way the core kernel is treated at boot. To hardcode the ftrace function modification of converting calls to mcount into nops. This is done in init/main.c there's no reason it could not be done in load_module(). This gives a better control of the changes and doesn't tie the state of the module to its notifiers as much. Ftrace is special, it needs to be treated as such. The reason this would work, is that the ftrace_module_init() would be called while the module is in MODULE_STATE_UNFORMED, which is ignored by the set_all_module_text_ro() call. Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-31tracing: Use rcu_dereference_sched() for trace event triggersSteven Rostedt (Red Hat)
commit 561a4fe851ccab9dd0d14989ab566f9392d9f8b5 upstream. As trace event triggers are now part of the mainline kernel, I added my trace event trigger tests to my test suite I run on all my kernels. Now these tests get run under different config options, and one of those options is CONFIG_PROVE_RCU, which checks under lockdep that the rcu locking primitives are being used correctly. This triggered the following splat: =============================== [ INFO: suspicious RCU usage. ] 3.15.0-rc2-test+ #11 Not tainted ------------------------------- kernel/trace/trace_events_trigger.c:80 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 4 locks held by swapper/1/0: #0: ((&(&j_cdbs->work)->timer)){..-...}, at: [<ffffffff8104d2cc>] call_timer_fn+0x5/0x1be #1: (&(&pool->lock)->rlock){-.-...}, at: [<ffffffff81059856>] __queue_work+0x140/0x283 #2: (&p->pi_lock){-.-.-.}, at: [<ffffffff8106e961>] try_to_wake_up+0x2e/0x1e8 #3: (&rq->lock){-.-.-.}, at: [<ffffffff8106ead3>] try_to_wake_up+0x1a0/0x1e8 stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc2-test+ #11 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 0000000000000001 ffff88007e083b98 ffffffff819f53a5 0000000000000006 ffff88007b0942c0 ffff88007e083bc8 ffffffff81081307 ffff88007ad96d20 0000000000000000 ffff88007af2d840 ffff88007b2e701c ffff88007e083c18 Call Trace: <IRQ> [<ffffffff819f53a5>] dump_stack+0x4f/0x7c [<ffffffff81081307>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff810ee51c>] event_triggers_call+0x99/0x108 [<ffffffff810e8174>] ftrace_event_buffer_commit+0x42/0xa4 [<ffffffff8106aadc>] ftrace_raw_event_sched_wakeup_template+0x71/0x7c [<ffffffff8106bcbf>] ttwu_do_wakeup+0x7f/0xff [<ffffffff8106bd9b>] ttwu_do_activate.constprop.126+0x5c/0x61 [<ffffffff8106eadf>] try_to_wake_up+0x1ac/0x1e8 [<ffffffff8106eb77>] wake_up_process+0x36/0x3b [<ffffffff810575cc>] wake_up_worker+0x24/0x26 [<ffffffff810578bc>] insert_work+0x5c/0x65 [<ffffffff81059982>] __queue_work+0x26c/0x283 [<ffffffff81059999>] ? __queue_work+0x283/0x283 [<ffffffff810599b7>] delayed_work_timer_fn+0x1e/0x20 [<ffffffff8104d3a6>] call_timer_fn+0xdf/0x1be^M [<ffffffff8104d2cc>] ? call_timer_fn+0x5/0x1be [<ffffffff81059999>] ? __queue_work+0x283/0x283 [<ffffffff8104d823>] run_timer_softirq+0x1a4/0x22f^M [<ffffffff8104696d>] __do_softirq+0x17b/0x31b^M [<ffffffff81046d03>] irq_exit+0x42/0x97 [<ffffffff81a08db6>] smp_apic_timer_interrupt+0x37/0x44 [<ffffffff81a07a2f>] apic_timer_interrupt+0x6f/0x80 <EOI> [<ffffffff8100a5d8>] ? default_idle+0x21/0x32 [<ffffffff8100a5d6>] ? default_idle+0x1f/0x32 [<ffffffff8100ac10>] arch_cpu_idle+0xf/0x11 [<ffffffff8107b3a4>] cpu_startup_entry+0x1a3/0x213 [<ffffffff8102a23c>] start_secondary+0x212/0x219 The cause is that the triggers are protected by rcu_read_lock_sched() but the data is dereferenced with rcu_dereference() which expects it to be protected with rcu_read_lock(). The proper reference should be rcu_dereference_sched(). Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-31tracing/uprobes: Fix uprobe_cpu_buffer memory leakzhangwei(Jovi)
commit 6ea6215fe394e320468589d9bba464a48f6d823a upstream. Forgot to free uprobe_cpu_buffer percpu page in uprobe_buffer_disable(). Link: http://lkml.kernel.org/p/534F8B3F.1090407@huawei.com Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-31blktrace: fix accounting of partially completed requestsRoman Pen
commit af5040da01ef980670b3741b3e10733ee3e33566 upstream. trace_block_rq_complete does not take into account that request can be partially completed, so we can get the following incorrect output of blkparser: C R 232 + 240 [0] C R 240 + 232 [0] C R 248 + 224 [0] C R 256 + 216 [0] but should be: C R 232 + 8 [0] C R 240 + 8 [0] C R 248 + 8 [0] C R 256 + 8 [0] Also, the whole output summary statistics of completed requests and final throughput will be incorrect. This patch takes into account real completion size of the request and fixes wrong completion accounting. Signed-off-by: Roman Pen <r.peniaev@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> CC: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-25tracing: Fix traceon trigger condition to actually turn tracing onSteven Rostedt (Red Hat)
While working on my tutorial for 2014 Linux Collaboration Summit I found that the traceon trigger did not work when conditions were used. The other triggers worked fine though. Looking into it, it is because of the way the triggers use the ring buffer to store the fields it will use for the condition. But if tracing is off, nothing is stored in the buffer, and the tracepoint exits before calling the trigger to test the condition. This is fine for all the triggers that only work when tracing is on, but for traceon trigger that is to work when tracing is off, nothing happens. The fix is simple, just use a temp ring buffer to record the event if tracing is off and the event has a trace event conditional trigger enabled. The rest of the tracepoint code will work just fine, but the tracepoint wont be recorded in the other buffers. Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-20tracing: Fix array size mismatch in format stringVaibhav Nagarnaik
In event format strings, the array size is reported in two locations. One in array subscript and then via the "size:" attribute. The values reported there have a mismatch. For e.g., in sched:sched_switch the prev_comm and next_comm character arrays have subscript values as [32] where as the actual field size is 16. name: sched_switch ID: 301 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[32]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[32]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; After bisection, the following commit was blamed: 92edca0 tracing: Use direct field, type and system names This commit removes the duplication of strings for field->name and field->type assuming that all the strings passed in __trace_define_field() are immutable. This is not true for arrays, where the type string is created in event_storage variable and field->type for all array fields points to event_storage. Use __stringify() to create a string constant for the type string. Also, get rid of event_storage and event_storage_mutex that are not needed anymore. also, an added benefit is that this reduces the overhead of events a bit more: text data bss dec hex filename 8424787 2036472 1302528 11763787 b3804b vmlinux 8420814 2036408 1302528 11759750 b37086 vmlinux.patched Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-03tracing: Do not add event files for modules that fail tracepointsSteven Rostedt (Red Hat)
If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Link: http://lkml.kernel.org/r/20140227154923.265882695@goodmis.org Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org # 2.6.31+ Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-11ring-buffer: Fix first commit on sub-buffer having non-zero deltaSteven Rostedt (Red Hat)
Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on that page use a 27 bit delta against that timestamp in order to save on bits written to the ring buffer. If the time between events is larger than what the 27 bits can hold, a "time extend" event is added to hold the entire 64 bit timestamp again and the events after that hold a delta from that timestamp. As a "time extend" is always paired with an event, it is logical to just allocate the event with the time extend, to make things a bit more efficient. Unfortunately, when the pairing code was written, it removed the "delta = 0" from the first commit on a page, causing the events on the page to be slightly skewed. Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together" Cc: stable@vger.kernel.org # 2.6.37+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-30Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...
2014-01-27Merge tag 'trace-fixes-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "The first two patches fix the debugfs README file to reflect better the new features added to 3.14. The third patch is a minor bugfix to the trace_puts() functions that will crash the system if a developer adds one before the tracing system is setup. It also affects trace_printk() if it has no arguments, as the code will convert it to a trace_puts() as well. Note, this bug will not affect unmodified kernels, as trace_printk() and trace_puts() should only be used by developers for testing" * tag 'trace-fixes-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Check if tracing is enabled in trace_puts() tracing: Fix formatting of trace README file tracing/README: Add event file usage to tracing mini-HOWTO
2014-01-23tracing: Check if tracing is enabled in trace_puts()Steven Rostedt (Red Hat)
If trace_puts() is used very early in boot up, it can crash the machine if it is called before the ring buffer is allocated. If a trace_printk() is used with no arguments, then it will be converted into a trace_puts() and suffer the same fate. Cc: stable@vger.kernel.org # 3.10+ Fixes: 09ae72348ecc "tracing: Add trace_puts() for even faster trace_printk() tracing" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-23tracing: Fix formatting of trace README fileSteven Rostedt (Red Hat)
Fix the formatting of the README file in the trace debugfs to fit in an 80 character window. Also add a comment about the event trigger counter with regards to traceon and traceoff. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-22tracing/README: Add event file usage to tracing mini-HOWTOTom Zanussi
It would be useful to have a cheat-sheet for everything under tracing/events/ alongside the existing text describing the other files in the tracing/ dir. Add short descriptions of the directories and files under events/ along with examples, similar to the existing text for the other files in tracing/. Also clean up a few minor alignment problems noticed when adding the new text. Link: http://lkml.kernel.org/r/1389993104.3040.445.camel@empanada Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-22Merge tag 'trace-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This pull request has a new feature to ftrace, namely the trace event triggers by Tom Zanussi. A trigger is a way to enable an action when an event is hit. The actions are: o trace on/off - enable or disable tracing o snapshot - save the current trace buffer in the snapshot o stacktrace - dump the current stack trace to the ringbuffer o enable/disable events - enable or disable another event Namhyung Kim added updates to the tracing uprobes code. Having the uprobes add support for fetch methods. The rest are various bug fixes with the new code, and minor ones for the old code" * tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (38 commits) tracing: Fix buggered tee(2) on tracing_pipe tracing: Have trace buffer point back to trace_array ftrace: Fix synchronization location disabling and freeing ftrace_ops ftrace: Have function graph only trace based on global_ops filters ftrace: Synchronize setting function_trace_op with ftrace_trace_function tracing: Show available event triggers when no trigger is set tracing: Consolidate event trigger code tracing: Fix counter for traceon/off event triggers tracing: Remove double-underscore naming in syscall trigger invocations tracing/kprobes: Add trace event trigger invocations tracing/probes: Fix build break on !CONFIG_KPROBE_EVENT tracing/uprobes: Add @+file_offset fetch method uprobes: Allocate ->utask before handler_chain() for tracing handlers tracing/uprobes: Add support for full argument access methods tracing/uprobes: Fetch args before reserving a ring buffer tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg() tracing/probes: Implement 'memory' fetch method for uprobes tracing/probes: Add fetch{,_size} member into deref fetch method tracing/probes: Move 'symbol' fetch method to kprobes tracing/probes: Implement 'stack' fetch method for uprobes ...
2014-01-19tracing: Fix buggered tee(2) on tracing_pipeAl Viro
In kernel/trace/trace.c we have this: static void tracing_pipe_buf_release(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { __free_page(buf->page); } static const struct pipe_buf_operations tracing_pipe_buf_ops = { .can_merge = 0, .map = generic_pipe_buf_map, .unmap = generic_pipe_buf_unmap, .confirm = generic_pipe_buf_confirm, .release = tracing_pipe_buf_release, .steal = generic_pipe_buf_steal, .get = generic_pipe_buf_get, }; with void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { page_cache_get(buf->page); } and I don't see anything that would've prevented tee(2) called on the pipe that got stuff spliced into it from that sucker. ->ops->get() will be called, then buf gets copied into target pipe's ->bufs[] and eventually readers get to both copies of the buffer. With get_page(page) look at that page __free_page(page) look at that page __free_page(page) which is not a good thing, to put it mildly. AFAICS, that ought to use the normal generic_pipe_buf_release() (aka page_cache_release(buf->page)), shouldn't it? [ SDR - As trace_pipe just allocates the page with alloc_page(GFP_KERNEL), and doesn't do anything special with it (no LRU logic). The __free_page() should be fine, as it wont actually free a page with reference count. Maybe there's a chance to leak memory? Anyway, This change is at a minimum good for being symmetric with generic_pipe_buf_get, it is fine to add. ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ SDR - Removed no longer used tracing_pipe_buf_release ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-14tracing: Have trace buffer point back to trace_arraySteven Rostedt (Red Hat)
The trace buffer has a descriptor pointer that goes back to the trace array. But it was never assigned. Luckily, nothing uses it (yet), but it will in the future. Although nothing currently uses this, if any of the new features get backported to older kernels, and because this is such a simple change, I'm marking it for stable too. Cc: stable@vger.kernel.org # v3.10+ Fixes: 12883efb670c "tracing: Consolidate max_tr into main trace_array structure" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-13ftrace: Fix synchronization location disabling and freeing ftrace_opsSteven Rostedt (Red Hat)
The synchronization needed after ftrace_ops are unregistered must happen after the callback is disabled from becing called by functions. The current location happens after the function is being removed from the internal lists, but not after the function callbacks were disabled, leaving the functions susceptible of being called after their callbacks are freed. This affects perf and any externel users of function tracing (LTTng and SystemTap). Cc: stable@vger.kernel.org # 3.0+ Fixes: cdbe61bfe704 "ftrace: Allow dynamically allocated function tracers" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-13ftrace: Have function graph only trace based on global_ops filtersSteven Rostedt (Red Hat)
Doing some different tests, I discovered that function graph tracing, when filtered via the set_ftrace_filter and set_ftrace_notrace files, does not always keep with them if another function ftrace_ops is registered to trace functions. The reason is that function graph just happens to trace all functions that the function tracer enables. When there was only one user of function tracing, the function graph tracer did not need to worry about being called by functions that it did not want to trace. But now that there are other users, this becomes a problem. For example, one just needs to do the following: # cd /sys/kernel/debug/tracing # echo schedule > set_ftrace_filter # echo function_graph > current_tracer # cat trace [..] 0) | schedule() { ------------------------------------------ 0) <idle>-0 => rcu_pre-7 ------------------------------------------ 0) ! 2980.314 us | } 0) | schedule() { ------------------------------------------ 0) rcu_pre-7 => <idle>-0 ------------------------------------------ 0) + 20.701 us | } # echo 1 > /proc/sys/kernel/stack_tracer_enabled # cat trace [..] 1) + 20.825 us | } 1) + 21.651 us | } 1) + 30.924 us | } /* SyS_ioctl */ 1) | do_page_fault() { 1) | __do_page_fault() { 1) 0.274 us | down_read_trylock(); 1) 0.098 us | find_vma(); 1) | handle_mm_fault() { 1) | _raw_spin_lock() { 1) 0.102 us | preempt_count_add(); 1) 0.097 us | do_raw_spin_lock(); 1) 2.173 us | } 1) | do_wp_page() { 1) 0.079 us | vm_normal_page(); 1) 0.086 us | reuse_swap_page(); 1) 0.076 us | page_move_anon_rmap(); 1) | unlock_page() { 1) 0.082 us | page_waitqueue(); 1) 0.086 us | __wake_up_bit(); 1) 1.801 us | } 1) 0.075 us | ptep_set_access_flags(); 1) | _raw_spin_unlock() { 1) 0.098 us | do_raw_spin_unlock(); 1) 0.105 us | preempt_count_sub(); 1) 1.884 us | } 1) 9.149 us | } 1) + 13.083 us | } 1) 0.146 us | up_read(); When the stack tracer was enabled, it enabled all functions to be traced, which now the function graph tracer also traces. This is a side effect that should not occur. To fix this a test is added when the function tracing is changed, as well as when the graph tracer is enabled, to see if anything other than the ftrace global_ops function tracer is enabled. If so, then the graph tracer calls a test trampoline that will look at the function that is being traced and compare it with the filters defined by the global_ops. As an optimization, if there's no other function tracers registered, or if the only registered function tracers also use the global ops, the function graph infrastructure will call the registered function graph callback directly and not go through the test trampoline. Cc: stable@vger.kernel.org # 3.3+ Fixes: d2d45c7a03a2 "tracing: Have stack_tracer use a separate list of functions" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-13sched/clock, x86: Use a static_key for sched_clock_stablePeter Zijlstra
In order to avoid the runtime condition and variable load turn sched_clock_stable into a static_key. Also provide a shorter implementation of local_clock() and cpu_clock(int) when sched_clock_stable==1. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 221876 215295 (cold) local_clock: 301773 234692 220773 (warm) sched_clock: 38375 25602 25659 (warm) local_clock: 100371 33265 27242 (warm) rdtsc: 27340 24214 24208 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 235941 237019 (cold) local_clock: 396890 297017 294819 (warm) sched_clock: 38194 25233 25609 (warm) local_clock: 143452 71234 71232 (warm) rdtsc: 27345 24245 24243 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/deadline: Add SCHED_DEADLINE inheritance logicDario Faggioli
Some method to deal with rt-mutexes and make sched_dl interact with the current PI-coded is needed, raising all but trivial issues, that needs (according to us) to be solved with some restructuring of the pi-code (i.e., going toward a proxy execution-ish implementation). This is under development, in the meanwhile, as a temporary solution, what this commits does is: - ensure a pi-lock owner with waiters is never throttled down. Instead, when it runs out of runtime, it immediately gets replenished and it's deadline is postponed; - the scheduling parameters (relative deadline and default runtime) used for that replenishments --during the whole period it holds the pi-lock-- are the ones of the waiting task with earliest deadline. Acting this way, we provide some kind of boosting to the lock-owner, still by using the existing (actually, slightly modified by the previous commit) pi-architecture. We would stress the fact that this is only a surely needed, all but clean solution to the problem. In the end it's only a way to re-start discussion within the community. So, as always, comments, ideas, rants, etc.. are welcome! :-) Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> [ Added !RT_MUTEXES build fix. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/deadline: Add latency tracing for SCHED_DEADLINE tasksDario Faggioli
It is very likely that systems that wants/needs to use the new SCHED_DEADLINE policy also want to have the scheduling latency of the -deadline tasks under control. For this reason a new version of the scheduling wakeup latency, called "wakeup_dl", is introduced. As a consequence of applying this patch there will be three wakeup latency tracer: * "wakeup", that deals with all tasks in the system; * "wakeup_rt", that deals with -rt and -deadline tasks only; * "wakeup_dl", that deals with -deadline tasks only. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-9-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-09ftrace: Synchronize setting function_trace_op with ftrace_trace_functionSteven Rostedt (Red Hat)
ftrace_trace_function is a variable that holds what function will be called directly by the assembly code (mcount). If just a single function is registered and it handles recursion itself, then the assembly will call that function directly without any helper function. It also passes in the ftrace_op that was registered with the callback. The ftrace_op to send is stored in the function_trace_op variable. The ftrace_trace_function and function_trace_op needs to be coordinated such that the called callback wont be called with the wrong ftrace_op, otherwise bad things can happen if it expected a different op. Luckily, there's no callback that doesn't use the helper functions that requires this. But there soon will be and this needs to be fixed. Use a set_function_trace_op to store the ftrace_op to set the function_trace_op to when it is safe to do so (during the update function within the breakpoint or stop machine calls). Or if dynamic ftrace is not being used (static tracing) then we have to do a bit more synchronization when the ftrace_trace_function is set as that takes affect immediately (as oppose to dynamic ftrace doing it with the modification of the trampoline). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-09tracing: Show available event triggers when no trigger is setSteven Rostedt (Red Hat)
Currently there's no way to know what triggers exist on a kernel without looking at the source of the kernel or randomly trying out triggers. Instead of creating another file in the debugfs system, simply show what available triggers are there when cat'ing the trigger file when it has no events: [root /sys/kernel/debug/tracing]# cat events/sched/sched_switch/trigger # Available triggers: # traceon traceoff snapshot stacktrace enable_event disable_event This stays consistent with other debugfs files where meta data like this is always proceeded with a '#' at the start of the line so that tools can strip these out. Link: http://lkml.kernel.org/r/20140107103548.0a84536d@gandalf.local.home Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-09tracing: Consolidate event trigger codeSteven Rostedt (Red Hat)
The event trigger code that checks for callback triggers before and after recording of an event has lots of flags checks. This code is duplicated throughout the ftrace events, kprobes and system calls. They all do the exact same checks against the event flags. Added helper functions ftrace_trigger_soft_disabled(), event_trigger_unlock_commit() and event_trigger_unlock_commit_regs() that consolidated the code and these are used instead. Link: http://lkml.kernel.org/r/20140106222703.5e7dbba2@gandalf.local.home Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-09tracing: Fix counter for traceon/off event triggersSteven Rostedt (Red Hat)
The counters for the traceon and traceoff are only suppose to decrement when the trigger enables or disables tracing. It is not suppose to decrement every time the event is hit. Only decrement the counter if the trigger actually did something. Link: http://lkml.kernel.org/r/20140106223124.0e5fd0b4@gandalf.local.home Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-06tracing: Remove double-underscore naming in syscall trigger invocationsTom Zanussi
There's no reason to use double-underscores for any variable name in ftrace_syscall_enter()/exit(), since those functions aren't generated and there's no need to avoid namespace collisions as with the event macros, which is where the original invocation code came from. Link: http://lkml.kernel.org/r/0b489c9d1f7ee315cff60fa0e4c2b433ade8ae0d.1389036657.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-06tracing/kprobes: Add trace event trigger invocationsTom Zanussi
Add code to the kprobe/kretprobe event functions that will invoke any event triggers associated with a probe's ftrace_event_file. The code to do this is very similar to the invocation code already used to invoke the triggers associated with static events and essentially replaces the existing soft-disable checks with a superset that preserves the original behavior but adds the bits needed to support event triggers. Link: http://lkml.kernel.org/r/f2d49f157b608070045fdb26c9564d5a05a5a7d0.1389036657.git.tom.zanussi@linux.intel.com Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-03tracing/probes: Fix build break on !CONFIG_KPROBE_EVENTNamhyung Kim
When kprobe-based dynamic event tracer is not enabled, it caused following build error: kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c8dd): undefined reference to `fetch_symbol_u8' kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c8e9): undefined reference to `fetch_symbol_u16' kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c8f5): undefined reference to `fetch_symbol_u32' kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c901): undefined reference to `fetch_symbol_u64' kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c909): undefined reference to `fetch_symbol_string' kernel/built-in.o: In function `traceprobe_update_arg': (.text+0x10c913): undefined reference to `fetch_symbol_string_size' ... It was due to the fetch methods are referred from CHECK_FETCH_FUNCS macro and since it was only defined in trace_kprobe.c. Move NULL definition of such fetch functions to the header file. Note, it also requires CONFIG_BRANCH_PROFILING enabled to trigger this failure as well. This is because the "fetch_symbol_*" variables are referenced in a "else if" statement that will only call update_symbol_cache(), which is a static inline stub function when CONFIG_KPROBE_EVENT is not enabled. gcc is smart enough to optimize this "else if" out and that also removes the code that references the undefined variables. But when BRANCH_PROFILING is enabled, it fools gcc into keeping the if statement around and thus references the undefined symbols and fails to build. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02tracing/uprobes: Add @+file_offset fetch methodNamhyung Kim
Enable to fetch data from a file offset. Currently it only supports fetching from same binary uprobe set. It'll translate the file offset to a proper virtual address in the process. The syntax is "@+OFFSET" as it does similar to normal memory fetching (@ADDR) which does no address translation. Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/uprobes: Add support for full argument access methodsNamhyung Kim
Enable to fetch other types of argument for the uprobes. IOW, we can access stack, memory, deref, bitfield and retval from uprobes now. The format for the argument types are same as kprobes (but @SYMBOL type is not supported for uprobes), i.e: @ADDR : Fetch memory at ADDR $stackN : Fetch Nth entry of stack (N >= 0) $stack : Fetch stack address $retval : Fetch return value +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address Note that the retval only can be used with uretprobes. Original-patch-by: Hyeoncheol Lee <cheol.lee@lge.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Hyeoncheol Lee <cheol.lee@lge.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/uprobes: Fetch args before reserving a ring bufferNamhyung Kim
Fetching from user space should be done in a non-atomic context. So use a per-cpu buffer and copy its content to the ring buffer atomically. Note that we can migrate during accessing user memory thus use a per-cpu mutex to protect concurrent accesses. This is needed since we'll be able to fetch args from an user memory which can be swapped out. Before that uprobes could fetch args from registers only which saved in a kernel space. While at it, use __get_data_size() and store_trace_args() to reduce code duplication. And add struct uprobe_cpu_buffer and its helpers as suggested by Oleg. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()Namhyung Kim
Currently uprobes don't pass is_return to the argument parser so that it cannot make use of "$retval" fetch method since it only works for return probes. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Implement 'memory' fetch method for uprobesNamhyung Kim
Use separate method to fetch from memory. Move existing functions to trace_kprobe.c and make them static. Also add new memory fetch implementation for uprobes. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Add fetch{,_size} member into deref fetch methodHyeoncheol Lee
The deref fetch methods access a memory region but it assumes that it's a kernel memory since uprobes does not support them. Add ->fetch and ->fetch_size member in order to provide a proper access methods for supporting uprobes. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Hyeoncheol Lee <cheol.lee@lge.com> [namhyung@kernel.org: Split original patch into pieces as requested] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Move 'symbol' fetch method to kprobesNamhyung Kim
Move existing functions to trace_kprobe.c and add NULL entries to the uprobes fetch type table. I don't make them static since some generic routines like update/free_XXX_fetch_param() require pointers to the functions. Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Implement 'stack' fetch method for uprobesNamhyung Kim
Use separate method to fetch from stack. Move existing functions to trace_kprobe.c and make them static. Also add new stack fetch implementation for uprobes. Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Split [ku]probes_fetch_type_tableNamhyung Kim
Use separate fetch_type_table for kprobes and uprobes. It currently shares all fetch methods but some of them will be implemented differently later. This is not to break build if [ku]probes is configured alone (like !CONFIG_KPROBE_EVENT and CONFIG_UPROBE_EVENT). So I added '__weak' to the table declaration so that it can be safely omitted when it configured out. Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Move fetch function helpers to trace_probe.hNamhyung Kim
Move fetch function helper macros/functions to the header file and make them external. This is preparation of supporting uprobe fetch table in next patch. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Integrate duplicate set_print_fmt()Namhyung Kim
The set_print_fmt() functions are implemented almost same for [ku]probes. Move it to a common place and get rid of the duplication. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/kprobes: Move common functions to trace_probe.hNamhyung Kim
The __get_data_size() and store_trace_args() will be used by uprobes too. Move them to a common location. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/uprobes: Convert to struct trace_probeNamhyung Kim
Convert struct trace_uprobe to make use of the common trace_probe structure. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/kprobes: Factor out struct trace_probeNamhyung Kim
There are functions that can be shared to both of kprobes and uprobes. Separate common data structure to struct trace_probe and use it from the shared functions. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/probes: Fix basic print type functionsNamhyung Kim
The print format of s32 type was "ld" and it's casted to "long". So it turned out to print 4294967295 for "-1" on 64-bit systems. Not sure whether it worked well on 32-bit systems. Anyway, it doesn't need to have cast argument at all since it already casted using type pointer - just get rid of it. Thanks to Oleg for pointing that out. And print 0x prefix for unsigned type as it shows hex numbers. Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing/uprobes: Fix documentation of uprobe registration syntaxNamhyung Kim
The uprobe syntax requires an offset after a file path not a symbol. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2014-01-02tracing: Fix rcu handling of event_trigger_data filter fieldSteven Rostedt (Red Hat)
The filter field of the event_trigger_data structure is protected under RCU sched locks. It was not annotated as such, and after doing so, sparse pointed out several locations that required fix ups. Reported-by: kbuild test robot <fengguang.wu@intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02tracing: Add generic tracing_lseek() functionSteven Rostedt (Red Hat)
Trace event triggers added a lseek that uses the ftrace_filter_lseek() function. Unfortunately, when function tracing is not configured in that function is not defined and the kernel fails to build. This is the second time that function was added to a file ops and it broke the build due to requiring special config dependencies. Make a generic tracing_lseek() that all the tracing utilities may use. Also, modify the old ftrace_filter_lseek() to return 0 instead of 1 on WRONLY. Not sure why it was a 1 as that does not make sense. This also changes the old tracing_seek() to modify the file pos pointer on WRONLY as well. Reported-by: kbuild test robot <fengguang.wu@intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-31Merge tag 'v3.13-rc6' into for-3.14/coreJens Axboe
Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c
2013-12-21tracing: Add and use generic set_trigger_filter() implementationTom Zanussi
Add a generic event_command.set_trigger_filter() op implementation and have the current set of trigger commands use it - this essentially gives them all support for filters. Syntactically, filters are supported by adding 'if <filter>' just after the command, in which case only events matching the filter will invoke the trigger. For example, to add a filter to an enable/disable_event command: echo 'enable_event:system:event if common_pid == 999' > \ .../othersys/otherevent/trigger The above command will only enable the system:event event if the common_pid field in the othersys:otherevent event is 999. As another example, to add a filter to a stacktrace command: echo 'stacktrace if common_pid == 999' > \ .../somesys/someevent/trigger The above command will only trigger a stacktrace if the common_pid field in the event is 999. The filter syntax is the same as that described in the 'Event filtering' section of Documentation/trace/events.txt. Because triggers can now use filters, the trigger-invoking logic needs to be moved in those cases - e.g. for ftrace_raw_event_calls, if a trigger has a filter associated with it, the trigger invocation now needs to happen after the { assign; } part of the call, in order for the trigger condition to be tested. There's still a SOFT_DISABLED-only check at the top of e.g. the ftrace_raw_events function, so when an event is soft disabled but not because of the presence of a trigger, the original SOFT_DISABLED behavior remains unchanged. There's also a bit of trickiness in that some triggers need to avoid being invoked while an event is currently in the process of being logged, since the trigger may itself log data into the trace buffer. Thus we make sure the current event is committed before invoking those triggers. To do that, we split the trigger invocation in two - the first part (event_triggers_call()) checks the filter using the current trace record; if a command has the post_trigger flag set, it sets a bit for itself in the return value, otherwise it directly invoks the trigger. Once all commands have been either invoked or set their return flag, event_triggers_call() returns. The current record is then either committed or discarded; if any commands have deferred their triggers, those commands are finally invoked following the close of the current event by event_triggers_post_call(). To simplify the above and make it more efficient, the TRIGGER_COND bit is introduced, which is set only if a soft-disabled trigger needs to use the log record for filter testing or needs to wait until the current log record is closed. The syscall event invocation code is also changed in analogous ways. Because event triggers need to be able to create and free filters, this also adds a couple external wrappers for the existing create_filter and free_filter functions, which are too generic to be made extern functions themselves. Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21tracing: Move ftrace_event_file() out of DYNAMIC_FTRACE ifdefSteven Rostedt (Red Hat)
Now that event triggers use ftrace_event_file(), it needs to be outside the #ifdef CONFIG_DYNAMIC_FTRACE, as it can now be used when that is not defined. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21tracing: Add 'enable_event' and 'disable_event' event trigger commandsTom Zanussi
Add 'enable_event' and 'disable_event' event_command commands. enable_event and disable_event event triggers are added by the user via these commands in a similar way and using practically the same syntax as the analagous 'enable_event' and 'disable_event' ftrace function commands, but instead of writing to the set_ftrace_filter file, the enable_event and disable_event triggers are written to the per-event 'trigger' files: echo 'enable_event:system:event' > .../othersys/otherevent/trigger echo 'disable_event:system:event' > .../othersys/otherevent/trigger The above commands will enable or disable the 'system:event' trace events whenever the othersys:otherevent events are hit. This also adds a 'count' version that limits the number of times the command will be invoked: echo 'enable_event:system:event:N' > .../othersys/otherevent/trigger echo 'disable_event:system:event:N' > .../othersys/otherevent/trigger Where N is the number of times the command will be invoked. The above commands will will enable or disable the 'system:event' trace events whenever the othersys:otherevent events are hit, but only N times. This also makes the find_event_file() helper function extern, since it's useful to use from other places, such as the event triggers code, so make it accessible. Link: http://lkml.kernel.org/r/f825f3048c3f6b026ee37ae5825f9fc373451828.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>