From 2b144498350860b6ee9dc57ff27a93ad488de5dc Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Thu, 9 Feb 2012 14:56:42 +0530 Subject: uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints Add uprobes support to the core kernel, with x86 support. This commit adds the kernel facilities, the actual uprobes user-space ABI and perf probe support comes in later commits. General design: Uprobes are maintained in an rb-tree indexed by inode and offset (the offset here is from the start of the mapping). For a unique (inode, offset) tuple, there can be at most one uprobe in the rb-tree. Since the (inode, offset) tuple identifies a unique uprobe, more than one user may be interested in the same uprobe. This provides the ability to connect multiple 'consumers' to the same uprobe. Each consumer defines a handler and a filter (optional). The 'handler' is run every time the uprobe is hit, if it matches the 'filter' criteria. The first consumer of a uprobe causes the breakpoint to be inserted at the specified address and subsequent consumers are appended to this list. On subsequent probes, the consumer gets appended to the existing list of consumers. The breakpoint is removed when the last consumer unregisters. For all other unregisterations, the consumer is removed from the list of consumers. Given a inode, we get a list of the mms that have mapped the inode. Do the actual registration if mm maps the page where a probe needs to be inserted/removed. We use a temporary list to walk through the vmas that map the inode. - The number of maps that map the inode, is not known before we walk the rmap and keeps changing. - extending vm_area_struct wasn't recommended, it's a size-critical data structure. - There can be more than one maps of the inode in the same mm. We add callbacks to the mmap methods to keep an eye on text vmas that are of interest to uprobes. When a vma of interest is mapped, we insert the breakpoint at the right address. Uprobe works by replacing the instruction at the address defined by (inode, offset) with the arch specific breakpoint instruction. We save a copy of the original instruction at the uprobed address. This is needed for: a. executing the instruction out-of-line (xol). b. instruction analysis for any subsequent fixups. c. restoring the instruction back when the uprobe is unregistered. We insert or delete a breakpoint instruction, and this breakpoint instruction is assumed to be the smallest instruction available on the platform. For fixed size instruction platforms this is trivially true, for variable size instruction platforms the breakpoint instruction is typically the smallest (often a single byte). Writing the instruction is done by COWing the page and changing the instruction during the copy, this even though most platforms allow atomic writes of the breakpoint instruction. This also mirrors the behaviour of a ptrace() memory write to a PRIVATE file map. The core worker is derived from KSM's replace_page() logic. In essence, similar to KSM: a. allocate a new page and copy over contents of the page that has the uprobed vaddr b. modify the copy and insert the breakpoint at the required address c. switch the original page with the copy containing the breakpoint d. flush page tables. replace_page() is being replicated here because of some minor changes in the type of pages and also because Hugh Dickins had plans to improve replace_page() for KSM specific work. Instruction analysis on x86 is based on instruction decoder and determines if an instruction can be probed and determines the necessary fixups after singlestep. Instruction analysis is done at probe insertion time so that we avoid having to repeat the same analysis every time a probe is hit. A lot of code here is due to the improvement/suggestions/inputs from Peter Zijlstra. Changelog: (v10): - Add code to clear REX.B prefix as suggested by Denys Vlasenko and Masami Hiramatsu. (v9): - Use insn_offset_modrm as suggested by Masami Hiramatsu. (v7): Handle comments from Peter Zijlstra: - Dont take reference to inode. (expect inode to uprobe_register to be sane). - Use PTR_ERR to set the return value. - No need to take reference to inode. - use PTR_ERR to return error value. - register and uprobe_unregister share code. (v5): - Modified del_consumer as per comments from Peter. - Drop reference to inode before dropping reference to uprobe. - Use i_size_read(inode) instead of inode->i_size. - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called. - Includes errno.h as recommended by Stephen Rothwell to fix a build issue on sparc defconfig - Remove restrictions while unregistering. - Earlier code leaked inode references under some conditions while registering/unregistering. - Continue the vma-rmap walk even if the intermediate vma doesnt meet the requirements. - Validate the vma found by find_vma before inserting/removing the breakpoint - Call del_consumer under mutex_lock. - Use hash locks. - Handle mremap. - Introduce find_least_offset_node() instead of close match logic in find_uprobe - Uprobes no more depends on MM_OWNER; No reference to task_structs while inserting/removing a probe. - Uses read_mapping_page instead of grab_cache_page so that the pages have valid content. - pass NULL to get_user_pages for the task parameter. - call SetPageUptodate on the new page allocated in write_opcode. - fix leaking a reference to the new page under certain conditions. - Include Instruction Decoder if Uprobes gets defined. - Remove const attributes for instruction prefix arrays. - Uses mm_context to know if the application is 32 bit. Signed-off-by: Srikar Dronamraju Also-written-by: Jim Keniston Reviewed-by: Peter Zijlstra Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Roland McGrath Cc: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Cc: Stephen Rothwell Cc: Denys Vlasenko Cc: Peter Zijlstra Cc: Linus Torvalds Cc: Andrew Morton Cc: Linux-mm Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com [ Made various small edits to the commit log ] Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 98 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) create mode 100644 include/linux/uprobes.h (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h new file mode 100644 index 000000000000..f1d13fd140f2 --- /dev/null +++ b/include/linux/uprobes.h @@ -0,0 +1,98 @@ +#ifndef _LINUX_UPROBES_H +#define _LINUX_UPROBES_H +/* + * Userspace Probes (UProbes) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2008-2011 + * Authors: + * Srikar Dronamraju + * Jim Keniston + */ + +#include +#include + +struct vm_area_struct; +#ifdef CONFIG_ARCH_SUPPORTS_UPROBES +#include +#else + +typedef u8 uprobe_opcode_t; +struct uprobe_arch_info {}; + +#define MAX_UINSN_BYTES 4 +#endif + +#define uprobe_opcode_sz sizeof(uprobe_opcode_t) + +/* flags that denote/change uprobes behaviour */ +/* Have a copy of original instruction */ +#define UPROBES_COPY_INSN 0x1 +/* Dont run handlers when first register/ last unregister in progress*/ +#define UPROBES_RUN_HANDLER 0x2 + +struct uprobe_consumer { + int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs); + /* + * filter is optional; If a filter exists, handler is run + * if and only if filter returns true. + */ + bool (*filter)(struct uprobe_consumer *self, struct task_struct *task); + + struct uprobe_consumer *next; +}; + +struct uprobe { + struct rb_node rb_node; /* node in the rb tree */ + atomic_t ref; + struct rw_semaphore consumer_rwsem; + struct list_head pending_list; + struct uprobe_arch_info arch_info; + struct uprobe_consumer *consumers; + struct inode *inode; /* Also hold a ref to inode */ + loff_t offset; + int flags; + u8 insn[MAX_UINSN_BYTES]; +}; + +#ifdef CONFIG_UPROBES +extern int __weak set_bkpt(struct mm_struct *mm, struct uprobe *uprobe, + unsigned long vaddr); +extern int __weak set_orig_insn(struct mm_struct *mm, struct uprobe *uprobe, + unsigned long vaddr, bool verify); +extern bool __weak is_bkpt_insn(uprobe_opcode_t *insn); +extern int register_uprobe(struct inode *inode, loff_t offset, + struct uprobe_consumer *consumer); +extern void unregister_uprobe(struct inode *inode, loff_t offset, + struct uprobe_consumer *consumer); +extern int mmap_uprobe(struct vm_area_struct *vma); +#else /* CONFIG_UPROBES is not defined */ +static inline int register_uprobe(struct inode *inode, loff_t offset, + struct uprobe_consumer *consumer) +{ + return -ENOSYS; +} +static inline void unregister_uprobe(struct inode *inode, loff_t offset, + struct uprobe_consumer *consumer) +{ +} +static inline int mmap_uprobe(struct vm_area_struct *vma) +{ + return 0; +} +#endif /* CONFIG_UPROBES */ +#endif /* _LINUX_UPROBES_H */ -- cgit v1.2.3 From 7b2d81d48a2d8e37efb6ce7b4d5ef58822b30d89 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Fri, 17 Feb 2012 09:27:41 +0100 Subject: uprobes/core: Clean up, refactor and improve the code Make the uprobes code readable to me: - improve the Kconfig text so that a mere mortal gets some idea what CONFIG_UPROBES=y is really about - do trivial renames to standardize around the uprobes_*() namespace - clean up and simplify various code flow details - separate basic blocks of functionality - line break artifact and white space related removal - use standard local varible definition blocks - use vertical spacing to make things more readable - remove unnecessary volatile - restructure comment blocks to make them more uniform and more readable in general Cc: Srikar Dronamraju Cc: Jim Keniston Cc: Peter Zijlstra Cc: Oleg Nesterov Cc: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.org Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index f1d13fd140f2..64e45f116b2a 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -1,7 +1,7 @@ #ifndef _LINUX_UPROBES_H #define _LINUX_UPROBES_H /* - * Userspace Probes (UProbes) + * User-space Probes (UProbes) * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -40,8 +40,10 @@ struct uprobe_arch_info {}; #define uprobe_opcode_sz sizeof(uprobe_opcode_t) /* flags that denote/change uprobes behaviour */ + /* Have a copy of original instruction */ #define UPROBES_COPY_INSN 0x1 + /* Dont run handlers when first register/ last unregister in progress*/ #define UPROBES_RUN_HANDLER 0x2 @@ -70,27 +72,23 @@ struct uprobe { }; #ifdef CONFIG_UPROBES -extern int __weak set_bkpt(struct mm_struct *mm, struct uprobe *uprobe, - unsigned long vaddr); -extern int __weak set_orig_insn(struct mm_struct *mm, struct uprobe *uprobe, - unsigned long vaddr, bool verify); +extern int __weak set_bkpt(struct mm_struct *mm, struct uprobe *uprobe, unsigned long vaddr); +extern int __weak set_orig_insn(struct mm_struct *mm, struct uprobe *uprobe, unsigned long vaddr, bool verify); extern bool __weak is_bkpt_insn(uprobe_opcode_t *insn); -extern int register_uprobe(struct inode *inode, loff_t offset, - struct uprobe_consumer *consumer); -extern void unregister_uprobe(struct inode *inode, loff_t offset, - struct uprobe_consumer *consumer); -extern int mmap_uprobe(struct vm_area_struct *vma); +extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); +extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); +extern int uprobe_mmap(struct vm_area_struct *vma); #else /* CONFIG_UPROBES is not defined */ -static inline int register_uprobe(struct inode *inode, loff_t offset, - struct uprobe_consumer *consumer) +static inline int +uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer) { return -ENOSYS; } -static inline void unregister_uprobe(struct inode *inode, loff_t offset, - struct uprobe_consumer *consumer) +static inline void +uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer) { } -static inline int mmap_uprobe(struct vm_area_struct *vma) +static inline int uprobe_mmap(struct vm_area_struct *vma) { return 0; } -- cgit v1.2.3 From 96379f60075c75b261328aa7830ef8aa158247ac Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Wed, 22 Feb 2012 14:45:49 +0530 Subject: uprobes/core: Remove uprobe_opcode_sz uprobe_opcode_sz refers to the smallest instruction size for that architecture. UPROBES_BKPT_INSN_SIZE refers to the size of the breakpoint instruction for that architecture. For now we are assuming that both uprobe_opcode_sz and UPROBES_BKPT_INSN_SIZE are the same for all archs and hence removing uprobe_opcode_sz in favour of UPROBES_BKPT_INSN_SIZE. However if we have to support architectures where the smallest instruction size is different from the size of breakpoint instruction, we may have to re-introduce uprobe_opcode_sz. Signed-off-by: Srikar Dronamraju Cc: Peter Zijlstra Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Jiri Olsa Cc: Josh Stone Link: http://lkml.kernel.org/r/20120222091549.15880.67020.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 2 -- 1 file changed, 2 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 64e45f116b2a..fd45b70750d4 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -37,8 +37,6 @@ struct uprobe_arch_info {}; #define MAX_UINSN_BYTES 4 #endif -#define uprobe_opcode_sz sizeof(uprobe_opcode_t) - /* flags that denote/change uprobes behaviour */ /* Have a copy of original instruction */ -- cgit v1.2.3 From 3ff54efdfaace9e9b2b7c1959a865be6b91de96c Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Wed, 22 Feb 2012 14:46:02 +0530 Subject: uprobes/core: Move insn to arch specific structure Few cleanups suggested by Ingo Molnar. - Rename struct uprobe_arch_info to struct arch_uprobe. - Move insn from struct uprobe to struct arch_uprobe. - Make arch specific uprobe functions to accept struct arch_uprobe instead of struct uprobe. - Move struct uprobe to kernel/uprobes.c from include/linux/uprobes.h Signed-off-by: Srikar Dronamraju Cc: Peter Zijlstra Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Jiri Olsa Cc: Josh Stone Link: http://lkml.kernel.org/r/20120222091602.15880.40249.sendpatchset@srdronam.in.ibm.com [ Made various small improvements ] Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 23 ++--------------------- 1 file changed, 2 insertions(+), 21 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index fd45b70750d4..9c6be62787ed 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -29,12 +29,6 @@ struct vm_area_struct; #ifdef CONFIG_ARCH_SUPPORTS_UPROBES #include -#else - -typedef u8 uprobe_opcode_t; -struct uprobe_arch_info {}; - -#define MAX_UINSN_BYTES 4 #endif /* flags that denote/change uprobes behaviour */ @@ -56,22 +50,9 @@ struct uprobe_consumer { struct uprobe_consumer *next; }; -struct uprobe { - struct rb_node rb_node; /* node in the rb tree */ - atomic_t ref; - struct rw_semaphore consumer_rwsem; - struct list_head pending_list; - struct uprobe_arch_info arch_info; - struct uprobe_consumer *consumers; - struct inode *inode; /* Also hold a ref to inode */ - loff_t offset; - int flags; - u8 insn[MAX_UINSN_BYTES]; -}; - #ifdef CONFIG_UPROBES -extern int __weak set_bkpt(struct mm_struct *mm, struct uprobe *uprobe, unsigned long vaddr); -extern int __weak set_orig_insn(struct mm_struct *mm, struct uprobe *uprobe, unsigned long vaddr, bool verify); +extern int __weak set_bkpt(struct mm_struct *mm, struct arch_uprobe *auprobe, unsigned long vaddr); +extern int __weak set_orig_insn(struct mm_struct *mm, struct arch_uprobe *auprobe, unsigned long vaddr, bool verify); extern bool __weak is_bkpt_insn(uprobe_opcode_t *insn); extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); -- cgit v1.2.3 From 35aa621b5ab9d08767f7bc8d209b696df281d715 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 22 Feb 2012 11:37:29 +0100 Subject: uprobes: Update copyright notices Add Peter Zijlstra's copyright to the uprobes code, whose contributions to the uprobes code are not visible in the Git history, because they were backmerged. Also update existing copyright notices to the year 2012. Acked-by: Srikar Dronamraju Cc: Peter Zijlstra Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Link: http://lkml.kernel.org/n/tip-vjqxst502pc1efz7ah8cyht4@git.kernel.org Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 9c6be62787ed..f85797e1ccd4 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -17,10 +17,11 @@ * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. * - * Copyright (C) IBM Corporation, 2008-2011 + * Copyright (C) IBM Corporation, 2008-2012 * Authors: * Srikar Dronamraju * Jim Keniston + * Copyright (C) 2011-2012 Red Hat, Inc., Peter Zijlstra */ #include -- cgit v1.2.3 From 900771a483ef28915a48066d7895d8252315607a Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Mon, 12 Mar 2012 14:55:14 +0530 Subject: uprobes/core: Make macro names consistent Rename macros that refer to individual uprobe to start with UPROBE_ instead of UPROBES_. This is pure cleanup, no functional change intended. Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120312092514.5379.36595.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index f85797e1ccd4..838fb312926a 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -35,10 +35,10 @@ struct vm_area_struct; /* flags that denote/change uprobes behaviour */ /* Have a copy of original instruction */ -#define UPROBES_COPY_INSN 0x1 +#define UPROBE_COPY_INSN 0x1 /* Dont run handlers when first register/ last unregister in progress*/ -#define UPROBES_RUN_HANDLER 0x2 +#define UPROBE_RUN_HANDLER 0x2 struct uprobe_consumer { int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs); -- cgit v1.2.3 From e3343e6a2819ff5d0dfc4bb5c9fb7f9a4d04da73 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Mon, 12 Mar 2012 14:55:30 +0530 Subject: uprobes/core: Make order of function parameters consistent across functions If a function takes struct uprobe or struct arch_uprobe, then it is passed as the first parameter. This is pure cleanup, no functional change intended. Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120312092530.5379.18394.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 838fb312926a..58699182e9a7 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -52,20 +52,20 @@ struct uprobe_consumer { }; #ifdef CONFIG_UPROBES -extern int __weak set_bkpt(struct mm_struct *mm, struct arch_uprobe *auprobe, unsigned long vaddr); -extern int __weak set_orig_insn(struct mm_struct *mm, struct arch_uprobe *auprobe, unsigned long vaddr, bool verify); +extern int __weak set_bkpt(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); +extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify); extern bool __weak is_bkpt_insn(uprobe_opcode_t *insn); -extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); -extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer); +extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); +extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern int uprobe_mmap(struct vm_area_struct *vma); #else /* CONFIG_UPROBES is not defined */ static inline int -uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer) +uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) { return -ENOSYS; } static inline void -uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *consumer) +uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) { } static inline int uprobe_mmap(struct vm_area_struct *vma) -- cgit v1.2.3 From 5cb4ac3a583d4ee18c8682ab857e093c4a0d0895 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Mon, 12 Mar 2012 14:55:45 +0530 Subject: uprobes/core: Rename bkpt to swbp bkpt doesnt seem to be a correct abbrevation for breakpoint. Choice was between bp and breakpoint. Since bp can refer to things other than breakpoint, use swbp to refer to breakpoints. This is pure cleanup, no functional change intended. Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120312092545.5379.91251.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 58699182e9a7..eac525f41b94 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -52,9 +52,9 @@ struct uprobe_consumer { }; #ifdef CONFIG_UPROBES -extern int __weak set_bkpt(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); +extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify); -extern bool __weak is_bkpt_insn(uprobe_opcode_t *insn); +extern bool __weak is_swbp_insn(uprobe_opcode_t *insn); extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern int uprobe_mmap(struct vm_area_struct *vma); -- cgit v1.2.3 From 0326f5a94ddea33fa331b2519f4172f4fb387baa Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Tue, 13 Mar 2012 23:30:11 +0530 Subject: uprobes/core: Handle breakpoint and singlestep exceptions Uprobes uses exception notifiers to get to know if a thread hit a breakpoint or a singlestep exception. When a thread hits a uprobe or is singlestepping post a uprobe hit, the uprobe exception notifier sets its TIF_UPROBE bit, which will then be checked on its return to userspace path (do_notify_resume() ->uprobe_notify_resume()), where the consumers handlers are run (in task context) based on the defined filters. Uprobe hits are thread specific and hence we need to maintain information about if a task hit a uprobe, what uprobe was hit, the slot where the original instruction was copied for xol so that it can be singlestepped with appropriate fixups. In some cases, special care is needed for instructions that are executed out of line (xol). These are architecture specific artefacts, such as handling RIP relative instructions on x86_64. Since the instruction at which the uprobe was inserted is executed out of line, architecture specific fixups are added so that the thread continues normal execution in the presence of a uprobe. Postpone the signals until we execute the probed insn. post_xol() path does a recalc_sigpending() before return to user-mode, this ensures the signal can't be lost. Uprobes relies on DIE_DEBUG notification to notify if a singlestep is complete. Adds x86 specific uprobe exception notifiers and appropriate hooks needed to determine a uprobe hit and subsequent post processing. Add requisite x86 fixups for xol for uprobes. Specific cases needing fixups include relative jumps (x86_64), calls, etc. Where possible, we check and skip singlestepping the breakpointed instructions. For now we skip single byte as well as few multibyte nop instructions. However this can be extended to other instructions too. Credits to Oleg Nesterov for suggestions/patches related to signal, breakpoint, singlestep handling code. Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120313180011.29771.89027.sendpatchset@srdronam.in.ibm.com [ Performed various cleanliness edits ] Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 55 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 52 insertions(+), 3 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index eac525f41b94..5ec778fdce6f 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -28,8 +28,9 @@ #include struct vm_area_struct; + #ifdef CONFIG_ARCH_SUPPORTS_UPROBES -#include +# include #endif /* flags that denote/change uprobes behaviour */ @@ -39,6 +40,8 @@ struct vm_area_struct; /* Dont run handlers when first register/ last unregister in progress*/ #define UPROBE_RUN_HANDLER 0x2 +/* Can skip singlestep */ +#define UPROBE_SKIP_SSTEP 0x4 struct uprobe_consumer { int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs); @@ -52,13 +55,42 @@ struct uprobe_consumer { }; #ifdef CONFIG_UPROBES +enum uprobe_task_state { + UTASK_RUNNING, + UTASK_BP_HIT, + UTASK_SSTEP, + UTASK_SSTEP_ACK, + UTASK_SSTEP_TRAPPED, +}; + +/* + * uprobe_task: Metadata of a task while it singlesteps. + */ +struct uprobe_task { + enum uprobe_task_state state; + struct arch_uprobe_task autask; + + struct uprobe *active_uprobe; + + unsigned long xol_vaddr; + unsigned long vaddr; +}; + extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify); extern bool __weak is_swbp_insn(uprobe_opcode_t *insn); extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern int uprobe_mmap(struct vm_area_struct *vma); -#else /* CONFIG_UPROBES is not defined */ +extern void uprobe_free_utask(struct task_struct *t); +extern void uprobe_copy_process(struct task_struct *t); +extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs); +extern int uprobe_post_sstep_notifier(struct pt_regs *regs); +extern int uprobe_pre_sstep_notifier(struct pt_regs *regs); +extern void uprobe_notify_resume(struct pt_regs *regs); +extern bool uprobe_deny_signal(void); +extern bool __weak arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs *regs); +#else /* !CONFIG_UPROBES */ static inline int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) { @@ -72,5 +104,22 @@ static inline int uprobe_mmap(struct vm_area_struct *vma) { return 0; } -#endif /* CONFIG_UPROBES */ +static inline void uprobe_notify_resume(struct pt_regs *regs) +{ +} +static inline bool uprobe_deny_signal(void) +{ + return false; +} +static inline unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) +{ + return 0; +} +static inline void uprobe_free_utask(struct task_struct *t) +{ +} +static inline void uprobe_copy_process(struct task_struct *t) +{ +} +#endif /* !CONFIG_UPROBES */ #endif /* _LINUX_UPROBES_H */ -- cgit v1.2.3 From d4b3b6384f98f8692ad0209891ccdbc7e78bbefe Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Fri, 30 Mar 2012 23:56:31 +0530 Subject: uprobes/core: Allocate XOL slots for uprobes use Uprobes executes the original instruction at a probed location out of line. For this, we allocate a page (per mm) upon the first uprobe hit, in the process user address space, divide it into slots that are used to store the actual instructions to be singlestepped. These slots are known as xol (execution out of line) slots. Care is taken to ensure that the allocation is in an unmapped area as close to the top of the user address space as possible, with appropriate permission settings to keep selinux like frameworks happy. Upon a uprobe hit, a free slot is acquired, and is released after the singlestep completes. Lots of improvements courtesy suggestions/inputs from Peter and Oleg. [ Folded a fix for build issue on powerpc fixed and reported by Stephen Rothwell. ] Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120330182631.10018.48175.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 5ec778fdce6f..a111460c07d5 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -28,6 +28,8 @@ #include struct vm_area_struct; +struct mm_struct; +struct inode; #ifdef CONFIG_ARCH_SUPPORTS_UPROBES # include @@ -76,6 +78,28 @@ struct uprobe_task { unsigned long vaddr; }; +/* + * On a breakpoint hit, thread contests for a slot. It frees the + * slot after singlestep. Currently a fixed number of slots are + * allocated. + */ +struct xol_area { + wait_queue_head_t wq; /* if all slots are busy */ + atomic_t slot_count; /* number of in-use slots */ + unsigned long *bitmap; /* 0 = free slot */ + struct page *page; + + /* + * We keep the vma's vm_start rather than a pointer to the vma + * itself. The probed process or a naughty kernel module could make + * the vma go away, and we must handle that reasonably gracefully. + */ + unsigned long vaddr; /* Page(s) of instruction slots */ +}; + +struct uprobes_state { + struct xol_area *xol_area; +}; extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify); extern bool __weak is_swbp_insn(uprobe_opcode_t *insn); @@ -90,7 +114,11 @@ extern int uprobe_pre_sstep_notifier(struct pt_regs *regs); extern void uprobe_notify_resume(struct pt_regs *regs); extern bool uprobe_deny_signal(void); extern bool __weak arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs *regs); +extern void uprobe_clear_state(struct mm_struct *mm); +extern void uprobe_reset_state(struct mm_struct *mm); #else /* !CONFIG_UPROBES */ +struct uprobes_state { +}; static inline int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc) { @@ -121,5 +149,11 @@ static inline void uprobe_free_utask(struct task_struct *t) static inline void uprobe_copy_process(struct task_struct *t) { } +static inline void uprobe_clear_state(struct mm_struct *mm) +{ +} +static inline void uprobe_reset_state(struct mm_struct *mm) +{ +} #endif /* !CONFIG_UPROBES */ #endif /* _LINUX_UPROBES_H */ -- cgit v1.2.3 From 682968e0c425c60f0dde37977e5beb2b12ddc4cc Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Fri, 30 Mar 2012 23:56:46 +0530 Subject: uprobes/core: Optimize probe hits with the help of a counter Maintain a per-mm counter: number of uprobes that are inserted on this process address space. This counter can be used at probe hit time to determine if we need a lookup in the uprobes rbtree. Everytime a probe gets inserted successfully, the probe count is incremented and everytime a probe gets removed, the probe count is decremented. The new uprobe_munmap hook ensures the count is correct on a unmap or remap of a region. We expect that once a uprobe_munmap() is called, the vma goes away. So uprobe_unregister() finding a probe to unregister would either mean unmap event hasnt occurred yet or a mmap event on the same executable file occured after a unmap event. Additionally, uprobe_mmap hook now also gets called: a. on every executable vma that is COWed at fork. b. a vma of interest is newly mapped; breakpoint insertion also happens at the required address. On process creation, make sure the probes count in the child is set correctly. Special cases that are taken care include: a. mremap b. VM_DONTCOPY vmas on fork() c. insertion/removal races in the parent during fork(). Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120330182646.10018.85805.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index a111460c07d5..d594d3b3ad4c 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -99,6 +99,7 @@ struct xol_area { struct uprobes_state { struct xol_area *xol_area; + atomic_t count; }; extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr); extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify); @@ -106,6 +107,7 @@ extern bool __weak is_swbp_insn(uprobe_opcode_t *insn); extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern int uprobe_mmap(struct vm_area_struct *vma); +extern void uprobe_munmap(struct vm_area_struct *vma); extern void uprobe_free_utask(struct task_struct *t); extern void uprobe_copy_process(struct task_struct *t); extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs); @@ -132,6 +134,9 @@ static inline int uprobe_mmap(struct vm_area_struct *vma) { return 0; } +static inline void uprobe_munmap(struct vm_area_struct *vma) +{ +} static inline void uprobe_notify_resume(struct pt_regs *regs) { } -- cgit v1.2.3 From cbc91f71b51b8335f1fc7ccfca8011f31a717367 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Wed, 11 Apr 2012 16:05:27 +0530 Subject: uprobes/core: Decrement uprobe count before the pages are unmapped Uprobes has a callback (uprobe_munmap()) in the unmap path to maintain the uprobes count. In the exit path this callback gets called in unlink_file_vma(). However by the time unlink_file_vma() is called, the pages would have been unmapped (in unmap_vmas()) and the task->rss_stat counts accounted (in zap_pte_range()). If the exiting process has probepoints, uprobe_munmap() checks if the breakpoint instruction was around before decrementing the probe count. This results in a file backed page being reread by uprobe_munmap() and hence it does not find the breakpoint. This patch fixes this problem by moving the callback to unmap_single_vma(). Since unmap_single_vma() may not unmap the complete vma, add start and end parameters to uprobe_munmap(). This bug became apparent courtesy of commit c3f0327f8e9d ("mm: add rss counters consistency check"). Signed-off-by: Srikar Dronamraju Cc: Linus Torvalds Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Linux-mm Cc: Oleg Nesterov Cc: Andi Kleen Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Anton Arapov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20120411103527.23245.9835.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar --- include/linux/uprobes.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include/linux/uprobes.h') diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index d594d3b3ad4c..efe4b3308c74 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -107,7 +107,7 @@ extern bool __weak is_swbp_insn(uprobe_opcode_t *insn); extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc); extern int uprobe_mmap(struct vm_area_struct *vma); -extern void uprobe_munmap(struct vm_area_struct *vma); +extern void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void uprobe_free_utask(struct task_struct *t); extern void uprobe_copy_process(struct task_struct *t); extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs); @@ -134,7 +134,8 @@ static inline int uprobe_mmap(struct vm_area_struct *vma) { return 0; } -static inline void uprobe_munmap(struct vm_area_struct *vma) +static inline void +uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end) { } static inline void uprobe_notify_resume(struct pt_regs *regs) -- cgit v1.2.3