<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/tools/objtool/elf.h, branch v5.7-rc2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>objtool: Optimize find_rela_by_dest_range()</title>
<updated>2020-03-25T17:28:31+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-12T10:30:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=74b873e49d92f90deb41d1a2a8fbb70328aebd67'/>
<id>74b873e49d92f90deb41d1a2a8fbb70328aebd67</id>
<content type='text'>
Perf shows there is significant time in find_rela_by_dest(); this is
because we have to iterate the address space per byte, looking for
relocation entries.

Optimize this by reducing the address space granularity.

This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.861321325@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Perf shows there is significant time in find_rela_by_dest(); this is
because we have to iterate the address space per byte, looking for
relocation entries.

Optimize this by reducing the address space granularity.

This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.861321325@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize read_sections()</title>
<updated>2020-03-25T17:28:30+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-12T10:23:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8b5fa6bc326bf02f293b5a39a8f5b3de816265d3'/>
<id>8b5fa6bc326bf02f293b5a39a8f5b3de816265d3</id>
<content type='text'>
Perf showed that __hash_init() is a significant portion of
read_sections(), so instead of doing a per section rela_hash, use an
elf-wide rela_hash.

Statistics show us there are about 1.1 million relas, so size it
accordingly.

This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5
seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.739153726@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Perf showed that __hash_init() is a significant portion of
read_sections(), so instead of doing a per section rela_hash, use an
elf-wide rela_hash.

Statistics show us there are about 1.1 million relas, so size it
accordingly.

This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5
seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.739153726@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize find_symbol_by_name()</title>
<updated>2020-03-25T17:28:30+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-12T09:17:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cdb3d057a17d56363a831e486ea39e4c389a6cf9'/>
<id>cdb3d057a17d56363a831e486ea39e4c389a6cf9</id>
<content type='text'>
Perf showed that find_symbol_by_name() takes time; add a symbol name
hash.

This shaves another second off of objtool on vmlinux.o runtime, down
to 15 seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.676865656@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Perf showed that find_symbol_by_name() takes time; add a symbol name
hash.

This shaves another second off of objtool on vmlinux.o runtime, down
to 15 seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.676865656@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Rename find_containing_func()</title>
<updated>2020-03-25T17:28:29+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-16T09:36:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=53d20720bbc8718ef86fdfe53dec0accfb593ef8'/>
<id>53d20720bbc8718ef86fdfe53dec0accfb593ef8</id>
<content type='text'>
For consistency; we have:

  find_symbol_by_offset() / find_symbol_containing()
  find_func_by_offset()   / find_containing_func()

fix that.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.558470724@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For consistency; we have:

  find_symbol_by_offset() / find_symbol_containing()
  find_func_by_offset()   / find_containing_func()

fix that.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.558470724@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize find_symbol_*() and read_symbols()</title>
<updated>2020-03-25T17:28:29+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-12T08:34:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2a362ecc3ec9632aeea4b9a9062db91b2bd9975a'/>
<id>2a362ecc3ec9632aeea4b9a9062db91b2bd9975a</id>
<content type='text'>
All of:

  read_symbols(), find_symbol_by_offset(), find_symbol_containing(),
  find_containing_func()

do a linear search of the symbols. Add an RB tree to make it go
faster.

This about halves objtool runtime on vmlinux.o, from 34s to 18s.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.499016559@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All of:

  read_symbols(), find_symbol_by_offset(), find_symbol_containing(),
  find_containing_func()

do a linear search of the symbols. Add an RB tree to make it go
faster.

This about halves objtool runtime on vmlinux.o, from 34s to 18s.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.499016559@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize find_section_by_name()</title>
<updated>2020-03-25T17:28:29+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-12T08:32:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ae358196fac3a0b4d2a7d47a4f401e3421027b03'/>
<id>ae358196fac3a0b4d2a7d47a4f401e3421027b03</id>
<content type='text'>
In order to avoid yet another linear search of (20k) sections, add a
name based hash.

This reduces objtool runtime on vmlinux.o by some 10s to around 35s.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.440174280@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to avoid yet another linear search of (20k) sections, add a
name based hash.

This reduces objtool runtime on vmlinux.o by some 10s to around 35s.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.440174280@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize find_section_by_index()</title>
<updated>2020-03-25T17:28:28+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-10T17:43:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=530389968739883a61192767e1c215653ba4ba2b'/>
<id>530389968739883a61192767e1c215653ba4ba2b</id>
<content type='text'>
In order to avoid a linear search (over 20k entries), add an
section_hash to the elf object.

This reduces objtool on vmlinux.o from a few minutes to around 45
seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.381249993@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to avoid a linear search (over 20k entries), add an
section_hash to the elf object.

This reduces objtool on vmlinux.o from a few minutes to around 45
seconds.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.381249993@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Optimize find_symbol_by_index()</title>
<updated>2020-03-25T17:28:28+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-03-10T17:39:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=65fb11a7f6aeae678043738d06248a4e21f4e4e4'/>
<id>65fb11a7f6aeae678043738d06248a4e21f4e4e4</id>
<content type='text'>
The symbol index is object wide, not per section, so it makes no sense
to have the symbol_hash be part of the section object. By moving it to
the elf object we avoid the linear sections iteration.

This reduces the runtime of objtool on vmlinux.o from over 3 hours (I
gave up) to a few minutes. The defconfig vmlinux.o has around 20k
sections.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.261852348@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The symbol index is object wide, not per section, so it makes no sense
to have the symbol_hash be part of the section object. By moving it to
the elf object we avoid the linear sections iteration.

This reduces the runtime of objtool on vmlinux.o from over 3 hours (I
gave up) to a few minutes. The defconfig vmlinux.o has around 20k
sections.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Miroslav Benes &lt;mbenes@suse.cz&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200324160924.261852348@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Improve call destination function detection</title>
<updated>2020-02-21T09:20:34+00:00</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@redhat.com</email>
</author>
<published>2020-02-18T03:41:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7acfe5315312fc56c2a94c9216448087b38ae909'/>
<id>7acfe5315312fc56c2a94c9216448087b38ae909</id>
<content type='text'>
A recent clang change, combined with a binutils bug, can trigger a
situation where a ".Lprintk$local" STT_NOTYPE symbol gets created at the
same offset as the "printk" STT_FUNC symbol.  This confuses objtool:

  kernel/printk/printk.o: warning: objtool: ignore_loglevel_setup()+0x10: can't find call dest symbol at .text+0xc67

Improve the call destination detection by looking specifically for an
STT_FUNC symbol.

Reported-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/872
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25551
Link: https://lkml.kernel.org/r/0a7ee320bc0ea4469bd3dc450a7b4725669e0ea9.1581997059.git.jpoimboe@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A recent clang change, combined with a binutils bug, can trigger a
situation where a ".Lprintk$local" STT_NOTYPE symbol gets created at the
same offset as the "printk" STT_FUNC symbol.  This confuses objtool:

  kernel/printk/printk.o: warning: objtool: ignore_loglevel_setup()+0x10: can't find call dest symbol at .text+0xc67

Improve the call destination detection by looking specifically for an
STT_FUNC symbol.

Reported-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/872
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25551
Link: https://lkml.kernel.org/r/0a7ee320bc0ea4469bd3dc450a7b4725669e0ea9.1581997059.git.jpoimboe@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>objtool: Support repeated uses of the same C jump table</title>
<updated>2019-07-18T19:01:09+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2019-07-18T01:36:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bd98c81346468fc2f86aeeb44d4d0d6f763a62b7'/>
<id>bd98c81346468fc2f86aeeb44d4d0d6f763a62b7</id>
<content type='text'>
This fixes objtool for both a GCC issue and a Clang issue:

1) GCC issue:

   kernel/bpf/core.o: warning: objtool: ___bpf_prog_run()+0x8d5: sibling call from callable instruction with modified stack frame

   With CONFIG_RETPOLINE=n, GCC is doing the following optimization in
   ___bpf_prog_run().

   Before:

           select_insn:
                   jmp *jumptable(,%rax,8)
                   ...
           ALU64_ADD_X:
                   ...
                   jmp select_insn
           ALU_ADD_X:
                   ...
                   jmp select_insn

   After:

           select_insn:
                   jmp *jumptable(, %rax, 8)
                   ...
           ALU64_ADD_X:
                   ...
                   jmp *jumptable(, %rax, 8)
           ALU_ADD_X:
                   ...
                   jmp *jumptable(, %rax, 8)

   This confuses objtool.  It has never seen multiple indirect jump
   sites which use the same jump table.

   For GCC switch tables, the only way of detecting the size of a table
   is by continuing to scan for more tables.  The size of the previous
   table can only be determined after another switch table is found, or
   when the scan reaches the end of the function.

   That logic was reused for C jump tables, and was based on the
   assumption that each jump table only has a single jump site.  The
   above optimization breaks that assumption.

2) Clang issue:

   drivers/usb/misc/sisusbvga/sisusb.o: warning: objtool: sisusb_write_mem_bulk()+0x588: can't find switch jump table

   With clang 9, code can be generated where a function contains two
   indirect jump instructions which use the same switch table.

The fix is the same for both issues: split the jump table parsing into
two passes.

In the first pass, locate the heads of all switch tables for the
function and mark their locations.

In the second pass, parse the switch tables and add them.

Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reported-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/e995befaada9d4d8b2cf788ff3f566ba900d2b4d.1563413318.git.jpoimboe@redhat.com

Co-developed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes objtool for both a GCC issue and a Clang issue:

1) GCC issue:

   kernel/bpf/core.o: warning: objtool: ___bpf_prog_run()+0x8d5: sibling call from callable instruction with modified stack frame

   With CONFIG_RETPOLINE=n, GCC is doing the following optimization in
   ___bpf_prog_run().

   Before:

           select_insn:
                   jmp *jumptable(,%rax,8)
                   ...
           ALU64_ADD_X:
                   ...
                   jmp select_insn
           ALU_ADD_X:
                   ...
                   jmp select_insn

   After:

           select_insn:
                   jmp *jumptable(, %rax, 8)
                   ...
           ALU64_ADD_X:
                   ...
                   jmp *jumptable(, %rax, 8)
           ALU_ADD_X:
                   ...
                   jmp *jumptable(, %rax, 8)

   This confuses objtool.  It has never seen multiple indirect jump
   sites which use the same jump table.

   For GCC switch tables, the only way of detecting the size of a table
   is by continuing to scan for more tables.  The size of the previous
   table can only be determined after another switch table is found, or
   when the scan reaches the end of the function.

   That logic was reused for C jump tables, and was based on the
   assumption that each jump table only has a single jump site.  The
   above optimization breaks that assumption.

2) Clang issue:

   drivers/usb/misc/sisusbvga/sisusb.o: warning: objtool: sisusb_write_mem_bulk()+0x588: can't find switch jump table

   With clang 9, code can be generated where a function contains two
   indirect jump instructions which use the same switch table.

The fix is the same for both issues: split the jump table parsing into
two passes.

In the first pass, locate the heads of all switch tables for the
function and mark their locations.

In the second pass, parse the switch tables and add them.

Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reported-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/e995befaada9d4d8b2cf788ff3f566ba900d2b4d.1563413318.git.jpoimboe@redhat.com

Co-developed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
