<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/tools, branch master</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc64/ftrace: fix OOL stub count with clang</title>
<updated>2026-03-07T10:32:25+00:00</updated>
<author>
<name>Hari Bathini</name>
<email>hbathini@linux.ibm.com</email>
</author>
<published>2026-01-27T08:49:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=875612a7745013a43c67493cb0583ee3f7476344'/>
<id>875612a7745013a43c67493cb0583ee3f7476344</id>
<content type='text'>
The total number of out-of-line (OOL) stubs required for function
tracing is determined using the following command:

    $(OBJDUMP) -r -j __patchable_function_entries vmlinux.o

While this works correctly with GNU objdump, llvm-objdump does not
list the expected relocation records for this section. Fix this by
using the -d option and counting R_PPC64_ADDR64 relocation entries.
This works as desired with both objdump and llvm-objdump.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Tested-by: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260127084926.34497-3-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The total number of out-of-line (OOL) stubs required for function
tracing is determined using the following command:

    $(OBJDUMP) -r -j __patchable_function_entries vmlinux.o

While this works correctly with GNU objdump, llvm-objdump does not
list the expected relocation records for this section. Fix this by
using the -d option and counting R_PPC64_ADDR64 relocation entries.
This works as desired with both objdump and llvm-objdump.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Tested-by: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260127084926.34497-3-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64: make clang cross-build friendly</title>
<updated>2026-03-07T10:32:25+00:00</updated>
<author>
<name>Hari Bathini</name>
<email>hbathini@linux.ibm.com</email>
</author>
<published>2026-01-27T08:49:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=73cdf24e81e4eba52a40a6b10c6cf285d0ac23fd'/>
<id>73cdf24e81e4eba52a40a6b10c6cf285d0ac23fd</id>
<content type='text'>
ARCH_USING_PATCHABLE_FUNCTION_ENTRY depends on toolchain support for
-fpatchable-function-entry option. The current script that checks
for this support only handles GCC. Rename the script and extend it
to detect support for -fpatchable-function-entry with Clang as well,
allowing clean cross-compilation with Clang toolchains.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Tested-by: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260127084926.34497-2-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ARCH_USING_PATCHABLE_FUNCTION_ENTRY depends on toolchain support for
-fpatchable-function-entry option. The current script that checks
for this support only handles GCC. Rename the script and extend it
to detect support for -fpatchable-function-entry with Clang as well,
allowing clean cross-compilation with Clang toolchains.

Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Tested-by: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20260127084926.34497-2-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/tools: drop `-o pipefail` in gcc check scripts</title>
<updated>2025-12-22T12:26:34+00:00</updated>
<author>
<name>Jan Stancek</name>
<email>jstancek@redhat.com</email>
</author>
<published>2025-09-23T15:32:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f1164534ad62f0cc247d99650b07bd59ad2a49fd'/>
<id>f1164534ad62f0cc247d99650b07bd59ad2a49fd</id>
<content type='text'>
Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry")
Fixes: b71c9ffb1405 ("powerpc: Add arch/powerpc/tools directory")
Reported-by: Joe Lawrence &lt;joe.lawrence@redhat.com&gt;
Acked-by: Joe Lawrence &lt;joe.lawrence@redhat.com&gt;
Signed-off-by: Jan Stancek &lt;jstancek@redhat.com&gt;
Fixes: 8c50b72a3b4f ("powerpc/ftrace: Add Kconfig &amp; Make glue for mprofile-kernel")
Fixes: abba759796f9 ("powerpc/kbuild: move -mprofile-kernel check to Kconfig")
Tested-by: Justin M. Forbes &lt;jforbes@fedoraproject.org&gt;
Reviewed-by: Naveen N Rao (AMD) &lt;naveen@kernel.org&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/cc6cdd116c3ad9d990df21f13c6d8e8a83815bbd.1758641374.git.jstancek@redhat.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry")
Fixes: b71c9ffb1405 ("powerpc: Add arch/powerpc/tools directory")
Reported-by: Joe Lawrence &lt;joe.lawrence@redhat.com&gt;
Acked-by: Joe Lawrence &lt;joe.lawrence@redhat.com&gt;
Signed-off-by: Jan Stancek &lt;jstancek@redhat.com&gt;
Fixes: 8c50b72a3b4f ("powerpc/ftrace: Add Kconfig &amp; Make glue for mprofile-kernel")
Fixes: abba759796f9 ("powerpc/kbuild: move -mprofile-kernel check to Kconfig")
Tested-by: Justin M. Forbes &lt;jforbes@fedoraproject.org&gt;
Reviewed-by: Naveen N Rao (AMD) &lt;naveen@kernel.org&gt;
Reviewed-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/cc6cdd116c3ad9d990df21f13c6d8e8a83815bbd.1758641374.git.jstancek@redhat.com

</pre>
</div>
</content>
</entry>
<entry>
<title>arch:powerpc:tools This file was missing shebang line, so added it</title>
<updated>2025-11-11T09:06:24+00:00</updated>
<author>
<name>Bhaskar Chowdhury</name>
<email>unixbhaskar@gmail.com</email>
</author>
<published>2025-07-22T21:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f90d28443b1f4dbbbdcfea1be5295f6903acc94c'/>
<id>f90d28443b1f4dbbbdcfea1be5295f6903acc94c</id>
<content type='text'>
This file was missing the shebang line, so added it.

Signed-off-by: Bhaskar Chowdhury &lt;unixbhaskar@gmail.com&gt;
Reviewed-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20250722220043.14862-1-unixbhaskar@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This file was missing the shebang line, so added it.

Signed-off-by: Bhaskar Chowdhury &lt;unixbhaskar@gmail.com&gt;
Reviewed-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20250722220043.14862-1-unixbhaskar@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/ftrace: Fix ftrace bug with KASAN=y</title>
<updated>2024-11-10T11:33:51+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2024-11-07T11:16:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cfec8463d9a19ec043845525fe5fd675e59a8aab'/>
<id>cfec8463d9a19ec043845525fe5fd675e59a8aab</id>
<content type='text'>
Booting a KASAN=y kernel with the recently added ftrace out-of-line
support causes a warning at boot:

  ------------[ cut here ]------------
  Stub index overflow (1729 &gt; 1728)
  WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/trace/ftrace.c:209 ftrace_init_nop+0x408/0x444
  ...
  NIP ftrace_init_nop+0x408/0x444
  LR  ftrace_init_nop+0x404/0x444
  Call Trace:
    ftrace_init_nop+0x404/0x444 (unreliable)
    ftrace_process_locs+0x544/0x8a0
    ftrace_init+0xb4/0x22c
    start_kernel+0x1dc/0x4d4
    start_here_common+0x1c/0x20
  ...
  ftrace failed to modify
  [&lt;c0000000030beddc&gt;] _sub_I_65535_1+0x8/0x3c
   actual:   00:00:00:60
  Initializing ftrace call sites
  ftrace record flags: 0
   (0)
   expected tramp: c00000000008b418
  ------------[ cut here ]------------

The function in question, _sub_I_65535_1 is some sort of trampoline
generated for KASAN, and is in the .text.startup section. That section
is part of INIT_TEXT, meaning is_kernel_inittext() returns true for it.

But the script that determines how many out-of-line ftrace stubs are
needed isn't doesn't consider .text.startup as inittext, leading to
there not being enough space for the init stubs.

Conversely the logic to calculate how many stubs are needed for the text
section isn't filtering out the symbols in .text.startup and so ends up
over counting.

Fix both problems by calculating the total number of stubs first, then
the number that count as inittext, and then subtract the latter from the
former to get the count for the text section.

Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241107111630.31068-1-mpe@ellerman.id.au

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Booting a KASAN=y kernel with the recently added ftrace out-of-line
support causes a warning at boot:

  ------------[ cut here ]------------
  Stub index overflow (1729 &gt; 1728)
  WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/trace/ftrace.c:209 ftrace_init_nop+0x408/0x444
  ...
  NIP ftrace_init_nop+0x408/0x444
  LR  ftrace_init_nop+0x404/0x444
  Call Trace:
    ftrace_init_nop+0x404/0x444 (unreliable)
    ftrace_process_locs+0x544/0x8a0
    ftrace_init+0xb4/0x22c
    start_kernel+0x1dc/0x4d4
    start_here_common+0x1c/0x20
  ...
  ftrace failed to modify
  [&lt;c0000000030beddc&gt;] _sub_I_65535_1+0x8/0x3c
   actual:   00:00:00:60
  Initializing ftrace call sites
  ftrace record flags: 0
   (0)
   expected tramp: c00000000008b418
  ------------[ cut here ]------------

The function in question, _sub_I_65535_1 is some sort of trampoline
generated for KASAN, and is in the .text.startup section. That section
is part of INIT_TEXT, meaning is_kernel_inittext() returns true for it.

But the script that determines how many out-of-line ftrace stubs are
needed isn't doesn't consider .text.startup as inittext, leading to
there not being enough space for the init stubs.

Conversely the logic to calculate how many stubs are needed for the text
section isn't filtering out the symbols in .text.startup and so ends up
over counting.

Fix both problems by calculating the total number of stubs first, then
the number that count as inittext, and then subtract the latter from the
former to get the count for the text section.

Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241107111630.31068-1-mpe@ellerman.id.au

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS</title>
<updated>2024-10-31T00:00:55+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e717754f0bb5c5347aac82232691340955735ce1'/>
<id>e717754f0bb5c5347aac82232691340955735ce1</id>
<content type='text'>
Implement support for DYNAMIC_FTRACE_WITH_CALL_OPS similar to the
arm64 implementation.

This works by patching-in a pointer to an associated ftrace_ops
structure before each traceable function. If multiple ftrace_ops are
associated with a call site, then a special ftrace_list_ops is used to
enable iterating over all the registered ftrace_ops. If no ftrace_ops
are associated with a call site, then a special ftrace_nop_ops structure
is used to render the ftrace call as a no-op. ftrace trampoline can then
read the associated ftrace_ops for a call site by loading from an offset
from the LR, and branch directly to the associated function.

The primary advantage with this approach is that we don't have to
iterate over all the registered ftrace_ops for call sites that have a
single ftrace_ops registered. This is the equivalent of implementing
support for dynamic ftrace trampolines, which set up a special ftrace
trampoline for each registered ftrace_ops and have individual call sites
branch into those directly.

A secondary advantage is that this gives us a way to add support for
direct ftrace callers without having to resort to using stubs. The
address of the direct call trampoline can be loaded from the ftrace_ops
structure.

To support this, we reserve a nop before each function on 32-bit
powerpc. For 64-bit powerpc, two nops are reserved before each
out-of-line stub. During ftrace activation, we update this location with
the associated ftrace_ops pointer. Then, on ftrace entry, we load from
this location and call into ftrace_ops-&gt;func().

For 64-bit powerpc, we ensure that the out-of-line stub area is
doubleword aligned so that ftrace_ops address can be updated atomically.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-15-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement support for DYNAMIC_FTRACE_WITH_CALL_OPS similar to the
arm64 implementation.

This works by patching-in a pointer to an associated ftrace_ops
structure before each traceable function. If multiple ftrace_ops are
associated with a call site, then a special ftrace_list_ops is used to
enable iterating over all the registered ftrace_ops. If no ftrace_ops
are associated with a call site, then a special ftrace_nop_ops structure
is used to render the ftrace call as a no-op. ftrace trampoline can then
read the associated ftrace_ops for a call site by loading from an offset
from the LR, and branch directly to the associated function.

The primary advantage with this approach is that we don't have to
iterate over all the registered ftrace_ops for call sites that have a
single ftrace_ops registered. This is the equivalent of implementing
support for dynamic ftrace trampolines, which set up a special ftrace
trampoline for each registered ftrace_ops and have individual call sites
branch into those directly.

A secondary advantage is that this gives us a way to add support for
direct ftrace callers without having to resort to using stubs. The
address of the direct call trampoline can be loaded from the ftrace_ops
structure.

To support this, we reserve a nop before each function on 32-bit
powerpc. For 64-bit powerpc, two nops are reserved before each
out-of-line stub. During ftrace activation, we update this location with
the associated ftrace_ops pointer. Then, on ftrace entry, we load from
this location and call into ftrace_ops-&gt;func().

For 64-bit powerpc, we ensure that the out-of-line stub area is
doubleword aligned so that ftrace_ops address can be updated atomically.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-15-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64/ftrace: Support .text larger than 32MB with out-of-line stubs</title>
<updated>2024-10-31T00:00:55+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cf9bc0efcce2c324314cf7f5138c08f85ef7b5eb'/>
<id>cf9bc0efcce2c324314cf7f5138c08f85ef7b5eb</id>
<content type='text'>
We are restricted to a .text size of ~32MB when using out-of-line
function profile sequence. Allow this to be extended up to the previous
limit of ~64MB by reserving space in the middle of .text.

A new config option CONFIG_PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE is
introduced to specify the number of function stubs that are reserved in
.text. On boot, ftrace utilizes stubs from this area first before using
the stub area at the end of .text.

A ppc64le defconfig has ~44k functions that can be traced. A more
conservative value of 32k functions is chosen as the default value of
PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE so that we do not allot more space
than necessary by default. If building a kernel that only has 32k
trace-able functions, we won't allot any more space at the end of .text
during the pass on vmlinux.o. Otherwise, only the remaining functions
get space for stubs at the end of .text. This default value should help
cover a .text size of ~48MB in total (including space reserved at the
end of .text which can cover up to 32MB), which should be sufficient for
most common builds. For a very small kernel build, this can be set to 0.
Or, this can be bumped up to a larger value to support vmlinux .text
size up to ~64MB.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-14-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We are restricted to a .text size of ~32MB when using out-of-line
function profile sequence. Allow this to be extended up to the previous
limit of ~64MB by reserving space in the middle of .text.

A new config option CONFIG_PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE is
introduced to specify the number of function stubs that are reserved in
.text. On boot, ftrace utilizes stubs from this area first before using
the stub area at the end of .text.

A ppc64le defconfig has ~44k functions that can be traced. A more
conservative value of 32k functions is chosen as the default value of
PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE so that we do not allot more space
than necessary by default. If building a kernel that only has 32k
trace-able functions, we won't allot any more space at the end of .text
during the pass on vmlinux.o. Otherwise, only the remaining functions
get space for stubs at the end of .text. This default value should help
cover a .text size of ~48MB in total (including space reserved at the
end of .text which can cover up to 32MB), which should be sufficient for
most common builds. For a very small kernel build, this can be set to 0.
Or, this can be bumped up to a larger value to support vmlinux .text
size up to ~64MB.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-14-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64/ftrace: Move ftrace sequence out of line</title>
<updated>2024-10-31T00:00:54+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=eec37961a56aa4f3fe1c33ffd48eec7d1bb0c009'/>
<id>eec37961a56aa4f3fe1c33ffd48eec7d1bb0c009</id>
<content type='text'>
Function profile sequence on powerpc includes two instructions at the
beginning of each function:
	mflr	r0
	bl	ftrace_caller

The call to ftrace_caller() gets nop'ed out during kernel boot and is
patched in when ftrace is enabled.

Given the sequence, we cannot return from ftrace_caller with 'blr' as we
need to keep LR and r0 intact. This results in link stack (return
address predictor) imbalance when ftrace is enabled. To address that, we
would like to use a three instruction sequence:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0

Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
reserve two instruction slots before the function. This results in a
total of five instruction slots to be reserved for ftrace use on each
function that is traced.

Move the function profile sequence out-of-line to minimize its impact.
To do this, we reserve a single nop at function entry using
-fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
the total number of functions that can be traced. This is then used to
generate a .S file reserving the appropriate amount of space for use as
ftrace stubs, which is built and linked into vmlinux.

On bootup, the stub space is split into separate stubs per function and
populated with the proper instruction sequence. A pointer to the
associated stub is maintained in dyn_arch_ftrace.

For modules, space for ftrace stubs is reserved from the generic module
stub space.

This is restricted to and enabled by default only on 64-bit powerpc,
though there are some changes to accommodate 32-bit powerpc. This is
done so that 32-bit powerpc could choose to opt into this based on
further tests and benchmarks.

As an example, after this patch, kernel functions will have a single nop
at function entry:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	nop
	mfocrf	r11,8
	...

When ftrace is enabled, the nop is converted to an unconditional branch
to the stub associated with that function:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	b	ftrace_ool_stub_text_end+0x11b28
	mfocrf	r11,8
	...

The associated stub:
&lt;ftrace_ool_stub_text_end+0x11b28&gt;:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0
	b	kernel_clone+0xc
	...

This change showed an improvement of ~10% in null_syscall benchmark on a
Power 10 system with ftrace enabled.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Function profile sequence on powerpc includes two instructions at the
beginning of each function:
	mflr	r0
	bl	ftrace_caller

The call to ftrace_caller() gets nop'ed out during kernel boot and is
patched in when ftrace is enabled.

Given the sequence, we cannot return from ftrace_caller with 'blr' as we
need to keep LR and r0 intact. This results in link stack (return
address predictor) imbalance when ftrace is enabled. To address that, we
would like to use a three instruction sequence:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0

Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
reserve two instruction slots before the function. This results in a
total of five instruction slots to be reserved for ftrace use on each
function that is traced.

Move the function profile sequence out-of-line to minimize its impact.
To do this, we reserve a single nop at function entry using
-fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
the total number of functions that can be traced. This is then used to
generate a .S file reserving the appropriate amount of space for use as
ftrace stubs, which is built and linked into vmlinux.

On bootup, the stub space is split into separate stubs per function and
populated with the proper instruction sequence. A pointer to the
associated stub is maintained in dyn_arch_ftrace.

For modules, space for ftrace stubs is reserved from the generic module
stub space.

This is restricted to and enabled by default only on 64-bit powerpc,
though there are some changes to accommodate 32-bit powerpc. This is
done so that 32-bit powerpc could choose to opt into this based on
further tests and benchmarks.

As an example, after this patch, kernel functions will have a single nop
at function entry:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	nop
	mfocrf	r11,8
	...

When ftrace is enabled, the nop is converted to an unconditional branch
to the stub associated with that function:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	b	ftrace_ool_stub_text_end+0x11b28
	mfocrf	r11,8
	...

The associated stub:
&lt;ftrace_ool_stub_text_end+0x11b28&gt;:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0
	b	kernel_clone+0xc
	...

This change showed an improvement of ~10% in null_syscall benchmark on a
Power 10 system with ftrace enabled.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/ftrace: Add a postlink script to validate function tracer</title>
<updated>2024-10-31T00:00:54+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=782f46cbce5328da9380f166bd31cd17a04a7b10'/>
<id>782f46cbce5328da9380f166bd31cd17a04a7b10</id>
<content type='text'>
Function tracer on powerpc can only work with vmlinux having a .text
size of up to ~64MB due to powerpc branch instruction having a limited
relative branch range of 32MB. Today, this is only detected on kernel
boot when ftrace is init'ed. Add a post-link script to check the size of
.text so that we can detect this at build time, and break the build if
necessary.

We add a dependency on !COMPILE_TEST for CONFIG_HAVE_FUNCTION_TRACER so
that allyesconfig and other test builds can continue to work without
enabling ftrace.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-11-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Function tracer on powerpc can only work with vmlinux having a .text
size of up to ~64MB due to powerpc branch instruction having a limited
relative branch range of 32MB. Today, this is only detected on kernel
boot when ftrace is init'ed. Add a post-link script to check the size of
.text so that we can detect this at build time, and break the build if
necessary.

We add a dependency on !COMPILE_TEST for CONFIG_HAVE_FUNCTION_TRACER so
that allyesconfig and other test builds can continue to work without
enabling ftrace.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-11-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/tools: Pass -mabi=elfv2 to gcc-check-mprofile-kernel.sh</title>
<updated>2023-10-20T06:46:33+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2023-05-30T09:38:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d42f55e8ae741540d0d2ebab8ff02f68fb0c802f'/>
<id>d42f55e8ae741540d0d2ebab8ff02f68fb0c802f</id>
<content type='text'>
Toolchains don't always default to the ELFv2 ABI. This is true with at
least the kernel.org toolchains. As such, pass -mabi=elfv2 explicitly to
ensure that we are testing against the correct compiler output.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
[mpe: Tweak comment wording]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230530093821.298590-1-naveen@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Toolchains don't always default to the ELFv2 ABI. This is true with at
least the kernel.org toolchains. As such, pass -mabi=elfv2 explicitly to
ensure that we are testing against the correct compiler output.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
[mpe: Tweak comment wording]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230530093821.298590-1-naveen@kernel.org
</pre>
</div>
</content>
</entry>
</feed>
