<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/kernel/asm-offsets.c, branch v4.8</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc/8xx: Force VIRT_IMMR_BASE to be a positive number</title>
<updated>2016-07-09T08:26:53+00:00</updated>
<author>
<name>Scott Wood</name>
<email>oss@buserror.net</email>
</author>
<published>2016-07-09T08:22:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f595fd8b54809fed13fc30906ef1e90a3fcfbc9'/>
<id>9f595fd8b54809fed13fc30906ef1e90a3fcfbc9</id>
<content type='text'>
The asm-offsets mechanism generates signed numbers, even if the
input value is explicitly unsigned.  This causes a problem with
older binutils (e.g. 2.23), which sign-extend a negative number
when @h is applied.  Thus, this instruction:

	cmpli   cr0, r11, VIRT_IMMR_BASE@h

resulted in this:

Error: operand out of range (0xfffffff0 is not between 0x00000000 and
0x0000ffff)

By casting to a larger type, we can force the output to be expressed
as a positive number.

Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The asm-offsets mechanism generates signed numbers, even if the
input value is explicitly unsigned.  This causes a problem with
older binutils (e.g. 2.23), which sign-extend a negative number
when @h is applied.  Thus, this instruction:

	cmpli   cr0, r11, VIRT_IMMR_BASE@h

resulted in this:

Error: operand out of range (0xfffffff0 is not between 0x00000000 and
0x0000ffff)

By casting to a larger type, we can force the output to be expressed
as a positive number.

Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/8xx: Fix vaddr for IMMR early remap</title>
<updated>2016-07-09T07:02:48+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2016-05-17T07:02:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f86ef74ed9193c52411277eeac2eec69af553392'/>
<id>f86ef74ed9193c52411277eeac2eec69af553392</id>
<content type='text'>
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfde00000..0xfe000000  : consistent mem
  * 0xfddf6000..0xfde00000  : early ioremap
  * 0xc9000000..0xfddf6000  : vmalloc &amp; ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

Today, IMMR is mapped 1:1 at startup

Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff000000
but for instance on EP88xC board, IMMR is at 0xfa200000 which
overlaps with VM ioremap area

This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.

The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfde00000..0xfe000000  : consistent mem
  * 0xfddf6000..0xfde00000  : early ioremap
  * 0xc9000000..0xfddf6000  : vmalloc &amp; ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

Today, IMMR is mapped 1:1 at startup

Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff000000
but for instance on EP88xC board, IMMR is at 0xfa200000 which
overlaps with VM ioremap area

This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.

The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc32: provide VIRT_CPU_ACCOUNTING</title>
<updated>2016-07-09T06:43:50+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2016-05-17T06:33:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c223c90386bc2306510e0ceacd768a0123ff2a2f'/>
<id>c223c90386bc2306510e0ceacd768a0123ff2a2f</id>
<content type='text'>
This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture.
PPC32 doesn't have the PACA structure, so we use the task_info
structure to store the accounting data.

In order to reuse on PPC32 the PPC64 functions, all u64 data has
been replaced by 'unsigned long' so that it is u32 on PPC32 and
u64 on PPC64

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture.
PPC32 doesn't have the PACA structure, so we use the task_info
structure to store the accounting data.

In order to reuse on PPC32 the PPC64 functions, all u64 data has
been replaced by 'unsigned long' so that it is u32 on PPC32 and
u64 on PPC64

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/asm: Remove unused symbols in asm-offsets.c</title>
<updated>2016-06-16T05:11:25+00:00</updated>
<author>
<name>Rashmica Gupta</name>
<email>rashmicy@gmail.com</email>
</author>
<published>2016-06-01T22:56:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=aac6a91fea93e6bdd7ac20365d7ecc9187ca61da'/>
<id>aac6a91fea93e6bdd7ac20365d7ecc9187ca61da</id>
<content type='text'>
THREAD_DSCR:
  Added in efcac6589a27 "powerpc: Per process DSCR + some fixes (try#4)"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_DSCR_INHERIT:
  Added in 714332858bfd "powerpc: Restore correct DSCR in context switch"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_TAR:
  Added in 2468dcf641e4 "powerpc: Add support for context switching the TAR register"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_BESCR, THREAD_EBBHR and THREAD_EBBRR:
  Added in 9353374b8e15 "powerpc: Context switch the new EBB SPRs"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_SIAR, THREAD_SDAR, THREAD_SIER, THREAD_MMCR0, and THREAD_MMCR2:
  Added in 59affcd3e460 "powerpc: Context switch more PMU related SPRs"
  Last usage removed in b11ae95100f7 "powerpc: Partial revert of "Context switch more PMU related SPRs""

PACA_LOCK_TOKEN:
  Added in 9e368f291560 "KVM: PPC: book3s_hv: Add support for PPC970-family processors"
  Last usage removed in c17b98cf6028 "KVM: PPC: Book3S HV: Remove code for PPC970 processors"

HCALL_STAT_SIZE, HCALL_STAT_CALLS, HCALL_STAT_TB and HCALL_STAT_PURR:
  Added in 57852a853b0d "[POWERPC] powerpc: Instrument Hypervisor Calls"
  Last usage removed in c8cd093a6e9f "powerpc: tracing: Add hypervisor call tracepoints"

VCPU_EPLC:
  Added in d30f6e480055 "KVM: PPC: booke: category E.HV (GS-mode) support"
  Never used.

CPU_DOWN_FLUSH:
  Added in e7affb1dba0e "powerpc/cache: add cache flush operation for various e500"
  Never used.

CFG_STAMP_XSEC:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 0e469db8f70c "powerpc: Rework VDSO gettimeofday to prevent time going backwards"

KVM_LPCR:
  Added in aa04b4cc5be6 "KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests"
  Last usage removed in a0144e2a6b0b "KVM: PPC: Book3S HV: Store LPCR value for each virtual core"

GPR15, GPR16, GPR17, GPR18, GPR19, GPR20, GPR21, GPR22, GPR23, GPR24,
GPR25, GPR26, GPR27, GPR28, GPR29, GPR30 and GPR31:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

VCPU_SHADOW_FSCR:
  Added in 616dff860282 "KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR"
  Never used.

VCPU_SHADOW_SRR1:
  Added in a2d56020d1d9 "KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu"
  Never used.

KVM_SPLIT_SIZE:
  Added in b4deba5c41e9 "KVM: PPC: Book3S HV: Implement dynamicmicro-threading on POWER8"
  Never used.

VCPU_VCPUID:
  Added in de56a948b918 "KVM: PPC: Add support for Book3S processors in hypervisor mode"
  Last usage removed 1b400ba0cd24 "KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations"

_MQ:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

AUDITCONTEXT:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 401d1f029beb "[PATCH] syscall entry/exit revamp"

CLONE_VM:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

CLONE_UNTRACED:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

Signed-off-by: Rashmica Gupta &lt;rashmicy@gmail.com&gt;
[mpe: Munge change log]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
THREAD_DSCR:
  Added in efcac6589a27 "powerpc: Per process DSCR + some fixes (try#4)"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_DSCR_INHERIT:
  Added in 714332858bfd "powerpc: Restore correct DSCR in context switch"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_TAR:
  Added in 2468dcf641e4 "powerpc: Add support for context switching the TAR register"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_BESCR, THREAD_EBBHR and THREAD_EBBRR:
  Added in 9353374b8e15 "powerpc: Context switch the new EBB SPRs"
  Last usage removed in 152d523e6307 "powerpc: Create context switch helpers save_sprs() and restore_sprs()"

THREAD_SIAR, THREAD_SDAR, THREAD_SIER, THREAD_MMCR0, and THREAD_MMCR2:
  Added in 59affcd3e460 "powerpc: Context switch more PMU related SPRs"
  Last usage removed in b11ae95100f7 "powerpc: Partial revert of "Context switch more PMU related SPRs""

PACA_LOCK_TOKEN:
  Added in 9e368f291560 "KVM: PPC: book3s_hv: Add support for PPC970-family processors"
  Last usage removed in c17b98cf6028 "KVM: PPC: Book3S HV: Remove code for PPC970 processors"

HCALL_STAT_SIZE, HCALL_STAT_CALLS, HCALL_STAT_TB and HCALL_STAT_PURR:
  Added in 57852a853b0d "[POWERPC] powerpc: Instrument Hypervisor Calls"
  Last usage removed in c8cd093a6e9f "powerpc: tracing: Add hypervisor call tracepoints"

VCPU_EPLC:
  Added in d30f6e480055 "KVM: PPC: booke: category E.HV (GS-mode) support"
  Never used.

CPU_DOWN_FLUSH:
  Added in e7affb1dba0e "powerpc/cache: add cache flush operation for various e500"
  Never used.

CFG_STAMP_XSEC:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 0e469db8f70c "powerpc: Rework VDSO gettimeofday to prevent time going backwards"

KVM_LPCR:
  Added in aa04b4cc5be6 "KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests"
  Last usage removed in a0144e2a6b0b "KVM: PPC: Book3S HV: Store LPCR value for each virtual core"

GPR15, GPR16, GPR17, GPR18, GPR19, GPR20, GPR21, GPR22, GPR23, GPR24,
GPR25, GPR26, GPR27, GPR28, GPR29, GPR30 and GPR31:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

VCPU_SHADOW_FSCR:
  Added in 616dff860282 "KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR"
  Never used.

VCPU_SHADOW_SRR1:
  Added in a2d56020d1d9 "KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu"
  Never used.

KVM_SPLIT_SIZE:
  Added in b4deba5c41e9 "KVM: PPC: Book3S HV: Implement dynamicmicro-threading on POWER8"
  Never used.

VCPU_VCPUID:
  Added in de56a948b918 "KVM: PPC: Add support for Book3S processors in hypervisor mode"
  Last usage removed 1b400ba0cd24 "KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations"

_MQ:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Never used.

AUDITCONTEXT:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Last usage removed in 401d1f029beb "[PATCH] syscall entry/exit revamp"

CLONE_VM:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

CLONE_UNTRACED:
  Added in 14cf11af6cf6 "powerpc: Merge enough to start building in arch/powerpc."
  Currently unused.

Signed-off-by: Rashmica Gupta &lt;rashmicy@gmail.com&gt;
[mpe: Munge change log]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/mm: Make page table size a variable</title>
<updated>2016-05-01T08:32:48+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2016-04-29T13:25:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd1842a2a448bb66d74aa02a550df6be8c25f20b'/>
<id>dd1842a2a448bb66d74aa02a550df6be8c25f20b</id>
<content type='text'>
Radix and hash MMU models support different page table sizes. Make
the #defines a variable so that existing code can work with variable
sizes.

Slice related code is only used by hash, so use hash constants there. We
will replicate some of the boundary conditions with resepct to TASK_SIZE
using radix values too. Right now we do boundary condition check using
hash constants.

Swapper pgdir size is initialized in asm code. We select the max pgd
size to keep it simple. For now we select hash pgdir. When adding radix
we will switch that to radix pgdir which is 64K.

BUILD_BUG_ON check which is removed is already done in hugepage_init()
using MAYBE_BUILD_BUG_ON().

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Radix and hash MMU models support different page table sizes. Make
the #defines a variable so that existing code can work with variable
sizes.

Slice related code is only used by hash, so use hash constants there. We
will replicate some of the boundary conditions with resepct to TASK_SIZE
using radix values too. Right now we do boundary condition check using
hash constants.

Swapper pgdir size is initialized in asm code. We select the max pgd
size to keep it simple. For now we select hash pgdir. When adding radix
we will switch that to radix pgdir which is 64K.

BUILD_BUG_ON check which is removed is already done in hugepage_init()
using MAYBE_BUILD_BUG_ON().

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'topic/livepatch' into next</title>
<updated>2016-04-18T10:45:32+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2016-04-18T10:45:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8404410b296095c78ed63f163ac5d417ff0647dd'/>
<id>8404410b296095c78ed63f163ac5d417ff0647dd</id>
<content type='text'>
Merge the support for live patching on ppc64le using mprofile-kernel.
This branch has also been merged into the livepatching tree for v4.7.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge the support for live patching on ppc64le using mprofile-kernel.
This branch has also been merged into the livepatching tree for v4.7.
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/livepatch: Add live patching support on ppc64le</title>
<updated>2016-04-14T05:48:06+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2016-03-24T11:04:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=85baa095497f3e590df9f6c8932121f123efca5c'/>
<id>85baa095497f3e590df9f6c8932121f123efca5c</id>
<content type='text'>
Add the kconfig logic &amp; assembly support for handling live patched
functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
depends on the new -mprofile-kernel ftrace ABI, which is only supported
currently on ppc64le.

Live patching is handled by a special ftrace handler. This means it runs
from ftrace_caller(). The live patch handler modifies the NIP so as to
redirect the return from ftrace_caller() to the new patched function.

However there is one particularly tricky case we need to handle.

If a function A calls another function B, and it is known at link time
that they share the same TOC, then A will not save or restore its TOC,
and will call the local entry point of B.

When we live patch B, we replace it with a new function C, which may
not have the same TOC as A. At live patch time it's too late to modify A
to do the TOC save/restore, so the live patching code must interpose
itself between A and C, and do the TOC save/restore that A omitted.

An additionaly complication is that the livepatch code can not create a
stack frame in order to save the TOC. That is because if C takes &gt; 8
arguments, or is varargs, A will have written the arguments for C in
A's stack frame.

To solve this, we introduce a "livepatch stack" which grows upward from
the base of the regular stack, and is used to store the TOC &amp; LR when
calling a live patched function.

When the patched function returns, we retrieve the real LR &amp; TOC from
the livepatch stack, restore them, and pop the livepatch "stack frame".

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Reviewed-by: Torsten Duwe &lt;duwe@suse.de&gt;
Reviewed-by: Balbir Singh &lt;bsingharora@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the kconfig logic &amp; assembly support for handling live patched
functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
depends on the new -mprofile-kernel ftrace ABI, which is only supported
currently on ppc64le.

Live patching is handled by a special ftrace handler. This means it runs
from ftrace_caller(). The live patch handler modifies the NIP so as to
redirect the return from ftrace_caller() to the new patched function.

However there is one particularly tricky case we need to handle.

If a function A calls another function B, and it is known at link time
that they share the same TOC, then A will not save or restore its TOC,
and will call the local entry point of B.

When we live patch B, we replace it with a new function C, which may
not have the same TOC as A. At live patch time it's too late to modify A
to do the TOC save/restore, so the live patching code must interpose
itself between A and C, and do the TOC save/restore that A omitted.

An additionaly complication is that the livepatch code can not create a
stack frame in order to save the TOC. That is because if C takes &gt; 8
arguments, or is varargs, A will have written the arguments for C in
A's stack frame.

To solve this, we introduce a "livepatch stack" which grows upward from
the base of the regular stack, and is used to store the TOC &amp; LR when
calling a live patched function.

When the patched function returns, we retrieve the real LR &amp; TOC from
the livepatch stack, restore them, and pop the livepatch "stack frame".

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Reviewed-by: Torsten Duwe &lt;duwe@suse.de&gt;
Reviewed-by: Balbir Singh &lt;bsingharora@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/cache: add cache flush operation for various e500</title>
<updated>2016-03-05T05:44:51+00:00</updated>
<author>
<name>chenhui zhao</name>
<email>chenhui.zhao@freescale.com</email>
</author>
<published>2015-11-20T09:13:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e7affb1dba0e9068aeb3978e858f39753e0dc20a'/>
<id>e7affb1dba0e9068aeb3978e858f39753e0dc20a</id>
<content type='text'>
Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.

Signed-off-by: Chenhui Zhao &lt;chenhui.zhao@freescale.com&gt;
Signed-off-by: Tang Yuantian &lt;Yuantian.Tang@feescale.com&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.

Signed-off-by: Chenhui Zhao &lt;chenhui.zhao@freescale.com&gt;
Signed-off-by: Tang Yuantian &lt;Yuantian.Tang@feescale.com&gt;
Signed-off-by: Scott Wood &lt;oss@buserror.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Restore FPU/VEC/VSX if previously used</title>
<updated>2016-03-02T12:34:48+00:00</updated>
<author>
<name>Cyril Bur</name>
<email>cyrilbur@gmail.com</email>
</author>
<published>2016-02-29T06:53:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=70fe3d980f5f14d8125869125ba9a0ea95e09c6b'/>
<id>70fe3d980f5f14d8125869125ba9a0ea95e09c6b</id>
<content type='text'>
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.

Signed-off-by: Cyril Bur &lt;cyrilbur@gmail.com&gt;
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;

fixup
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.

Signed-off-by: Cyril Bur &lt;cyrilbur@gmail.com&gt;
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;

fixup
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Copy only required pieces of the mm_context_t to the paca</title>
<updated>2015-12-27T08:12:14+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2015-12-10T22:34:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2fc251a8dda56b71ec491bee4c6897e3e12c0739'/>
<id>2fc251a8dda56b71ec491bee4c6897e3e12c0739</id>
<content type='text'>
Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=yield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
[mpe: Rename paca fields to be mm_ctx_foo rather than context_foo]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=yield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
[mpe: Rename paca fields to be mm_ctx_foo rather than context_foo]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
