<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/lib/Makefile, branch tegra</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc: Add support for popcnt instructions</title>
<updated>2010-11-29T04:48:17+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-08-12T16:28:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=64ff31287693c1f325cb9cb049569c1611438ef1'/>
<id>64ff31287693c1f325cb9cb049569c1611438ef1</id>
<content type='text'>
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/Makefiles: Change to new flag variables</title>
<updated>2010-10-13T05:19:22+00:00</updated>
<author>
<name>matt mooney</name>
<email>mfm@muteddisk.com</email>
</author>
<published>2010-09-22T20:51:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4108d9ba9091c55cfb968d42dd7dcae9a098b876'/>
<id>4108d9ba9091c55cfb968d42dd7dcae9a098b876</id>
<content type='text'>
Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y.

Signed-off-by: matt mooney &lt;mfm@muteddisk.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y.

Signed-off-by: matt mooney &lt;mfm@muteddisk.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Optimise 64bit csum_partial_copy_generic and add csum_and_copy_from_user</title>
<updated>2010-09-02T04:07:30+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-08-02T20:09:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fdd374b62ca4df144c0138359dcffa83df7a0ea8'/>
<id>fdd374b62ca4df144c0138359dcffa83df7a0ea8</id>
<content type='text'>
We use the same core loop as the new csum_partial, adding in the
stores and exception handling code. To keep things simple we do all the
exception fixup in csum_and_copy_from_user. This wrapper function is
modelled on the generic checksum code and is careful to always calculate
a complete checksum even if we only copied part of the data to userspace.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 19% with
this patch. If I forced both the sender and receiver onto the same cpu
(with the hope of shifting the benchmark from being cache bandwidth limited
to cpu limited), adding this patch improved performance by 55%

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We use the same core loop as the new csum_partial, adding in the
stores and exception handling code. To keep things simple we do all the
exception fixup in csum_and_copy_from_user. This wrapper function is
modelled on the generic checksum code and is careful to always calculate
a complete checksum even if we only copied part of the data to userspace.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 19% with
this patch. If I forced both the sender and receiver onto the same cpu
(with the hope of shifting the benchmark from being cache bandwidth limited
to cpu limited), adding this patch improved performance by 55%

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge commit 'paulus-perf/master' into next</title>
<updated>2010-07-09T01:25:48+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2010-07-09T01:25:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5f07aa7524e98d6f68f2bec54f155ef6012e2c9a'/>
<id>5f07aa7524e98d6f68f2bec54f155ef6012e2c9a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix module building for gcc 4.5 and 64 bit</title>
<updated>2010-07-08T08:11:38+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2010-06-29T20:08:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7fca5dc8aa7aaa6a1023bd3587901b88ebfe8154'/>
<id>7fca5dc8aa7aaa6a1023bd3587901b88ebfe8154</id>
<content type='text'>
Gcc 4.5 is now generating out of line register save and restore
in the function prefix and postfix when we use -Os.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Gcc 4.5 is now generating out of line register save and restore
in the function prefix and postfix when we use -Os.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors</title>
<updated>2010-06-22T09:40:50+00:00</updated>
<author>
<name>K.Prasad</name>
<email>prasad@linux.vnet.ibm.com</email>
</author>
<published>2010-06-15T06:05:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5aae8a53708025d4e718f0d2e7c2f766779ddc71'/>
<id>5aae8a53708025d4e718f0d2e7c2f766779ddc71</id>
<content type='text'>
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras &lt;paulus@samba.org&gt; to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras &lt;paulus@samba.org&gt; to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Emulate most Book I instructions in emulate_step()</title>
<updated>2010-06-22T09:40:29+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2010-06-15T04:48:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0016a4cf5582415849fafbf9f019dd9530824789'/>
<id>0016a4cf5582415849fafbf9f019dd9530824789</id>
<content type='text'>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Add configurable -Werror for arch/powerpc</title>
<updated>2009-06-16T04:15:45+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2009-06-09T20:48:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba55bd74360ea4b8b95e73ed79474d37ff482b36'/>
<id>ba55bd74360ea4b8b95e73ed79474d37ff482b36</id>
<content type='text'>
Add the option to build the code under arch/powerpc with -Werror.

The intention is to make it harder for people to inadvertantly introduce
warnings in the arch/powerpc code. It needs to be configurable so that
if a warning is introduced, people can easily work around it while it's
being fixed.

The option is a negative, ie. don't enable -Werror, so that it will be
turned on for allyes and allmodconfig builds.

The default is n, in the hope that developers will build with -Werror,
that will probably lead to some build breaks, I am prepared to be flamed.

It's not enabled for math-emu, which is a steaming pile of warnings.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the option to build the code under arch/powerpc with -Werror.

The intention is to make it harder for people to inadvertantly introduce
warnings in the arch/powerpc code. It needs to be configurable so that
if a warning is introduced, people can easily work around it while it's
being fixed.

The option is a negative, ie. don't enable -Werror, so that it will be
turned on for allyes and allmodconfig builds.

The default is n, in the hope that developers will build with -Werror,
that will probably lead to some build breaks, I am prepared to be flamed.

It's not enabled for math-emu, which is a steaming pile of warnings.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Move dma-noncoherent.c from arch/powerpc/lib to arch/powerpc/mm</title>
<updated>2009-05-27T06:32:05+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2009-05-27T03:36:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b16e7766d6436835f473ba823ad04fbdfe5e9cbd'/>
<id>b16e7766d6436835f473ba823ad04fbdfe5e9cbd</id>
<content type='text'>
(pre-requisite to make the next patches more palatable)

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
(pre-requisite to make the next patches more palatable)

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/ppc32: static ftrace fixes for PPC32</title>
<updated>2008-11-28T13:08:07+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>srotedt@redhat.com</email>
</author>
<published>2008-11-26T20:54:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f1eecf0e4f0796911cc076f38fcf05fea0b353d5'/>
<id>f1eecf0e4f0796911cc076f38fcf05fea0b353d5</id>
<content type='text'>
Impact: fix for PowerPC 32 code

There were some early init code that was not safe for static
ftrace to boot on my PowerBook. This code must only use relative
addressing, and static mcount performs a compare of the
ftrace_trace_function pointer, and gets that with an absolute address.
In the early init boot up code, this will cause a fault.

This patch removes tracing from the files containing the offending
functions.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Impact: fix for PowerPC 32 code

There were some early init code that was not safe for static
ftrace to boot on my PowerBook. This code must only use relative
addressing, and static mcount performs a compare of the
ftrace_trace_function pointer, and gets that with an absolute address.
In the early init boot up code, this will cause a fault.

This patch removes tracing from the files containing the offending
functions.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
