<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/powerpc/lib/string.S, branch v3.9.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>powerpc: 64bit optimised __clear_user</title>
<updated>2012-07-03T04:14:41+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2012-05-27T19:54:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=17968fbbd19f1bb281ee4eb2548764ac5664c4ec'/>
<id>17968fbbd19f1bb281ee4eb2548764ac5664c4ec</id>
<content type='text'>
I noticed __clear_user high up in a profile of one of my RAID stress
tests. The testcase was doing a dd from /dev/zero which ends up
calling __clear_user.

__clear_user is basically a loop with a single 4 byte store which
is horribly slow. We can do much better by aligning the desination
and doing 32 bytes of 8 byte stores in a loop.

The following testcase was used to verify the patch:

http://ozlabs.org/~anton/junkcode/stress_clear_user.c

To show the improvement in performance I ran a dd from /dev/zero
to /dev/null on a POWER7 box:

Before:

# dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s

After:

# time dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s

Over 5x faster.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I noticed __clear_user high up in a profile of one of my RAID stress
tests. The testcase was doing a dd from /dev/zero which ends up
calling __clear_user.

__clear_user is basically a loop with a single 4 byte store which
is horribly slow. We can do much better by aligning the desination
and doing 32 bytes of 8 byte stores in a loop.

The following testcase was used to verify the patch:

http://ozlabs.org/~anton/junkcode/stress_clear_user.c

To show the improvement in performance I ran a dd from /dev/zero
to /dev/null on a POWER7 box:

Before:

# dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s

After:

# time dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s

Over 5x faster.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use the new generic strncpy_from_user() and strnlen_user()</title>
<updated>2012-05-28T04:00:07+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2012-05-28T03:03:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1629372caaaf7ef744d3b983be56b99468a68ff8'/>
<id>1629372caaaf7ef744d3b983be56b99468a68ff8</id>
<content type='text'>
This is much the same as for SPARC except that we can do the find_zero()
function more efficiently using the count-leading-zeroes instructions.
Tested on 32-bit and 64-bit PowerPC.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is much the same as for SPARC except that we can do the find_zero()
function more efficiently using the count-leading-zeroes instructions.
Tested on 32-bit and 64-bit PowerPC.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix string library functions</title>
<updated>2010-05-21T07:31:08+00:00</updated>
<author>
<name>Andreas Schwab</name>
<email>schwab@linux-m68k.org</email>
</author>
<published>2010-05-18T08:15:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca5d0674c37840366f04a7bbfbf78e7b5f3ce0a4'/>
<id>ca5d0674c37840366f04a7bbfbf78e7b5f3ce0a4</id>
<content type='text'>
The powerpc strncmp implementation does not correctly handle a zero
length, despite the claim in 0119536cd314ef95553604208c25bc35581f7f0a
(Add hand-coded assembly strcmp).

Additionally, all the length arguments are size_t, not int, so use
PPC_LCMPI and eq instead of cmpwi and le throughout.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The powerpc strncmp implementation does not correctly handle a zero
length, despite the claim in 0119536cd314ef95553604208c25bc35581f7f0a
(Add hand-coded assembly strcmp).

Additionally, all the length arguments are size_t, not int, so use
PPC_LCMPI and eq instead of cmpwi and le throughout.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix handling of strncmp with zero len</title>
<updated>2010-04-07T08:00:39+00:00</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2010-03-17T10:55:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=637a99022fb119b90fb281715d13172f0394fc12'/>
<id>637a99022fb119b90fb281715d13172f0394fc12</id>
<content type='text'>
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len &lt;= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
CC: &lt;stable@kernel.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len &lt;= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
CC: &lt;stable@kernel.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S</title>
<updated>2008-07-22T00:39:35+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2008-07-17T07:17:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=76bfdcf71c421154b69470e1ce346c8cf4ee1230'/>
<id>76bfdcf71c421154b69470e1ce346c8cf4ee1230</id>
<content type='text'>
Replace ifdef clutter with the PPC_LONG and PPC_LONG_ALIGN macros
for readability.

No change to the generated code.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace ifdef clutter with the PPC_LONG and PPC_LONG_ALIGN macros
for readability.

No change to the generated code.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[POWERPC] Add hand-coded assembly strcmp</title>
<updated>2008-04-07T00:03:03+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2008-02-29T16:04:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0119536cd314ef95553604208c25bc35581f7f0a'/>
<id>0119536cd314ef95553604208c25bc35581f7f0a</id>
<content type='text'>
We have an assembly version of strncmp for the bootwrapper, but not
for the kernel, so we end up using the C version in the kernel.  This
takes the strncmp code from the bootup and copies it to the kernel
proper, adding two instructions so it copes correctly with len==0.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have an assembly version of strncmp for the bootwrapper, but not
for the kernel, so we end up using the C version in the kernel.  This
takes the strncmp code from the bootup and copies it to the kernel
proper, adding two instructions so it copes correctly with len==0.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove obsolete #include &lt;linux/config.h&gt;</title>
<updated>2006-06-30T17:25:36+00:00</updated>
<author>
<name>Jörn Engel</name>
<email>joern@wohnheim.fh-wedel.de</email>
</author>
<published>2006-06-30T17:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6ab3d5624e172c553004ecc862bfeac16d9d68b7'/>
<id>6ab3d5624e172c553004ecc862bfeac16d9d68b7</id>
<content type='text'>
Signed-off-by: Jörn Engel &lt;joern@wohnheim.fh-wedel.de&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Jörn Engel &lt;joern@wohnheim.fh-wedel.de&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use reg.h instead of processor.h when we just want reg names</title>
<updated>2005-10-10T12:20:10+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2005-10-10T12:20:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b3b8dc6c07cecc1f8d52d03f677206bdf9f794c9'/>
<id>b3b8dc6c07cecc1f8d52d03f677206bdf9f794c9</id>
<content type='text'>
Now that the register names and bit definitions are all in reg.h,
use that instead of processor.h in assembly code in a few places.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the register names and bit definitions are all in reg.h,
use that instead of processor.h in assembly code in a few places.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Merge enough to start building in arch/powerpc.</title>
<updated>2005-09-26T06:04:21+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2005-09-26T06:04:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=14cf11af6cf608eb8c23e989ddb17a715ddce109'/>
<id>14cf11af6cf608eb8c23e989ddb17a715ddce109</id>
<content type='text'>
This creates the directory structure under arch/powerpc and a bunch
of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
to build a 32-bit powermac kernel with ARCH=powerpc.

For now we are getting some unmerged files from arch/ppc/kernel and
arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
to files in those directories and files outside arch/powerpc.

The boot directory is still not merged.  That's going to be interesting.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This creates the directory structure under arch/powerpc and a bunch
of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
to build a 32-bit powermac kernel with ARCH=powerpc.

For now we are getting some unmerged files from arch/ppc/kernel and
arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
to files in those directories and files outside arch/powerpc.

The boot directory is still not merged.  That's going to be interesting.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
