<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/sparc/lib/Makefile, branch v3.7.6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy.</title>
<updated>2012-10-05T20:45:26+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-10-05T20:45:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f825962efdee5c2b22ac1f6cda50056336c06e1'/>
<id>9f825962efdee5c2b22ac1f6cda50056336c06e1</id>
<content type='text'>
This adds optimized memset/bzero/page-clear routines for Niagara-4.

We basically can do what powerpc has been able to do for a decade (via
the "dcbz" instruction), which is use cache line clearing stores for
bzero and memsets with a 'c' argument of zero.

As long as we make the cache initializing store to each 32-byte
subblock of the L2 cache line, it works.

As with other Niagara-4 optimized routines, the key is to make sure to
avoid any usage of the %asi register, as reads and writes to it cost
at least 50 cycles.

For the user clear cases, we don't use these new routines, we use the
Niagara-1 variants instead.  Those have to use %asi in an unavoidable
way.

A Niagara-4 8K page clear costs just under 600 cycles.

Add definitions of the MRU variants of the cache initializing store
ASIs.  By default, cache initializing stores install the line as Least
Recently Used.  If we know we're going to use the data immediately
(which is true for page copies and clears) we can use the Most
Recently Used variant, to decrease the likelyhood of the lines being
evicted before they get used.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds optimized memset/bzero/page-clear routines for Niagara-4.

We basically can do what powerpc has been able to do for a decade (via
the "dcbz" instruction), which is use cache line clearing stores for
bzero and memsets with a 'c' argument of zero.

As long as we make the cache initializing store to each 32-byte
subblock of the L2 cache line, it works.

As with other Niagara-4 optimized routines, the key is to make sure to
avoid any usage of the %asi register, as reads and writes to it cost
at least 50 cycles.

For the user clear cases, we don't use these new routines, we use the
Niagara-1 variants instead.  Those have to use %asi in an unavoidable
way.

A Niagara-4 8K page clear costs just under 600 cycles.

Add definitions of the MRU variants of the cache initializing store
ASIs.  By default, cache initializing stores install the line as Least
Recently Used.  If we know we're going to use the data immediately
(which is true for page copies and clears) we can use the Most
Recently Used variant, to decrease the likelyhood of the lines being
evicted before they get used.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc64: Add SPARC-T4 optimized memcpy.</title>
<updated>2012-09-27T07:35:11+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-09-27T04:11:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ae2c6ca64118b934ef85f66adb03d5bbfdd57201'/>
<id>ae2c6ca64118b934ef85f66adb03d5bbfdd57201</id>
<content type='text'>
		Before		After
		--------------	--------------
bw_tcp:         1288.53 MB/sec	1637.77 MB/sec
bw_pipe:        1517.18 MB/sec	2107.61 MB/sec
bw_unix:        1838.38 MB/sec	2640.91 MB/sec

make -s -j128
allmodconfig	5min 49sec	5min 31sec

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
		Before		After
		--------------	--------------
bw_tcp:         1288.53 MB/sec	1637.77 MB/sec
bw_pipe:        1517.18 MB/sec	2107.61 MB/sec
bw_unix:        1838.38 MB/sec	2640.91 MB/sec

make -s -j128
allmodconfig	5min 49sec	5min 31sec

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc: use the new generic strnlen_user() function</title>
<updated>2012-05-26T18:33:54+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-05-26T18:14:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2c66f623631709aa5f2e4c14c7e089682e7394a3'/>
<id>2c66f623631709aa5f2e4c14c7e089682e7394a3</id>
<content type='text'>
This throws away the sparc-specific functions in favor of the generic
optimized version.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This throws away the sparc-specific functions in favor of the generic
optimized version.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc: Add full proper error handling to strncpy_from_user().</title>
<updated>2012-05-23T06:32:27+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-05-23T00:53:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ff06dffbc8abfc60d6a0332f058f1d1bb01abb31'/>
<id>ff06dffbc8abfc60d6a0332f058f1d1bb01abb31</id>
<content type='text'>
Linus removed the end-of-address-space hackery from
fs/namei.c:do_getname() so we really have to validate these edge
conditions and cannot cheat any more (as x86 used to as well).

Move to a common C implementation like x86 did.  And if both
src and dst are sufficiently aligned we'll do word at a time
copies and checks as well.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Linus removed the end-of-address-space hackery from
fs/namei.c:do_getname() so we really have to validate these edge
conditions and cannot cheat any more (as x86 used to as well).

Move to a common C implementation like x86 did.  And if both
src and dst are sufficiently aligned we'll do word at a time
copies and checks as well.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc32: Add ucmpdi2.o to obj-y instead of lib-y.</title>
<updated>2012-05-19T22:27:01+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-05-19T22:27:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=74c7b28953d4eaa6a479c187aeafcfc0280da5e8'/>
<id>74c7b28953d4eaa6a479c187aeafcfc0280da5e8</id>
<content type='text'>
Otherwise if no references exist in the static kernel image,
we won't export the symbol properly to modules.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise if no references exist in the static kernel image,
we won't export the symbol properly to modules.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc32: add ucmpdi2</title>
<updated>2012-05-19T22:23:57+00:00</updated>
<author>
<name>Sam Ravnborg</name>
<email>sam@ravnborg.org</email>
</author>
<published>2012-05-19T09:54:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=de36e66d5fa52bc6e2dacd95c701a1762b5308a7'/>
<id>de36e66d5fa52bc6e2dacd95c701a1762b5308a7</id>
<content type='text'>
Based on copy from microblaze add ucmpdi2 implementation.
This fixes build of niu driver which failed with:

drivers/built-in.o: In function `niu_get_nfc':
niu.c:(.text+0x91494): undefined reference to `__ucmpdi2'

This driver will never be used on a sparc32 system,
but patch added to fix build breakage with all*config builds.

Signed-off-by: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Based on copy from microblaze add ucmpdi2 implementation.
This fixes build of niu driver which failed with:

drivers/built-in.o: In function `niu_get_nfc':
niu.c:(.text+0x91494): undefined reference to `__ucmpdi2'

This driver will never be used on a sparc32 system,
but patch added to fix build breakage with all*config builds.

Signed-off-by: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc32: Kill off software 32-bit multiply/divide routines.</title>
<updated>2012-05-15T18:23:47+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-05-15T18:23:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1b35a57b1c1781f0fc8fc554f732b3a5408c5244'/>
<id>1b35a57b1c1781f0fc8fc554f732b3a5408c5244</id>
<content type='text'>
For the explicit calls to .udiv/.umul in assembler, I made a
mechanical (read as: safe) transformation.  I didn't attempt
to make any simplifications.

In particular, __ndelay and __udelay can be simplified significantly.
Some of the %y reads are unnecessary and these routines have no need
any longer for allocating a register window, they can be leaf
functions.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For the explicit calls to .udiv/.umul in assembler, I made a
mechanical (read as: safe) transformation.  I didn't attempt
to make any simplifications.

In particular, __ndelay and __udelay can be simplified significantly.
Some of the %y reads are unnecessary and these routines have no need
any longer for allocating a register window, they can be leaf
functions.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc32: Kill btfixup for xchg()'s 'swap' instruction.</title>
<updated>2012-05-13T20:07:16+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-05-13T20:07:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=73c1377da9fb732bf8ff1b262a92507e8736b1bf'/>
<id>73c1377da9fb732bf8ff1b262a92507e8736b1bf</id>
<content type='text'>
We always have this instruction available, so no need to use
btfixup for it any more.

This also eradicates the whole of atomic_32.S and thus the
__atomic_begin and __atomic_end symbols completely.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We always have this instruction available, so no need to use
btfixup for it any more.

This also eradicates the whole of atomic_32.S and thus the
__atomic_begin and __atomic_end symbols completely.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc: Use popc when possible for ffs/__ffs/ffz.</title>
<updated>2011-08-03T04:28:53+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-08-03T03:23:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=56d205cc5c0a3032a605121d4253e111193bf923'/>
<id>56d205cc5c0a3032a605121d4253e111193bf923</id>
<content type='text'>
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sparc: Use popc if possible for hweight routines.</title>
<updated>2011-08-03T04:28:50+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-07-29T16:42:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ef7c4d4675d2a9206f913f26ca1a5cd41bff9d41'/>
<id>ef7c4d4675d2a9206f913f26ca1a5cd41bff9d41</id>
<content type='text'>
Just like powerpc, we code patch at boot time.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just like powerpc, we code patch at boot time.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
