<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/parisc/include/asm/hash.h, branch v4.10</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>parisc: Add &lt;asm/hash.h&gt;</title>
<updated>2016-08-02T14:44:29+00:00</updated>
<author>
<name>George Spelvin</name>
<email>linux@sciencehorizons.net</email>
</author>
<published>2016-06-07T23:45:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=773e1c5fa4bf1faa25e119490b26ece2ef1bdb46'/>
<id>773e1c5fa4bf1faa25e119490b26ece2ef1bdb46</id>
<content type='text'>
PA-RISC is interesting; integer multiplies are implemented in the
FPU, so are painful in the kernel.  But it tries to be friendly to
shift-and-add sequences for constant multiplies.

__hash_32 is implemented using the same shift-and-add sequence as
Microblaze, just scheduled for the PA7100.  (It's 2-way superscalar
but in-order, like the Pentium.)

hash_64 was tricky, but a suggestion from Jason Thong allowed a
good solution by breaking up the multiplier.  After a lot of manual
optimization, I found a 19-instruction sequence for the multiply that
can be executed in 10 cycles using only 4 temporaries.

(The PA8xxx can issue 4 instructions per cycle, but 2 must be ALU ops
and 2 must be loads/stores.  And the final add can't be paired.)

An alternative considered, but ultimately not used, was Thomas Wang's
64-to-32-bit integer hash.  At 12 instructions, it's smaller, but they're
all sequentially dependent, so it has longer latency.

https://web.archive.org/web/2011/http://www.concentric.net/~Ttwang/tech/inthash.htm
http://burtleburtle.net/bob/hash/integer.html

Signed-off-by: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PA-RISC is interesting; integer multiplies are implemented in the
FPU, so are painful in the kernel.  But it tries to be friendly to
shift-and-add sequences for constant multiplies.

__hash_32 is implemented using the same shift-and-add sequence as
Microblaze, just scheduled for the PA7100.  (It's 2-way superscalar
but in-order, like the Pentium.)

hash_64 was tricky, but a suggestion from Jason Thong allowed a
good solution by breaking up the multiplier.  After a lot of manual
optimization, I found a 19-instruction sequence for the multiply that
can be executed in 10 cycles using only 4 temporaries.

(The PA8xxx can issue 4 instructions per cycle, but 2 must be ALU ops
and 2 must be loads/stores.  And the final add can't be paired.)

An alternative considered, but ultimately not used, was Thomas Wang's
64-to-32-bit integer hash.  At 12 instructions, it's smaller, but they're
all sequentially dependent, so it has longer latency.

https://web.archive.org/web/2011/http://www.concentric.net/~Ttwang/tech/inthash.htm
http://burtleburtle.net/bob/hash/integer.html

Signed-off-by: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
