<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/riscv/lib, branch v5.18</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>riscv: Fixed misaligned memory access. Fixed pointer comparison.</title>
<updated>2022-03-10T18:24:04+00:00</updated>
<author>
<name>Michael T. Kloos</name>
<email>michael@michaelkloos.com</email>
</author>
<published>2022-03-08T01:03:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9d1f0ec9f71780e69ceb9d91697600c747d6e02e'/>
<id>9d1f0ec9f71780e69ceb9d91697600c747d6e02e</id>
<content type='text'>
Rewrote the RISC-V memmove() assembly implementation.  The
previous implementation did not check memory alignment and it
compared 2 pointers with a signed comparison.  The misaligned
memory access would cause the kernel to crash on systems that
did not emulate it in firmware and did not support it in hardware.
Firmware emulation is slow and may not exist.  The RISC-V spec
does not guarantee that support for misaligned memory accesses
will exist.  It should not be depended on.

This patch now checks for XLEN granularity of co-alignment between
the pointers.  Failing that, copying is done by loading from the 2
contiguous and naturally aligned XLEN memory locations containing
the overlapping XLEN sized data to be copied.  The data is shifted
into the correct place and binary or'ed together on each
iteration.  The result is then stored into the corresponding
naturally aligned XLEN sized location in the destination.  For
unaligned data at the terminations of the regions to be copied
or for copies less than (2 * XLEN) in size, byte copy is used.

This patch also now uses unsigned comparison for the pointers and
migrates to the newer assembler annotations from the now deprecated
ones.

Signed-off-by: Michael T. Kloos &lt;michael@michaelkloos.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rewrote the RISC-V memmove() assembly implementation.  The
previous implementation did not check memory alignment and it
compared 2 pointers with a signed comparison.  The misaligned
memory access would cause the kernel to crash on systems that
did not emulate it in firmware and did not support it in hardware.
Firmware emulation is slow and may not exist.  The RISC-V spec
does not guarantee that support for misaligned memory accesses
will exist.  It should not be depended on.

This patch now checks for XLEN granularity of co-alignment between
the pointers.  Failing that, copying is done by loading from the 2
contiguous and naturally aligned XLEN memory locations containing
the overlapping XLEN sized data to be copied.  The data is shifted
into the correct place and binary or'ed together on each
iteration.  The result is then stored into the corresponding
naturally aligned XLEN sized location in the destination.  For
unaligned data at the terminations of the regions to be copied
or for copies less than (2 * XLEN) in size, byte copy is used.

This patch also now uses unsigned comparison for the pointers and
migrates to the newer assembler annotations from the now deprecated
ones.

Signed-off-by: Michael T. Kloos &lt;michael@michaelkloos.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: extable: consolidate definitions</title>
<updated>2022-01-06T01:52:47+00:00</updated>
<author>
<name>Jisheng Zhang</name>
<email>jszhang@kernel.org</email>
</author>
<published>2021-11-18T11:25:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6dd10d9166a0c06260e0ac6b1fac454117c8024a'/>
<id>6dd10d9166a0c06260e0ac6b1fac454117c8024a</id>
<content type='text'>
This is a riscv port of commit 819771cc2892 ("arm64: extable:
consolidate definitions").

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a riscv port of commit 819771cc2892 ("arm64: extable:
consolidate definitions").

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: lib: uaccess: fold fixups into body</title>
<updated>2022-01-06T01:52:39+00:00</updated>
<author>
<name>Jisheng Zhang</name>
<email>jszhang@kernel.org</email>
</author>
<published>2021-11-18T11:25:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9d504f9aa5c1b76673018da9503e76b351a24b8c'/>
<id>9d504f9aa5c1b76673018da9503e76b351a24b8c</id>
<content type='text'>
uaccess functions such __asm_copy_to_user(),  __arch_copy_from_user()
and __clear_user() place their exception fixups in the `.fixup` section
without any clear association with themselves. If we backtrace the
fixup code, it will be symbolized as an offset from the nearest prior
symbol.

Similar as arm64 does, we must move fixups into the body of the
functions themselves, after the usual fast-path returns.

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
uaccess functions such __asm_copy_to_user(),  __arch_copy_from_user()
and __clear_user() place their exception fixups in the `.fixup` section
without any clear association with themselves. If we backtrace the
fixup code, it will be symbolized as an offset from the nearest prior
symbol.

Similar as arm64 does, we must move fixups into the body of the
functions themselves, after the usual fast-path returns.

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: switch to relative exception tables</title>
<updated>2022-01-06T01:52:20+00:00</updated>
<author>
<name>Jisheng Zhang</name>
<email>jszhang@kernel.org</email>
</author>
<published>2021-11-18T11:22:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bb1f85d6046f0db757ac52ed60a5eba5df394819'/>
<id>bb1f85d6046f0db757ac52ed60a5eba5df394819</id>
<content type='text'>
Similar as other architectures such as arm64, x86 and so on, use
offsets relative to the exception table entry values rather than
absolute addresses for both the exception locationand the fixup.

However, RISCV label difference will actually produce two relocations,
a pair of R_RISCV_ADD32 and R_RISCV_SUB32. Take below simple code for
example:

$ cat test.S
.section .text
1:
        nop
.section __ex_table,"a"
        .balign 4
        .long (1b - .)
.previous

$ riscv64-linux-gnu-gcc -c test.S
$ riscv64-linux-gnu-readelf -r test.o
Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000600000023 R_RISCV_ADD32     0000000000000000 .L1^B1 + 0
000000000000  000500000027 R_RISCV_SUB32     0000000000000000 .L0  + 0

The modpost will complain the R_RISCV_SUB32 relocation, so we need to
patch modpost.c to skip this relocation for .rela__ex_table section.

After this patch, the __ex_table section size of defconfig vmlinux is
reduced from 7072 Bytes to 3536 Bytes.

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Reviewed-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar as other architectures such as arm64, x86 and so on, use
offsets relative to the exception table entry values rather than
absolute addresses for both the exception locationand the fixup.

However, RISCV label difference will actually produce two relocations,
a pair of R_RISCV_ADD32 and R_RISCV_SUB32. Take below simple code for
example:

$ cat test.S
.section .text
1:
        nop
.section __ex_table,"a"
        .balign 4
        .long (1b - .)
.previous

$ riscv64-linux-gnu-gcc -c test.S
$ riscv64-linux-gnu-readelf -r test.o
Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000600000023 R_RISCV_ADD32     0000000000000000 .L1^B1 + 0
000000000000  000500000027 R_RISCV_SUB32     0000000000000000 .L0  + 0

The modpost will complain the R_RISCV_SUB32 relocation, so we need to
patch modpost.c to skip this relocation for .rela__ex_table section.

After this patch, the __ex_table section size of defconfig vmlinux is
reduced from 7072 Bytes to 3536 Bytes.

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Reviewed-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>include/linux/delay.h: replace kernel.h with the necessary inclusions</title>
<updated>2021-11-09T18:02:49+00:00</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2021-11-09T02:32:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5f6286a608107a68646241b57d2bd2a4fc139ef9'/>
<id>5f6286a608107a68646241b57d2bd2a4fc139ef9</id>
<content type='text'>
When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

[akpm@linux-foundation.org: cxd2880_common.h needs bits.h for GENMASK()]
[andriy.shevchenko@linux.intel.com: delay.h: fix for removed kernel.h]
  Link: https://lkml.kernel.org/r/20211028170143.56523-1-andriy.shevchenko@linux.intel.com
[akpm@linux-foundation.org: include/linux/fwnode.h needs bits.h for BIT()]

Link: https://lkml.kernel.org/r/20211027150324.79827-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

[akpm@linux-foundation.org: cxd2880_common.h needs bits.h for GENMASK()]
[andriy.shevchenko@linux.intel.com: delay.h: fix for removed kernel.h]
  Link: https://lkml.kernel.org/r/20211028170143.56523-1-andriy.shevchenko@linux.intel.com
[akpm@linux-foundation.org: include/linux/fwnode.h needs bits.h for BIT()]

Link: https://lkml.kernel.org/r/20211027150324.79827-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: __asm_copy_to-from_user: Fix: Typos in comments</title>
<updated>2021-07-24T00:49:12+00:00</updated>
<author>
<name>Akira Tsukamoto</name>
<email>akira.tsukamoto@gmail.com</email>
</author>
<published>2021-07-20T08:53:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ea196c548c0ac407afd31d142712b6da8bd00244'/>
<id>ea196c548c0ac407afd31d142712b6da8bd00244</id>
<content type='text'>
Fixing typos and grammar mistakes and using more intuitive label
name.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixing typos and grammar mistakes and using more intuitive label
name.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: __asm_copy_to-from_user: Remove unnecessary size check</title>
<updated>2021-07-24T00:49:07+00:00</updated>
<author>
<name>Akira Tsukamoto</name>
<email>akira.tsukamoto@gmail.com</email>
</author>
<published>2021-07-20T08:52:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d4b3e0105e3c2411af666a50b1bf2d25656a5e83'/>
<id>d4b3e0105e3c2411af666a50b1bf2d25656a5e83</id>
<content type='text'>
Clean up:

The size of 0 will be evaluated in the next step. Not
required here.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clean up:

The size of 0 will be evaluated in the next step. Not
required here.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: __asm_copy_to-from_user: Fix: fail on RV32</title>
<updated>2021-07-24T00:49:01+00:00</updated>
<author>
<name>Akira Tsukamoto</name>
<email>akira.tsukamoto@gmail.com</email>
</author>
<published>2021-07-20T08:51:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=22b5f16ffeff38938ad7420a2bfa3c281c36fd17'/>
<id>22b5f16ffeff38938ad7420a2bfa3c281c36fd17</id>
<content type='text'>
Had a bug when converting bytes to bits when the cpu was rv32.

The a3 contains the number of bytes and multiple of 8
would be the bits. The LGREG is holding 2 for RV32 and 3 for
RV32, so to achieve multiple of 8 it must always be constant 3.
The 2 was mistakenly used for rv32.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Had a bug when converting bytes to bits when the cpu was rv32.

The a3 contains the number of bytes and multiple of 8
would be the bits. The LGREG is holding 2 for RV32 and 3 for
RV32, so to achieve multiple of 8 it must always be constant 3.
The 2 was mistakenly used for rv32.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: __asm_copy_to-from_user: Fix: overrun copy</title>
<updated>2021-07-24T00:48:52+00:00</updated>
<author>
<name>Akira Tsukamoto</name>
<email>akira.tsukamoto@gmail.com</email>
</author>
<published>2021-07-20T08:50:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6010d300f9f7e16d1bf327b4730bcd0c0886d9e6'/>
<id>6010d300f9f7e16d1bf327b4730bcd0c0886d9e6</id>
<content type='text'>
There were two causes for the overrun memory access.

The threshold size was too small.
The aligning dst require one SZREG and unrolling word copy requires
8*SZREG, total have to be at least 9*SZREG.

Inside the unrolling copy, the subtracting -(8*SZREG-1) would make
iteration happening one extra loop. Proper value is -(8*SZREG).

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There were two causes for the overrun memory access.

The threshold size was too small.
The aligning dst require one SZREG and unrolling word copy requires
8*SZREG, total have to be at least 9*SZREG.

Inside the unrolling copy, the subtracting -(8*SZREG-1) would make
iteration happening one extra loop. Proper value is -(8*SZREG).

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall")
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall</title>
<updated>2021-07-06T22:09:48+00:00</updated>
<author>
<name>Akira Tsukamoto</name>
<email>akira.tsukamoto@gmail.com</email>
</author>
<published>2021-06-23T12:40:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca6eaaa210deec0e41cbfc380bf89cf079203569'/>
<id>ca6eaaa210deec0e41cbfc380bf89cf079203569</id>
<content type='text'>
This patch will reduce cpu usage dramatically in kernel space especially
for application which use sys-call with large buffer size, such as
network applications. The main reason behind this is that every
unaligned memory access will raise exceptions and switch between s-mode
and m-mode causing large overhead.

First copy in bytes until reaches the first word aligned boundary in
destination memory address. This is the preparation before the bulk
aligned word copy.

The destination address is aligned now, but oftentimes the source
address is not in an aligned boundary. To reduce the unaligned memory
access, it reads the data from source in aligned boundaries, which will
cause the data to have an offset, and then combines the data in the next
iteration by fixing offset with shifting before writing to destination.
The majority of the improving copy speed comes from this shift copy.

In the lucky situation that the both source and destination address are
on the aligned boundary, perform load and store with register size to
copy the data. Without the unrolling, it will reduce the speed since the
next store instruction for the same register using from the load will
stall the pipeline.

At last, copying the remainder in one byte at a time.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch will reduce cpu usage dramatically in kernel space especially
for application which use sys-call with large buffer size, such as
network applications. The main reason behind this is that every
unaligned memory access will raise exceptions and switch between s-mode
and m-mode causing large overhead.

First copy in bytes until reaches the first word aligned boundary in
destination memory address. This is the preparation before the bulk
aligned word copy.

The destination address is aligned now, but oftentimes the source
address is not in an aligned boundary. To reduce the unaligned memory
access, it reads the data from source in aligned boundaries, which will
cause the data to have an offset, and then combines the data in the next
iteration by fixing offset with shifting before writing to destination.
The majority of the improving copy speed comes from this shift copy.

In the lucky situation that the both source and destination address are
on the aligned boundary, perform load and store with register size to
copy the data. Without the unrolling, it will reduce the speed since the
next store instruction for the same register using from the load will
stall the pipeline.

At last, copying the remainder in one byte at a time.

Signed-off-by: Akira Tsukamoto &lt;akira.tsukamoto@gmail.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmerdabbelt@google.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
