<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/tools/include/nolibc/string.h, branch v6.9-rc4</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function</title>
<updated>2023-10-12T19:14:03+00:00</updated>
<author>
<name>Ammar Faizi</name>
<email>ammarfaizi2@gnuweeb.org</email>
</author>
<published>2023-09-02T13:35:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc61614de044fe298b5f8ba0876bb6f3294c2644'/>
<id>bc61614de044fe298b5f8ba0876bb6f3294c2644</id>
<content type='text'>
This function is only called by memcpy(), there is no real reason to
have this wrapper. Delete this function and move the code to memcpy()
directly.

Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This function is only called by memcpy(), there is no real reason to
have this wrapper. Delete this function and move the code to memcpy()
directly.

Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function</title>
<updated>2023-10-12T19:14:02+00:00</updated>
<author>
<name>Ammar Faizi</name>
<email>ammarfaizi2@gnuweeb.org</email>
</author>
<published>2023-09-02T13:35:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5dfc79b20e467f59878c86e9203c5c291d496b45'/>
<id>5dfc79b20e467f59878c86e9203c5c291d496b45</id>
<content type='text'>
This nolibc internal function is not used. Delete it. It was probably
supposed to handle memmove(), but today the memmove() has its own
implementation.

Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This nolibc internal function is not used. Delete it. It was probably
supposed to handle memmove(), but today the memmove() has its own
implementation.

Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: x86-64: Use `rep stosb` for `memset()`</title>
<updated>2023-10-12T19:14:00+00:00</updated>
<author>
<name>Ammar Faizi</name>
<email>ammarfaizi2@gnuweeb.org</email>
</author>
<published>2023-09-02T13:35:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=12108aa8c1a13bdf4024f1e2e94a6121ac644d2c'/>
<id>12108aa8c1a13bdf4024f1e2e94a6121ac644d2c</id>
<content type='text'>
Simplify memset() on the x86-64 arch.

The x86-64 arch has a 'rep stosb' instruction, which can perform
memset() using only a single instruction, given:

    %al  = value (just like the second argument of memset())
    %rdi = destination
    %rcx = length

Before this patch:
```
  00000000000010c9 &lt;memset&gt;:
    10c9: 48 89 f8              mov    %rdi,%rax
    10cc: 48 85 d2              test   %rdx,%rdx
    10cf: 74 0e                 je     10df &lt;memset+0x16&gt;
    10d1: 31 c9                 xor    %ecx,%ecx
    10d3: 40 88 34 08           mov    %sil,(%rax,%rcx,1)
    10d7: 48 ff c1              inc    %rcx
    10da: 48 39 ca              cmp    %rcx,%rdx
    10dd: 75 f4                 jne    10d3 &lt;memset+0xa&gt;
    10df: c3                    ret
```

After this patch:
```
  0000000000001511 &lt;memset&gt;:
    1511: 96                    xchg   %eax,%esi
    1512: 48 89 d1              mov    %rdx,%rcx
    1515: 57                    push   %rdi
    1516: f3 aa                 rep stos %al,%es:(%rdi)
    1518: 58                    pop    %rax
    1519: c3                    ret
```

v2:
  - Use pushq %rdi / popq %rax (Alviro).
  - Use xchg %eax, %esi (Willy).

Link: https://lore.kernel.org/lkml/ZO9e6h2jjVIMpBJP@1wt.eu
Suggested-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Suggested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Simplify memset() on the x86-64 arch.

The x86-64 arch has a 'rep stosb' instruction, which can perform
memset() using only a single instruction, given:

    %al  = value (just like the second argument of memset())
    %rdi = destination
    %rcx = length

Before this patch:
```
  00000000000010c9 &lt;memset&gt;:
    10c9: 48 89 f8              mov    %rdi,%rax
    10cc: 48 85 d2              test   %rdx,%rdx
    10cf: 74 0e                 je     10df &lt;memset+0x16&gt;
    10d1: 31 c9                 xor    %ecx,%ecx
    10d3: 40 88 34 08           mov    %sil,(%rax,%rcx,1)
    10d7: 48 ff c1              inc    %rcx
    10da: 48 39 ca              cmp    %rcx,%rdx
    10dd: 75 f4                 jne    10d3 &lt;memset+0xa&gt;
    10df: c3                    ret
```

After this patch:
```
  0000000000001511 &lt;memset&gt;:
    1511: 96                    xchg   %eax,%esi
    1512: 48 89 d1              mov    %rdx,%rcx
    1515: 57                    push   %rdi
    1516: f3 aa                 rep stos %al,%es:(%rdi)
    1518: 58                    pop    %rax
    1519: c3                    ret
```

v2:
  - Use pushq %rdi / popq %rax (Alviro).
  - Use xchg %eax, %esi (Willy).

Link: https://lore.kernel.org/lkml/ZO9e6h2jjVIMpBJP@1wt.eu
Suggested-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Suggested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Reviewed-by: Alviro Iskandar Setiawan &lt;alviro.iskandar@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()`</title>
<updated>2023-10-12T19:13:56+00:00</updated>
<author>
<name>Ammar Faizi</name>
<email>ammarfaizi2@gnuweeb.org</email>
</author>
<published>2023-09-02T13:35:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=553845eebd6003314bde2f40c1721275bfd2bfcd'/>
<id>553845eebd6003314bde2f40c1721275bfd2bfcd</id>
<content type='text'>
Simplify memcpy() and memmove() on the x86-64 arch.

The x86-64 arch has a 'rep movsb' instruction, which can perform
memcpy() using only a single instruction, given:

    %rdi = destination
    %rsi = source
    %rcx = length

Additionally, it can also handle the overlapping case by setting DF=1
(backward copy), which can be used as the memmove() implementation.

Before this patch:
```
  00000000000010ab &lt;memmove&gt;:
    10ab: 48 89 f8              mov    %rdi,%rax
    10ae: 31 c9                 xor    %ecx,%ecx
    10b0: 48 39 f7              cmp    %rsi,%rdi
    10b3: 48 83 d1 ff           adc    $0xffffffffffffffff,%rcx
    10b7: 48 85 d2              test   %rdx,%rdx
    10ba: 74 25                 je     10e1 &lt;memmove+0x36&gt;
    10bc: 48 83 c9 01           or     $0x1,%rcx
    10c0: 48 39 f0              cmp    %rsi,%rax
    10c3: 48 c7 c7 ff ff ff ff  mov    $0xffffffffffffffff,%rdi
    10ca: 48 0f 43 fa           cmovae %rdx,%rdi
    10ce: 48 01 cf              add    %rcx,%rdi
    10d1: 44 8a 04 3e           mov    (%rsi,%rdi,1),%r8b
    10d5: 44 88 04 38           mov    %r8b,(%rax,%rdi,1)
    10d9: 48 01 cf              add    %rcx,%rdi
    10dc: 48 ff ca              dec    %rdx
    10df: 75 f0                 jne    10d1 &lt;memmove+0x26&gt;
    10e1: c3                    ret

  00000000000010e2 &lt;memcpy&gt;:
    10e2: 48 89 f8              mov    %rdi,%rax
    10e5: 48 85 d2              test   %rdx,%rdx
    10e8: 74 12                 je     10fc &lt;memcpy+0x1a&gt;
    10ea: 31 c9                 xor    %ecx,%ecx
    10ec: 40 8a 3c 0e           mov    (%rsi,%rcx,1),%dil
    10f0: 40 88 3c 08           mov    %dil,(%rax,%rcx,1)
    10f4: 48 ff c1              inc    %rcx
    10f7: 48 39 ca              cmp    %rcx,%rdx
    10fa: 75 f0                 jne    10ec &lt;memcpy+0xa&gt;
    10fc: c3                    ret
```

After this patch:
```
  // memmove is an alias for memcpy
  000000000040133b &lt;memcpy&gt;:
    40133b: 48 89 d1              mov    %rdx,%rcx
    40133e: 48 89 f8              mov    %rdi,%rax
    401341: 48 89 fa              mov    %rdi,%rdx
    401344: 48 29 f2              sub    %rsi,%rdx
    401347: 48 39 ca              cmp    %rcx,%rdx
    40134a: 72 03                 jb     40134f &lt;memcpy+0x14&gt;
    40134c: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40134e: c3                    ret
    40134f: 48 8d 7c 0f ff        lea    -0x1(%rdi,%rcx,1),%rdi
    401354: 48 8d 74 0e ff        lea    -0x1(%rsi,%rcx,1),%rsi
    401359: fd                    std
    40135a: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40135c: fc                    cld
    40135d: c3                    ret
```

v3:
  - Make memmove as an alias for memcpy (Willy).
  - Make the forward copy the likely case (Alviro).

v2:
  - Fix the broken memmove implementation (David).

Link: https://lore.kernel.org/lkml/20230902062237.GA23141@1wt.eu
Link: https://lore.kernel.org/lkml/5a821292d96a4dbc84c96ccdc6b5b666@AcuMS.aculab.com
Suggested-by: David Laight &lt;David.Laight@aculab.com&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Simplify memcpy() and memmove() on the x86-64 arch.

The x86-64 arch has a 'rep movsb' instruction, which can perform
memcpy() using only a single instruction, given:

    %rdi = destination
    %rsi = source
    %rcx = length

Additionally, it can also handle the overlapping case by setting DF=1
(backward copy), which can be used as the memmove() implementation.

Before this patch:
```
  00000000000010ab &lt;memmove&gt;:
    10ab: 48 89 f8              mov    %rdi,%rax
    10ae: 31 c9                 xor    %ecx,%ecx
    10b0: 48 39 f7              cmp    %rsi,%rdi
    10b3: 48 83 d1 ff           adc    $0xffffffffffffffff,%rcx
    10b7: 48 85 d2              test   %rdx,%rdx
    10ba: 74 25                 je     10e1 &lt;memmove+0x36&gt;
    10bc: 48 83 c9 01           or     $0x1,%rcx
    10c0: 48 39 f0              cmp    %rsi,%rax
    10c3: 48 c7 c7 ff ff ff ff  mov    $0xffffffffffffffff,%rdi
    10ca: 48 0f 43 fa           cmovae %rdx,%rdi
    10ce: 48 01 cf              add    %rcx,%rdi
    10d1: 44 8a 04 3e           mov    (%rsi,%rdi,1),%r8b
    10d5: 44 88 04 38           mov    %r8b,(%rax,%rdi,1)
    10d9: 48 01 cf              add    %rcx,%rdi
    10dc: 48 ff ca              dec    %rdx
    10df: 75 f0                 jne    10d1 &lt;memmove+0x26&gt;
    10e1: c3                    ret

  00000000000010e2 &lt;memcpy&gt;:
    10e2: 48 89 f8              mov    %rdi,%rax
    10e5: 48 85 d2              test   %rdx,%rdx
    10e8: 74 12                 je     10fc &lt;memcpy+0x1a&gt;
    10ea: 31 c9                 xor    %ecx,%ecx
    10ec: 40 8a 3c 0e           mov    (%rsi,%rcx,1),%dil
    10f0: 40 88 3c 08           mov    %dil,(%rax,%rcx,1)
    10f4: 48 ff c1              inc    %rcx
    10f7: 48 39 ca              cmp    %rcx,%rdx
    10fa: 75 f0                 jne    10ec &lt;memcpy+0xa&gt;
    10fc: c3                    ret
```

After this patch:
```
  // memmove is an alias for memcpy
  000000000040133b &lt;memcpy&gt;:
    40133b: 48 89 d1              mov    %rdx,%rcx
    40133e: 48 89 f8              mov    %rdi,%rax
    401341: 48 89 fa              mov    %rdi,%rdx
    401344: 48 29 f2              sub    %rsi,%rdx
    401347: 48 39 ca              cmp    %rcx,%rdx
    40134a: 72 03                 jb     40134f &lt;memcpy+0x14&gt;
    40134c: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40134e: c3                    ret
    40134f: 48 8d 7c 0f ff        lea    -0x1(%rdi,%rcx,1),%rdi
    401354: 48 8d 74 0e ff        lea    -0x1(%rsi,%rcx,1),%rsi
    401359: fd                    std
    40135a: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40135c: fc                    cld
    40135d: c3                    ret
```

v3:
  - Make memmove as an alias for memcpy (Willy).
  - Make the forward copy the likely case (Alviro).

v2:
  - Fix the broken memmove implementation (David).

Link: https://lore.kernel.org/lkml/20230902062237.GA23141@1wt.eu
Link: https://lore.kernel.org/lkml/5a821292d96a4dbc84c96ccdc6b5b666@AcuMS.aculab.com
Suggested-by: David Laight &lt;David.Laight@aculab.com&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: use standard __asm__ statements</title>
<updated>2023-06-09T18:46:07+00:00</updated>
<author>
<name>Thomas Weißschuh</name>
<email>linux@weissschuh.net</email>
</author>
<published>2023-04-06T21:54:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7f291cfa90d7f95da11fe9aa7062344ddfce603a'/>
<id>7f291cfa90d7f95da11fe9aa7062344ddfce603a</id>
<content type='text'>
Most of the code was migrated to C99-conformant __asm__ statements
before. It seems string.h was missed.

Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Most of the code was migrated to C99-conformant __asm__ statements
before. It seems string.h was missed.

Signed-off-by: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: prevent gcc from making memset() loop over itself</title>
<updated>2023-01-09T17:36:05+00:00</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2023-01-09T07:54:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1bfbe1f3e96720daf185f03d101f072d69753f88'/>
<id>1bfbe1f3e96720daf185f03d101f072d69753f88</id>
<content type='text'>
When building on ARM in thumb mode with gcc-11.3 at -O2 or -O3,
nolibc-test segfaults during the select() tests. It turns out that at
this level, gcc recognizes an opportunity for using memset() to zero
the fd_set, but it miscompiles it because it also recognizes a memset
pattern as well, and decides to call memset() from the memset() code:

  000122bc &lt;memset&gt;:
     122bc:       b510            push    {r4, lr}
     122be:       0004            movs    r4, r0
     122c0:       2a00            cmp     r2, #0
     122c2:       d003            beq.n   122cc &lt;memset+0x10&gt;
     122c4:       23ff            movs    r3, #255        ; 0xff
     122c6:       4019            ands    r1, r3
     122c8:       f7ff fff8       bl      122bc &lt;memset&gt;
     122cc:       0020            movs    r0, r4
     122ce:       bd10            pop     {r4, pc}

Simply placing an empty asm() statement inside the loop suffices to
avoid this.

Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When building on ARM in thumb mode with gcc-11.3 at -O2 or -O3,
nolibc-test segfaults during the select() tests. It turns out that at
this level, gcc recognizes an opportunity for using memset() to zero
the fd_set, but it miscompiles it because it also recognizes a memset
pattern as well, and decides to call memset() from the memset() code:

  000122bc &lt;memset&gt;:
     122bc:       b510            push    {r4, lr}
     122be:       0004            movs    r4, r0
     122c0:       2a00            cmp     r2, #0
     122c2:       d003            beq.n   122cc &lt;memset+0x10&gt;
     122c4:       23ff            movs    r3, #255        ; 0xff
     122c6:       4019            ands    r1, r3
     122c8:       f7ff fff8       bl      122bc &lt;memset&gt;
     122cc:       0020            movs    r0, r4
     122ce:       bd10            pop     {r4, pc}

Simply placing an empty asm() statement inside the loop suffices to
avoid this.

Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: fix missing includes causing build issues at -O0</title>
<updated>2023-01-09T17:36:05+00:00</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2023-01-09T07:54:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=55abdd1f5e1e07418bf4a46c233a92f83cb5ae97'/>
<id>55abdd1f5e1e07418bf4a46c233a92f83cb5ae97</id>
<content type='text'>
After the nolibc includes were split to facilitate portability from
standard libcs, programs that include only what they need may miss
some symbols which are needed by libgcc. This is the case for raise()
which is needed by the divide by zero code in some architectures for
example.

Regardless, being able to include only the apparently needed files is
convenient.

Instead of trying to move all exported definitions to a single file,
since this can change over time, this patch takes another approach
consisting in including the nolibc header at the end of all standard
include files. This way their types and functions are already known
at the moment of inclusion, and including any single one of them is
sufficient to bring all the required ones.

Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the nolibc includes were split to facilitate portability from
standard libcs, programs that include only what they need may miss
some symbols which are needed by libgcc. This is the case for raise()
which is needed by the divide by zero code in some architectures for
example.

Regardless, being able to include only the apparently needed files is
convenient.

Instead of trying to move all exported definitions to a single file,
since this can change over time, this patch takes another approach
consisting in including the nolibc header at the end of all standard
include files. This way their types and functions are already known
at the moment of inclusion, and including any single one of them is
sufficient to bring all the required ones.

Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc/string: Fix memcmp() implementation</title>
<updated>2022-10-28T22:07:02+00:00</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2022-10-21T06:01:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b3f4f51ea68a495f8a5956064c33dce711a2df91'/>
<id>b3f4f51ea68a495f8a5956064c33dce711a2df91</id>
<content type='text'>
The C standard says that memcmp() must treat the buffers as consisting
of "unsigned chars". If char happens to be unsigned, the casts are ok,
but then obviously the c1 variable can never contain a negative
value. And when char is signed, the casts are wrong, and there's still
a problem with using an 8-bit quantity to hold the difference, because
that can range from -255 to +255.

For example, assuming char is signed, comparing two 1-byte buffers,
one containing 0x00 and another 0x80, the current implementation would
return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one
of those should of course return something positive.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The C standard says that memcmp() must treat the buffers as consisting
of "unsigned chars". If char happens to be unsigned, the casts are ok,
but then obviously the c1 variable can never contain a negative
value. And when char is signed, the casts are wrong, and there's still
a problem with using an 8-bit quantity to hold the difference, because
that can range from -255 to +255.

For example, assuming char is signed, comparing two 1-byte buffers,
one containing 0x00 and another 0x80, the current implementation would
return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one
of those should of course return something positive.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc: Fix missing strlen() definition and infinite loop with gcc-12</title>
<updated>2022-10-28T22:07:02+00:00</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2022-10-09T18:29:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bfc3b0f05653a28c8d41067a2aa3875d1f982e3e'/>
<id>bfc3b0f05653a28c8d41067a2aa3875d1f982e3e</id>
<content type='text'>
When built at -Os, gcc-12 recognizes an strlen() pattern in nolibc_strlen()
and replaces it with a jump to strlen(), which is not defined as a symbol
and breaks compilation. Worse, when the function is called strlen(), the
function is simply replaced with a jump to itself, hence becomes an
infinite loop.

One way to avoid this is to always set -ffreestanding, but the calling
code doesn't know this and there's no way (either via attributes or
pragmas) to globally enable it from include files, effectively leaving
a painful situation for the caller.

Alexey suggested to place an empty asm() statement inside the loop to
stop gcc from recognizing a well-known pattern, which happens to work
pretty fine. At least it allows us to make sure our local definition
is not replaced with a self jump.

The function only needs to be renamed back to strlen() so that the symbol
exists, which implies that nolibc_strlen() which is used on variable
strings has to be declared as a macro that points back to it before the
strlen() macro is redifined.

It was verified to produce valid code with gcc 3.4 to 12.1 at different
optimization levels, and both with constant and variable strings.

In case this problem surfaces again in the future, an alternate approach
consisting in adding an optimize("no-tree-loop-distribute-patterns")
function attribute for gcc&gt;=12 worked as well but is less pretty.

Reported-by: kernel test robot &lt;yujie.liu@intel.com&gt;
Link: https://lore.kernel.org/r/202210081618.754a77db-yujie.liu@intel.com
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Fixes: 96980b833a21 ("tools/nolibc/string: do not use __builtin_strlen() at -O0")
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When built at -Os, gcc-12 recognizes an strlen() pattern in nolibc_strlen()
and replaces it with a jump to strlen(), which is not defined as a symbol
and breaks compilation. Worse, when the function is called strlen(), the
function is simply replaced with a jump to itself, hence becomes an
infinite loop.

One way to avoid this is to always set -ffreestanding, but the calling
code doesn't know this and there's no way (either via attributes or
pragmas) to globally enable it from include files, effectively leaving
a painful situation for the caller.

Alexey suggested to place an empty asm() statement inside the loop to
stop gcc from recognizing a well-known pattern, which happens to work
pretty fine. At least it allows us to make sure our local definition
is not replaced with a self jump.

The function only needs to be renamed back to strlen() so that the symbol
exists, which implies that nolibc_strlen() which is used on variable
strings has to be declared as a macro that points back to it before the
strlen() macro is redifined.

It was verified to produce valid code with gcc 3.4 to 12.1 at different
optimization levels, and both with constant and variable strings.

In case this problem surfaces again in the future, an alternate approach
consisting in adding an optimize("no-tree-loop-distribute-patterns")
function attribute for gcc&gt;=12 worked as well but is less pretty.

Reported-by: kernel test robot &lt;yujie.liu@intel.com&gt;
Link: https://lore.kernel.org/r/202210081618.754a77db-yujie.liu@intel.com
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Fixes: 96980b833a21 ("tools/nolibc/string: do not use __builtin_strlen() at -O0")
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools/nolibc/string: Implement `strdup()` and `strndup()`</title>
<updated>2022-04-21T00:05:46+00:00</updated>
<author>
<name>Ammar Faizi</name>
<email>ammarfaizi2@gnuweeb.org</email>
</author>
<published>2022-03-29T10:17:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=11dbdaeff41d9ec9376476889651fac4838bff99'/>
<id>11dbdaeff41d9ec9376476889651fac4838bff99</id>
<content type='text'>
These functions are currently only available on architectures that have
my_syscall6() macro implemented. Since these functions use malloc(),
malloc() uses mmap(), mmap() depends on my_syscall6() macro.

On architectures that don't support my_syscall6(), these function will
always return NULL with errno set to ENOSYS.

Acked-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These functions are currently only available on architectures that have
my_syscall6() macro implemented. Since these functions use malloc(),
malloc() uses mmap(), mmap() depends on my_syscall6() macro.

On architectures that don't support my_syscall6(), these function will
always return NULL with errno set to ENOSYS.

Acked-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Ammar Faizi &lt;ammarfaizi2@gnuweeb.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
