<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/lib/zstd, branch v5.19-rc7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>lib: zstd: Don't add -O3 to cflags</title>
<updated>2021-11-18T21:16:22+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2021-11-16T23:11:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7416cdc9b9c10968c57b1f73be5d48b3ecdaf3c8'/>
<id>7416cdc9b9c10968c57b1f73be5d48b3ecdaf3c8</id>
<content type='text'>
After the update to zstd-1.4.10 passing -O3 is no longer necessary to
get good performance from zstd. Using the default optimization level -O2
is sufficient to get good performance.

I've measured no significant change to compression speed, and a ~1%
decompression speed loss, which is acceptable.

This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
stacks that are ~3KB. With -O2 these same functions generate stacks in
the &lt; 100B, completely fixing the problem. Function size deltas are
listed below:

ZSTD_compressBlock_fast_extDict_generic: 3800 -&gt; 68
ZSTD_compressBlock_fast: 2216 -&gt; 40
ZSTD_compressBlock_fast_dictMatchState: 1848 -&gt;  64
ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -&gt; 76
ZSTD_fillDoubleHashTable: 3252 -&gt; 0
ZSTD_compressBlock_doubleFast: 5856 -&gt; 36
ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -&gt; 84
ZSTD_copmressBlock_lazy2: 2420 -&gt; 72

Additionally, this improves the reported code bloat [1]. With gcc-11
bloat-o-meter shows an 80KB code size improvement:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
Total: Before=6418562, After=6336372, chg -1.28%
```

Compared to before the zstd-1.4.10 update we see a total code size
regression of 105KB, down from 374KB at v5.16-rc1:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
Total: Before=6228850, After=6336372, chg +1.73%
```

[0] https://lkml.org/lkml/2021/11/15/710
[1] https://lkml.org/lkml/2021/11/14/189

Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Tested-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the update to zstd-1.4.10 passing -O3 is no longer necessary to
get good performance from zstd. Using the default optimization level -O2
is sufficient to get good performance.

I've measured no significant change to compression speed, and a ~1%
decompression speed loss, which is acceptable.

This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
stacks that are ~3KB. With -O2 these same functions generate stacks in
the &lt; 100B, completely fixing the problem. Function size deltas are
listed below:

ZSTD_compressBlock_fast_extDict_generic: 3800 -&gt; 68
ZSTD_compressBlock_fast: 2216 -&gt; 40
ZSTD_compressBlock_fast_dictMatchState: 1848 -&gt;  64
ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -&gt; 76
ZSTD_fillDoubleHashTable: 3252 -&gt; 0
ZSTD_compressBlock_doubleFast: 5856 -&gt; 36
ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -&gt; 84
ZSTD_copmressBlock_lazy2: 2420 -&gt; 72

Additionally, this improves the reported code bloat [1]. With gcc-11
bloat-o-meter shows an 80KB code size improvement:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
Total: Before=6418562, After=6336372, chg -1.28%
```

Compared to before the zstd-1.4.10 update we see a total code size
regression of 105KB, down from 374KB at v5.16-rc1:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
Total: Before=6228850, After=6336372, chg +1.73%
```

[0] https://lkml.org/lkml/2021/11/15/710
[1] https://lkml.org/lkml/2021/11/14/189

Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Tested-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Don't inline functions in zstd_opt.c</title>
<updated>2021-11-18T21:15:33+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2021-11-16T04:33:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1974990cca43a6ba708a70b15862113eb9c2f399'/>
<id>1974990cca43a6ba708a70b15862113eb9c2f399</id>
<content type='text'>
`zstd_opt.c` contains the match finder for the highest compression
levels. These levels are already very slow, and are unlikely to be used
in the kernel. If they are used, they shouldn't be used in latency
sensitive workloads, so slowing them down shouldn't be a big deal.

This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
I've also opened an issue upstream [1] so that we can properly tackle
the code size issue in `zstd_opt.c` for all users, and can hopefully
remove this hack in the next zstd version we import.

Bloat-o-meter output on x86-64:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
Function                                     old     new   delta
ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
ZSTD_insertBt1                                 -    1731   +1731
ZSTD_storeSeq                                  -     693    +693
ZSTD_BtGetAllMatches                           -     255    +255
ZSTD_updateRep                                 -     128    +128
ZSTD_updateTree                               96      99      +3
ZSTD_insertAndFindFirstIndexHash3             81       -     -81
ZSTD_setBasePrices.constprop                  98       -     -98
ZSTD_litLengthPrice.constprop                138       -    -138
ZSTD_count                                   362     181    -181
ZSTD_count_2segments                        1407     938    -469
ZSTD_insertBt1.constprop                    2689       -   -2689
ZSTD_compressBlock_btultra2                19990     423  -19567
ZSTD_compressBlock_btultra                 19633      15  -19618
ZSTD_initStats_ultra                       19825       -  -19825
ZSTD_compressBlock_btopt                   20374      12  -20362
ZSTD_compressBlock_btopt_extDict           29984      12  -29972
ZSTD_compressBlock_btultra_extDict         30718      15  -30703
ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
Total: Before=6611828, After=6418562, chg -2.92%
```

[0] https://lkml.org/lkml/2021/11/14/189
[1] https://github.com/facebook/zstd/issues/2862

Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Tested-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`zstd_opt.c` contains the match finder for the highest compression
levels. These levels are already very slow, and are unlikely to be used
in the kernel. If they are used, they shouldn't be used in latency
sensitive workloads, so slowing them down shouldn't be a big deal.

This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
I've also opened an issue upstream [1] so that we can properly tackle
the code size issue in `zstd_opt.c` for all users, and can hopefully
remove this hack in the next zstd version we import.

Bloat-o-meter output on x86-64:

```
&gt; ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
Function                                     old     new   delta
ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
ZSTD_insertBt1                                 -    1731   +1731
ZSTD_storeSeq                                  -     693    +693
ZSTD_BtGetAllMatches                           -     255    +255
ZSTD_updateRep                                 -     128    +128
ZSTD_updateTree                               96      99      +3
ZSTD_insertAndFindFirstIndexHash3             81       -     -81
ZSTD_setBasePrices.constprop                  98       -     -98
ZSTD_litLengthPrice.constprop                138       -    -138
ZSTD_count                                   362     181    -181
ZSTD_count_2segments                        1407     938    -469
ZSTD_insertBt1.constprop                    2689       -   -2689
ZSTD_compressBlock_btultra2                19990     423  -19567
ZSTD_compressBlock_btultra                 19633      15  -19618
ZSTD_initStats_ultra                       19825       -  -19825
ZSTD_compressBlock_btopt                   20374      12  -20362
ZSTD_compressBlock_btopt_extDict           29984      12  -29972
ZSTD_compressBlock_btultra_extDict         30718      15  -30703
ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
Total: Before=6611828, After=6418562, chg -2.92%
```

[0] https://lkml.org/lkml/2021/11/14/189
[1] https://github.com/facebook/zstd/issues/2862

Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Tested-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Fix unused variable warning</title>
<updated>2021-11-18T21:12:26+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2021-11-16T03:08:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ae8d67b2117f1ec6c8170d6e1af8ded17392bd2c'/>
<id>ae8d67b2117f1ec6c8170d6e1af8ded17392bd2c</id>
<content type='text'>
The variable `litLengthSum` is only used by an `assert()`, so when
asserts are disabled the compiler doesn't see any usage and warns.

This issue is already fixed upstream by PR #2838 [0]. It was reported
by the Kernel test robot in [1].

Another approach would be to change zstd's disabled `assert()`
definition to use the argument in a disabled branch, instead of
ignoring the argument. I've avoided this approach because there are
some small changes necessary to get zstd to build, and I would
want to thoroughly re-test for performance, since that is slightly
changing the code in every function in zstd. It seems like a
trivial change, but some functions are pretty sensitive to small
changes. However, I think it is a valid approach that I would
like to see upstream take, so I've opened Issue #2868 to attempt
this upstream.

Lastly, I've chosen not to use __maybe_unused because all code
in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
use __maybe_unused because it isn't portable across all compilers.

[0] https://github.com/facebook/zstd/pull/2838
[1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
[2] https://github.com/facebook/zstd/issues/2868

Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The variable `litLengthSum` is only used by an `assert()`, so when
asserts are disabled the compiler doesn't see any usage and warns.

This issue is already fixed upstream by PR #2838 [0]. It was reported
by the Kernel test robot in [1].

Another approach would be to change zstd's disabled `assert()`
definition to use the argument in a disabled branch, instead of
ignoring the argument. I've avoided this approach because there are
some small changes necessary to get zstd to build, and I would
want to thoroughly re-test for performance, since that is slightly
changing the code in every function in zstd. It seems like a
trivial change, but some functions are pretty sensitive to small
changes. However, I think it is a valid approach that I would
like to see upstream take, so I've opened Issue #2868 to attempt
this upstream.

Lastly, I've chosen not to use __maybe_unused because all code
in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
use __maybe_unused because it isn't portable across all compilers.

[0] https://github.com/facebook/zstd/pull/2838
[1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
[2] https://github.com/facebook/zstd/issues/2868

Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical</title>
<updated>2021-11-09T00:55:38+00:00</updated>
<author>
<name>Nathan Chancellor</name>
<email>nathan@kernel.org</email>
</author>
<published>2021-10-21T20:23:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a8ea235837cc39f27c45689930aa97ae91d5953'/>
<id>0a8ea235837cc39f27c45689930aa97ae91d5953</id>
<content type='text'>
A new warning in clang warns that there is an instance where boolean
expressions are being used with bitwise operators instead of logical
ones:

lib/zstd/decompress/huf_decompress.c:890:25: warning: use of bitwise '&amp;' with boolean operands [-Wbitwise-instead-of-logical]
                       (BIT_reloadDStreamFast(&amp;bitD1) == BIT_DStream_unfinished)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstd does this frequently to help with performance, as logical operators
have branches whereas bitwise ones do not.

To fix this warning in other cases, the expressions were placed on
separate lines with the '&amp;=' operator; however, this particular instance
was moved away from that so that it could be surrounded by LIKELY, which
is a macro for __builtin_expect(), to help with a performance
regression, according to upstream zstd pull #1973.

Aside from switching to logical operators, which is likely undesirable
in this instance, or disabling the warning outright, the solution is
casting one of the expressions to an integer type to make it clear to
clang that the author knows what they are doing. Add a cast to U32 to
silence the warning. The first U32 cast is to silence an instance of
-Wshorten-64-to-32 because __builtin_expect() returns long so it cannot
be moved.

Link: https://github.com/ClangBuiltLinux/linux/issues/1486
Link: https://github.com/facebook/zstd/pull/1973
Reported-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A new warning in clang warns that there is an instance where boolean
expressions are being used with bitwise operators instead of logical
ones:

lib/zstd/decompress/huf_decompress.c:890:25: warning: use of bitwise '&amp;' with boolean operands [-Wbitwise-instead-of-logical]
                       (BIT_reloadDStreamFast(&amp;bitD1) == BIT_DStream_unfinished)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstd does this frequently to help with performance, as logical operators
have branches whereas bitwise ones do not.

To fix this warning in other cases, the expressions were placed on
separate lines with the '&amp;=' operator; however, this particular instance
was moved away from that so that it could be surrounded by LIKELY, which
is a macro for __builtin_expect(), to help with a performance
regression, according to upstream zstd pull #1973.

Aside from switching to logical operators, which is likely undesirable
in this instance, or disabling the warning outright, the solution is
casting one of the expressions to an integer type to make it clear to
clang that the author knows what they are doing. Add a cast to U32 to
silence the warning. The first U32 cast is to silence an instance of
-Wshorten-64-to-32 because __builtin_expect() returns long so it cannot
be moved.

Link: https://github.com/ClangBuiltLinux/linux/issues/1486
Link: https://github.com/facebook/zstd/pull/1973
Reported-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Upgrade to latest upstream zstd version 1.4.10</title>
<updated>2021-11-09T00:55:32+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2020-09-11T23:37:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e0c1b49f5b674cca7b10549c53b3791d0bbc90a8'/>
<id>e0c1b49f5b674cca7b10549c53b3791d0bbc90a8</id>
<content type='text'>
Upgrade to the latest upstream zstd version 1.4.10.

This patch is 100% generated from upstream zstd commit 20821a46f412 [0].

This patch is very large because it is transitioning from the custom
kernel zstd to using upstream directly. The new zstd follows upstreams
file structure which is different. Future update patches will be much
smaller because they will only contain the changes from one upstream
zstd release.

As an aid for review I've created a commit [1] that shows the diff
between upstream zstd as-is (which doesn't compile), and the zstd
code imported in this patch. The verion of zstd in this patch is
generated from upstream with changes applied by automation to replace
upstreams libc dependencies, remove unnecessary portability macros,
replace `/**` comments with `/*` comments, and use the kernel's xxhash
instead of bundling it.

The benefits of this patch are as follows:
1. Using upstream directly with automated script to generate kernel
   code. This allows us to update the kernel every upstream release, so
   the kernel gets the latest bug fixes and performance improvements,
   and doesn't get 3 years out of date again. The automation and the
   translated code are tested every upstream commit to ensure it
   continues to work.
2. Upgrades from a custom zstd based on 1.3.1 to 1.4.10, getting 3 years
   of performance improvements and bug fixes. On x86_64 I've measured
   15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
   kernel decompression, and 30% faster ZRAM decompression+read speeds.
3. Zstd-1.4.10 supports negative compression levels, which allow zstd to
   match or subsume lzo's performance.
4. Maintains the same kernel-specific wrapper API, so no callers have to
   be modified with zstd version updates.

One concern that was brought up was stack usage. Upstream zstd had
already removed most of its heavy stack usage functions, but I just
removed the last functions that allocate arrays on the stack. I've
measured the high water mark for both compression and decompression
before and after this patch. Decompression is approximately neutral,
using about 1.2KB of stack space. Compression levels up to 3 regressed
from 1.4KB -&gt; 1.6KB, and higher compression levels regressed from 1.5KB
-&gt; 2KB. We've added unit tests upstream to prevent further regression.
I believe that this is a reasonable increase, and if it does end up
causing problems, this commit can be cleanly reverted, because it only
touches zstd.

I chose the bulk update instead of replaying upstream commits because
there have been ~3500 upstream commits since the 1.3.1 release, zstd
wasn't ready to be used in the kernel as-is before a month ago, and not
all upstream zstd commits build. The bulk update preserves bisectablity
because bugs can be bisected to the zstd version update. At that point
the update can be reverted, and we can work with upstream to find and
fix the bug.

Note that upstream zstd release 1.4.10 doesn't exist yet. I have cut a
staging branch at 20821a46f412 [0] and will apply any changes requested
to the staging branch. Once we're ready to merge this update I will cut
a zstd release at the commit we merge, so we have a known zstd release
in the kernel.

The implementation of the kernel API is contained in
zstd_compress_module.c and zstd_decompress_module.c.

[0] https://github.com/facebook/zstd/commit/20821a46f4122f9abd7c7b245d28162dde8129c9
[1] https://github.com/terrelln/linux/commit/e0fa481d0e3df26918da0a13749740a1f6777574

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Upgrade to the latest upstream zstd version 1.4.10.

This patch is 100% generated from upstream zstd commit 20821a46f412 [0].

This patch is very large because it is transitioning from the custom
kernel zstd to using upstream directly. The new zstd follows upstreams
file structure which is different. Future update patches will be much
smaller because they will only contain the changes from one upstream
zstd release.

As an aid for review I've created a commit [1] that shows the diff
between upstream zstd as-is (which doesn't compile), and the zstd
code imported in this patch. The verion of zstd in this patch is
generated from upstream with changes applied by automation to replace
upstreams libc dependencies, remove unnecessary portability macros,
replace `/**` comments with `/*` comments, and use the kernel's xxhash
instead of bundling it.

The benefits of this patch are as follows:
1. Using upstream directly with automated script to generate kernel
   code. This allows us to update the kernel every upstream release, so
   the kernel gets the latest bug fixes and performance improvements,
   and doesn't get 3 years out of date again. The automation and the
   translated code are tested every upstream commit to ensure it
   continues to work.
2. Upgrades from a custom zstd based on 1.3.1 to 1.4.10, getting 3 years
   of performance improvements and bug fixes. On x86_64 I've measured
   15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
   kernel decompression, and 30% faster ZRAM decompression+read speeds.
3. Zstd-1.4.10 supports negative compression levels, which allow zstd to
   match or subsume lzo's performance.
4. Maintains the same kernel-specific wrapper API, so no callers have to
   be modified with zstd version updates.

One concern that was brought up was stack usage. Upstream zstd had
already removed most of its heavy stack usage functions, but I just
removed the last functions that allocate arrays on the stack. I've
measured the high water mark for both compression and decompression
before and after this patch. Decompression is approximately neutral,
using about 1.2KB of stack space. Compression levels up to 3 regressed
from 1.4KB -&gt; 1.6KB, and higher compression levels regressed from 1.5KB
-&gt; 2KB. We've added unit tests upstream to prevent further regression.
I believe that this is a reasonable increase, and if it does end up
causing problems, this commit can be cleanly reverted, because it only
touches zstd.

I chose the bulk update instead of replaying upstream commits because
there have been ~3500 upstream commits since the 1.3.1 release, zstd
wasn't ready to be used in the kernel as-is before a month ago, and not
all upstream zstd commits build. The bulk update preserves bisectablity
because bugs can be bisected to the zstd version update. At that point
the update can be reverted, and we can work with upstream to find and
fix the bug.

Note that upstream zstd release 1.4.10 doesn't exist yet. I have cut a
staging branch at 20821a46f412 [0] and will apply any changes requested
to the staging branch. Once we're ready to merge this update I will cut
a zstd release at the commit we merge, so we have a known zstd release
in the kernel.

The implementation of the kernel API is contained in
zstd_compress_module.c and zstd_decompress_module.c.

[0] https://github.com/facebook/zstd/commit/20821a46f4122f9abd7c7b245d28162dde8129c9
[1] https://github.com/terrelln/linux/commit/e0fa481d0e3df26918da0a13749740a1f6777574

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Add decompress_sources.h for decompress_unzstd</title>
<updated>2021-11-09T00:55:26+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2020-09-14T19:54:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2479b523898633768e28796238534af31fbd6846'/>
<id>2479b523898633768e28796238534af31fbd6846</id>
<content type='text'>
Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: zstd: Add kernel-specific API</title>
<updated>2021-11-09T00:55:21+00:00</updated>
<author>
<name>Nick Terrell</name>
<email>terrelln@fb.com</email>
</author>
<published>2020-09-11T23:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cf30f6a5f0c60ec98a637b836bef6915f602c6ab'/>
<id>cf30f6a5f0c60ec98a637b836bef6915f602c6ab</id>
<content type='text'>
This patch:
- Moves `include/linux/zstd.h` -&gt; `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch:
- Moves `include/linux/zstd.h` -&gt; `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell &lt;terrelln@fb.com&gt;
Tested By: Paul Jones &lt;paul@pauljones.id.au&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt; # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard &lt;jd.girard@sysnux.pf&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib/decompressors: fix spelling mistakes</title>
<updated>2021-07-01T18:06:05+00:00</updated>
<author>
<name>Zhen Lei</name>
<email>thunder.leizhen@huawei.com</email>
</author>
<published>2021-07-01T01:55:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=05911c5d964956442d17fe21db239de5a1dace4a'/>
<id>05911c5d964956442d17fe21db239de5a1dace4a</id>
<content type='text'>
Fix some spelling mistakes in comments:
sentinal ==&gt; sentinel
compresed ==&gt; compressed
dependeny ==&gt; dependency
immediatelly ==&gt; immediately
dervied ==&gt; derived
splitted ==&gt; split
nore ==&gt; not
independed ==&gt; independent
asumed ==&gt; assumed

Link: https://lkml.kernel.org/r/20210604085656.12257-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix some spelling mistakes in comments:
sentinal ==&gt; sentinel
compresed ==&gt; compressed
dependeny ==&gt; dependency
immediatelly ==&gt; immediately
dervied ==&gt; derived
splitted ==&gt; split
nore ==&gt; not
independed ==&gt; independent
asumed ==&gt; assumed

Link: https://lkml.kernel.org/r/20210604085656.12257-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: Fix fall-through warnings for Clang</title>
<updated>2020-11-19T13:23:47+00:00</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-11-19T13:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36f9ff9e03de89691274a6aec45aa079bd3ae405'/>
<id>36f9ff9e03de89691274a6aec45aa079bd3ae405</id>
<content type='text'>
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of
letting the code fall through to the next case, and by replacing a
number of /* fall through */ comments with the new pseudo-keyword
macro fallthrough.

Notice that Clang doesn't recognize /* Fall through */ comments as
implicit fall-through markings.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of
letting the code fall through to the next case, and by replacing a
number of /* fall through */ comments with the new pseudo-keyword
macro fallthrough.

Notice that Clang doesn't recognize /* Fall through */ comments as
implicit fall-through markings.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"</title>
<updated>2020-11-18T20:15:17+00:00</updated>
<author>
<name>Nick Desaulniers</name>
<email>ndesaulniers@google.com</email>
</author>
<published>2020-11-16T04:35:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4c1ca831adb1010e473a18eb01b3fbef7595f230'/>
<id>4c1ca831adb1010e473a18eb01b3fbef7595f230</id>
<content type='text'>
This reverts commit 6a9dc5fd6170 ("lib: Revert use of fallthrough
pseudo-keyword in lib/")

Now that we can build arch/powerpc/boot/ free of -Wimplicit-fallthrough,
re-enable these fixes for lib/.

Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Reviewed-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Reviewed-by: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/236
Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 6a9dc5fd6170 ("lib: Revert use of fallthrough
pseudo-keyword in lib/")

Now that we can build arch/powerpc/boot/ free of -Wimplicit-fallthrough,
re-enable these fixes for lib/.

Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Reviewed-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Reviewed-by: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/236
Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
