<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/skbuff.h, branch v5.12-rc6</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct</title>
<updated>2021-03-16T22:22:18+00:00</updated>
<author>
<name>wenxu</name>
<email>wenxu@ucloud.cn</email>
</author>
<published>2021-03-16T08:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d29334c15d33a6a92d2043ca88f84cd5ad026c57'/>
<id>d29334c15d33a6a92d2043ca88f84cd5ad026c57</id>
<content type='text'>
When openvswitch conntrack offload with act_ct action. The first rule
do conntrack in the act_ct in tc subsystem. And miss the next rule in
the tc and fallback to the ovs datapath but miss set post_ct flag
which will lead the ct_state_key with -trk flag.

Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: wenxu &lt;wenxu@ucloud.cn&gt;
Reviewed-by: Marcelo Ricardo Leitner &lt;marcelo.leitner@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When openvswitch conntrack offload with act_ct action. The first rule
do conntrack in the act_ct in tc subsystem. And miss the next rule in
the tc and fallback to the ovs datapath but miss set post_ct flag
which will lead the ct_state_key with -trk flag.

Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: wenxu &lt;wenxu@ucloud.cn&gt;
Reviewed-by: Marcelo Ricardo Leitner &lt;marcelo.leitner@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing</title>
<updated>2021-02-13T22:32:04+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-02-13T14:13:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9243adfc311a20371c3f4d8eaf0af4b135e6fac3'/>
<id>9243adfc311a20371c3f4d8eaf0af4b135e6fac3</id>
<content type='text'>
napi_frags_finish() and napi_skb_finish() can only be called inside
NAPI Rx context, so we can feed NAPI cache with skbuff_heads that
got NAPI_MERGED_FREE verdict instead of immediate freeing.
Replace __kfree_skb() with __kfree_skb_defer() in napi_skb_finish()
and move napi_skb_free_stolen_head() to skbuff.c, so it can drop skbs
to NAPI cache.
As many drivers call napi_alloc_skb()/napi_get_frags() on their
receive path, this becomes especially useful.

Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
napi_frags_finish() and napi_skb_finish() can only be called inside
NAPI Rx context, so we can feed NAPI cache with skbuff_heads that
got NAPI_MERGED_FREE verdict instead of immediate freeing.
Replace __kfree_skb() with __kfree_skb_defer() in napi_skb_finish()
and move napi_skb_free_stolen_head() to skbuff.c, so it can drop skbs
to NAPI cache.
As many drivers call napi_alloc_skb()/napi_get_frags() on their
receive path, this becomes especially useful.

Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>skbuff: introduce {,__}napi_build_skb() which reuses NAPI cache heads</title>
<updated>2021-02-13T22:32:04+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-02-13T14:12:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f450d539c05a14c103dd174718f81bb2fe65cb4b'/>
<id>f450d539c05a14c103dd174718f81bb2fe65cb4b</id>
<content type='text'>
Instead of just bulk-flushing skbuff_heads queued up through
napi_consume_skb() or __kfree_skb_defer(), try to reuse them
on allocation path.
If the cache is empty on allocation, bulk-allocate the first
16 elements, which is more efficient than per-skb allocation.
If the cache is full on freeing, bulk-wipe the second half of
the cache (32 elements).
This also includes custom KASAN poisoning/unpoisoning to be
double sure there are no use-after-free cases.

To not change current behaviour, introduce a new function,
napi_build_skb(), to optionally use a new approach later
in drivers.

Note on selected bulk size, 16:
 - this equals to XDP_BULK_QUEUE_SIZE, DEV_MAP_BULK_SIZE
   and especially VETH_XDP_BATCH, which is also used to
   bulk-allocate skbuff_heads and was tested on powerful
   setups;
 - this also showed the best performance in the actual
   test series (from the array of {8, 16, 32}).

Suggested-by: Edward Cree &lt;ecree.xilinx@gmail.com&gt; # Divide on two halves
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;   # KASAN poisoning
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;             # Help with KASAN
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;                # Reduced batch size
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of just bulk-flushing skbuff_heads queued up through
napi_consume_skb() or __kfree_skb_defer(), try to reuse them
on allocation path.
If the cache is empty on allocation, bulk-allocate the first
16 elements, which is more efficient than per-skb allocation.
If the cache is full on freeing, bulk-wipe the second half of
the cache (32 elements).
This also includes custom KASAN poisoning/unpoisoning to be
double sure there are no use-after-free cases.

To not change current behaviour, introduce a new function,
napi_build_skb(), to optionally use a new approach later
in drivers.

Note on selected bulk size, 16:
 - this equals to XDP_BULK_QUEUE_SIZE, DEV_MAP_BULK_SIZE
   and especially VETH_XDP_BATCH, which is also used to
   bulk-allocate skbuff_heads and was tested on powerful
   setups;
 - this also showed the best performance in the actual
   test series (from the array of {8, 16, 32}).

Suggested-by: Edward Cree &lt;ecree.xilinx@gmail.com&gt; # Divide on two halves
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;   # KASAN poisoning
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;             # Help with KASAN
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;                # Reduced batch size
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>skbuff: remove __kfree_skb_flush()</title>
<updated>2021-02-13T22:32:03+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-02-13T14:12:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fec6e49b63989657bc4076dad99fa51d5ece34da'/>
<id>fec6e49b63989657bc4076dad99fa51d5ece34da</id>
<content type='text'>
This function isn't much needed as NAPI skb queue gets bulk-freed
anyway when there's no more room, and even may reduce the efficiency
of bulk operations.
It will be even less needed after reusing skb cache on allocation path,
so remove it and this way lighten network softirqs a bit.

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This function isn't much needed as NAPI skb queue gets bulk-freed
anyway when there's no more room, and even may reduce the efficiency
of bulk operations.
It will be even less needed after reusing skb cache on allocation path,
so remove it and this way lighten network softirqs a bit.

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Introduce {netdev,napi}_alloc_frag_align()</title>
<updated>2021-02-06T19:57:28+00:00</updated>
<author>
<name>Kevin Hao</name>
<email>haokexin@gmail.com</email>
</author>
<published>2021-02-04T10:56:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3f6e687dff395da43b056c18150a423bc7bf5d14'/>
<id>3f6e687dff395da43b056c18150a423bc7bf5d14</id>
<content type='text'>
In the current implementation of {netdev,napi}_alloc_frag(), it doesn't
have any align guarantee for the returned buffer address, But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the {netdev,napi}_alloc_frag() are used by these hardwares
for DMA.
    buf = napi_alloc_frag(really_needed_size + align);
    buf = PTR_ALIGN(buf, align);

These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. We have added the align
support for the page_frag functions, so add the corresponding
{netdev,napi}_frag functions.

Signed-off-by: Kevin Hao &lt;haokexin@gmail.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the current implementation of {netdev,napi}_alloc_frag(), it doesn't
have any align guarantee for the returned buffer address, But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the {netdev,napi}_alloc_frag() are used by these hardwares
for DMA.
    buf = napi_alloc_frag(really_needed_size + align);
    buf = PTR_ALIGN(buf, align);

These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. We have added the align
support for the page_frag functions, so add the corresponding
{netdev,napi}_frag functions.

Signed-off-by: Kevin Hao &lt;haokexin@gmail.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: introduce common dev_page_is_reusable()</title>
<updated>2021-02-05T02:20:14+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-02-02T13:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc38f30f8dbce0afb8af05d917bee084b1329418'/>
<id>bc38f30f8dbce0afb8af05d917bee084b1329418</id>
<content type='text'>
A bunch of drivers test the page before reusing/recycling for two
common conditions:
 - if a page was allocated under memory pressure (pfmemalloc page);
 - if a page was allocated at a distant memory node (to exclude
   slowdowns).

Introduce a new common inline for doing this, with likely() already
folded inside to make driver code a bit simpler.

Suggested-by: David Rientjes &lt;rientjes@google.com&gt;
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Reviewed-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A bunch of drivers test the page before reusing/recycling for two
common conditions:
 - if a page was allocated under memory pressure (pfmemalloc page);
 - if a page was allocated at a distant memory node (to exclude
   slowdowns).

Introduce a new common inline for doing this, with likely() already
folded inside to make driver code a bit simpler.

Suggested-by: David Rientjes &lt;rientjes@google.com&gt;
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Reviewed-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>skbuff: constify skb_propagate_pfmemalloc() "page" argument</title>
<updated>2021-02-05T02:20:13+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-02-02T13:31:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=48f971c9c80a728646fc03367a28df747f20d0f4'/>
<id>48f971c9c80a728646fc03367a28df747f20d0f4</id>
<content type='text'>
The function doesn't write anything to the page struct itself,
so this argument can be const.

Misc: align second argument to the brace while at it.

Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Reviewed-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The function doesn't write anything to the page struct itself,
so this argument can be const.

Misc: align second argument to the brace while at it.

Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Reviewed-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: add TTL to SCM_TIMESTAMPING_OPT_STATS</title>
<updated>2021-01-23T02:20:52+00:00</updated>
<author>
<name>Yousuk Seung</name>
<email>ysseung@google.com</email>
</author>
<published>2021-01-20T20:41:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e7ed11ee945438b737e2ae2370e35591e16ec371'/>
<id>e7ed11ee945438b737e2ae2370e35591e16ec371</id>
<content type='text'>
This patch adds TCP_NLA_TTL to SCM_TIMESTAMPING_OPT_STATS that exports
the time-to-live or hop limit of the latest incoming packet with
SCM_TSTAMP_ACK. The value exported may not be from the packet that acks
the sequence when incoming packets are aggregated. Exporting the
time-to-live or hop limit value of incoming packets helps to estimate
the hop count of the path of the flow that may change over time.

Signed-off-by: Yousuk Seung &lt;ysseung@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Link: https://lore.kernel.org/r/20210120204155.552275-1-ysseung@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds TCP_NLA_TTL to SCM_TIMESTAMPING_OPT_STATS that exports
the time-to-live or hop limit of the latest incoming packet with
SCM_TSTAMP_ACK. The value exported may not be from the packet that acks
the sequence when incoming packets are aggregated. Exporting the
time-to-live or hop limit value of incoming packets helps to estimate
the hop count of the path of the flow that may change over time.

Signed-off-by: Yousuk Seung &lt;ysseung@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Link: https://lore.kernel.org/r/20210120204155.552275-1-ysseung@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/sched: cls_flower add CT_FLAGS_INVALID flag support</title>
<updated>2021-01-21T05:09:44+00:00</updated>
<author>
<name>wenxu</name>
<email>wenxu@ucloud.cn</email>
</author>
<published>2021-01-19T08:31:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7baf2429a1a965369b0ce44efb6315cdd515aa9c'/>
<id>7baf2429a1a965369b0ce44efb6315cdd515aa9c</id>
<content type='text'>
This patch add the TCA_FLOWER_KEY_CT_FLAGS_INVALID flag to
match the ct_state with invalid for conntrack.

Signed-off-by: wenxu &lt;wenxu@ucloud.cn&gt;
Acked-by: Marcelo Ricardo Leitner &lt;marcelo.leitner@gmail.com&gt;
Link: https://lore.kernel.org/r/1611045110-682-1-git-send-email-wenxu@ucloud.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch add the TCA_FLOWER_KEY_CT_FLAGS_INVALID flag to
match the ct_state with invalid for conntrack.

Signed-off-by: wenxu &lt;wenxu@ucloud.cn&gt;
Acked-by: Marcelo Ricardo Leitner &lt;marcelo.leitner@gmail.com&gt;
Link: https://lore.kernel.org/r/1611045110-682-1-git-send-email-wenxu@ucloud.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add inline function skb_csum_is_sctp</title>
<updated>2021-01-19T22:31:25+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2021-01-16T06:13:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fa82117010430aff2ce86400f7328f55a31b48a6'/>
<id>fa82117010430aff2ce86400f7328f55a31b48a6</id>
<content type='text'>
This patch is to define a inline function skb_csum_is_sctp(), and
also replace all places where it checks if it's a SCTP CSUM skb.
This function would be used later in many networking drivers in
the following patches.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch is to define a inline function skb_csum_is_sctp(), and
also replace all places where it checks if it's a SCTP CSUM skb.
This function would be used later in many networking drivers in
the following patches.

Suggested-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
