diff options
author | Eric Dumazet <edumazet@google.com> | 2025-03-05 03:45:49 +0000 |
---|---|---|
committer | Jakub Kicinski <kuba@kernel.org> | 2025-03-06 15:26:02 -0800 |
commit | 9544d60a2605d1500cf5c3e331a76b9eaf4538c9 (patch) | |
tree | 1e22f074f1c01429e233ecd65a2eebd326726809 | |
parent | f8ece40786c9342249aa0a1b55e148ee23b2a746 (diff) |
inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()
In order to speedup __inet_hash_connect(), we want to ensure hash values
for <source address, port X, destination address, destination port>
are not randomly spread, but monotonically increasing.
Goal is to allow __inet_hash_connect() to derive the hash value
of a candidate 4-tuple with a single addition in the following
patch in the series.
Given :
hash_0 = inet_ehashfn(saddr, 0, daddr, dport)
hash_sport = inet_ehashfn(saddr, sport, daddr, dport)
Then (hash_sport == hash_0 + sport) for all sport values.
As far as I know, there is no security implication with this change.
After this patch, when __inet_hash_connect() has to try XXXX candidates,
the hash table buckets are contiguous and packed, allowing
a better use of cpu caches and hardware prefetchers.
Tested:
Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog
Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server
Before this patch:
utime_start=0.271607
utime_end=3.847111
stime_start=18.407684
stime_end=1997.485557
num_transactions=1350742
latency_min=0.014131929
latency_max=17.895073144
latency_mean=0.505675853
latency_stddev=2.125164772
num_samples=307884
throughput=139866.80
perf top on client:
56.86% [kernel] [k] __inet6_check_established
17.96% [kernel] [k] __inet_hash_connect
13.88% [kernel] [k] inet6_ehashfn
2.52% [kernel] [k] rcu_all_qs
2.01% [kernel] [k] __cond_resched
0.41% [kernel] [k] _raw_spin_lock
After this patch:
utime_start=0.286131
utime_end=4.378886
stime_start=11.952556
stime_end=1991.655533
num_transactions=1446830
latency_min=0.001061085
latency_max=12.075275028
latency_mean=0.376375302
latency_stddev=1.361969596
num_samples=306383
throughput=151866.56
perf top:
50.01% [kernel] [k] __inet6_check_established
20.65% [kernel] [k] __inet_hash_connect
15.81% [kernel] [k] inet6_ehashfn
2.92% [kernel] [k] rcu_all_qs
2.34% [kernel] [k] __cond_resched
0.50% [kernel] [k] _raw_spin_lock
0.34% [kernel] [k] sched_balance_trigger
0.24% [kernel] [k] queued_spin_lock_slowpath
There is indeed an increase of throughput and reduction of latency.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Tested-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250305034550.879255-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-rw-r--r-- | net/ipv4/inet_hashtables.c | 4 | ||||
-rw-r--r-- | net/ipv6/inet6_hashtables.c | 4 |
2 files changed, 4 insertions, 4 deletions
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 3edbe2dad8ba..ccf69b534d2e 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -35,8 +35,8 @@ u32 inet_ehashfn(const struct net *net, const __be32 laddr, { net_get_random_once(&inet_ehash_secret, sizeof(inet_ehash_secret)); - return __inet_ehashfn(laddr, lport, faddr, fport, - inet_ehash_secret + net_hash_mix(net)); + return lport + __inet_ehashfn(laddr, 0, faddr, fport, + inet_ehash_secret + net_hash_mix(net)); } EXPORT_SYMBOL_GPL(inet_ehashfn); diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 9be315496459..3d95f1e75a11 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -35,8 +35,8 @@ u32 inet6_ehashfn(const struct net *net, lhash = (__force u32)laddr->s6_addr32[3]; fhash = __ipv6_addr_jhash(faddr, tcp_ipv6_hash_secret); - return __inet6_ehashfn(lhash, lport, fhash, fport, - inet6_ehash_secret + net_hash_mix(net)); + return lport + __inet6_ehashfn(lhash, 0, fhash, fport, + inet6_ehash_secret + net_hash_mix(net)); } EXPORT_SYMBOL_GPL(inet6_ehashfn); |