<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/ipv4/inetpeer.c, branch v3.4.12</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>inetpeer: fix a race in inetpeer_gc_worker()</title>
<updated>2012-07-16T16:03:45+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-05T03:00:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=89a5feb2d59123824c344665c09328bb9fdb4fe9'/>
<id>89a5feb2d59123824c344665c09328bb9fdb4fe9</id>
<content type='text'>
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]

commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :

Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.

inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Acked-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]

commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :

Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.

inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Acked-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>route: Remove redirect_genid</title>
<updated>2012-03-08T08:30:32+00:00</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2012-03-06T21:21:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ac3f48de09d8f4b73397047e413fadff7f65cfa7'/>
<id>ac3f48de09d8f4b73397047e413fadff7f65cfa7</id>
<content type='text'>
As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: Invalidate the inetpeer tree along with the routing cache</title>
<updated>2012-03-08T08:30:24+00:00</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2012-03-06T21:20:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5faa5df1fa2024bd750089ff21dcc4191798263d'/>
<id>5faa5df1fa2024bd750089ff21dcc4191798263d</id>
<content type='text'>
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: initialize -&gt;redirect_genid in inet_getpeer()</title>
<updated>2012-01-17T20:52:12+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2012-01-17T10:48:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10ec1bb7e9eb462548f14dd53c73e927a3ddf31c'/>
<id>10ec1bb7e9eb462548f14dd53c73e927a3ddf31c</id>
<content type='text'>
kmemcheck complains that -&gt;redirect_genid doesn't get initialized.
Presumably it should be set to zero.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kmemcheck complains that -&gt;redirect_genid doesn't get initialized.
Presumably it should be set to zero.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix some sparse errors</title>
<updated>2012-01-17T15:31:12+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-01-16T19:27:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=747465ef7a082033e086dedc8189febfda43b015'/>
<id>747465ef7a082033e086dedc8189febfda43b015</id>
<content type='text'>
make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Compute protocol sequence numbers and fragment IDs using MD5.</title>
<updated>2011-08-07T01:33:19+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-08-04T03:50:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6e5714eaf77d79ae1c8b47e3e040ff5411b717ec'/>
<id>6e5714eaf77d79ae1c8b47e3e040ff5411b717ec</id>
<content type='text'>
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky &lt;dan@doxpara.com&gt;
Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky &lt;dan@doxpara.com&gt;
Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: make fragment identifications less predictable</title>
<updated>2011-07-22T04:25:58+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-07-22T04:25:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=87c48fa3b4630905f98268dde838ee43626a060c'/>
<id>87c48fa3b4630905f98268dde838ee43626a060c</id>
<content type='text'>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.

Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)

This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter

Reported-by: Fernando Gont &lt;fernando@gont.com.ar&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.

Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)

This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter

Reported-by: Fernando Gont &lt;fernando@gont.com.ar&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: kill inet_putpeer race</title>
<updated>2011-07-12T03:25:04+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-07-11T02:49:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d1a3e042f55861a785527a35a6f1ab4217ee810'/>
<id>6d1a3e042f55861a785527a35a6f1ab4217ee810</id>
<content type='text'>
We currently can free inetpeer entries too early :

[  782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
[  782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
[  782.636686]  i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
[  782.636694]                          ^
[  782.636696]
[  782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[  782.636702] EIP: 0060:[&lt;c13fefbb&gt;] EFLAGS: 00010286 CPU: 0
[  782.636707] EIP is at inet_getpeer+0x25b/0x5a0
[  782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
[  782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
[  782.636712]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
[  782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  782.636717] DR6: ffff4ff0 DR7: 00000400
[  782.636718]  [&lt;c13fbf76&gt;] rt_set_nexthop.clone.45+0x56/0x220
[  782.636722]  [&lt;c13fc449&gt;] __ip_route_output_key+0x309/0x860
[  782.636724]  [&lt;c141dc54&gt;] tcp_v4_connect+0x124/0x450
[  782.636728]  [&lt;c142ce43&gt;] inet_stream_connect+0xa3/0x270
[  782.636731]  [&lt;c13a8da1&gt;] sys_connect+0xa1/0xb0
[  782.636733]  [&lt;c13a99dd&gt;] sys_socketcall+0x25d/0x2a0
[  782.636736]  [&lt;c149deb8&gt;] sysenter_do_call+0x12/0x28
[  782.636738]  [&lt;ffffffff&gt;] 0xffffffff

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently can free inetpeer entries too early :

[  782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
[  782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
[  782.636686]  i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
[  782.636694]                          ^
[  782.636696]
[  782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[  782.636702] EIP: 0060:[&lt;c13fefbb&gt;] EFLAGS: 00010286 CPU: 0
[  782.636707] EIP is at inet_getpeer+0x25b/0x5a0
[  782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
[  782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
[  782.636712]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
[  782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  782.636717] DR6: ffff4ff0 DR7: 00000400
[  782.636718]  [&lt;c13fbf76&gt;] rt_set_nexthop.clone.45+0x56/0x220
[  782.636722]  [&lt;c13fc449&gt;] __ip_route_output_key+0x309/0x860
[  782.636724]  [&lt;c141dc54&gt;] tcp_v4_connect+0x124/0x450
[  782.636728]  [&lt;c142ce43&gt;] inet_stream_connect+0xa3/0x270
[  782.636731]  [&lt;c13a8da1&gt;] sys_connect+0xa1/0xb0
[  782.636733]  [&lt;c13a99dd&gt;] sys_socketcall+0x25d/0x2a0
[  782.636736]  [&lt;c149deb8&gt;] sysenter_do_call+0x12/0x28
[  782.636738]  [&lt;ffffffff&gt;] 0xffffffff

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: remove unused list</title>
<updated>2011-06-09T00:05:30+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-08T13:35:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4b9d9be839fdb7dcd7ce7619a623fd9015a50cda'/>
<id>4b9d9be839fdb7dcd7ce7619a623fd9015a50cda</id>
<content type='text'>
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: fix race in unused_list manipulations</title>
<updated>2011-05-27T17:39:11+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-05-26T17:27:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=686a7e32ca7fdd819eb9606abd3db52b77d1479f'/>
<id>686a7e32ca7fdd819eb9606abd3db52b77d1479f</id>
<content type='text'>
Several crashes in cleanup_once() were reported in recent kernels.

Commit d6cc1d642de9 (inetpeer: various changes) added a race in
unlink_from_unused().

One way to avoid taking unused_peers.lock before doing the list_empty()
test is to catch 0-&gt;1 refcnt transitions, using full barrier atomic
operations variants (atomic_cmpxchg() and atomic_inc_return()) instead
of previous atomic_inc() and atomic_add_unless() variants.

We then call unlink_from_unused() only for the owner of the 0-&gt;1
transition.

Add a new atomic_add_unless_return() static helper

With help from Arun Sharma.

Refs: https://bugzilla.kernel.org/show_bug.cgi?id=32772

Reported-by: Arun Sharma &lt;asharma@fb.com&gt;
Reported-by: Maximilian Engelhardt &lt;maxi@daemonizer.de&gt;
Reported-by: Yann Dupont &lt;Yann.Dupont@univ-nantes.fr&gt;
Reported-by: Denys Fedoryshchenko &lt;denys@visp.net.lb&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Several crashes in cleanup_once() were reported in recent kernels.

Commit d6cc1d642de9 (inetpeer: various changes) added a race in
unlink_from_unused().

One way to avoid taking unused_peers.lock before doing the list_empty()
test is to catch 0-&gt;1 refcnt transitions, using full barrier atomic
operations variants (atomic_cmpxchg() and atomic_inc_return()) instead
of previous atomic_inc() and atomic_add_unless() variants.

We then call unlink_from_unused() only for the owner of the 0-&gt;1
transition.

Add a new atomic_add_unless_return() static helper

With help from Arun Sharma.

Refs: https://bugzilla.kernel.org/show_bug.cgi?id=32772

Reported-by: Arun Sharma &lt;asharma@fb.com&gt;
Reported-by: Maximilian Engelhardt &lt;maxi@daemonizer.de&gt;
Reported-by: Yann Dupont &lt;Yann.Dupont@univ-nantes.fr&gt;
Reported-by: Denys Fedoryshchenko &lt;denys@visp.net.lb&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
