<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/ipv4/inetpeer.c, branch v3.4.94</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ip: generate unique IP identificator if local fragmentation is allowed</title>
<updated>2013-10-13T22:42:48+00:00</updated>
<author>
<name>Ansis Atteka</name>
<email>aatteka@nicira.com</email>
</author>
<published>2013-09-18T22:29:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f72299da3e1a010a3d77fbed0b9ee6abd0a19911'/>
<id>f72299da3e1a010a3d77fbed0b9ee6abd0a19911</id>
<content type='text'>
[ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]

If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.

For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]

If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.

For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: fix a race in inetpeer_gc_worker()</title>
<updated>2012-07-16T16:03:45+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-05T03:00:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=89a5feb2d59123824c344665c09328bb9fdb4fe9'/>
<id>89a5feb2d59123824c344665c09328bb9fdb4fe9</id>
<content type='text'>
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]

commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :

Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.

inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Acked-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 55432d2b543a4b6dfae54f5c432a566877a85d90 ]

commit 5faa5df1fa2024 (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :

Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.

inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Acked-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>route: Remove redirect_genid</title>
<updated>2012-03-08T08:30:32+00:00</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2012-03-06T21:21:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ac3f48de09d8f4b73397047e413fadff7f65cfa7'/>
<id>ac3f48de09d8f4b73397047e413fadff7f65cfa7</id>
<content type='text'>
As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: Invalidate the inetpeer tree along with the routing cache</title>
<updated>2012-03-08T08:30:24+00:00</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2012-03-06T21:20:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5faa5df1fa2024bd750089ff21dcc4191798263d'/>
<id>5faa5df1fa2024bd750089ff21dcc4191798263d</id>
<content type='text'>
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: initialize -&gt;redirect_genid in inet_getpeer()</title>
<updated>2012-01-17T20:52:12+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2012-01-17T10:48:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=10ec1bb7e9eb462548f14dd53c73e927a3ddf31c'/>
<id>10ec1bb7e9eb462548f14dd53c73e927a3ddf31c</id>
<content type='text'>
kmemcheck complains that -&gt;redirect_genid doesn't get initialized.
Presumably it should be set to zero.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kmemcheck complains that -&gt;redirect_genid doesn't get initialized.
Presumably it should be set to zero.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix some sparse errors</title>
<updated>2012-01-17T15:31:12+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-01-16T19:27:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=747465ef7a082033e086dedc8189febfda43b015'/>
<id>747465ef7a082033e086dedc8189febfda43b015</id>
<content type='text'>
make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Compute protocol sequence numbers and fragment IDs using MD5.</title>
<updated>2011-08-07T01:33:19+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-08-04T03:50:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6e5714eaf77d79ae1c8b47e3e040ff5411b717ec'/>
<id>6e5714eaf77d79ae1c8b47e3e040ff5411b717ec</id>
<content type='text'>
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky &lt;dan@doxpara.com&gt;
Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky &lt;dan@doxpara.com&gt;
Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: make fragment identifications less predictable</title>
<updated>2011-07-22T04:25:58+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-07-22T04:25:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=87c48fa3b4630905f98268dde838ee43626a060c'/>
<id>87c48fa3b4630905f98268dde838ee43626a060c</id>
<content type='text'>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.

Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)

This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter

Reported-by: Fernando Gont &lt;fernando@gont.com.ar&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.

Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)

This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter

Reported-by: Fernando Gont &lt;fernando@gont.com.ar&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: kill inet_putpeer race</title>
<updated>2011-07-12T03:25:04+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-07-11T02:49:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6d1a3e042f55861a785527a35a6f1ab4217ee810'/>
<id>6d1a3e042f55861a785527a35a6f1ab4217ee810</id>
<content type='text'>
We currently can free inetpeer entries too early :

[  782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
[  782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
[  782.636686]  i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
[  782.636694]                          ^
[  782.636696]
[  782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[  782.636702] EIP: 0060:[&lt;c13fefbb&gt;] EFLAGS: 00010286 CPU: 0
[  782.636707] EIP is at inet_getpeer+0x25b/0x5a0
[  782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
[  782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
[  782.636712]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
[  782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  782.636717] DR6: ffff4ff0 DR7: 00000400
[  782.636718]  [&lt;c13fbf76&gt;] rt_set_nexthop.clone.45+0x56/0x220
[  782.636722]  [&lt;c13fc449&gt;] __ip_route_output_key+0x309/0x860
[  782.636724]  [&lt;c141dc54&gt;] tcp_v4_connect+0x124/0x450
[  782.636728]  [&lt;c142ce43&gt;] inet_stream_connect+0xa3/0x270
[  782.636731]  [&lt;c13a8da1&gt;] sys_connect+0xa1/0xb0
[  782.636733]  [&lt;c13a99dd&gt;] sys_socketcall+0x25d/0x2a0
[  782.636736]  [&lt;c149deb8&gt;] sysenter_do_call+0x12/0x28
[  782.636738]  [&lt;ffffffff&gt;] 0xffffffff

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently can free inetpeer entries too early :

[  782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
[  782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
[  782.636686]  i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
[  782.636694]                          ^
[  782.636696]
[  782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[  782.636702] EIP: 0060:[&lt;c13fefbb&gt;] EFLAGS: 00010286 CPU: 0
[  782.636707] EIP is at inet_getpeer+0x25b/0x5a0
[  782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
[  782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
[  782.636712]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
[  782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  782.636717] DR6: ffff4ff0 DR7: 00000400
[  782.636718]  [&lt;c13fbf76&gt;] rt_set_nexthop.clone.45+0x56/0x220
[  782.636722]  [&lt;c13fc449&gt;] __ip_route_output_key+0x309/0x860
[  782.636724]  [&lt;c141dc54&gt;] tcp_v4_connect+0x124/0x450
[  782.636728]  [&lt;c142ce43&gt;] inet_stream_connect+0xa3/0x270
[  782.636731]  [&lt;c13a8da1&gt;] sys_connect+0xa1/0xb0
[  782.636733]  [&lt;c13a99dd&gt;] sys_socketcall+0x25d/0x2a0
[  782.636736]  [&lt;c149deb8&gt;] sysenter_do_call+0x12/0x28
[  782.636738]  [&lt;ffffffff&gt;] 0xffffffff

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: remove unused list</title>
<updated>2011-06-09T00:05:30+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-08T13:35:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4b9d9be839fdb7dcd7ce7619a623fd9015a50cda'/>
<id>4b9d9be839fdb7dcd7ce7619a623fd9015a50cda</id>
<content type='text'>
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
