<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/netfilter, branch v2.6.32.50</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>netfilter: IPv6: fix DSCP mangle code</title>
<updated>2011-06-23T22:24:09+00:00</updated>
<author>
<name>Fernando Luis Vazquez Cao</name>
<email>fernando@oss.ntt.co.jp</email>
</author>
<published>2011-05-10T08:00:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b69807562af90c4b041637716a399cd7adc7060b'/>
<id>b69807562af90c4b041637716a399cd7adc7060b</id>
<content type='text'>
commit 1ed2f73d90fb49bcf5704aee7e9084adb882bfc5 upstream.

The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.

Signed-off-by: Fernando Luis Vazquez Cao &lt;fernando@oss.ntt.co.jp&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 1ed2f73d90fb49bcf5704aee7e9084adb882bfc5 upstream.

The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.

Signed-off-by: Fernando Luis Vazquez Cao &lt;fernando@oss.ntt.co.jp&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values</title>
<updated>2011-03-14T21:29:58+00:00</updated>
<author>
<name>Jan Engelhardt</name>
<email>jengelh@medozas.de</email>
</author>
<published>2011-03-02T11:10:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0d122be36b4d7b9a6cb5464c872f37e8dd2ea044'/>
<id>0d122be36b4d7b9a6cb5464c872f37e8dd2ea044</id>
<content type='text'>
commit 9ef0298a8e5730d9a46d640014c727f3b4152870 upstream.

Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.

[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP:  __find_logger+0x6f/0xa0
[ 5954.123979]  nf_log_bind_pf+0x2b/0x70
[ 5954.123979]  nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979]  nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...

The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.

Reported-by: Miguel Di Ciurcio Filho &lt;miguel.filho@gmail.com&gt;,
  via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt &lt;jengelh@medozas.de&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9ef0298a8e5730d9a46d640014c727f3b4152870 upstream.

Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.

[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP:  __find_logger+0x6f/0xa0
[ 5954.123979]  nf_log_bind_pf+0x2b/0x70
[ 5954.123979]  nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979]  nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...

The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.

Reported-by: Miguel Di Ciurcio Filho &lt;miguel.filho@gmail.com&gt;,
  via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt &lt;jengelh@medozas.de&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: allow nf_ct_alloc_hashtable() to get highmem pages</title>
<updated>2010-12-09T21:26:51+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-10-28T10:34:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=611a418f2d8508757f9ca9499fe5348e3df21e8e'/>
<id>611a418f2d8508757f9ca9499fe5348e3df21e8e</id>
<content type='text'>
commit 6b1686a71e3158d3c5f125260effce171cc7852b upstream.

commit ea781f197d6a8 (use SLAB_DESTROY_BY_RCU and get rid of call_rcu())
did a mistake in __vmalloc() call in nf_ct_alloc_hashtable().

I forgot to add __GFP_HIGHMEM, so pages were taken from LOWMEM only.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6b1686a71e3158d3c5f125260effce171cc7852b upstream.

commit ea781f197d6a8 (use SLAB_DESTROY_BY_RCU and get rid of call_rcu())
did a mistake in __vmalloc() call in nf_ct_alloc_hashtable().

I forgot to add __GFP_HIGHMEM, so pages were taken from LOWMEM only.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: Add missing locking during connection table hashing and unhashing</title>
<updated>2010-08-02T17:20:50+00:00</updated>
<author>
<name>Sven Wegener</name>
<email>sven.wegener@stealer.net</email>
</author>
<published>2010-06-09T14:10:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fb97898fa8db2f812c630912aa80ea9608a40f2c'/>
<id>fb97898fa8db2f812c630912aa80ea9608a40f2c</id>
<content type='text'>
commit aea9d711f3d68c656ad31ab578ecfb0bb5cd7f97 upstream.

The code that hashes and unhashes connections from the connection table
is missing locking of the connection being modified, which opens up a
race condition and results in memory corruption when this race condition
is hit.

Here is what happens in pretty verbose form:

CPU 0					CPU 1
------------				------------
An active connection is terminated and
we schedule ip_vs_conn_expire() on this
CPU to expire this connection.

					IRQ assignment is changed to this CPU,
					but the expire timer stays scheduled on
					the other CPU.

					New connection from same ip:port comes
					in right before the timer expires, we
					find the inactive connection in our
					connection table and get a reference to
					it. We proper lock the connection in
					tcp_state_transition() and read the
					connection flags in set_tcp_state().

ip_vs_conn_expire() gets called, we
unhash the connection from our
connection table and remove the hashed
flag in ip_vs_conn_unhash(), without
proper locking!

					While still holding proper locks we
					write the connection flags in
					set_tcp_state() and this sets the hashed
					flag again.

ip_vs_conn_expire() fails to expire the
connection, because the other CPU has
incremented the reference count. We try
to re-insert the connection into our
connection table, but this fails in
ip_vs_conn_hash(), because the hashed
flag has been set by the other CPU. We
re-schedule execution of
ip_vs_conn_expire(). Now this connection
has the hashed flag set, but isn't
actually hashed in our connection table
and has a dangling list_head.

					We drop the reference we held on the
					connection and schedule the expire timer
					for timeouting the connection on this
					CPU. Further packets won't be able to
					find this connection in our connection
					table.

					ip_vs_conn_expire() gets called again,
					we think it's already hashed, but the
					list_head is dangling and while removing
					the connection from our connection table
					we write to the memory location where
					this list_head points to.

The result will probably be a kernel oops at some other point in time.

This race condition is pretty subtle, but it can be triggered remotely.
It needs the IRQ assignment change or another circumstance where packets
coming from the same ip:port for the same service are being processed on
different CPUs. And it involves hitting the exact time at which
ip_vs_conn_expire() gets called. It can be avoided by making sure that
all packets from one connection are always processed on the same CPU and
can be made harder to exploit by changing the connection timeouts to
some custom values.

Signed-off-by: Sven Wegener &lt;sven.wegener@stealer.net&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit aea9d711f3d68c656ad31ab578ecfb0bb5cd7f97 upstream.

The code that hashes and unhashes connections from the connection table
is missing locking of the connection being modified, which opens up a
race condition and results in memory corruption when this race condition
is hit.

Here is what happens in pretty verbose form:

CPU 0					CPU 1
------------				------------
An active connection is terminated and
we schedule ip_vs_conn_expire() on this
CPU to expire this connection.

					IRQ assignment is changed to this CPU,
					but the expire timer stays scheduled on
					the other CPU.

					New connection from same ip:port comes
					in right before the timer expires, we
					find the inactive connection in our
					connection table and get a reference to
					it. We proper lock the connection in
					tcp_state_transition() and read the
					connection flags in set_tcp_state().

ip_vs_conn_expire() gets called, we
unhash the connection from our
connection table and remove the hashed
flag in ip_vs_conn_unhash(), without
proper locking!

					While still holding proper locks we
					write the connection flags in
					set_tcp_state() and this sets the hashed
					flag again.

ip_vs_conn_expire() fails to expire the
connection, because the other CPU has
incremented the reference count. We try
to re-insert the connection into our
connection table, but this fails in
ip_vs_conn_hash(), because the hashed
flag has been set by the other CPU. We
re-schedule execution of
ip_vs_conn_expire(). Now this connection
has the hashed flag set, but isn't
actually hashed in our connection table
and has a dangling list_head.

					We drop the reference we held on the
					connection and schedule the expire timer
					for timeouting the connection on this
					CPU. Further packets won't be able to
					find this connection in our connection
					table.

					ip_vs_conn_expire() gets called again,
					we think it's already hashed, but the
					list_head is dangling and while removing
					the connection from our connection table
					we write to the memory location where
					this list_head points to.

The result will probably be a kernel oops at some other point in time.

This race condition is pretty subtle, but it can be triggered remotely.
It needs the IRQ assignment change or another circumstance where packets
coming from the same ip:port for the same service are being processed on
different CPUs. And it involves hitting the exact time at which
ip_vs_conn_expire() gets called. It can be avoided by making sure that
all packets from one connection are always processed on the same CPU and
can be made harder to exploit by changing the connection timeouts to
some custom values.

Signed-off-by: Sven Wegener &lt;sven.wegener@stealer.net&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: xt_recent: fix regression in rules using a zero hit_count</title>
<updated>2010-04-01T22:58:47+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2010-03-22T17:25:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5a32ab6f23dd07b97a6095d6ddbfa79c47cf3bbd'/>
<id>5a32ab6f23dd07b97a6095d6ddbfa79c47cf3bbd</id>
<content type='text'>
commit ef1691504c83ba3eb636c0cfd3ed33f7a6d0b4ee upstream.

Commit 8ccb92ad (netfilter: xt_recent: fix false match) fixed supposedly
false matches in rules using a zero hit_count. As it turns out there is
nothing false about these matches and people are actually using entries
with a hit_count of zero to make rules dependant on addresses inserted
manually through /proc.

Since this slipped past the eyes of three reviewers, instead of
reverting the commit in question, this patch explicitly checks
for a hit_count of zero to make the intentions more clear.

Reported-by: Thomas Jarosch &lt;thomas.jarosch@intra2net.com&gt;
Tested-by: Thomas Jarosch &lt;thomas.jarosch@intra2net.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ef1691504c83ba3eb636c0cfd3ed33f7a6d0b4ee upstream.

Commit 8ccb92ad (netfilter: xt_recent: fix false match) fixed supposedly
false matches in rules using a zero hit_count. As it turns out there is
nothing false about these matches and people are actually using entries
with a hit_count of zero to make rules dependant on addresses inserted
manually through /proc.

Since this slipped past the eyes of three reviewers, instead of
reverting the commit in question, this patch explicitly checks
for a hit_count of zero to make the intentions more clear.

Reported-by: Thomas Jarosch &lt;thomas.jarosch@intra2net.com&gt;
Tested-by: Thomas Jarosch &lt;thomas.jarosch@intra2net.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: xt_recent: fix false match</title>
<updated>2010-03-15T15:50:01+00:00</updated>
<author>
<name>Tim Gardner</name>
<email>tim.gardner@canonical.com</email>
</author>
<published>2010-02-23T13:59:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=58368fc8325ee2a9af4c29e232e319d78903f64b'/>
<id>58368fc8325ee2a9af4c29e232e319d78903f64b</id>
<content type='text'>
commit 8ccb92ad41cb311e52ad1b1fe77992c7f47a3b63 upstream.

A rule with a zero hit_count will always match.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8ccb92ad41cb311e52ad1b1fe77992c7f47a3b63 upstream.

A rule with a zero hit_count will always match.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: xt_recent: fix buffer overflow</title>
<updated>2010-03-15T15:50:00+00:00</updated>
<author>
<name>Tim Gardner</name>
<email>tim.gardner@canonical.com</email>
</author>
<published>2010-02-23T13:55:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6f2deb6aad5e242d5573ae8d98a90a454f5e79d0'/>
<id>6f2deb6aad5e242d5573ae8d98a90a454f5e79d0</id>
<content type='text'>
commit 2c08522e5d2f0af2d6f05be558946dcbf8173683 upstream.

e-&gt;index overflows e-&gt;stamps[] every ip_pkt_list_tot packets.

Consider the case when ip_pkt_list_tot==1; the first packet received is stored
in e-&gt;stamps[0] and e-&gt;index is initialized to 1. The next received packet
timestamp is then stored at e-&gt;stamps[1] in recent_entry_update(),
a buffer overflow because the maximum e-&gt;stamps[] index is 0.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2c08522e5d2f0af2d6f05be558946dcbf8173683 upstream.

e-&gt;index overflows e-&gt;stamps[] every ip_pkt_list_tot packets.

Consider the case when ip_pkt_list_tot==1; the first packet received is stored
in e-&gt;stamps[0] and e-&gt;index is initialized to 1. The next received packet
timestamp is then stored at e-&gt;stamps[1] in recent_entry_update(),
a buffer overflow because the maximum e-&gt;stamps[] index is 0.

Signed-off-by: Tim Gardner &lt;tim.gardner@canonical.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: fix hash resizing with namespaces</title>
<updated>2010-02-23T15:37:53+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2010-02-08T19:18:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=242a71829e57a4962e43f89cf50d5fa99ff8a3e5'/>
<id>242a71829e57a4962e43f89cf50d5fa99ff8a3e5</id>
<content type='text'>
commit d696c7bdaa55e2208e56c6f98e6bc1599f34286d upstream.

As noticed by Jon Masters &lt;jonathan@jonmasters.org&gt;, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d696c7bdaa55e2208e56c6f98e6bc1599f34286d upstream.

As noticed by Jon Masters &lt;jonathan@jonmasters.org&gt;, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: restrict runtime expect hashsize modifications</title>
<updated>2010-02-23T15:37:53+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2010-02-08T19:17:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=51d3a347944f76bc2f304e0622d61b9b39fec585'/>
<id>51d3a347944f76bc2f304e0622d61b9b39fec585</id>
<content type='text'>
commit 13ccdfc2af03e09e60791f7d4bc4ccf53398af7c upstream.

Expectation hashtable size was simply glued to a variable with no code
to rehash expectations, so it was a bug to allow writing to it.
Make "expect_hashsize" readonly.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 13ccdfc2af03e09e60791f7d4bc4ccf53398af7c upstream.

Expectation hashtable size was simply glued to a variable with no code
to rehash expectations, so it was a bug to allow writing to it.
Make "expect_hashsize" readonly.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: per netns nf_conntrack_cachep</title>
<updated>2010-02-23T15:37:53+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-02-08T19:16:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=747edef00c9b2147ca0b3d5bc33e9291a9a6d86e'/>
<id>747edef00c9b2147ca0b3d5bc33e9291a9a6d86e</id>
<content type='text'>
commit 5b3501faa8741d50617ce4191c20061c6ef36cb3 upstream.

nf_conntrack_cachep is currently shared by all netns instances, but
because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.

If we use a shared slab cache, one object can instantly flight between
one hash table (netns ONE) to another one (netns TWO), and concurrent
reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
can be fooled without notice, because no RCU grace period has to be
observed between object freeing and its reuse.

We dont have this problem with UDP/TCP slab caches because TCP/UDP
hashtables are global to the machine (and each object has a pointer to
its netns).

If we use per netns conntrack hash tables, we also *must* use per netns
conntrack slab caches, to guarantee an object can not escape from one
namespace to another one.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
[Patrick: added unique slab name allocation]
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5b3501faa8741d50617ce4191c20061c6ef36cb3 upstream.

nf_conntrack_cachep is currently shared by all netns instances, but
because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.

If we use a shared slab cache, one object can instantly flight between
one hash table (netns ONE) to another one (netns TWO), and concurrent
reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
can be fooled without notice, because no RCU grace period has to be
observed between object freeing and its reuse.

We dont have this problem with UDP/TCP slab caches because TCP/UDP
hashtables are global to the machine (and each object has a pointer to
its netns).

If we use per netns conntrack hash tables, we also *must* use per netns
conntrack slab caches, to guarantee an object can not escape from one
namespace to another one.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
[Patrick: added unique slab name allocation]
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
</feed>
