<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/sctp, branch v3.2.28</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sctp: Fix list corruption resulting from freeing an association on a list</title>
<updated>2012-08-19T17:15:22+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2012-07-16T09:13:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e44b8b47d096751cb15a81b4c0f83a05fc1f49c8'/>
<id>e44b8b47d096751cb15a81b4c0f83a05fc1f49c8</id>
<content type='text'>
[ Upstream commit 2eebc1e188e9e45886ee00662519849339884d6d ]

A few days ago Dave Jones reported this oops:

[22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
[22766.295376] CPU 0
[22766.295384] Modules linked in:
[22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
ffff880147c03a74
[22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
[22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
[22766.387137] Stack:
[22766.387140]  ffff880147c03a10
[22766.387140]  ffffffffa169f2b6
[22766.387140]  ffff88013ed95728
[22766.387143]  0000000000000002
[22766.387143]  0000000000000000
[22766.387143]  ffff880003fad062
[22766.387144]  ffff88013c120000
[22766.387144]
[22766.387145] Call Trace:
[22766.387145]  &lt;IRQ&gt;
[22766.387150]  [&lt;ffffffffa169f292&gt;] ? __sctp_lookup_association+0x62/0xd0
[sctp]
[22766.387154]  [&lt;ffffffffa169f2b6&gt;] __sctp_lookup_association+0x86/0xd0 [sctp]
[22766.387157]  [&lt;ffffffffa169f597&gt;] sctp_rcv+0x207/0xbb0 [sctp]
[22766.387161]  [&lt;ffffffff810d4da8&gt;] ? trace_hardirqs_off_caller+0x28/0xd0
[22766.387163]  [&lt;ffffffff815827e3&gt;] ? nf_hook_slow+0x133/0x210
[22766.387166]  [&lt;ffffffff815902fc&gt;] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387168]  [&lt;ffffffff8159043d&gt;] ip_local_deliver_finish+0x18d/0x4c0
[22766.387169]  [&lt;ffffffff815902fc&gt;] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387171]  [&lt;ffffffff81590a07&gt;] ip_local_deliver+0x47/0x80
[22766.387172]  [&lt;ffffffff8158fd80&gt;] ip_rcv_finish+0x150/0x680
[22766.387174]  [&lt;ffffffff81590c54&gt;] ip_rcv+0x214/0x320
[22766.387176]  [&lt;ffffffff81558c07&gt;] __netif_receive_skb+0x7b7/0x910
[22766.387178]  [&lt;ffffffff8155856c&gt;] ? __netif_receive_skb+0x11c/0x910
[22766.387180]  [&lt;ffffffff810d423e&gt;] ? put_lock_stats.isra.25+0xe/0x40
[22766.387182]  [&lt;ffffffff81558f83&gt;] netif_receive_skb+0x23/0x1f0
[22766.387183]  [&lt;ffffffff815596a9&gt;] ? dev_gro_receive+0x139/0x440
[22766.387185]  [&lt;ffffffff81559280&gt;] napi_skb_finish+0x70/0xa0
[22766.387187]  [&lt;ffffffff81559cb5&gt;] napi_gro_receive+0xf5/0x130
[22766.387218]  [&lt;ffffffffa01c4679&gt;] e1000_receive_skb+0x59/0x70 [e1000e]
[22766.387242]  [&lt;ffffffffa01c5aab&gt;] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
[22766.387266]  [&lt;ffffffffa01c9c18&gt;] e1000e_poll+0x78/0x430 [e1000e]
[22766.387268]  [&lt;ffffffff81559fea&gt;] net_rx_action+0x1aa/0x3d0
[22766.387270]  [&lt;ffffffff810a495f&gt;] ? account_system_vtime+0x10f/0x130
[22766.387273]  [&lt;ffffffff810734d0&gt;] __do_softirq+0xe0/0x420
[22766.387275]  [&lt;ffffffff8169826c&gt;] call_softirq+0x1c/0x30
[22766.387278]  [&lt;ffffffff8101db15&gt;] do_softirq+0xd5/0x110
[22766.387279]  [&lt;ffffffff81073bc5&gt;] irq_exit+0xd5/0xe0
[22766.387281]  [&lt;ffffffff81698b03&gt;] do_IRQ+0x63/0xd0
[22766.387283]  [&lt;ffffffff8168ee2f&gt;] common_interrupt+0x6f/0x6f
[22766.387283]  &lt;EOI&gt;
[22766.387284]
[22766.387285]  [&lt;ffffffff8168eed9&gt;] ? retint_swapgs+0x13/0x1b
[22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
89 e5 48 83
ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 &lt;0f&gt; b7 87 98 00 00 00
48 89 fb
49 89 f5 66 c1 c0 08 66 39 46 02
[22766.387307]
[22766.387307] RIP
[22766.387311]  [&lt;ffffffffa168a2c9&gt;] sctp_assoc_is_match+0x19/0x90 [sctp]
[22766.387311]  RSP &lt;ffff880147c039b0&gt;
[22766.387142]  ffffffffa16ab120
[22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
[22766.601221] Kernel panic - not syncing: Fatal exception in interrupt

It appears from his analysis and some staring at the code that this is likely
occuring because an association is getting freed while still on the
sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
while a freed node corrupts part of the list.

Nominally I would think that an mibalanced refcount was responsible for this,
but I can't seem to find any obvious imbalance.  What I did note however was
that the two places where we create an association using
sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
which free a newly created association after calling sctp_primitive_ASSOCIATE.
sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
the aforementioned hash table.  the sctp command interpreter that process side
effects has not way to unwind previously processed commands, so freeing the
association from the __sctp_connect or sctp_sendmsg error path would lead to a
freed association remaining on this hash table.

I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
which allows us to proerly use hlist_unhashed to check if the node is on a
hashlist safely during a delete.  That in turn alows us to safely call
sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
before freeing them, regardles of what the associations state is on the hash
list.

I noted, while I was doing this, that the __sctp_unhash_endpoint was using
hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
pointers to make that function work properly, so I fixed that up in a simmilar
fashion.

I attempted to test this using a virtual guest running the SCTP_RR test from
netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
able to recreate the problem prior to this fix, nor was I able to trigger the
failure after (neither of which I suppose is suprising).  Given the trace above
however, I think its likely that this is what we hit.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: davej@redhat.com
CC: davej@redhat.com
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Vlad Yasevich &lt;vyasevich@gmail.com&gt;
CC: Sridhar Samudrala &lt;sri@us.ibm.com&gt;
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2eebc1e188e9e45886ee00662519849339884d6d ]

A few days ago Dave Jones reported this oops:

[22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
[22766.295376] CPU 0
[22766.295384] Modules linked in:
[22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
ffff880147c03a74
[22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
[22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
[22766.387137] Stack:
[22766.387140]  ffff880147c03a10
[22766.387140]  ffffffffa169f2b6
[22766.387140]  ffff88013ed95728
[22766.387143]  0000000000000002
[22766.387143]  0000000000000000
[22766.387143]  ffff880003fad062
[22766.387144]  ffff88013c120000
[22766.387144]
[22766.387145] Call Trace:
[22766.387145]  &lt;IRQ&gt;
[22766.387150]  [&lt;ffffffffa169f292&gt;] ? __sctp_lookup_association+0x62/0xd0
[sctp]
[22766.387154]  [&lt;ffffffffa169f2b6&gt;] __sctp_lookup_association+0x86/0xd0 [sctp]
[22766.387157]  [&lt;ffffffffa169f597&gt;] sctp_rcv+0x207/0xbb0 [sctp]
[22766.387161]  [&lt;ffffffff810d4da8&gt;] ? trace_hardirqs_off_caller+0x28/0xd0
[22766.387163]  [&lt;ffffffff815827e3&gt;] ? nf_hook_slow+0x133/0x210
[22766.387166]  [&lt;ffffffff815902fc&gt;] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387168]  [&lt;ffffffff8159043d&gt;] ip_local_deliver_finish+0x18d/0x4c0
[22766.387169]  [&lt;ffffffff815902fc&gt;] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387171]  [&lt;ffffffff81590a07&gt;] ip_local_deliver+0x47/0x80
[22766.387172]  [&lt;ffffffff8158fd80&gt;] ip_rcv_finish+0x150/0x680
[22766.387174]  [&lt;ffffffff81590c54&gt;] ip_rcv+0x214/0x320
[22766.387176]  [&lt;ffffffff81558c07&gt;] __netif_receive_skb+0x7b7/0x910
[22766.387178]  [&lt;ffffffff8155856c&gt;] ? __netif_receive_skb+0x11c/0x910
[22766.387180]  [&lt;ffffffff810d423e&gt;] ? put_lock_stats.isra.25+0xe/0x40
[22766.387182]  [&lt;ffffffff81558f83&gt;] netif_receive_skb+0x23/0x1f0
[22766.387183]  [&lt;ffffffff815596a9&gt;] ? dev_gro_receive+0x139/0x440
[22766.387185]  [&lt;ffffffff81559280&gt;] napi_skb_finish+0x70/0xa0
[22766.387187]  [&lt;ffffffff81559cb5&gt;] napi_gro_receive+0xf5/0x130
[22766.387218]  [&lt;ffffffffa01c4679&gt;] e1000_receive_skb+0x59/0x70 [e1000e]
[22766.387242]  [&lt;ffffffffa01c5aab&gt;] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
[22766.387266]  [&lt;ffffffffa01c9c18&gt;] e1000e_poll+0x78/0x430 [e1000e]
[22766.387268]  [&lt;ffffffff81559fea&gt;] net_rx_action+0x1aa/0x3d0
[22766.387270]  [&lt;ffffffff810a495f&gt;] ? account_system_vtime+0x10f/0x130
[22766.387273]  [&lt;ffffffff810734d0&gt;] __do_softirq+0xe0/0x420
[22766.387275]  [&lt;ffffffff8169826c&gt;] call_softirq+0x1c/0x30
[22766.387278]  [&lt;ffffffff8101db15&gt;] do_softirq+0xd5/0x110
[22766.387279]  [&lt;ffffffff81073bc5&gt;] irq_exit+0xd5/0xe0
[22766.387281]  [&lt;ffffffff81698b03&gt;] do_IRQ+0x63/0xd0
[22766.387283]  [&lt;ffffffff8168ee2f&gt;] common_interrupt+0x6f/0x6f
[22766.387283]  &lt;EOI&gt;
[22766.387284]
[22766.387285]  [&lt;ffffffff8168eed9&gt;] ? retint_swapgs+0x13/0x1b
[22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
89 e5 48 83
ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 &lt;0f&gt; b7 87 98 00 00 00
48 89 fb
49 89 f5 66 c1 c0 08 66 39 46 02
[22766.387307]
[22766.387307] RIP
[22766.387311]  [&lt;ffffffffa168a2c9&gt;] sctp_assoc_is_match+0x19/0x90 [sctp]
[22766.387311]  RSP &lt;ffff880147c039b0&gt;
[22766.387142]  ffffffffa16ab120
[22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
[22766.601221] Kernel panic - not syncing: Fatal exception in interrupt

It appears from his analysis and some staring at the code that this is likely
occuring because an association is getting freed while still on the
sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
while a freed node corrupts part of the list.

Nominally I would think that an mibalanced refcount was responsible for this,
but I can't seem to find any obvious imbalance.  What I did note however was
that the two places where we create an association using
sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
which free a newly created association after calling sctp_primitive_ASSOCIATE.
sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
the aforementioned hash table.  the sctp command interpreter that process side
effects has not way to unwind previously processed commands, so freeing the
association from the __sctp_connect or sctp_sendmsg error path would lead to a
freed association remaining on this hash table.

I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
which allows us to proerly use hlist_unhashed to check if the node is on a
hashlist safely during a delete.  That in turn alows us to safely call
sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
before freeing them, regardles of what the associations state is on the hash
list.

I noted, while I was doing this, that the __sctp_unhash_endpoint was using
hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
pointers to make that function work properly, so I fixed that up in a simmilar
fashion.

I attempted to test this using a virtual guest running the SCTP_RR test from
netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
able to recreate the problem prior to this fix, nor was I able to trigger the
failure after (neither of which I suppose is suprising).  Given the trace above
however, I think its likely that this is what we hit.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: davej@redhat.com
CC: davej@redhat.com
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Vlad Yasevich &lt;vyasevich@gmail.com&gt;
CC: Sridhar Samudrala &lt;sri@us.ibm.com&gt;
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sctp: check cached dst before using it</title>
<updated>2012-06-10T13:42:06+00:00</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2012-05-04T05:24:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba9ee5eb730f20bab874ba91d3dfec7816459dad'/>
<id>ba9ee5eb730f20bab874ba91d3dfec7816459dad</id>
<content type='text'>
[ Upstream commit e0268868ba064980488fc8c194db3d8e9fb2959c ]

dst_check() will take care of SA (and obsolete field), hence
IPsec rekeying scenario is taken into account.

Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Vlad Yaseivch &lt;vyasevich@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e0268868ba064980488fc8c194db3d8e9fb2959c ]

dst_check() will take care of SA (and obsolete field), hence
IPsec rekeying scenario is taken into account.

Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Vlad Yaseivch &lt;vyasevich@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sctp: Allow struct sctp_event_subscribe to grow without breaking binaries</title>
<updated>2012-05-11T12:14:18+00:00</updated>
<author>
<name>Thomas Graf</name>
<email>tgraf@infradead.org</email>
</author>
<published>2012-04-03T22:17:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a13e640b8d17e0a5ab4b388431f877e2feef7abf'/>
<id>a13e640b8d17e0a5ab4b388431f877e2feef7abf</id>
<content type='text'>
[ Upstream commit acdd5985364f8dc511a0762fab2e683f29d9d692 ]

getsockopt(..., SCTP_EVENTS, ...) performs a length check and returns
an error if the user provides less bytes than the size of struct
sctp_event_subscribe.

Struct sctp_event_subscribe needs to be extended by an u8 for every
new event or notification type that is added.

This obviously makes getsockopt fail for binaries that are compiled
against an older versions of &lt;net/sctp/user.h&gt; which do not contain
all event types.

This patch changes getsockopt behaviour to no longer return an error
if not enough bytes are being provided by the user. Instead, it
returns as much of sctp_event_subscribe as fits into the provided buffer.

This leads to the new behavior that users see what they have been aware
of at compile time.

The setsockopt(..., SCTP_EVENTS, ...) API is already behaving like this.

Signed-off-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Acked-by: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit acdd5985364f8dc511a0762fab2e683f29d9d692 ]

getsockopt(..., SCTP_EVENTS, ...) performs a length check and returns
an error if the user provides less bytes than the size of struct
sctp_event_subscribe.

Struct sctp_event_subscribe needs to be extended by an u8 for every
new event or notification type that is added.

This obviously makes getsockopt fail for binaries that are compiled
against an older versions of &lt;net/sctp/user.h&gt; which do not contain
all event types.

This patch changes getsockopt behaviour to no longer return an error
if not enough bytes are being provided by the user. Instead, it
returns as much of sctp_event_subscribe as fits into the provided buffer.

This leads to the new behavior that users see what they have been aware
of at compile time.

The setsockopt(..., SCTP_EVENTS, ...) API is already behaving like this.

Signed-off-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Acked-by: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd</title>
<updated>2011-12-20T18:58:37+00:00</updated>
<author>
<name>Thomas Graf</name>
<email>tgraf@redhat.com</email>
</author>
<published>2011-12-19T04:11:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a76c0adf60f6ca5ff3481992e4ea0383776b24d2'/>
<id>a76c0adf60f6ca5ff3481992e4ea0383776b24d2</id>
<content type='text'>
When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.

The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.

When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.

Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.

The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.

Chunk
Size    Unpatched     No Overhead
-------------------------------------
   4    15.2 Kbit [!]   12.2 Mbit [!]
   8    35.8 Kbit [!]   26.0 Mbit [!]
  16    95.5 Kbit [!]   54.4 Mbit [!]
  32   106.7 Mbit      102.3 Mbit
  64   189.2 Mbit      188.3 Mbit
 128   331.2 Mbit      334.8 Mbit
 256   537.7 Mbit      536.0 Mbit
 512   766.9 Mbit      766.6 Mbit
1024   810.1 Mbit      808.6 Mbit

Signed-off-by: Thomas Graf &lt;tgraf@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.

The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.

When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.

Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.

The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.

Chunk
Size    Unpatched     No Overhead
-------------------------------------
   4    15.2 Kbit [!]   12.2 Mbit [!]
   8    35.8 Kbit [!]   26.0 Mbit [!]
  16    95.5 Kbit [!]   54.4 Mbit [!]
  32   106.7 Mbit      102.3 Mbit
  64   189.2 Mbit      188.3 Mbit
 128   331.2 Mbit      334.8 Mbit
 256   537.7 Mbit      536.0 Mbit
 512   766.9 Mbit      766.6 Mbit
1024   810.1 Mbit      808.6 Mbit

Signed-off-by: Thomas Graf &lt;tgraf@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sctp: fix incorrect overflow check on autoclose</title>
<updated>2011-12-19T21:25:46+00:00</updated>
<author>
<name>Xi Wang</name>
<email>xi.wang@gmail.com</email>
</author>
<published>2011-12-16T12:44:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2692ba61a82203404abd7dd2a027bda962861f74'/>
<id>2692ba61a82203404abd7dd2a027bda962861f74</id>
<content type='text'>
Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value.  If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.

This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.

1) Avoid overflowing autoclose * HZ.

2) Keep the default autoclose bound consistent across 32- and 64-bit
   platforms (INT_MAX / HZ in this patch).

3) Keep the autoclose value consistent between setsockopt() and
   getsockopt() calls.

Suggested-by: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: Xi Wang &lt;xi.wang@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value.  If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.

This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.

1) Avoid overflowing autoclose * HZ.

2) Keep the default autoclose bound consistent across 32- and 64-bit
   platforms (INT_MAX / HZ in this patch).

3) Keep the autoclose value consistent between setsockopt() and
   getsockopt() calls.

Suggested-by: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: Xi Wang &lt;xi.wang@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sctp: better integer overflow check in sctp_auth_create_key()</title>
<updated>2011-11-29T20:51:03+00:00</updated>
<author>
<name>Xi Wang</name>
<email>xi.wang@gmail.com</email>
</author>
<published>2011-11-29T09:26:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c89304b8ea34ab48ba6ae10e06a8b1b8c8212307'/>
<id>c89304b8ea34ab48ba6ae10e06a8b1b8c8212307</id>
<content type='text'>
The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1).  In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.

However this shouldn't be a security issue.  The function is called
from the following two code paths:

 1) setsockopt()

 2) sctp_auth_asoc_set_secret()

In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API.  As such, the key length will
never overflow.

In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.

In other words, this overflow check is not really necessary.  Just
make it more correct.

Signed-off-by: Xi Wang &lt;xi.wang@gmail.com&gt;
Cc: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1).  In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.

However this shouldn't be a security issue.  The function is called
from the following two code paths:

 1) setsockopt()

 2) sctp_auth_asoc_set_secret()

In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API.  As such, the key length will
never overflow.

In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.

In other words, this overflow check is not really necessary.  Just
make it more correct.

Signed-off-by: Xi Wang &lt;xi.wang@gmail.com&gt;
Cc: Vlad Yasevich &lt;vladislav.yasevich@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux</title>
<updated>2011-11-07T03:44:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-11-07T03:44:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=32aaeffbd4a7457bf2f7448b33b5946ff2a960eb'/>
<id>32aaeffbd4a7457bf2f7448b33b5946ff2a960eb</id>
<content type='text'>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include &lt;linux/module.h&gt;
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include &lt;linux/module.h&gt;
  net: sch_generic remove redundant use of &lt;linux/module.h&gt;
  net: inet_timewait_sock doesnt need &lt;linux/module.h&gt;
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include &lt;linux/module.h&gt;
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include &lt;linux/module.h&gt;
  net: sch_generic remove redundant use of &lt;linux/module.h&gt;
  net: inet_timewait_sock doesnt need &lt;linux/module.h&gt;
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules</title>
<updated>2011-10-31T23:30:30+00:00</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2011-07-15T15:47:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc3b2d7fb9b014d75ebb79ba371a763dbab5e8cf'/>
<id>bc3b2d7fb9b014d75ebb79ba371a763dbab5e8cf</id>
<content type='text'>
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAIT</title>
<updated>2011-10-27T04:44:35+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-10-27T04:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b903d324bee2627036d024dceed73b3c96558795'/>
<id>b903d324bee2627036d024dceed73b3c96558795</id>
<content type='text'>
commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
TIME_WAIT) fixed IPv4 only.

This part is for the IPv6 side, adding a tclass param to ip6_xmit()

We alias tw_tclass and tw_tos, if socket family is INET6.

[ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
TOS field, TCLASS is not taken into account ]

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
TIME_WAIT) fixed IPv4 only.

This part is for the IPv6 side, adding a tclass param to ip6_xmit()

We alias tw_tclass and tw_tos, if socket family is INET6.

[ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
TOS field, TCLASS is not taken into account ]

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: more accurate skb truesize</title>
<updated>2011-10-13T20:05:07+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-10-13T07:28:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=87fb4b7b533073eeeaed0b6bf7c2328995f6c075'/>
<id>87fb4b7b533073eeeaed0b6bf7c2328995f6c075</id>
<content type='text'>
skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb-&gt;head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;ak@linux.intel.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb-&gt;head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;ak@linux.intel.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
