<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/core, branch v2.6.18.3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>[PATCH] NET: Set truesize in pskb_copy</title>
<updated>2006-11-19T03:28:03+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2006-11-09T06:33:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=37ccc3f978bc86c131eea70d7a5afb29b2ca3404'/>
<id>37ccc3f978bc86c131eea70d7a5afb29b2ca3404</id>
<content type='text'>
Since pskb_copy tacks on the non-linear bits from the original
skb, it needs to count them in the truesize field of the new skb.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since pskb_copy tacks on the non-linear bits from the original
skb, it needs to count them in the truesize field of the new skb.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] NET: __alloc_pages() failures reported due to fragmentation</title>
<updated>2006-11-19T03:28:02+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2006-11-06T23:07:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=90be3aa1886f6c18b99aacf6c682cf6c3493d385'/>
<id>90be3aa1886f6c18b99aacf6c682cf6c3493d385</id>
<content type='text'>
We have seen a couple of __alloc_pages() failures due to
fragmentation, there is plenty of free memory but no large order pages
available.  I think the problem is in sock_alloc_send_pskb(), the
gfp_mask includes __GFP_REPEAT but its never used/passed to the page
allocator.  Shouldnt the gfp_mask be passed to alloc_skb() ?

Signed-off-by: Larry Woodman &lt;lwoodman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have seen a couple of __alloc_pages() failures due to
fragmentation, there is plenty of free memory but no large order pages
available.  I think the problem is in sock_alloc_send_pskb(), the
gfp_mask includes __GFP_REPEAT but its never used/passed to the page
allocator.  Shouldnt the gfp_mask be passed to alloc_skb() ?

Signed-off-by: Larry Woodman &lt;lwoodman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] NET: Fix skb_segment() handling of fully linear SKBs</title>
<updated>2006-11-04T01:33:48+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2006-10-30T00:11:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5565a6be17231fdcbaa65e2ef41e4f67bf709a81'/>
<id>5565a6be17231fdcbaa65e2ef41e4f67bf709a81</id>
<content type='text'>
[NET]: Fix segmentation of linear packets

skb_segment fails to segment linear packets correctly because it
tries to write all linear parts of the original skb into each
segment.  This will always panic as each segment only contains
enough space for one MSS.

This was not detected earlier because linear packets should be
rare for GSO.  In fact it still remains to be seen what exactly
created the linear packets that triggered this bug.  Basically
the only time this should happen is if someone enables GSO
emulation on an interface that does not support SG.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[NET]: Fix segmentation of linear packets

skb_segment fails to segment linear packets correctly because it
tries to write all linear parts of the original skb into each
segment.  This will always panic as each segment only contains
enough space for one MSS.

This was not detected earlier because linear packets should be
rare for GSO.  In fact it still remains to be seen what exactly
created the linear packets that triggered this bug.  Basically
the only time this should happen is if someone enables GSO
emulation on an interface that does not support SG.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NET_SCHED: Fix fallout from dev-&gt;qdisc RCU change</title>
<updated>2006-10-13T20:23:20+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2006-10-10T04:16:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5f804c752ba4e4e85457bf0e4da9f8521720328b'/>
<id>5f804c752ba4e4e85457bf0e4da9f8521720328b</id>
<content type='text'>
The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.

The two assumptions were:

- since changes only happen in process context, read_lock doesn't need
  bottem half protection. Now invalid since destruction of inner qdiscs,
  classifiers, actions and estimators happens in the RCU callback unless
  they're manually deleted, resulting in dead-locks when read_lock in
  process context is interrupted by write_lock_bh in bottem half context.

- since changes only happen under the RTNL, no additional locking is
  necessary for data not used during packet processing (f.e. u32_list).
  Again, since destruction now happens in the RCU callback, this assumption
  is not valid anymore, causing races while using this data, which can
  result in corruption or use-after-free.

Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev-&gt;qdisc
pointer is protected by RCU, but -&gt;enqueue and the qdisc tree are still
protected by dev-&gt;qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.

The two assumptions were:

- since changes only happen in process context, read_lock doesn't need
  bottem half protection. Now invalid since destruction of inner qdiscs,
  classifiers, actions and estimators happens in the RCU callback unless
  they're manually deleted, resulting in dead-locks when read_lock in
  process context is interrupted by write_lock_bh in bottem half context.

- since changes only happen under the RTNL, no additional locking is
  necessary for data not used during packet processing (f.e. u32_list).
  Again, since destruction now happens in the RCU callback, this assumption
  is not valid anymore, causing races while using this data, which can
  result in corruption or use-after-free.

Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev-&gt;qdisc
pointer is protected by RCU, but -&gt;enqueue and the qdisc tree are still
protected by dev-&gt;qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>[NEIGH]: neigh_table_clear() doesn't free stats</title>
<updated>2006-09-18T06:21:01+00:00</updated>
<author>
<name>Kirill Korotaev</name>
<email>dev@openvz.org</email>
</author>
<published>2006-09-01T08:34:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3fcde74b3877756f4b4725a883d0b48696c0d369'/>
<id>3fcde74b3877756f4b4725a883d0b48696c0d369</id>
<content type='text'>
neigh_table_clear() doesn't free tbl-&gt;stats.
Found by Alexey Kuznetsov. Though Alexey considers this
leak minor for mainstream, I still believe that cleanup
code should not forget to free some of the resources :)

At least, this is critical for OpenVZ with virtualized
neighbour tables.

Signed-Off-By: Kirill Korotaev &lt;dev@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
neigh_table_clear() doesn't free tbl-&gt;stats.
Found by Alexey Kuznetsov. Though Alexey considers this
leak minor for mainstream, I still believe that cleanup
code should not forget to free some of the resources :)

At least, this is critical for OpenVZ with virtualized
neighbour tables.

Signed-Off-By: Kirill Korotaev &lt;dev@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Disallow whitespace in network device names.</title>
<updated>2006-08-17T23:29:56+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@sunset.davemloft.net</email>
</author>
<published>2006-08-15T23:34:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c7fa9d189e93877a1fa08ab00f230e0689125e45'/>
<id>c7fa9d189e93877a1fa08ab00f230e0689125e45</id>
<content type='text'>
It causes way too much trouble and confusion in userspace.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It causes way too much trouble and confusion in userspace.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Fix potential stack overflow in net/core/utils.c</title>
<updated>2006-08-17T23:29:47+00:00</updated>
<author>
<name>Suresh Siddha</name>
<email>suresh.b.siddha@intel.com</email>
</author>
<published>2006-08-15T07:03:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=855751125093f758871b70da2951d8b92b6368cc'/>
<id>855751125093f758871b70da2951d8b92b6368cc</id>
<content type='text'>
On High end systems (1024 or so cpus) this can potentially cause stack
overflow.  Fix the stack usage.

Signed-off-by: Suresh Siddha &lt;suresh.b.siddha@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On High end systems (1024 or so cpus) this can potentially cause stack
overflow.  Fix the stack usage.

Signed-off-by: Suresh Siddha &lt;suresh.b.siddha@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[VLAN]: Make sure bonding packet drop checks get done in hwaccel RX path.</title>
<updated>2006-08-17T23:29:46+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@sunset.davemloft.net</email>
</author>
<published>2006-08-15T00:08:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7ea49ed73c8d0d0bdf7c11fc18c61572d2d22176'/>
<id>7ea49ed73c8d0d0bdf7c11fc18c61572d2d22176</id>
<content type='text'>
Since __vlan_hwaccel_rx() is essentially bypassing the
netif_receive_skb() call that would have occurred if we did the VLAN
decapsulation in software, we are missing the skb_bond() call and the
assosciated checks it does.

Export those checks via an inline function, skb_bond_should_drop(),
and use this in __vlan_hwaccel_rx().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since __vlan_hwaccel_rx() is essentially bypassing the
netif_receive_skb() call that would have occurred if we did the VLAN
decapsulation in software, we are missing the skb_bond() call and the
assosciated checks it does.

Export those checks via an inline function, skb_bond_should_drop(),
and use this in __vlan_hwaccel_rx().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2006-08-09T18:49:13+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@suse.de</email>
</author>
<published>2006-08-09T18:49:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a4657141091c2f975fa35ac1ad28fffdd756091e'/>
<id>a4657141091c2f975fa35ac1ad28fffdd756091e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: add_timer -&gt; mod_timer() in dst_run_gc()</title>
<updated>2006-08-09T09:25:54+00:00</updated>
<author>
<name>Dmitry Mishin</name>
<email>dim@openvz.org</email>
</author>
<published>2006-08-09T09:25:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7c91767a6b701543c93ebcd611dab61deff3dad1'/>
<id>7c91767a6b701543c93ebcd611dab61deff3dad1</id>
<content type='text'>
Patch from Dmitry Mishin &lt;dim@openvz.org&gt;:

Replace add_timer() by mod_timer() in dst_run_gc
in order to avoid BUG message.

       CPU1                            CPU2
dst_run_gc()  entered           dst_run_gc() entered
spin_lock(&amp;dst_lock)                   .....
del_timer(&amp;dst_gc_timer)         fail to get lock
       ....                         mod_timer() &lt;--- puts 
                                                 timer back
                                                 to the list
add_timer(&amp;dst_gc_timer) &lt;--- BUG because timer is in list already.

Found during OpenVZ internal testing.

At first we thought that it is OpenVZ specific as we
added dst_run_gc(0) call in dst_dev_event(),
but as Alexey pointed to me it is possible to trigger
this condition in mainstream kernel.

F.e. timer has fired on CPU2, but the handler was preeempted
by an irq before dst_lock is tried.
Meanwhile, someone on CPU1 adds an entry to gc list and
starts the timer.
If CPU2 was preempted long enough, this timer can expire
simultaneously with resuming timer handler on CPU1, arriving
exactly to the situation described.

Signed-off-by: Dmitry Mishin &lt;dim@openvz.org&gt;
Signed-off-by: Kirill Korotaev &lt;dev@openvz.org&gt;
Signed-off-by: Alexey Kuznetsov &lt;kuznet@ms2.inr.ac.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch from Dmitry Mishin &lt;dim@openvz.org&gt;:

Replace add_timer() by mod_timer() in dst_run_gc
in order to avoid BUG message.

       CPU1                            CPU2
dst_run_gc()  entered           dst_run_gc() entered
spin_lock(&amp;dst_lock)                   .....
del_timer(&amp;dst_gc_timer)         fail to get lock
       ....                         mod_timer() &lt;--- puts 
                                                 timer back
                                                 to the list
add_timer(&amp;dst_gc_timer) &lt;--- BUG because timer is in list already.

Found during OpenVZ internal testing.

At first we thought that it is OpenVZ specific as we
added dst_run_gc(0) call in dst_dev_event(),
but as Alexey pointed to me it is possible to trigger
this condition in mainstream kernel.

F.e. timer has fired on CPU2, but the handler was preeempted
by an irq before dst_lock is tried.
Meanwhile, someone on CPU1 adds an entry to gc list and
starts the timer.
If CPU2 was preempted long enough, this timer can expire
simultaneously with resuming timer handler on CPU1, arriving
exactly to the situation described.

Signed-off-by: Dmitry Mishin &lt;dim@openvz.org&gt;
Signed-off-by: Kirill Korotaev &lt;dev@openvz.org&gt;
Signed-off-by: Alexey Kuznetsov &lt;kuznet@ms2.inr.ac.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
