<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/net/bonding, branch v3.0.7</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>bonding: fix string comparison errors</title>
<updated>2011-08-16T01:31:38+00:00</updated>
<author>
<name>Andy Gospodarek</name>
<email>andy@greyhouse.net</email>
</author>
<published>2011-07-26T11:12:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=247460c92c3a5d58496c6d65c4b4604ca0e2ff0e'/>
<id>247460c92c3a5d58496c6d65c4b4604ca0e2ff0e</id>
<content type='text'>
[ Upstream commit f4bb2e9c4fa9e5fdddf90589703613fd1a9c519f ]

When a bond contains a device where one name is the subset of another
(eth1 and eth10, for example), one cannot properly set the primary
device or the currently active device.

This was reported and based on work by Takuma Umeya.  I also verified
the problem and tested that this fix resolves it.

V2: A few did not like the the current code or my changes, so I
refactored bonding_store_primary and bonding_store_active_slave to be a
bit cleaner, dropped the use of strnicmp since we did not really need
the comparison to be case insensitive, and formatted the input string
from sysfs so a comparison to IFNAMSIZ could be used.

I also discovered an error in bonding_store_active_slave that would
modify bond-&gt;primary_slave rather than bond-&gt;curr_active_slave before
forcing the bonding driver to choose a new active slave.

V3: Actually sending the proper patch....

Signed-off-by: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Reported-by: Takuma Umeya &lt;tumeya@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f4bb2e9c4fa9e5fdddf90589703613fd1a9c519f ]

When a bond contains a device where one name is the subset of another
(eth1 and eth10, for example), one cannot properly set the primary
device or the currently active device.

This was reported and based on work by Takuma Umeya.  I also verified
the problem and tested that this fix resolves it.

V2: A few did not like the the current code or my changes, so I
refactored bonding_store_primary and bonding_store_active_slave to be a
bit cleaner, dropped the use of strnicmp since we did not really need
the comparison to be case insensitive, and formatted the input string
from sysfs so a comparison to IFNAMSIZ could be used.

I also discovered an error in bonding_store_active_slave that would
modify bond-&gt;primary_slave rather than bond-&gt;curr_active_slave before
forcing the bonding driver to choose a new active slave.

V3: Actually sending the proper patch....

Signed-off-by: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Reported-by: Takuma Umeya &lt;tumeya@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared</title>
<updated>2011-08-16T01:31:38+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2011-07-26T06:05:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9cf81e790a0d8709cbadbaff73ee40aa944e2ea1'/>
<id>9cf81e790a0d8709cbadbaff73ee40aa944e2ea1</id>
<content type='text'>
[ Upstream commit 550fd08c2cebad61c548def135f67aba284c6162 ]

After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs.  There are a handful of drivers that violate this assumption of
course, and need to be fixed up.  This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
CC: Karsten Keil &lt;isdn@linux-pingi.de&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Krzysztof Halasa &lt;khc@pm.waw.pl&gt;
CC: "John W. Linville" &lt;linville@tuxdriver.com&gt;
CC: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
CC: Marcel Holtmann &lt;marcel@holtmann.org&gt;
CC: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 550fd08c2cebad61c548def135f67aba284c6162 ]

After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs.  There are a handful of drivers that violate this assumption of
course, and need to be fixed up.  This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
CC: Karsten Keil &lt;isdn@linux-pingi.de&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Krzysztof Halasa &lt;khc@pm.waw.pl&gt;
CC: "John W. Linville" &lt;linville@tuxdriver.com&gt;
CC: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
CC: Marcel Holtmann &lt;marcel@holtmann.org&gt;
CC: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove NETIF_F_ALL_TX_OFFLOADS</title>
<updated>2011-07-14T22:18:49+00:00</updated>
<author>
<name>Michał Mirosław</name>
<email>mirq-linux@rere.qmqm.pl</email>
</author>
<published>2011-07-13T14:10:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=62f2a3a48bdc99822a24356e667e52c30df287c9'/>
<id>62f2a3a48bdc99822a24356e667e52c30df287c9</id>
<content type='text'>
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.

For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.

This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.

NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.

Signed-off-by: Michał Mirosław &lt;mirq-linux@rere.qmqm.pl&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.

For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.

This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.

NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.

Signed-off-by: Michał Mirosław &lt;mirq-linux@rere.qmqm.pl&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netpoll: copy dev name of slaves to struct netpoll</title>
<updated>2011-06-19T23:13:01+00:00</updated>
<author>
<name>WANG Cong</name>
<email>amwang@redhat.com</email>
</author>
<published>2011-06-19T23:13:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cefa9993f161c1c2b6b91b7ea2e84a9bfbd43d2e'/>
<id>cefa9993f161c1c2b6b91b7ea2e84a9bfbd43d2e</id>
<content type='text'>
Otherwise we will not see the name of the slave dev in error
message:

[  388.469446] (null):  doesn't support polling, aborting.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise we will not see the name of the slave dev in error
message:

[  388.469446] (null):  doesn't support polling, aborting.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: reset queue mapping prior to transmission to physical device (v5)</title>
<updated>2011-06-05T21:31:25+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2011-06-03T10:35:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=374eeb5a9d77ea719c5c46f4d70226623f4528ce'/>
<id>374eeb5a9d77ea719c5c46f4d70226623f4528ce</id>
<content type='text'>
The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb-&gt;queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

v3) Fixed spelling an casting error in skb-&gt;cb

v4) fixed to store raw queue_mapping to avoid double decrement

v5) Eric D requested that -&gt;cb access be wrapped in a macro.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bonding driver is multiqueue enabled, in which each queue represents a slave
to enable optional steering of output frames to given slaves against the default
output policy.  However, it needs to reset the skb-&gt;queue_mapping prior to
queuing to the physical device or the physical slave (if it is multiqueue) could
wind up transmitting on an unintended tx queue

Change Notes:
v2) Based on first pass review, updated the patch to restore the origional queue
mapping that was found in bond_select_queue, rather than simply resetting to
zero.  This preserves the value of queue_mapping when it was set on receive in
the forwarding case which is desireable.

v3) Fixed spelling an casting error in skb-&gt;cb

v4) fixed to store raw queue_mapping to avoid double decrement

v5) Eric D requested that -&gt;cb access be wrapped in a macro.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: cleanup module option descriptions</title>
<updated>2011-05-26T18:57:17+00:00</updated>
<author>
<name>Andy Gospodarek</name>
<email>andy@greyhouse.net</email>
</author>
<published>2011-05-25T04:41:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=90e62474fd08e16ba5309886c801243b0eb782f3'/>
<id>90e62474fd08e16ba5309886c801243b0eb782f3</id>
<content type='text'>
Weiping Pan noticed that the module option description for
xmit_hash_policy was incorrect and was nice enough to post a patch to
fix it.  The text was correct, but created a line over 80 characters and
I would rather not add those.  I realized I could take a few minutes and
clean up all the descriptions and things would look much better.  This
is the result.

Based on patch from Weiping Pan &lt;panweiping3@gmail.com&gt;.

Signed-off-by: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Weiping Pan &lt;panweiping3@gmail.com&gt;
Reviewed-by: Weiping Pan &lt;panweiping3@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Weiping Pan noticed that the module option description for
xmit_hash_policy was incorrect and was nice enough to post a patch to
fix it.  The text was correct, but created a line over 80 characters and
I would rather not add those.  I realized I could take a few minutes and
clean up all the descriptions and things would look much better.  This
is the result.

Based on patch from Weiping Pan &lt;panweiping3@gmail.com&gt;.

Signed-off-by: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: Weiping Pan &lt;panweiping3@gmail.com&gt;
Reviewed-by: Weiping Pan &lt;panweiping3@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: documentation and code cleanup for resend_igmp</title>
<updated>2011-05-25T21:55:33+00:00</updated>
<author>
<name>Flavio Leitner</name>
<email>fbl@redhat.com</email>
</author>
<published>2011-05-25T08:38:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=94265cf5f731c7df29fdfde262ca3e6d51e6828c'/>
<id>94265cf5f731c7df29fdfde262ca3e6d51e6828c</id>
<content type='text'>
Improves the documentation about how IGMP resend parameter
works, fix two missing checks and coding style issues.

Signed-off-by: Flavio Leitner &lt;fbl@redhat.com&gt;
Acked-by: Rick Jones &lt;rick.jones2@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Improves the documentation about how IGMP resend parameter
works, fix two missing checks and coding style issues.

Signed-off-by: Flavio Leitner &lt;fbl@redhat.com&gt;
Acked-by: Rick Jones &lt;rick.jones2@hp.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: prevent deadlock on slave store with alb mode (v3)</title>
<updated>2011-05-25T21:55:33+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2011-05-25T08:13:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9fe0617d9b6d21f700ee9e658e1c9fe3be2fb402'/>
<id>9fe0617d9b6d21f700ee9e658e1c9fe3be2fb402</id>
<content type='text'>
This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 &gt; /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 &gt; /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 &gt; /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[&lt;ffffffff80064bf0&gt;]  [&lt;ffffffff80064bf0&gt;]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
 [&lt;ffffffff80064af9&gt;] _spin_lock_bh+0x9/0x14
 [&lt;ffffffff886937d7&gt;] :bonding:tlb_clear_slave+0x22/0xa1
 [&lt;ffffffff8869423c&gt;] :bonding:bond_alb_deinit_slave+0xba/0xf0
 [&lt;ffffffff8868dda6&gt;] :bonding:bond_release+0x1b4/0x450
 [&lt;ffffffff8006457b&gt;] __down_write_nested+0x12/0x92
 [&lt;ffffffff88696ae4&gt;] :bonding:bonding_store_slaves+0x25c/0x2f7
 [&lt;ffffffff801106f7&gt;] sysfs_write_file+0xb9/0xe8
 [&lt;ffffffff80016b87&gt;] vfs_write+0xce/0x174
 [&lt;ffffffff80017450&gt;] sys_write+0x45/0x6e
 [&lt;ffffffff8005d28d&gt;] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 &gt; /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 &gt; /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 &gt; /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[&lt;ffffffff80064bf0&gt;]  [&lt;ffffffff80064bf0&gt;]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
 [&lt;ffffffff80064af9&gt;] _spin_lock_bh+0x9/0x14
 [&lt;ffffffff886937d7&gt;] :bonding:tlb_clear_slave+0x22/0xa1
 [&lt;ffffffff8869423c&gt;] :bonding:bond_alb_deinit_slave+0xba/0xf0
 [&lt;ffffffff8868dda6&gt;] :bonding:bond_release+0x1b4/0x450
 [&lt;ffffffff8006457b&gt;] __down_write_nested+0x12/0x92
 [&lt;ffffffff88696ae4&gt;] :bonding:bonding_store_slaves+0x25c/0x2f7
 [&lt;ffffffff801106f7&gt;] sysfs_write_file+0xb9/0xe8
 [&lt;ffffffff80016b87&gt;] vfs_write+0xce/0x174
 [&lt;ffffffff80017450&gt;] sys_write+0x45/0x6e
 [&lt;ffffffff8005d28d&gt;] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
CC: Andy Gospodarek &lt;andy@greyhouse.net&gt;
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Jay Vosburgh &lt;fubar@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: rename NETDEV_BONDING_DESLAVE to NETDEV_RELEASE</title>
<updated>2011-05-23T01:01:19+00:00</updated>
<author>
<name>Amerigo Wang</name>
<email>amwang@redhat.com</email>
</author>
<published>2011-05-19T21:39:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=daf9209bb2c8b07ca025eac82e3d175534086c77'/>
<id>daf9209bb2c8b07ca025eac82e3d175534086c77</id>
<content type='text'>
s/NETDEV_BONDING_DESLAVE/NETDEV_RELEASE/ as Andy suggested.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Cc: Neil Horman &lt;nhorman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
s/NETDEV_BONDING_DESLAVE/NETDEV_RELEASE/ as Andy suggested.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Cc: Neil Horman &lt;nhorman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netpoll: disable netpoll when enslave a device</title>
<updated>2011-05-23T01:01:19+00:00</updated>
<author>
<name>Amerigo Wang</name>
<email>amwang@redhat.com</email>
</author>
<published>2011-05-19T21:39:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8d8fc29d02a33e4bd5f4fa47823c1fd386346093'/>
<id>8d8fc29d02a33e4bd5f4fa47823c1fd386346093</id>
<content type='text'>
V3: rename NETDEV_ENSLAVE to NETDEV_JOIN

Currently we do nothing when we enslave a net device which is running netconsole.
Neil pointed out that we may get weird results in such case, so let's disable
netpoll on the device being enslaved. I think it is too harsh to prevent
the device being ensalved if it is running netconsole.

By the way, this patch also removes the NETDEV_GOING_DOWN from netconsole
netdev notifier, because netpoll will check if the device is running or not
and we don't handle NETDEV_PRE_UP neither.

This patch is based on net-next-2.6.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Neil Horman &lt;nhorman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
V3: rename NETDEV_ENSLAVE to NETDEV_JOIN

Currently we do nothing when we enslave a net device which is running netconsole.
Neil pointed out that we may get weird results in such case, so let's disable
netpoll on the device being enslaved. I think it is too harsh to prevent
the device being ensalved if it is running netconsole.

By the way, this patch also removes the NETDEV_GOING_DOWN from netconsole
netdev notifier, because netpoll will check if the device is running or not
and we don't handle NETDEV_PRE_UP neither.

This patch is based on net-next-2.6.

Signed-off-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Neil Horman &lt;nhorman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
