<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/infiniband/ulp, branch v3.2.53</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>IPoIB: Fix send lockup due to missed TX completion</title>
<updated>2013-04-10T02:20:02+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2013-02-26T15:46:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6273c03e47f9192594311b3d06601c695d2c7171'/>
<id>6273c03e47f9192594311b3d06601c695d2c7171</id>
<content type='text'>
commit 1ee9e2aa7b31427303466776f455d43e5e3c9275 upstream.

Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.

The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.

This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.

This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.

Reviewed-by: Dean Luick &lt;dean.luick@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 1ee9e2aa7b31427303466776f455d43e5e3c9275 upstream.

Commit f0dc117abdfa ("IPoIB: Fix TX queue lockup with mixed UD/CM
traffic") attempts to solve an issue where unprocessed UD send
completions can deadlock the netdev.

The patch doesn't fully resolve the issue because if more than half
the tx_outstanding's were UD and all of the destinations are RC
reachable, arming the CQ doesn't solve the issue.

This patch uses the IB_CQ_REPORT_MISSED_EVENTS on the
ib_req_notify_cq().  If the rc is above 0, the UD send cq completion
callback is called directly to re-arm the send completion timer.

This issue is seen in very large parallel filesystem deployments
and the patch has been shown to correct the issue.

Reviewed-by: Dean Luick &lt;dean.luick@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srp: Avoid having aborted requests hang</title>
<updated>2012-10-17T02:49:19+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2012-08-24T10:29:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0575df129e2eb4a801beae0e6e041787480f42b9'/>
<id>0575df129e2eb4a801beae0e6e041787480f42b9</id>
<content type='text'>
commit d8536670916a685df116b5c2cb256573fd25e4e3 upstream.

We need to call scsi_done() for commands after we abort them.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d8536670916a685df116b5c2cb256573fd25e4e3 upstream.

We need to call scsi_done() for commands after we abort them.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srp: Fix use-after-free in srp_reset_req()</title>
<updated>2012-10-17T02:49:18+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2012-08-24T10:27:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6bfc0805e76ff5a2fd0e2b2f1bee777c38b0da1c'/>
<id>6bfc0805e76ff5a2fd0e2b2f1bee777c38b0da1c</id>
<content type='text'>
commit 9b796d06d5d1b1e85ae2316a283ea11dd739ef96 upstream.

srp_free_req() uses the scsi_cmnd structure contents to unmap
buffers, so we must invoke srp_free_req() before we release
ownership of that structure.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9b796d06d5d1b1e85ae2316a283ea11dd739ef96 upstream.

srp_free_req() uses the scsi_cmnd structure contents to unmap
buffers, so we must invoke srp_free_req() before we release
ownership of that structure.

Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPoIB: Fix use-after-free of multicast object</title>
<updated>2012-10-17T02:49:17+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2012-08-30T07:01:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1ba95552a4efd2f9e779cd840ffbd159bd0e9c7f'/>
<id>1ba95552a4efd2f9e779cd840ffbd159bd0e9c7f</id>
<content type='text'>
commit bea1e22df494a729978e7f2c54f7bda328f74bc3 upstream.

Fix a crash in ipoib_mcast_join_task().  (with help from Or Gerlitz)

Commit c8c2afe360b7 ("IPoIB: Use rtnl lock/unlock when changing device
flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which
is run from the ipoib_workqueue, and hence the workqueue can't be
flushed from the context of ipoib_stop().

In the current code, ipoib_stop() (which doesn't flush the workqueue)
calls ipoib_mcast_dev_flush(), which goes and deletes all the
multicast entries.  This takes place without any synchronization with
a possible running instance of ipoib_mcast_join_task() for the same
ipoib device, leading to a crash due to NULL pointer dereference.

Fix this by making sure that the workqueue is flushed before
ipoib_mcast_dev_flush() is called.  To make that possible, we move the
RTNL-lock wrapped code to ipoib_mcast_join_finish().

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bea1e22df494a729978e7f2c54f7bda328f74bc3 upstream.

Fix a crash in ipoib_mcast_join_task().  (with help from Or Gerlitz)

Commit c8c2afe360b7 ("IPoIB: Use rtnl lock/unlock when changing device
flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which
is run from the ipoib_workqueue, and hence the workqueue can't be
flushed from the context of ipoib_stop().

In the current code, ipoib_stop() (which doesn't flush the workqueue)
calls ipoib_mcast_dev_flush(), which goes and deletes all the
multicast entries.  This takes place without any synchronization with
a possible running instance of ipoib_mcast_join_task() for the same
ipoib device, leading to a crash due to NULL pointer dereference.

Fix this by making sure that the workqueue is flushed before
ipoib_mcast_dev_flush() is called.  To make that possible, we move the
RTNL-lock wrapped code to ipoib_mcast_join_finish().

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srp: Fix a race condition</title>
<updated>2012-09-12T02:37:04+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2012-08-14T13:18:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fc7834af1b974a2578440d1210a991eedd2151b1'/>
<id>fc7834af1b974a2578440d1210a991eedd2151b1</id>
<content type='text'>
commit 220329916c72ee3d54ae7262b215a050f04a18fc upstream.

Avoid a crash caused by the scmnd-&gt;scsi_done(scmnd) call in
srp_process_rsp() being invoked with scsi_done == NULL.  This can
happen if a reply is received during or after a command abort.

Reported-by: Joseph Glanville &lt;joseph.glanville@orionvm.com.au&gt;
Reference: http://marc.info/?l=linux-rdma&amp;m=134314367801595
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 220329916c72ee3d54ae7262b215a050f04a18fc upstream.

Avoid a crash caused by the scmnd-&gt;scsi_done(scmnd) call in
srp_process_rsp() being invoked with scsi_done == NULL.  This can
happen if a reply is received during or after a command abort.

Reported-by: Joseph Glanville &lt;joseph.glanville@orionvm.com.au&gt;
Reference: http://marc.info/?l=linux-rdma&amp;m=134314367801595
Acked-by: David Dillow &lt;dillowda@ornl.gov&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/iser: Post initial receive buffers before sending the final login request</title>
<updated>2012-04-02T16:52:36+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2012-03-05T16:21:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9af865fc7d42e347f679405000fc22ecbfbf2653'/>
<id>9af865fc7d42e347f679405000fc22ecbfbf2653</id>
<content type='text'>
commit 89e984e2c2cd14f77ccb26c47726ac7f13b70ae8 upstream.

An iser target may send iscsi NO-OP PDUs as soon as it marks the iSER
iSCSI session as fully operative.  This means that there is window
where there are no posted receive buffers on the initiator side, so
it's possible for the iSER RC connection to break because of RNR NAK /
retry errors.  To fix this, rely on the flags bits in the login
request to have FFP (0x3) in the lower nibble as a marker for the
final login request, and post an initial chunk of receive buffers
before sending that login request instead of after getting the login
response.

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 89e984e2c2cd14f77ccb26c47726ac7f13b70ae8 upstream.

An iser target may send iscsi NO-OP PDUs as soon as it marks the iSER
iSCSI session as fully operative.  This means that there is window
where there are no posted receive buffers on the initiator side, so
it's possible for the iSER RC connection to break because of RNR NAK /
retry errors.  To fix this, rely on the flags bits in the login
request to have FFP (0x3) in the lower nibble as a marker for the
final login request, and post an initial chunk of receive buffers
before sending that login request instead of after getting the login
response.

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>IPoIB: Stop lying about hard_header_len and use skb-&gt;cb to stash LL addresses</title>
<updated>2012-03-01T00:31:03+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2012-02-07T14:51:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=afd87adacb5de00768b2e54f0bd851278f2e6179'/>
<id>afd87adacb5de00768b2e54f0bd851278f2e6179</id>
<content type='text'>
[ Upstream commit 936d7de3d736e0737542641269436f4b5968e9ef ]

Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb-&gt;cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 936d7de3d736e0737542641269436f4b5968e9ef ]

Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb-&gt;cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next</title>
<updated>2011-11-30T02:01:53+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2011-11-30T02:01:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a493f1a24a496711d96b91c4dc0a1bd35eb6954b'/>
<id>a493f1a24a496711d96b91c4dc0a1bd35eb6954b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>IB: Fix RCU lockdep splats</title>
<updated>2011-11-29T21:37:11+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-11-29T21:31:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=580da35a31f91a594f3090b7a2c39b85cb051a12'/>
<id>580da35a31f91a594f3090b7a2c39b85cb051a12</id>
<content type='text'>
Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France &lt;tsi@ualberta.ca&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France &lt;tsi@ualberta.ca&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Prevent hung task or softlockup processing multicast response</title>
<updated>2011-11-29T21:20:02+00:00</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@qlogic.com</email>
</author>
<published>2011-11-21T13:43:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3874397c0bdec3c21ce071711cd105165179b8eb'/>
<id>3874397c0bdec3c21ce071711cd105165179b8eb</id>
<content type='text'>
This following can occur with ipoib when processing a multicast reponse:

    BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
    Modules linked in: ...
    CPU 0:
    Modules linked in: ...
    Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
    RIP: 0010:[&lt;ffffffff814ddb27&gt;]  [&lt;ffffffff814ddb27&gt;] _spin_unlock_irqrestore+0x17/0x20
    RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
    0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
    RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
    R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
    R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
    [&lt;ffffffffa032c247&gt;] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
    [&lt;ffffffff8100bc8e&gt;] ? apic_timer_interrupt+0xe/0x20
    [&lt;ffffffff8100bc8e&gt;] ? apic_timer_interrupt+0xe/0x20
    [&lt;ffffffffa03283d4&gt;] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
    [&lt;ffffffffa03286fc&gt;] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
    [&lt;ffffffff8141e758&gt;] ? dev_hard_start_xmit+0x2c8/0x3f0
    [&lt;ffffffff81439d0a&gt;] ? sch_direct_xmit+0x15a/0x1c0
    [&lt;ffffffff81423098&gt;] ? dev_queue_xmit+0x388/0x4d0
    [&lt;ffffffffa032d6b7&gt;] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
    [&lt;ffffffffa032dab8&gt;] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
    [&lt;ffffffffa02a0946&gt;] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
    [&lt;ffffffffa015f01e&gt;] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
    [&lt;ffffffffa00f6c93&gt;] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
    [&lt;ffffffffa02a0f9b&gt;] ? join_handler+0xeb/0x200 [ib_sa]
    [&lt;ffffffffa029e4fc&gt;] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
    [&lt;ffffffffa029e79c&gt;] ? recv_handler+0x3c/0x70 [ib_sa]
    [&lt;ffffffffa01603a4&gt;] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
    [&lt;ffffffffa015fb60&gt;] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
    [&lt;ffffffff81088830&gt;] ? worker_thread+0x170/0x2a0
    [&lt;ffffffff8108e160&gt;] ? autoremove_wake_function+0x0/0x40
    [&lt;ffffffff810886c0&gt;] ? worker_thread+0x0/0x2a0
    [&lt;ffffffff8108ddf6&gt;] ? kthread+0x96/0xa0
    [&lt;ffffffff8100c1ca&gt;] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

    ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

                ah = ipoib_create_ah(dev, priv-&gt;pd, &amp;av);
                if (!ah) {
                        ipoib_warn(priv, "ib_address_create failed\n");
                } else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast-&gt;pkt_queue and eventually
end up in ipoib_mcast_send():

        if (!mcast-&gt;ah) {
                if (skb_queue_len(&amp;mcast-&gt;pkt_queue) &lt; IPOIB_MAX_MCAST_QUEUE)
                        skb_queue_tail(&amp;mcast-&gt;pkt_queue, skb);
                else {
                        ++dev-&gt;stats.tx_dropped;
                        dev_kfree_skb_any(skb);
                }

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets.  Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa &lt;ram.vepa@qlogic.com&gt;
Reviewed-by: Gary Leshner &lt;gary.leshner@qlogic.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@qlogic.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This following can occur with ipoib when processing a multicast reponse:

    BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
    Modules linked in: ...
    CPU 0:
    Modules linked in: ...
    Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
    RIP: 0010:[&lt;ffffffff814ddb27&gt;]  [&lt;ffffffff814ddb27&gt;] _spin_unlock_irqrestore+0x17/0x20
    RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
    0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
    RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
    R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
    R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
    [&lt;ffffffffa032c247&gt;] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
    [&lt;ffffffff8100bc8e&gt;] ? apic_timer_interrupt+0xe/0x20
    [&lt;ffffffff8100bc8e&gt;] ? apic_timer_interrupt+0xe/0x20
    [&lt;ffffffffa03283d4&gt;] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
    [&lt;ffffffffa03286fc&gt;] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
    [&lt;ffffffff8141e758&gt;] ? dev_hard_start_xmit+0x2c8/0x3f0
    [&lt;ffffffff81439d0a&gt;] ? sch_direct_xmit+0x15a/0x1c0
    [&lt;ffffffff81423098&gt;] ? dev_queue_xmit+0x388/0x4d0
    [&lt;ffffffffa032d6b7&gt;] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
    [&lt;ffffffffa032dab8&gt;] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
    [&lt;ffffffffa02a0946&gt;] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
    [&lt;ffffffffa015f01e&gt;] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
    [&lt;ffffffffa00f6c93&gt;] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
    [&lt;ffffffffa02a0f9b&gt;] ? join_handler+0xeb/0x200 [ib_sa]
    [&lt;ffffffffa029e4fc&gt;] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
    [&lt;ffffffffa029e79c&gt;] ? recv_handler+0x3c/0x70 [ib_sa]
    [&lt;ffffffffa01603a4&gt;] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
    [&lt;ffffffffa015fb60&gt;] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
    [&lt;ffffffff81088830&gt;] ? worker_thread+0x170/0x2a0
    [&lt;ffffffff8108e160&gt;] ? autoremove_wake_function+0x0/0x40
    [&lt;ffffffff810886c0&gt;] ? worker_thread+0x0/0x2a0
    [&lt;ffffffff8108ddf6&gt;] ? kthread+0x96/0xa0
    [&lt;ffffffff8100c1ca&gt;] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

    ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

                ah = ipoib_create_ah(dev, priv-&gt;pd, &amp;av);
                if (!ah) {
                        ipoib_warn(priv, "ib_address_create failed\n");
                } else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast-&gt;pkt_queue and eventually
end up in ipoib_mcast_send():

        if (!mcast-&gt;ah) {
                if (skb_queue_len(&amp;mcast-&gt;pkt_queue) &lt; IPOIB_MAX_MCAST_QUEUE)
                        skb_queue_tail(&amp;mcast-&gt;pkt_queue, skb);
                else {
                        ++dev-&gt;stats.tx_dropped;
                        dev_kfree_skb_any(skb);
                }

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets.  Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa &lt;ram.vepa@qlogic.com&gt;
Reviewed-by: Gary Leshner &lt;gary.leshner@qlogic.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@qlogic.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
