<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/infiniband, branch v2.6.18</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>RDMA/cma: Increase the IB CM retry count in CMA</title>
<updated>2006-09-14T20:55:30+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@mellanox.co.il</email>
</author>
<published>2006-09-13T12:01:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d5bb75999cb5733ad936ff000023221fe7a13c59'/>
<id>d5bb75999cb5733ad936ff000023221fe7a13c59</id>
<content type='text'>
3 seems like a low number of IB Communication Manager retries to set;
we see connections failing under stress, and in any case 3 just looks
like an arbitrary number.  15 is the max value allowed by the
InfiniBand spec.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Acked-by: Sean Hefty &lt;sean.hefty@intel.com&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
3 seems like a low number of IB Communication Manager retries to set;
we see connections failing under stress, and in any case 3 just looks
like an arbitrary number.  15 is the max value allowed by the
InfiniBand spec.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Acked-by: Sean Hefty &lt;sean.hefty@intel.com&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>IPoIB: Retry failed send-only multicast group joins</title>
<updated>2006-09-14T20:51:41+00:00</updated>
<author>
<name>Eli Cohen</name>
<email>eli@mellanox.co.il</email>
</author>
<published>2006-09-14T20:51:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c11bd42a7676b49d41c780f2ede3709204f8da83'/>
<id>c11bd42a7676b49d41c780f2ede3709204f8da83</id>
<content type='text'>
When a send-only multicast group join fails, mcast-&gt;query must be set
to NULL.  Otherwise, IPoIB will never retry the join and the multicast
group will never be reachable.

Signed-off-by: Eli Cohen &lt;eli@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a send-only multicast group join fails, mcast-&gt;query must be set
to NULL.  Otherwise, IPoIB will never retry the join and the multicast
group will never be reachable.

Signed-off-by: Eli Cohen &lt;eli@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/srp: Don't schedule reconnect from srp</title>
<updated>2006-09-14T20:51:40+00:00</updated>
<author>
<name>Ishai Rabinovitz</name>
<email>ishai@mellanox.co.il</email>
</author>
<published>2006-09-07T12:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=add7afc756eddd5d02fd986d19e6300b3e1a5ae8'/>
<id>add7afc756eddd5d02fd986d19e6300b3e1a5ae8</id>
<content type='text'>
If there is a problem in the connection, the SCSI mid-layer will
eventually call srp_reset_host(), which will call srp_reconnect(), so
we do not need to schedule a call to srp_reconnect_work() from
srp_completion().

Removing this prevents srp_reset_host() from failing if a reconnect
scheduled from srp_completion() is already in progress, which in turn
was causing crashes as both SCSI midlayer and srp_reconnect() were
cancelling commands.

Signed-off-by: Ishai Rabinovitz &lt;ishai@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If there is a problem in the connection, the SCSI mid-layer will
eventually call srp_reset_host(), which will call srp_reconnect(), so
we do not need to schedule a call to srp_reconnect_work() from
srp_completion().

Removing this prevents srp_reset_host() from failing if a reconnect
scheduled from srp_completion() is already in progress, which in turn
was causing crashes as both SCSI midlayer and srp_reconnect() were
cancelling commands.

Signed-off-by: Ishai Rabinovitz &lt;ishai@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mthca: Use IRQ safe locks to protect allocation bitmaps</title>
<updated>2006-09-01T00:25:56+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>rolandd@cisco.com</email>
</author>
<published>2006-08-31T23:43:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5a4e6dccbc0cd1b726820b782daebf887dcb95e9'/>
<id>5a4e6dccbc0cd1b726820b782daebf887dcb95e9</id>
<content type='text'>
It is supposed to be OK to call mthca_create_ah() and mthca_destroy_ah()
from any context.  However, for mem-full HCAs, these functions use the
mthca_alloc() and mthca_free() bitmap helpers, and those helpers use
non-IRQ-safe spin_lock() internally.  Lockdep correctly warns that
this could lead to a deadlock.  Fix this by changing mthca_alloc() and
mthca_free() to use spin_lock_irqsave().

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is supposed to be OK to call mthca_create_ah() and mthca_destroy_ah()
from any context.  However, for mem-full HCAs, these functions use the
mthca_alloc() and mthca_free() bitmap helpers, and those helpers use
non-IRQ-safe spin_lock() internally.  Lockdep correctly warns that
this could lead to a deadlock.  Fix this by changing mthca_alloc() and
mthca_free() to use spin_lock_irqsave().

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6</title>
<updated>2006-08-26T20:04:23+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@suse.de</email>
</author>
<published>2006-08-26T20:04:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f834c755423332a6ff4397fae754029a6a7a8249'/>
<id>f834c755423332a6ff4397fae754029a6a7a8249</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mthca: Update HCA firmware revisions</title>
<updated>2006-08-23T20:29:08+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@mellanox.co.il</email>
</author>
<published>2006-08-22T19:45:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=834ac73d4bc804db8ccb3f2a517e36db5f6bc4bd'/>
<id>834ac73d4bc804db8ccb3f2a517e36db5f6bc4bd</id>
<content type='text'>
Update the driver's list of HCA firmware revisions to make sure people
running Sinai firmware older than 1.1.0 get a message suggesting a
firmware upgrade.  Update the Arbel versions as well while we are at it.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Update the driver's list of HCA firmware revisions to make sure people
running Sinai firmware older than 1.1.0 get a message suggesting a
firmware upgrade.  Update the Arbel versions as well while we are at it.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mthca: No userspace SRQs if HCA doesn't have SRQ support</title>
<updated>2006-08-18T17:41:46+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>rolandd@cisco.com</email>
</author>
<published>2006-08-18T17:41:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5beba53230351b2d77c317c22e66c415f2ebaf02'/>
<id>5beba53230351b2d77c317c22e66c415f2ebaf02</id>
<content type='text'>
Leave all SRQ methods out of the device's uverbs_cmd_mask if the
device doesn't have SRQ support (because of ancient firmware) so that
we don't allow userspace to call the driver's create_srq method.  This
fixes a userspace-triggerable oops caused by ib_uverbs_create_srq()
following the device's -&gt;create_srq function pointer, which will be
NULL if the device doesn't support SRQs.

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Leave all SRQ methods out of the device's uverbs_cmd_mask if the
device doesn't have SRQ support (because of ancient firmware) so that
we don't allow userspace to call the driver's create_srq method.  This
fixes a userspace-triggerable oops caused by ib_uverbs_create_srq()
following the device's -&gt;create_srq function pointer, which will be
NULL if the device doesn't support SRQs.

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/core: Fix SM LID/LID change with client reregister set</title>
<updated>2006-08-16T16:54:47+00:00</updated>
<author>
<name>Jack Morgenstein</name>
<email>jackm@mellanox.co.il</email>
</author>
<published>2006-08-15T14:20:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=acaea9ee460d0ba5a14f0066ba26cfa43dd5fdf3'/>
<id>acaea9ee460d0ba5a14f0066ba26cfa43dd5fdf3</id>
<content type='text'>
After commit 12bbb2b7be7f5564952ebe0196623e97464b8ac5, when SM LID
change or LID change MAD also has a client reregistration bit set,
only CLIENT_REREGISTER event is generated.

As a result, the sa_query module and the cache module don't update the
port information, and ULPs (e.g. IPoIB) stop working.  This is the
regression we observe as compared to 2.6.17.

Rather than generate multiple events (which would have negative
performance impact), let us simply let cache and SA query respond to
reregister event in the same way as to LID and SM change events.

Signed-off-by: Jack Morgenstein &lt;jackm@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After commit 12bbb2b7be7f5564952ebe0196623e97464b8ac5, when SM LID
change or LID change MAD also has a client reregistration bit set,
only CLIENT_REREGISTER event is generated.

As a result, the sa_query module and the cache module don't update the
port information, and ULPs (e.g. IPoIB) stop working.  This is the
regression we observe as compared to 2.6.17.

Rather than generate multiple events (which would have negative
performance impact), let us simply let cache and SA query respond to
reregister event in the same way as to LID and SM change events.

Signed-off-by: Jack Morgenstein &lt;jackm@mellanox.co.il&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mthca: Fix potential AB-BA deadlock with CQ locks</title>
<updated>2006-08-11T15:56:57+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>rolandd@cisco.com</email>
</author>
<published>2006-08-11T15:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a19aa5c5fdda8b556ab238177ee27c5ef7873c94'/>
<id>a19aa5c5fdda8b556ab238177ee27c5ef7873c94</id>
<content type='text'>
When destroying a QP, mthca locks both the QP's send CQ and receive
CQ.  However, the following scenario is perfectly valid:

    QP_a: send_cq == CQ_x, recv_cq == CQ_y
    QP_b: send_cq == CQ_y, recv_cq == CQ_x

The old mthca code simply locked send_cq and then recv_cq, which in
this case could lead to an AB-BA deadlock if QP_a and QP_b were
destroyed simultaneously.

We can fix this by changing the locking code to lock the CQ with the
lower CQ number first, which will create a consistent lock ordering.
Also, the second CQ is locked with spin_lock_nested() to tell lockdep
that we know what we're doing with the lock nesting.

This bug was found by lockdep.

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When destroying a QP, mthca locks both the QP's send CQ and receive
CQ.  However, the following scenario is perfectly valid:

    QP_a: send_cq == CQ_x, recv_cq == CQ_y
    QP_b: send_cq == CQ_y, recv_cq == CQ_x

The old mthca code simply locked send_cq and then recv_cq, which in
this case could lead to an AB-BA deadlock if QP_a and QP_b were
destroyed simultaneously.

We can fix this by changing the locking code to lock the CQ with the
lower CQ number first, which will create a consistent lock ordering.
Also, the second CQ is locked with spin_lock_nested() to tell lockdep
that we know what we're doing with the lock nesting.

This bug was found by lockdep.

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mthca: Make fence flag work for send work requests</title>
<updated>2006-08-10T17:50:52+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@mellanox.co.il</email>
</author>
<published>2006-08-10T17:46:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e54b82d739d4a2ef992976c8c0692cdf89286420'/>
<id>e54b82d739d4a2ef992976c8c0692cdf89286420</id>
<content type='text'>
The fence bit needs to be set in the doorbell too, not just the WQE.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The fence bit needs to be set in the doorbell too, not just the WQE.

Signed-off-by: Michael S. Tsirkin &lt;mst@mellanox.co.il&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
