linux-toradex.git/drivers/infiniband/ulp, branch v4.4.73

IB/IPoIB: ibX: failed to create mcg debug file

2017-05-20T12:27:00+00:00

commit 771a52584096c45e4565e8aabb596eece9d73d61 upstream.

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then creating a debugfs entry for the new device would fail.

Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
of ipoib event handling in order to avoid any race condition between these.

Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs)
Signed-off-by: Vijay Kumar 
Signed-off-by: Shamir Rabinovitch 
Reviewed-by: Mark Bloch 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/srp: Fix race conditions related to task management

2017-03-15T01:57:13+00:00

commit 0a6fdbdeb1c25e31763c1fb333fa2723a7d2aba6 upstream.

Avoid that srp_process_rsp() overwrites the status information
in ch if the SRP target response timed out and processing of
another task management function has already started. Avoid that
issuing multiple task management functions concurrently triggers
list corruption. This patch prevents that the following stack
trace appears in the system log:

WARNING: CPU: 8 PID: 9269 at lib/list_debug.c:52 __list_del_entry_valid+0xbc/0xc0
list_del corruption. prev->next should be ffffc90004bb7b00, but was ffff8804052ecc68
CPU: 8 PID: 9269 Comm: sg_reset Tainted: G        W       4.10.0-rc7-dbg+ #3
Call Trace:
 dump_stack+0x68/0x93
 __warn+0xc6/0xe0
 warn_slowpath_fmt+0x4a/0x50
 __list_del_entry_valid+0xbc/0xc0
 wait_for_completion_timeout+0x12e/0x170
 srp_send_tsk_mgmt+0x1ef/0x2d0 [ib_srp]
 srp_reset_device+0x5b/0x110 [ib_srp]
 scsi_ioctl_reset+0x1c7/0x290
 scsi_ioctl+0x12a/0x420
 sd_ioctl+0x9d/0x100
 blkdev_ioctl+0x51e/0x9f0
 block_ioctl+0x38/0x40
 do_vfs_ioctl+0x8f/0x700
 SyS_ioctl+0x3c/0x70
 entry_SYSCALL_64_fastpath+0x18/0xad

Signed-off-by: Bart Van Assche 
Cc: Israel Rukshin 
Cc: Max Gurtovoy 
Cc: Laurence Oberman 
Cc: Steve Feeley 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/srp: Avoid that duplicate responses trigger a kernel bug

2017-03-15T01:57:13+00:00

commit 6cb72bc1b40bb2c1750ee7a5ebade93bed49a5fb upstream.

After srp_process_rsp() returns there is a short time during which
the scsi_host_find_tag() call will return a pointer to the SCSI
command that is being completed. If during that time a duplicate
response is received, avoid that the following call stack appears:

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: srp_recv_done+0x450/0x6b0 [ib_srp]
Oops: 0000 [#1] SMP
CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.10.0-rc7-dbg+ #1
Call Trace:
 
 __ib_process_cq+0x4b/0xd0 [ib_core]
 ib_poll_handler+0x1d/0x70 [ib_core]
 irq_poll_softirq+0xba/0x120
 __do_softirq+0xba/0x4c0
 irq_exit+0xbe/0xd0
 smp_apic_timer_interrupt+0x38/0x50
 apic_timer_interrupt+0x90/0xa0
 
RIP: srp_recv_done+0x450/0x6b0 [ib_srp] RSP: ffff88046f483e20

Signed-off-by: Bart Van Assche 
Cc: Israel Rukshin 
Cc: Max Gurtovoy 
Cc: Laurence Oberman 
Cc: Steve Feeley 
Reviewed-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/IPoIB: Add destination address when re-queue packet

2017-03-15T01:57:13+00:00

commit 2b0841766a898aba84630fb723989a77a9d3b4e6 upstream.

When sending packet to destination that was not resolved yet
via path query, the driver keeps the skb and tries to re-send it
again when the path is resolved.

But when re-sending via dev_queue_xmit the kernel doesn't call
to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb
and to put the destination address inside them.

In that way the dev_start_xmit will have the correct destination,
and the driver won't take the destination from the skb->data, while
nothing exists there, which causes to packet be be dropped.

The test flow is:
1. Run the SM on remote node,
2. Restart the driver.
4. Ping some destination,
3. Observe that first ICMP request will be dropped.

Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header")
Signed-off-by: Erez Shitrit 
Signed-off-by: Noa Osherovich 
Signed-off-by: Leon Romanovsky 
Tested-by: Yuval Shaia 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/ipoib: Fix deadlock between rmmod and set_mode

2017-03-15T01:57:13+00:00

commit 0a0007f28304cb9fc87809c86abb80ec71317f20 upstream.

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the other hand, the rmmod call flow takes the rtnl_lock first
(when calling unregister_netdev) and then tries to take the sys/fs
lock. Deadlock a->b, b->a.

The problem starts when ipoib_set_mod frees it's rtnl_lck and tries
to get it after that.

    set_mod:
    [] ? check_preempt_curr+0x6d/0x90
    [] __mutex_lock_slowpath+0x13e/0x180
    [] ? __rtnl_unlock+0x15/0x20
    [] mutex_lock+0x2b/0x50
    [] rtnl_lock+0x15/0x20
    [] ipoib_set_mode+0x97/0x160 [ib_ipoib]
    [] set_mode+0x3b/0x80 [ib_ipoib]
    [] dev_attr_store+0x20/0x30
    [] sysfs_write_file+0xe5/0x170
    [] vfs_write+0xb8/0x1a0
    [] sys_write+0x51/0x90
    [] system_call_fastpath+0x16/0x1b

    rmmod:
    [] ? put_dec+0x10c/0x110
    [] ? number+0x2ee/0x320
    [] schedule_timeout+0x215/0x2e0
    [] ? vsnprintf+0x484/0x5f0
    [] ? string+0x40/0x100
    [] wait_for_common+0x123/0x180
    [] ? default_wake_function+0x0/0x20
    [] ? ifind_fast+0x5e/0xb0
    [] wait_for_completion+0x1d/0x20
    [] sysfs_addrm_finish+0x228/0x270
    [] sysfs_remove_dir+0xa3/0xf0
    [] kobject_del+0x16/0x40
    [] device_del+0x184/0x1e0
    [] netdev_unregister_kobject+0xab/0xc0
    [] rollback_registered+0xae/0x130
    [] unregister_netdevice+0x22/0x70
    [] unregister_netdev+0x1e/0x30
    [] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
    [] ib_unregister_device+0x4f/0x100 [ib_core]
    [] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
    [] mlx4_remove_device+0x71/0x90 [mlx4_core]

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Cc: Or Gerlitz 
Signed-off-by: Feras Daoud 
Signed-off-by: Erez Shitrit 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/ipoib: move back IB LL address into the hard header

2017-02-01T07:30:53+00:00

commit fc791b6335152c5278dc4a4991bcb2d329f806f9 upstream.

After the commit 9207f9d45b0a ("net: preserve IP control block
during GSO segmentation"), the GSO CB and the IPoIB CB conflict.
That destroy the IPoIB address information cached there,
causing a severe performance regression, as better described here:

http://marc.info/?l=linux-kernel&m=146787279825501&w=2

This change moves the data cached by the IPoIB driver from the
skb control lock into the IPoIB hard header, as done before
the commit 936d7de3d736 ("IPoIB: Stop lying about hard_header_len
and use skb->cb to stash LL addresses").
In order to avoid GRO issue, on packet reception, the IPoIB driver
stash into the skb a dummy pseudo header, so that the received
packets have actually a hard header matching the declared length.
To avoid changing the connected mode maximum mtu, the allocated
head buffer size is increased by the pseudo header length.

After this commit, IPoIB performances are back to pre-regression
value.

v2 -> v3: rebased
v1 -> v2: avoid changing the max mtu, increasing the head buf size

Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation")
Signed-off-by: Paolo Abeni 
Signed-off-by: David S. Miller 
Cc: Vasiliy Tolstov 
Cc: Nikolay Borisov 
Cc: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/IPoIB: Remove can't use GFP_NOIO warning

2017-01-26T07:23:47+00:00

commit 0b59970e7d96edcb3c7f651d9d48e1a59af3c3b0 upstream.

Remove the warning print of "can't use of GFP_NOIO" to avoid prints in
each QP creation when devices aren't supporting IB_QP_CREATE_USE_GFP_NOIO.

This print become more annoying when the IPoIB interface is configured
to work in connected mode.

Fixes: 09b93088d750 ('IB: Add a QP creation flag to use GFP_NOIO allocations')
Signed-off-by: Kamal Heib 
Signed-off-by: Leon Romanovsky 
Reviewed-by: Yuval Shaia 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IPoIB: Avoid reading an uninitialized member variable

2017-01-09T07:07:51+00:00

commit 11b642b84e8c43e8597de031678d15c08dd057bc upstream.

This patch avoids that Coverity reports the following:

    Using uninitialized value port_attr.state when calling printk

Fixes: commit 94232d9ce817 ("IPoIB: Start multicast join process only on active ports")
Signed-off-by: Bart Van Assche 
Cc: Erez Shitrit 
Reviewed-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/ipoib: Don't allow MC joins during light MC flush

2016-10-07T13:23:46+00:00

commit 344bacca8cd811809fc33a249f2738ab757d327f upstream.

This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411]  0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960]  0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547]  ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015]  [] dump_stack+0x63/0x8c
[18332.801831]  [] __warn+0xd1/0xf0
[18332.805403]  [] warn_slowpath_null+0x1d/0x20
[18332.809706]  [] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384]  [] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031]  [] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220]  [] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290]  [] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911]  [] rollback_registered_many+0x1aa/0x2c0
[18332.839741]  [] rollback_registered+0x31/0x40
[18332.844091]  [] unregister_netdevice_queue+0x48/0x80
[18332.848880]  [] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848]  [] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474]  [] dev_attr_store+0x18/0x30
[18332.862510]  [] sysfs_kf_write+0x3a/0x50
[18332.866349]  [] kernfs_fop_write+0x120/0x170
[18332.870471]  [] __vfs_write+0x28/0xe0
[18332.874152]  [] ? percpu_down_read+0x1f/0x50
[18332.878274]  [] vfs_write+0xa2/0x1a0
[18332.881896]  [] SyS_write+0x46/0xa0
[18332.885632]  [] do_syscall_64+0x57/0xb0
[18332.889709]  [] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman

IB/ipoib: Fix memory corruption in ipoib cm mode connect flow

2016-10-07T13:23:46+00:00

commit 546481c2816ea3c061ee9d5658eb48070f69212e upstream.

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.

The next scenario demonstrates it:
	neigh_add_path --> ipoib_cm_create_tx -->
	queue_work (pointer to path is in the cm/tx struct)
	#while the work is still in the queue,
	#the port goes down and causes the ipoib_flush_paths:
	ipoib_flush_paths --> path_free --> kfree(path)
	#at this point the work scheduled starts.
	ipoib_cm_tx_start --> copy from the (invalid)path pointer:
	(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
	 -> memory corruption.

To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.

Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman