<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/infiniband, branch v4.20</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>IB/core: Fix oops in netdev_next_upper_dev_rcu()</title>
<updated>2018-12-12T17:14:49+00:00</updated>
<author>
<name>Mark Zhang</name>
<email>markz@mellanox.com</email>
</author>
<published>2018-12-05T13:50:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=37fbd834b4e492dc41743830cbe435f35120abd8'/>
<id>37fbd834b4e492dc41743830cbe435f35120abd8</id>
<content type='text'>
When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond.  If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released).  We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().

Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
Reviewed-by: Parav Pandit &lt;parav@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond.  If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released).  We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().

Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
Reviewed-by: Parav Pandit &lt;parav@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Block DEVX umem from the non applicable cases</title>
<updated>2018-12-06T16:38:14+00:00</updated>
<author>
<name>Yishai Hadas</name>
<email>yishaih@mellanox.com</email>
</author>
<published>2018-12-05T13:50:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=47f07f03b5ee436fe074c4fb1fb28d013c36a0d8'/>
<id>47f07f03b5ee436fe074c4fb1fb28d013c36a0d8</id>
<content type='text'>
Blocks creating a DEVX UMEM with the non applicable access flags
as of ODP, MW_BIND, etc.

Specifically when an ODP flag is used below WARN call trace is issued.

[ 2510.404131] RIP: 0010:__mlx5_ib_populate_pas+0x207/0x220 [mlx5_ib]
...
[ 2510.404143] Call Trace:
[ 2510.404150]  ? __kmalloc_node+0x1b3/0x280
[ 2510.404156]  ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404158]  ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404162]  mlx5_ib_populate_pas+0x53/0x60 [mlx5_ib]
[ 2510.404167]  mlx5_ib_handler_MLX5_IB_METHOD_DEVX_UMEM_REG+0x273/0x3f0 [mlx5_ib]

Fixes: aeae94579caf ("IB/mlx5: Add DEVX support for memory registration")
Signed-off-by: Yishai Hadas &lt;yishaih@mellanox.com&gt;
Reviewed-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Blocks creating a DEVX UMEM with the non applicable access flags
as of ODP, MW_BIND, etc.

Specifically when an ODP flag is used below WARN call trace is issued.

[ 2510.404131] RIP: 0010:__mlx5_ib_populate_pas+0x207/0x220 [mlx5_ib]
...
[ 2510.404143] Call Trace:
[ 2510.404150]  ? __kmalloc_node+0x1b3/0x280
[ 2510.404156]  ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404158]  ? _uverbs_alloc+0x63/0x90 [ib_uverbs]
[ 2510.404162]  mlx5_ib_populate_pas+0x53/0x60 [mlx5_ib]
[ 2510.404167]  mlx5_ib_handler_MLX5_IB_METHOD_DEVX_UMEM_REG+0x273/0x3f0 [mlx5_ib]

Fixes: aeae94579caf ("IB/mlx5: Add DEVX support for memory registration")
Signed-off-by: Yishai Hadas &lt;yishaih@mellanox.com&gt;
Reviewed-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Fix implicit ODP interrupted page fault</title>
<updated>2018-12-03T21:52:53+00:00</updated>
<author>
<name>Artemy Kovalyov</name>
<email>artemyko@mellanox.com</email>
</author>
<published>2018-11-27T06:51:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=37b06e5078975bb4efe3cbd91e254112851b125f'/>
<id>37b06e5078975bb4efe3cbd91e254112851b125f</id>
<content type='text'>
Since any page fault may be interrupted by a MMU invalidation and implicit
leaf MR may be released during this process. The check for parent value
is unreliable condition for an implicit MR.
Use other condition that we can rely on to determine if MR is implicit.

Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since any page fault may be interrupted by a MMU invalidation and implicit
leaf MR may be released during this process. The check for parent value
is unreliable condition for an implicit MR.
Use other condition that we can rely on to determine if MR is implicit.

Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Fix an out-of-bounds access in get_hw_stats</title>
<updated>2018-12-03T21:05:19+00:00</updated>
<author>
<name>Piotr Stankiewicz</name>
<email>piotr.stankiewicz@intel.com</email>
</author>
<published>2018-11-28T14:44:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36d842194a57f1b21fbc6a6875f2fa2f9a7f8679'/>
<id>36d842194a57f1b21fbc6a6875f2fa2f9a7f8679</id>
<content type='text'>
When running with KASAN, the following trace is produced:

[   62.535888]

==================================================================
[   62.544930] BUG: KASAN: slab-out-of-bounds in
gut_hw_stats+0x122/0x230 [hfi1]
[   62.553856] Write of size 8 at addr ffff88080e8d6330 by task
kworker/0:1/14

[   62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
4.19.0-test-build-kasan+ #8
[   62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
[   62.587951] Workqueue: events work_for_cpu_fn
[   62.594050] Call Trace:
[   62.598023]  dump_stack+0xc6/0x14c
[   62.603089]  ? dump_stack_print_info.cold.1+0x2f/0x2f
[   62.610041]  ? kmsg_dump_rewind_nolock+0x59/0x59
[   62.616615]  ? get_hw_stats+0x122/0x230 [hfi1]
[   62.622985]  print_address_description+0x6c/0x23c
[   62.629744]  ? get_hw_stats+0x122/0x230 [hfi1]
[   62.636108]  kasan_report.cold.6+0x241/0x308
[   62.642365]  get_hw_stats+0x122/0x230 [hfi1]
[   62.648703]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[   62.655088]  ? __kmalloc+0x110/0x240
[   62.660695]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[   62.667142]  setup_hw_stats+0xd8/0x430 [ib_core]
[   62.673972]  ? show_hfi+0x50/0x50 [hfi1]
[   62.680026]  ib_device_register_sysfs+0x165/0x180 [ib_core]
[   62.687995]  ib_register_device+0x5a2/0xa10 [ib_core]
[   62.695340]  ? show_hfi+0x50/0x50 [hfi1]
[   62.701421]  ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
[   62.709222]  ? __vmalloc_node_range+0x2d0/0x380
[   62.716131]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[   62.723735]  ? vmalloc_node+0x5c/0x70
[   62.729697]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[   62.737347]  ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
[   62.744998]  ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
[   62.752315]  ? rvt_rc_error+0x140/0x140 [rdmavt]
[   62.759434]  ? rvt_vma_open+0x30/0x30 [rdmavt]
[   62.766364]  ? mutex_unlock+0x1d/0x40
[   62.772445]  ? kmem_cache_create_usercopy+0x15d/0x230
[   62.780115]  rvt_register_device+0x1f6/0x360 [rdmavt]
[   62.787823]  ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
[   62.796058]  ? __get_txreq+0x400/0x400 [hfi1]
[   62.802969]  ? memcpy+0x34/0x50
[   62.808611]  hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
[   62.816601]  ? hfi1_get_npkeys+0x10/0x10 [hfi1]
[   62.823760]  ? hfi1_init+0x89f/0x9a0 [hfi1]
[   62.830469]  ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
[   62.838204]  ? pcie_capability_clear_and_set_word+0xcd/0xe0
[   62.846429]  ? pcie_capability_read_word+0xd0/0xd0
[   62.853791]  ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
[   62.860958]  init_one+0x67f/0xae0 [hfi1]
[   62.867301]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
[   62.873876]  ? wait_woken+0x130/0x130
[   62.879860]  ? read_word_at_a_time+0xe/0x20
[   62.886329]  ? strscpy+0x14b/0x280
[   62.891998]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
[   62.898405]  local_pci_probe+0x70/0xd0
[   62.904295]  ? pci_device_shutdown+0x90/0x90
[   62.910833]  work_for_cpu_fn+0x29/0x40
[   62.916750]  process_one_work+0x584/0x960
[   62.922974]  ? rcu_work_rcufn+0x40/0x40
[   62.928991]  ? __schedule+0x396/0xdc0
[   62.934806]  ? __sched_text_start+0x8/0x8
[   62.941020]  ? pick_next_task_fair+0x68b/0xc60
[   62.947674]  ? run_rebalance_domains+0x260/0x260
[   62.954471]  ? __list_add_valid+0x29/0xa0
[   62.960607]  ? move_linked_works+0x1c7/0x230
[   62.967077]  ?
trace_event_raw_event_workqueue_execute_start+0x140/0x140
[   62.976248]  ? mutex_lock+0xa6/0x100
[   62.982029]  ? __mutex_lock_slowpath+0x10/0x10
[   62.988795]  ? __switch_to+0x37a/0x710
[   62.994731]  worker_thread+0x62e/0x9d0
[   63.000602]  ? max_active_store+0xf0/0xf0
[   63.006828]  ? __switch_to_asm+0x40/0x70
[   63.012932]  ? __switch_to_asm+0x34/0x70
[   63.019013]  ? __switch_to_asm+0x40/0x70
[   63.025042]  ? __switch_to_asm+0x34/0x70
[   63.031030]  ? __switch_to_asm+0x40/0x70
[   63.037006]  ? __schedule+0x396/0xdc0
[   63.042660]  ? kmem_cache_alloc_trace+0xf3/0x1f0
[   63.049323]  ? kthread+0x59/0x1d0
[   63.054594]  ? ret_from_fork+0x35/0x40
[   63.060257]  ? __sched_text_start+0x8/0x8
[   63.066212]  ? schedule+0xcf/0x250
[   63.071529]  ? __wake_up_common+0x110/0x350
[   63.077794]  ? __schedule+0xdc0/0xdc0
[   63.083348]  ? wait_woken+0x130/0x130
[   63.088963]  ? finish_task_switch+0x1f1/0x520
[   63.095258]  ? kasan_unpoison_shadow+0x30/0x40
[   63.101792]  ? __init_waitqueue_head+0xa0/0xd0
[   63.108183]  ? replenish_dl_entity.cold.60+0x18/0x18
[   63.115151]  ? _raw_spin_lock_irqsave+0x25/0x50
[   63.121754]  ? max_active_store+0xf0/0xf0
[   63.127753]  kthread+0x1ae/0x1d0
[   63.132894]  ? kthread_bind+0x30/0x30
[   63.138422]  ret_from_fork+0x35/0x40

[   63.146973] Allocated by task 14:
[   63.152077]  kasan_kmalloc+0xbf/0xe0
[   63.157471]  __kmalloc+0x110/0x240
[   63.162804]  init_cntrs+0x34d/0xdf0 [hfi1]
[   63.168883]  hfi1_init_dd+0x29a3/0x2f90 [hfi1]
[   63.175244]  init_one+0x551/0xae0 [hfi1]
[   63.181065]  local_pci_probe+0x70/0xd0
[   63.186759]  work_for_cpu_fn+0x29/0x40
[   63.192310]  process_one_work+0x584/0x960
[   63.198163]  worker_thread+0x62e/0x9d0
[   63.203843]  kthread+0x1ae/0x1d0
[   63.208874]  ret_from_fork+0x35/0x40

[   63.217203] Freed by task 1:
[   63.221844]  __kasan_slab_free+0x12e/0x180
[   63.227844]  kfree+0x92/0x1a0
[   63.232570]  single_release+0x3a/0x60
[   63.238024]  __fput+0x1d9/0x480
[   63.242911]  task_work_run+0x139/0x190
[   63.248440]  exit_to_usermode_loop+0x191/0x1a0
[   63.254814]  do_syscall_64+0x301/0x330
[   63.260283]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   63.270199] The buggy address belongs to the object at
ffff88080e8d5500
 which belongs to the cache kmalloc-4096 of size 4096
[   63.287247] The buggy address is located 3632 bytes inside of
 4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
[   63.303564] The buggy address belongs to the page:
[   63.310447] page:ffffea00203a3400 count:1 mapcount:0
mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
[   63.323102] flags: 0x2fffff80008100(slab|head)
[   63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
ffff88081380e840
[   63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
0000000000000000
[   63.350564] page dumped because: kasan: bad access detected

[   63.361974] Memory state around the buggy address:
[   63.369137]  ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[   63.379082]  ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[   63.389032] &gt;ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
fc fc fc
[   63.398944]                                      ^
[   63.406141]  ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   63.416109]  ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   63.426099]
==================================================================

The trace happens because get_hw_stats() assumes there is room in the
memory allocated in init_cntrs() to accommodate the driver counters.
Unfortunately, that routine only allocated space for the device
counters.

Fix by insuring the allocation has room for the additional driver
counters.

Cc: &lt;Stable@vger.kernel.org&gt; # v4.14+
Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface")
Reviewed-by: Mike Marciniczyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Mike Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Piotr Stankiewicz &lt;piotr.stankiewicz@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When running with KASAN, the following trace is produced:

[   62.535888]

==================================================================
[   62.544930] BUG: KASAN: slab-out-of-bounds in
gut_hw_stats+0x122/0x230 [hfi1]
[   62.553856] Write of size 8 at addr ffff88080e8d6330 by task
kworker/0:1/14

[   62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
4.19.0-test-build-kasan+ #8
[   62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
[   62.587951] Workqueue: events work_for_cpu_fn
[   62.594050] Call Trace:
[   62.598023]  dump_stack+0xc6/0x14c
[   62.603089]  ? dump_stack_print_info.cold.1+0x2f/0x2f
[   62.610041]  ? kmsg_dump_rewind_nolock+0x59/0x59
[   62.616615]  ? get_hw_stats+0x122/0x230 [hfi1]
[   62.622985]  print_address_description+0x6c/0x23c
[   62.629744]  ? get_hw_stats+0x122/0x230 [hfi1]
[   62.636108]  kasan_report.cold.6+0x241/0x308
[   62.642365]  get_hw_stats+0x122/0x230 [hfi1]
[   62.648703]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[   62.655088]  ? __kmalloc+0x110/0x240
[   62.660695]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
[   62.667142]  setup_hw_stats+0xd8/0x430 [ib_core]
[   62.673972]  ? show_hfi+0x50/0x50 [hfi1]
[   62.680026]  ib_device_register_sysfs+0x165/0x180 [ib_core]
[   62.687995]  ib_register_device+0x5a2/0xa10 [ib_core]
[   62.695340]  ? show_hfi+0x50/0x50 [hfi1]
[   62.701421]  ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
[   62.709222]  ? __vmalloc_node_range+0x2d0/0x380
[   62.716131]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[   62.723735]  ? vmalloc_node+0x5c/0x70
[   62.729697]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
[   62.737347]  ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
[   62.744998]  ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
[   62.752315]  ? rvt_rc_error+0x140/0x140 [rdmavt]
[   62.759434]  ? rvt_vma_open+0x30/0x30 [rdmavt]
[   62.766364]  ? mutex_unlock+0x1d/0x40
[   62.772445]  ? kmem_cache_create_usercopy+0x15d/0x230
[   62.780115]  rvt_register_device+0x1f6/0x360 [rdmavt]
[   62.787823]  ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
[   62.796058]  ? __get_txreq+0x400/0x400 [hfi1]
[   62.802969]  ? memcpy+0x34/0x50
[   62.808611]  hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
[   62.816601]  ? hfi1_get_npkeys+0x10/0x10 [hfi1]
[   62.823760]  ? hfi1_init+0x89f/0x9a0 [hfi1]
[   62.830469]  ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
[   62.838204]  ? pcie_capability_clear_and_set_word+0xcd/0xe0
[   62.846429]  ? pcie_capability_read_word+0xd0/0xd0
[   62.853791]  ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
[   62.860958]  init_one+0x67f/0xae0 [hfi1]
[   62.867301]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
[   62.873876]  ? wait_woken+0x130/0x130
[   62.879860]  ? read_word_at_a_time+0xe/0x20
[   62.886329]  ? strscpy+0x14b/0x280
[   62.891998]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
[   62.898405]  local_pci_probe+0x70/0xd0
[   62.904295]  ? pci_device_shutdown+0x90/0x90
[   62.910833]  work_for_cpu_fn+0x29/0x40
[   62.916750]  process_one_work+0x584/0x960
[   62.922974]  ? rcu_work_rcufn+0x40/0x40
[   62.928991]  ? __schedule+0x396/0xdc0
[   62.934806]  ? __sched_text_start+0x8/0x8
[   62.941020]  ? pick_next_task_fair+0x68b/0xc60
[   62.947674]  ? run_rebalance_domains+0x260/0x260
[   62.954471]  ? __list_add_valid+0x29/0xa0
[   62.960607]  ? move_linked_works+0x1c7/0x230
[   62.967077]  ?
trace_event_raw_event_workqueue_execute_start+0x140/0x140
[   62.976248]  ? mutex_lock+0xa6/0x100
[   62.982029]  ? __mutex_lock_slowpath+0x10/0x10
[   62.988795]  ? __switch_to+0x37a/0x710
[   62.994731]  worker_thread+0x62e/0x9d0
[   63.000602]  ? max_active_store+0xf0/0xf0
[   63.006828]  ? __switch_to_asm+0x40/0x70
[   63.012932]  ? __switch_to_asm+0x34/0x70
[   63.019013]  ? __switch_to_asm+0x40/0x70
[   63.025042]  ? __switch_to_asm+0x34/0x70
[   63.031030]  ? __switch_to_asm+0x40/0x70
[   63.037006]  ? __schedule+0x396/0xdc0
[   63.042660]  ? kmem_cache_alloc_trace+0xf3/0x1f0
[   63.049323]  ? kthread+0x59/0x1d0
[   63.054594]  ? ret_from_fork+0x35/0x40
[   63.060257]  ? __sched_text_start+0x8/0x8
[   63.066212]  ? schedule+0xcf/0x250
[   63.071529]  ? __wake_up_common+0x110/0x350
[   63.077794]  ? __schedule+0xdc0/0xdc0
[   63.083348]  ? wait_woken+0x130/0x130
[   63.088963]  ? finish_task_switch+0x1f1/0x520
[   63.095258]  ? kasan_unpoison_shadow+0x30/0x40
[   63.101792]  ? __init_waitqueue_head+0xa0/0xd0
[   63.108183]  ? replenish_dl_entity.cold.60+0x18/0x18
[   63.115151]  ? _raw_spin_lock_irqsave+0x25/0x50
[   63.121754]  ? max_active_store+0xf0/0xf0
[   63.127753]  kthread+0x1ae/0x1d0
[   63.132894]  ? kthread_bind+0x30/0x30
[   63.138422]  ret_from_fork+0x35/0x40

[   63.146973] Allocated by task 14:
[   63.152077]  kasan_kmalloc+0xbf/0xe0
[   63.157471]  __kmalloc+0x110/0x240
[   63.162804]  init_cntrs+0x34d/0xdf0 [hfi1]
[   63.168883]  hfi1_init_dd+0x29a3/0x2f90 [hfi1]
[   63.175244]  init_one+0x551/0xae0 [hfi1]
[   63.181065]  local_pci_probe+0x70/0xd0
[   63.186759]  work_for_cpu_fn+0x29/0x40
[   63.192310]  process_one_work+0x584/0x960
[   63.198163]  worker_thread+0x62e/0x9d0
[   63.203843]  kthread+0x1ae/0x1d0
[   63.208874]  ret_from_fork+0x35/0x40

[   63.217203] Freed by task 1:
[   63.221844]  __kasan_slab_free+0x12e/0x180
[   63.227844]  kfree+0x92/0x1a0
[   63.232570]  single_release+0x3a/0x60
[   63.238024]  __fput+0x1d9/0x480
[   63.242911]  task_work_run+0x139/0x190
[   63.248440]  exit_to_usermode_loop+0x191/0x1a0
[   63.254814]  do_syscall_64+0x301/0x330
[   63.260283]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   63.270199] The buggy address belongs to the object at
ffff88080e8d5500
 which belongs to the cache kmalloc-4096 of size 4096
[   63.287247] The buggy address is located 3632 bytes inside of
 4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
[   63.303564] The buggy address belongs to the page:
[   63.310447] page:ffffea00203a3400 count:1 mapcount:0
mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
[   63.323102] flags: 0x2fffff80008100(slab|head)
[   63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
ffff88081380e840
[   63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
0000000000000000
[   63.350564] page dumped because: kasan: bad access detected

[   63.361974] Memory state around the buggy address:
[   63.369137]  ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[   63.379082]  ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00
[   63.389032] &gt;ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
fc fc fc
[   63.398944]                                      ^
[   63.406141]  ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   63.416109]  ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   63.426099]
==================================================================

The trace happens because get_hw_stats() assumes there is room in the
memory allocated in init_cntrs() to accommodate the driver counters.
Unfortunately, that routine only allocated space for the device
counters.

Fix by insuring the allocation has room for the additional driver
counters.

Cc: &lt;Stable@vger.kernel.org&gt; # v4.14+
Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface")
Reviewed-by: Mike Marciniczyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Mike Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Piotr Stankiewicz &lt;piotr.stankiewicz@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/hfi1: Fix a latency issue for small messages</title>
<updated>2018-12-03T21:05:19+00:00</updated>
<author>
<name>Michael J. Ruhl</name>
<email>michael.j.ruhl@intel.com</email>
</author>
<published>2018-11-28T14:44:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=90b2620e6a8aa08c40cc78d61603e0acd853c33a'/>
<id>90b2620e6a8aa08c40cc78d61603e0acd853c33a</id>
<content type='text'>
A recent performance enhancement introduced a latency issue in the
HFI message path.  The new algorithm removed a forced call send for
PIO messages and added a forced schedule event for messages larger
than the MTU.

For PIO, the schedule path can introduce thrashing that can
significantly impact the throughput for small messages.

If a message size is within the PIO threshold, always take the send
path.

Fixes: 0b79b27748cb ("IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of posting")
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A recent performance enhancement introduced a latency issue in the
HFI message path.  The new algorithm removed a forced call send for
PIO messages and added a forced schedule event for messages larger
than the MTU.

For PIO, the schedule path can introduce thrashing that can
significantly impact the throughput for small messages.

If a message size is within the PIO threshold, always take the send
path.

Fixes: 0b79b27748cb ("IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of posting")
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mlx5: Initialize return variable in case pagefault was skipped</title>
<updated>2018-11-29T22:16:45+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@mellanox.com</email>
</author>
<published>2018-11-29T10:25:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7bca603a69c0c239654a8f0bcb99e1a60b30040c'/>
<id>7bca603a69c0c239654a8f0bcb99e1a60b30040c</id>
<content type='text'>
Pagefaults occurred in non-ODP MR are completely valid events, so
initialize return variable to 0.

Fixes: 4d5422a309de ("IB/mlx5: Skip non-ODP MR when handling a page fault")
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pagefaults occurred in non-ODP MR are completely valid events, so
initialize return variable to 0.

Fixes: 4d5422a309de ("IB/mlx5: Skip non-ODP MR when handling a page fault")
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Fix page fault handling for MW</title>
<updated>2018-11-26T23:28:36+00:00</updated>
<author>
<name>Artemy Kovalyov</name>
<email>artemyko@mellanox.com</email>
</author>
<published>2018-11-25T18:34:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=75b7b86bdb0df37e08e44b6c1f99010967f81944'/>
<id>75b7b86bdb0df37e08e44b6c1f99010967f81944</id>
<content type='text'>
Memory windows are implemented with an indirect MKey, when a page fault
event comes for a MW Mkey we need to find the MR at the end of the list of
the indirect MKeys by iterating on all items from the first to the last.

The offset calculated during this process has to be zeroed after the first
iteration or the next iteration will start from a wrong address, resulting
incorrect ODP faulting behavior.

Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Memory windows are implemented with an indirect MKey, when a page fault
event comes for a MW Mkey we need to find the MR at the end of the list of
the indirect MKeys by iterating on all items from the first to the last.

The offset calculated during this process has to be zeroed after the first
iteration or the next iteration will start from a wrong address, resulting
incorrect ODP faulting behavior.

Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/umem: Set correct address to the invalidation function</title>
<updated>2018-11-26T23:28:36+00:00</updated>
<author>
<name>Artemy Kovalyov</name>
<email>artemyko@mellanox.com</email>
</author>
<published>2018-11-25T18:34:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=605728e65ad303a1b639bcae7c0abd2e24e6a930'/>
<id>605728e65ad303a1b639bcae7c0abd2e24e6a930</id>
<content type='text'>
The invalidate range was using PAGE_SIZE instead of the computed 'end',
and had the wrong transformation of page_index due the weird
construction. This can trigger during error unwind and would cause
malfunction.

Inline the code and correct the math.

Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The invalidate range was using PAGE_SIZE instead of the computed 'end',
and had the wrong transformation of page_index due the weird
construction. This can trigger during error unwind and would cause
malfunction.

Inline the code and correct the math.

Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx5: Skip non-ODP MR when handling a page fault</title>
<updated>2018-11-26T23:27:29+00:00</updated>
<author>
<name>Artemy Kovalyov</name>
<email>artemyko@mellanox.com</email>
</author>
<published>2018-11-25T18:34:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4d5422a309deecec906c491f8aea77593a46321d'/>
<id>4d5422a309deecec906c491f8aea77593a46321d</id>
<content type='text'>
It is possible that we call pagefault_single_data_segment() with a MKey
that belongs to a memory region which is not on demand (i.e. pinned
pages). This can happen if, for instance, a WQE that points to multiple
MRs where some of them are ODP MRs and some are not.  In this case we
don't need to handle this MR in the ODP context besides reporting success.

Otherwise the code will call pagefault_mr() which will do to_ib_umem_odp()
on a non-ODP MR and thus access out of bounds.

Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is possible that we call pagefault_single_data_segment() with a MKey
that belongs to a memory region which is not on demand (i.e. pinned
pages). This can happen if, for instance, a WQE that points to multiple
MRs where some of them are ODP MRs and some are not.  In this case we
don't need to handle this MR in the ODP context besides reporting success.

Otherwise the code will call pagefault_mr() which will do to_ib_umem_odp()
on a non-ODP MR and thus access out of bounds.

Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/hns: Bugfix pbl configuration for rereg mr</title>
<updated>2018-11-23T20:11:47+00:00</updated>
<author>
<name>Yixian Liu</name>
<email>liuyixian@huawei.com</email>
</author>
<published>2018-11-23T07:46:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca088320a02537f36c243ac21794525d8eabb3bd'/>
<id>ca088320a02537f36c243ac21794525d8eabb3bd</id>
<content type='text'>
Current hns driver assigned the first two PBL page addresses from previous
registered MR to the hardware when reregister MR changing the memory
locations occurred. This will lead to PBL addressing error as the PBL has
already been released. This patch fixes this wrong assignment by using the
page address from new allocated PBL.

Fixes: a2c80b7b4119 ("RDMA/hns: Add rereg mr support for hip08")
Signed-off-by: Yixian Liu &lt;liuyixian@huawei.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current hns driver assigned the first two PBL page addresses from previous
registered MR to the hardware when reregister MR changing the memory
locations occurred. This will lead to PBL addressing error as the PBL has
already been released. This patch fixes this wrong assignment by using the
page address from new allocated PBL.

Fixes: a2c80b7b4119 ("RDMA/hns: Add rereg mr support for hip08")
Signed-off-by: Yixian Liu &lt;liuyixian@huawei.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
