summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2018-08-24RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error pathKamal Heib
[ Upstream commit d63c46734c545ad0488761059004a65c46efdde3 ] Fix memory leak in the error path of mlx5_ib_create_srq() by making sure to free the allocated srq. Fixes: c2b37f76485f ("IB/mlx5: Fix integer overflows in mlx5_ib_create_srq") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24IB/rxe: Fix missing completion for mem_reg work requestsVijay Immanuel
[ Upstream commit 375dc53d032fc11e98036b5f228ad13f7c5933f5 ] Run the completer task to post a work completion after processing a memory registration or invalidate work request. This covers the case where the memory registration or invalidate was the last work request posted to the qp. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15IB/ocrdma: fix out of bounds access to local bufferMichael Mera
commit 062d0f22a30c39840ea49b72cfcfc1aa4cc538fa upstream. In write to debugfs file 'resource_stats' the local buffer 'tmp_str' is written at index 'count-1' where 'count' is the size of the write, so potentially 0. This patch filters odd values for the write size/position to avoid this type of problem. Signed-off-by: Michael Mera <dev@michaelmera.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15IB/mlx4: Mark user MR as writable if actual virtual memory is writableJack Morgenstein
commit d8f9cc328c8888369880e2527e9186d745f2bbf6 upstream. To allow rereg_user_mr to modify the MR from read-only to writable without using get_user_pages again, we needed to define the initial MR as writable. However, this was originally done unconditionally, without taking into account the writability of the underlying virtual memory. As a result, any attempt to register a read-only MR over read-only virtual memory failed. To fix this, do not add the writable flag bit when the user virtual memory is not writable (e.g. const memory). However, when the underlying memory is NOT writable (and we therefore do not define the initial MR as writable), the IB core adds a "force writable" flag to its user-pages request. If this succeeds, the reg_user_mr caller gets a writable copy of the original pages. If the user-space caller then does a rereg_user_mr operation to enable writability, this will succeed. This should not be allowed, since the original virtual memory was not writable. Cc: <stable@vger.kernel.org> Fixes: 9376932d0c26 ("IB/mlx4_ib: Add support for user MR re-registration") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15IB/core: Make testing MR flags for writability a static inline functionJack Morgenstein
commit 08bb558ac11ab944e0539e78619d7b4c356278bd upstream. Make the MR writability flags check, which is performed in umem.c, a static inline function in file ib_verbs.h This allows the function to be used by low-level infiniband drivers. Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-09IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return valuesMichael J. Ruhl
commit b697d7d8c741f27b728a878fc55852b06d0f6f5e upstream. The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL. All of the relevant call sites look for IS_ERR, so the NULL return would lead to a NULL pointer exception. Do not use the ERR_PTR mechanism for this function. Update all call sites to handle the return value correctly. Clean up error paths to reflect return value. Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code") Cc: <stable@vger.kernel.org> # 4.9.x+ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03RDMA/uverbs: Protect from attempts to create flows on unsupported QPLeon Romanovsky
commit 940efcc8889f0d15567eb07fc9fd69b06e366aa5 upstream. Flows can be created on UD and RAW_PACKET QP types. Attempts to provide other QP types as an input causes to various unpredictable failures. The reason is that in order to support all various types (e.g. XRC), we are supposed to use real_qp handle and not qp handle and expect to driver/FW to fail such (XRC) flows. The simpler and safer variant is to ban all QP types except UD and RAW_PACKET, instead of relying on driver/FW. Cc: <stable@vger.kernel.org> # 3.11 Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03RDMA/mad: Convert BUG_ONs to error flowsLeon Romanovsky
[ Upstream commit 2468b82d69e3a53d024f28d79ba0fdb8bf43dfbf ] Let's perform checks in-place instead of BUG_ONs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03infiniband: fix a possible use-after-free bugCong Wang
[ Upstream commit cb2595c1393b4a5211534e6f0a0fbad369e21ad8 ] ucma_process_join() will free the new allocated "mc" struct, if there is any error after that, especially the copy_to_user(). But in parallel, ucma_leave_multicast() could find this "mc" through idr_find() before ucma_process_join() frees it, since it is already published. So "mc" could be used in ucma_leave_multicast() after it is been allocated and freed in ucma_process_join(), since we don't refcnt it. Fix this by separating "publish" from ID allocation, so that we can get an ID first and publish it later after copy_to_user(). Fixes: c8f6a362bf3e ("RDMA/cma: Add multicast communication support") Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17RDMA/ucm: Mark UCM interface as BROKENLeon Romanovsky
commit 7a8690ed6f5346f6738971892205e91d39b6b901 upstream. In commit 357d23c811a7 ("Remove the obsolete libibcm library") in rdma-core [1], we removed obsolete library which used the /dev/infiniband/ucmX interface. Following multiple syzkaller reports about non-sanitized user input in the UCMA module, the short audit reveals the same issues in UCM module too. It is better to disable this interface in the kernel, before syzkaller team invests time and energy to harden this unused interface. [1] https://github.com/linux-rdma/rdma-core/pull/279 Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17iw_cxgb4: correctly enforce the max reg_mr depthSteve Wise
commit 7b72717a20bba8bdd01b14c0460be7d15061cd6b upstream. The code was mistakenly using the length of the page array memory instead of the depth of the page array. This would cause MR creation to fail in some cases. Fixes: 8376b86de7d3 ("iw_cxgb4: Support the new memory registration API") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11IB/hfi1: Fix user context tail allocation for DMA_RTAILMike Marciniszyn
commit 1bc0299d976e000ececc6acd76e33b4582646cb7 upstream. The following code fails to allocate a buffer for the tail address that the hardware DMAs into when the user context DMA_RTAIL is set. if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) { rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent( &dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail, gfp_flags); if (!rcd->rcvhdrtail_kvaddr) goto bail_free; rcd->rcvhdrqtailaddr_dma = dma_hdrqtail; } So the rcvhdrtail_kvaddr would then be NULL. The mmap logic fails to check for a NULL rcvhdrtail_kvaddr. The fix is to test for both user and kernel DMA_TAIL options during the allocation as well as testing for a NULL rcvhdrtail_kvaddr during the mmap processing. Additionally, all downstream testing of the capmask for DMA_RTAIL have been eliminated in favor of testing rcvhdrtail_kvaddr. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03RDMA/mlx4: Discard unknown SQP work requestsLeon Romanovsky
commit 6b1ca7ece15e94251d1d0d919f813943e4a58059 upstream. There is no need to crash the machine if unknown work request was received in SQP MAD. Cc: <stable@vger.kernel.org> # 3.6 Fixes: 37bfc7c1e83f ("IB/mlx4: SR-IOV multiplex and demultiplex MADs") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03IB/isert: fix T10-pi check mask settingMax Gurtovoy
commit 0e12af84cdd3056460f928adc164f9e87f4b303b upstream. A copy/paste bug (probably) caused setting of an app_tag check mask in case where a ref_tag check was needed. Fixes: 38a2d0d429f1 ("IB/isert: convert to the generic RDMA READ/WRITE API") Fixes: 9e961ae73c2c ("IB/isert: Support T10-PI protected transactions") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03IB/isert: Fix for lib/dma_debug check_sync warningAlex Estrin
commit 763b69654bfb88ea3230d015e7d755ee8339f8ee upstream. The following error message occurs on a target host in a debug build during session login: [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0 [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes] ......snip ..... [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 [ 3524.555740] Call Trace: [ 3524.559102] [<ffffffffa5fe915b>] dump_stack+0x19/0x1b [ 3524.565477] [<ffffffffa58a2f58>] __warn+0xd8/0x100 [ 3524.571557] [<ffffffffa58a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 3524.578610] [<ffffffffa5bf5b8c>] check_sync+0x4ec/0x5b0 [ 3524.585177] [<ffffffffa58efc3f>] ? set_cpus_allowed_ptr+0x5f/0x1c0 [ 3524.592812] [<ffffffffa5bf5cd0>] debug_dma_sync_single_for_cpu+0x80/0x90 [ 3524.601029] [<ffffffffa586add3>] ? x2apic_send_IPI_mask+0x13/0x20 [ 3524.608574] [<ffffffffa585ee1b>] ? native_smp_send_reschedule+0x5b/0x80 [ 3524.616699] [<ffffffffa58e9b76>] ? resched_curr+0xf6/0x140 [ 3524.623567] [<ffffffffc0879af0>] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert] [ 3524.633060] [<ffffffffc087af95>] isert_put_login_tx+0x55/0x8b0 [ib_isert] [ 3524.641383] [<ffffffffa58ef114>] ? try_to_wake_up+0x1a4/0x430 [ 3524.648561] [<ffffffffc098cfed>] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod] [ 3524.658557] [<ffffffffc098d827>] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod] [ 3524.668084] [<ffffffffa59f9bc9>] ? kstrdup+0x49/0x60 [ 3524.674420] [<ffffffffc098e976>] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod] [ 3524.684656] [<ffffffffc098c2ee>] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod] [ 3524.694901] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.705446] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.715976] [<ffffffffc098ca78>] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod] [ 3524.725739] [<ffffffffa58d60ff>] kthread+0xef/0x100 [ 3524.732007] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.739540] [<ffffffffa5fff1b7>] ret_from_fork_nospec_begin+0x21/0x21 [ 3524.747558] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]--- [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it with dma-mapped address. login_tx_desc is a part of iser_conn structure and is used only once during login negotiation, so the issue is fixed by eliminating dma_sync call for this buffer using a special case routine. Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03IB/mlx5: Fetch soft WQE's on fatal error stateErez Shitrit
commit 7b74a83cf54a3747e22c57e25712bd70eef8acee upstream. On fatal error the driver simulates CQE's for ULPs that rely on completion of all their posted work-request. For the GSI traffic, the mlx5 has its own mechanism that sends the completions via software CQE's directly to the relevant CQ. This should be kept in fatal error too, so the driver should simulate such CQE's with the specified error state in order to complete GSI QP work requests. Without the fix the next deadlock might appears: schedule_timeout+0x274/0x350 wait_for_common+0xec/0x240 mcast_remove_one+0xd0/0x120 [ib_core] ib_unregister_device+0x12c/0x230 [ib_core] mlx5_ib_remove+0xc4/0x270 [mlx5_ib] mlx5_detach_device+0x184/0x1a0 [mlx5_core] mlx5_unload_one+0x308/0x340 [mlx5_core] mlx5_pci_err_detected+0x74/0xe0 [mlx5_core] Cc: <stable@vger.kernel.org> # 4.7 Fixes: 89ea94a7b6c4 ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03IB/{hfi1, qib}: Add handling of kernel restartAlex Estrin
commit 8d3e71136a080d007620472f50c7b3e63ba0f5cf upstream. A warm restart will fail to unload the driver, leaving link state potentially flapping up to the point the BIOS resets the adapter. Correct the issue by hooking the shutdown pci method, which will bring port down. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03IB/qib: Fix DMA api warning with debug kernelMike Marciniszyn
commit 0252f73334f9ef68868e4684200bea3565a4fcee upstream. The following error occurs in a debug build when running MPI PSM: [ 307.415911] WARNING: CPU: 4 PID: 23867 at lib/dma-debug.c:1158 check_unmap+0x4ee/0xa20 [ 307.455661] ib_qib 0000:05:00.0: DMA-API: device driver failed to check map error[device address=0x00000000df82b000] [size=4096 bytes] [mapped as page] [ 307.517494] Modules linked in: [ 307.531584] ib_isert iscsi_target_mod ib_srpt target_core_mod rpcrdma sunrpc ib_srp scsi_transport_srp scsi_tgt ib_iser libiscsi ib_ipoib scsi_transport_iscsi rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_qib intel_powerclamp coretemp rdmavt intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel ipmi_ssif ib_core aesni_intel sg ipmi_si lrw gf128mul dca glue_helper ipmi_devintf iTCO_wdt gpio_ich hpwdt iTCO_vendor_support ablk_helper hpilo acpi_power_meter cryptd ipmi_msghandler ie31200_edac shpchp pcc_cpufreq lpc_ich pcspkr ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crct10dif_pclmul crct10dif_common drm crc32c_intel libahci tg3 libata serio_raw ptp i2c_core [ 307.846113] pps_core dm_mirror dm_region_hash dm_log dm_mod [ 307.866505] CPU: 4 PID: 23867 Comm: mpitests-IMB-MP Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 307.911178] Hardware name: HP ProLiant DL320e Gen8, BIOS J05 11/09/2013 [ 307.944206] Call Trace: [ 307.956973] [<ffffffffbd9e915b>] dump_stack+0x19/0x1b [ 307.982201] [<ffffffffbd2a2f58>] __warn+0xd8/0x100 [ 308.005999] [<ffffffffbd2a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 308.034260] [<ffffffffbd5f667e>] check_unmap+0x4ee/0xa20 [ 308.060801] [<ffffffffbd41acaa>] ? page_add_file_rmap+0x2a/0x1d0 [ 308.090689] [<ffffffffbd5f6c4d>] debug_dma_unmap_page+0x9d/0xb0 [ 308.120155] [<ffffffffbd4082e0>] ? might_fault+0xa0/0xb0 [ 308.146656] [<ffffffffc07761a5>] qib_tid_free.isra.14+0x215/0x2a0 [ib_qib] [ 308.180739] [<ffffffffc0776bf4>] qib_write+0x894/0x1280 [ib_qib] [ 308.210733] [<ffffffffbd540b00>] ? __inode_security_revalidate+0x70/0x80 [ 308.244837] [<ffffffffbd53c2b7>] ? security_file_permission+0x27/0xb0 [ 308.266025] qib_ib0.8006: multicast join failed for ff12:401b:8006:0000:0000:0000:ffff:ffff, status -22 [ 308.323421] [<ffffffffbd46f5d3>] vfs_write+0xc3/0x1f0 [ 308.347077] [<ffffffffbd492a5c>] ? fget_light+0xfc/0x510 [ 308.372533] [<ffffffffbd47045a>] SyS_write+0x8a/0x100 [ 308.396456] [<ffffffffbd9ff355>] system_call_fastpath+0x1c/0x21 The code calls a qib_map_page() which has never correctly tested for a mapping error. Fix by testing for pci_dma_mapping_error() in all cases and properly handling the failure in the caller. Additionally, streamline qib_map_page() arguments to satisfy just the single caller. Cc: <stable@vger.kernel.org> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Tested-by: Don Dutile <ddutile@redhat.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06IB/core: Fix error code for invalid GID entryParav Pandit
commit a840c93ca7582bb6c88df2345a33f979b7a67874 upstream. When a GID entry is invalid EAGAIN is returned. This is an incorrect error code, there is nothing that will make this GID entry valid again in bounded time. Some user space tools fail incorrectly if EAGAIN is returned here, and this represents a small ABI change from earlier kernels. The first patch in the Fixes list makes entries that were valid before to become invalid, allowing this code to trigger, while the second patch in the Fixes list introduced the wrong EAGAIN. Therefore revert the return result to EINVAL which matches the historical expectations of the ibv_query_gid_type() API of the libibverbs user space library. Cc: <stable@vger.kernel.org> Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/core: Honor port_num while resolving GID for IB link layerParav Pandit
[ Upstream commit 563c4ba3bd2b8b0b21c65669ec2226b1cfa1138b ] ah_attr contains the port number to which cm_id is bound. However, while searching for GID table for matching GID entry, the port number is ignored. This could cause the wrong GID to be used when the ah_attr is converted to an AH. Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/qedr: Fix rc initialization on CNQ allocation failureKalderon, Michal
[ Upstream commit b15606f47b89b0b09936d7f45b59ba6275527041 ] Return code wasn't set properly when CNQ allocation failed. This only affect error message logging, currently user will receive an error message that says the qedr driver load failed with rc '0', instead of ENOMEM Fixes: ec72fce4 ("qedr: Add support for RoCE HW init") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/qedr: fix QP's ack timeout configurationKalderon, Michal
[ Upstream commit c3594f22302cca5e924e47ec1cc8edd265708f41 ] QPs that were configured with ack timeout value lower than 1 msec will not implement re-transmission timeout. This means that if a packet / ACK were dropped, the QP will not retransmit this packet. This can lead to an application hang. Fixes: cecbcddf6 ("qedr: Add support for QP verbs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/ucma: Correct option size check using optlenChien Tin Tung
[ Upstream commit 5f3e3b85cc0a5eae1c46d72e47d3de7bf208d9e2 ] The option size check is using optval instead of optlen causing the set option call to fail. Use the correct field, optlen, for size check. Fixes: 6a21dfc0d0db ("RDMA/ucma: Limit possible option size") Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/core: Fix possible crash to access NULL netdevParav Pandit
[ Upstream commit bb7f8f199c354c4cf155b1d6d55f86eaaed7fa5a ] resolved_dev returned might be NULL as ifindex is transient number. Ignoring NULL check of resolved_dev might crash the kernel. Therefore perform NULL check before accessing resolved_dev. Additionally rdma_resolve_ip_route() invokes addr_resolve() which performs check and address translation for loopback ifindex. Therefore, checking it again in rdma_resolve_ip_route() is not helpful. Therefore, the code is simplified to avoid IFF_LOOPBACK check. Fixes: 200298326b27 ("IB/core: Validate route when we init ah") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()Dan Carpenter
[ Upstream commit 5d414b178e950ce9685c253994cc730893d5d887 ] "err" is either zero or possibly uninitialized here. It should be -EINVAL. Fixes: 427c1e7bcd7e ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/mlx4: Include GID type when deleting GIDs from HW table under RoCEJack M
[ Upstream commit a18177925c252da7801149abe217c05b80884798 ] The commit cited below added a gid_type field (RoCEv1 or RoCEv2) to GID properties. When adding GIDs, this gid_type field was copied over to the hardware gid table. However, when deleting GIDs, the gid_type field was not copied over to the hardware gid table. As a result, when running RoCEv2, all RoCEv2 gids in the hardware gid table were set to type RoCEv1 when any gid was deleted. This problem would persist until the next gid was added (which would again restore the gid_type field for all the gids in the hardware gid table). Fix this by copying over the gid_type field to the hardware gid table when deleting gids, so that the gid_type of all remaining gids is preserved when a gid is deleted. Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDsJack Morgenstein
[ Upstream commit 0077416a3d529baccbe07ab3242e8db541cfadf6 ] When using IPv4 addresses in RoCEv2, the GID format for the mapped IPv4 address should be: ::ffff:<4-byte IPv4 address>. In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords zeroed out by memset, which resulted in deleting the ffff field. However, since procedure ipv6_addr_v4mapped() already verifies that the gid has format ::ffff:<ipv4 address>, no change is needed for the gid, and the memset can simply be removed. Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/qedr: Fix iWARP write and send with immediateKalderon, Michal
[ Upstream commit 551e1c67b4207455375a2e7a285dea1c7e8fc361 ] iWARP does not support RDMA WRITE or SEND with immediate data. Driver should check this before submitting to FW and return an immediate error Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/qedr: Fix kernel panic when running fio over NFSoRDMAKalderon, Michal
[ Upstream commit e3fd112cbf21d049faf64ba1471d72b93c22109a ] Race in qedr_poll_cq, lastest_cqe wasn't protected by lock, leading to a case where two context's accessing poll_cq at the same time lead to one of them having a pointer to an old latest_cqe and reading an invalid cqe element Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/ipoib: Fix for potential no-carrier stateAlex Estrin
[ Upstream commit 1029361084d18cc270f64dfd39529fafa10cfe01 ] On reboot SM can program port pkey table before ipoib registered its event handler, which could result in missing pkey event and leave root interface with initial pkey value from index 0. Since OPA port starts with invalid pkey in index 0, root interface will fail to initialize and stay down with no-carrier flag. For IB ipoib interface may end up with pkey different from value opensm put in pkey table idx 0, resulting in connectivity issues (different mcast groups, for example). Close the window by calling event handler after registration to make sure ipoib pkey is in sync with port pkey table. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failureLeon Romanovsky
[ Upstream commit b081808a66345ba725b77ecd8d759bee874cd937 ] Failure in XRCD FW deallocation command leaves memory leaked and returns error to the user which he can't do anything about it. This patch changes behavior to always free memory and always return success to the user. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30i40iw: Zero-out consumer key on allocate stag for FMRShiraz Saleem
[ Upstream commit 6376e926af1a8661dd1b2e6d0896e07f84a35844 ] If the application invalidates the MR before the FMR WR, HW parses the consumer key portion of the stag and returns an invalid stag key Asynchronous Event (AE) that tears down the QP. Fix this by zeroing-out the consumer key portion of the allocated stag returned to application for FMR. Fixes: ee855d3b93f3 ("RDMA/i40iw: Add base memory management extensions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30IB/hfi1: Use after free race condition in send context error pathMichael J. Ruhl
commit f9e76ca3771bf23d2142a81a88ddd8f31f5c4c03 upstream. A pio send egress error can occur when the PSM library attempts to to send a bad packet. That issue is still being investigated. The pio error interrupt handler then attempts to progress the recovery of the errored pio send context. Code inspection reveals that the handling lacks the necessary locking if that recovery interleaves with a PSM close of the "context" object contains the pio send context. The lack of the locking can cause the recovery to access the already freed pio send context object and incorrectly deduce that the pio send context is actually a kernel pio send context as shown by the NULL deref stack below: [<ffffffff8143d78c>] _dev_info+0x6c/0x90 [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1] [<ffffffff816ab124>] ? __schedule+0x424/0x9b0 [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440 [<ffffffff810ab086>] worker_thread+0x126/0x3c0 [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffff810b252f>] kthread+0xcf/0xe0 [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 [<ffffffff816b8798>] ret_from_fork+0x58/0x90 [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 This is the best case scenario and other scenarios can corrupt the already freed memory. Fix by adding the necessary locking in the pio send context error handler. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16IB/device: Convert ib-comp-wq to be CPU-boundSagi Grimberg
commit b7363e67b23e04c23c2a99437feefac7292a88bc upstream. This workqueue is used by our storage target mode ULPs via the new CQ API. Recent observations when working with very high-end flash storage devices reveal that UNBOUND workqueue threads can migrate between cpu cores and even numa nodes (although some numa locality is accounted for). While this attribute can be useful in some workloads, it does not fit in very nicely with the normal run-to-completion model we usually use in our target-mode ULPs and the block-mq irq<->cpu affinity facilities. The whole block-mq concept is that the completion will land on the same cpu where the submission was performed. The fact that our submitter thread is migrating cpus can break this locality. We assume that as a target mode ULP, we will serve multiple initiators/clients and we can spread the load enough without having to use unbound kworkers. Also, while we're at it, expose this workqueue via sysfs which is harmless and can be useful for debug. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09IB/hfi1: Fix NULL pointer dereference when invalid num_vls is usedSebastian Sanchez
commit 45d924571a5e1329580811f2419da61b07ac3613 upstream. When an invalid num_vls is used as a module parameter, the code execution follows an exception path where the macro dd_dev_err() expects dd->pcidev->dev not to be NULL in hfi1_init_dd(). This causes a NULL pointer dereference. Fix hfi1_init_dd() by initializing dd->pcidev and dd->pcidev->dev earlier in the code. If a dd exists, then dd->pcidev and dd->pcidev->dev always exists. BUG: unable to handle kernel NULL pointer dereference at 00000000000000f0 IP: __dev_printk+0x15/0x90 Workqueue: events work_for_cpu_fn RIP: 0010:__dev_printk+0x15/0x90 Call Trace: dev_err+0x6c/0x90 ? hfi1_init_pportdata+0x38d/0x3f0 [hfi1] hfi1_init_dd+0xdd/0x2530 [hfi1] ? pci_conf1_read+0xb2/0xf0 ? pci_read_config_word.part.9+0x64/0x80 ? pci_conf1_write+0xb0/0xf0 ? pcie_capability_clear_and_set_word+0x57/0x80 init_one+0x141/0x490 [hfi1] local_pci_probe+0x3f/0xa0 work_for_cpu_fn+0x10/0x20 process_one_work+0x152/0x350 worker_thread+0x1cf/0x3e0 kthread+0xf5/0x130 ? max_active_store+0x80/0x80 ? kthread_bind+0x10/0x10 ? do_syscall_64+0x6e/0x1a0 ? SyS_exit_group+0x10/0x10 ret_from_fork+0x35/0x40 Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09IB/mlx5: Use unlimited rate when static rate is not supportedDanit Goldberg
commit 4f32ac2e452c2180cd2df581cbadac183e27ecd0 upstream. Before the change, if the user passed a static rate value different than zero and the FW doesn't support static rate, it would end up configuring rate of 2.5 GBps. Fix this by using rate 0; unlimited, in cases where FW doesn't support static rate configuration. Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09RDMA/mlx5: Protect from shift operand overflowLeon Romanovsky
commit 002bf2282b2d7318e444dca9ffcb994afc5d5f15 upstream. Ensure that user didn't supply values too large that can cause overflow. UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:263:23 shift exponent -2147483648 is negative CPU: 0 PID: 292 Comm: syzkaller612609 Not tainted 4.16.0-rc1+ #131 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ubsan_epilogue+0xe/0x81 set_rq_size+0x7c2/0xa90 create_qp_common+0xc18/0x43c0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x2260 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 do_syscall_64+0x1aa/0x740 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x433569 RSP: 002b:00007ffc6e62f448 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433569 RDX: 0000000000000070 RSI: 00000000200042c0 RDI: 0000000000000003 RBP: 00000000006d5018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040c9f0 R14: 000000000040ca80 R15: 0000000000000006 Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09RDMA/ucma: Allow resolving address w/o specifying source addressRoland Dreier
commit 09abfe7b5b2f442a85f4c4d59ecf582ad76088d7 upstream. The RDMA CM will select a source device and address by consulting the routing table if no source address is passed into rdma_resolve_address(). Userspace will ask for this by passing an all-zero source address in the RESOLVE_IP command. Unfortunately the new check for non-zero address size rejects this with EINVAL, which breaks valid userspace applications. Fix this by explicitly allowing a zero address family for the source. Fixes: 2975d5de6428 ("RDMA/ucma: Check AF family prior resolving address") Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09RDMA/cxgb4: release hw resources on device removalRaju Rangoju
commit 26bff1bd74a4f7417509a83295614e9dab995b2a upstream. The c4iw_rdev_close() logic was not releasing all the hw resources (PBL and RQT memory) during the device removal event (driver unload / system reboot). This can cause panic in gen_pool_destroy(). The module remove function will wait for all the hw resources to be released during the device removal event. Fixes c12a67fe(iw_cxgb4: free EQ queue memory on last deref) Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPsLeon Romanovsky
commit 75a4598209cbe45540baa316c3b51d9db222e96e upstream. mlx5 modify_qp() relies on FW that the error will be thrown if wrong state is supplied. The missing check in FW causes the following crash while using XRC_TGT QPs. [ 14.769632] BUG: unable to handle kernel NULL pointer dereference at (null) [ 14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0 [ 14.773126] Oops: 0002 [#1] SMP PTI [ 14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119 [ 14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246 [ 14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000 [ 14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000 [ 14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504 [ 14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240 [ 14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000 [ 14.785800] FS: 00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000 [ 14.787073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0 [ 14.788689] Call Trace: [ 14.789007] _ib_modify_qp+0x71/0x120 [ 14.789475] modify_qp.isra.20+0x207/0x2f0 [ 14.790010] ib_uverbs_modify_qp+0x90/0xe0 [ 14.790532] ib_uverbs_write+0x1d2/0x3c0 [ 14.791049] ? __handle_mm_fault+0x93c/0xe40 [ 14.791644] __vfs_write+0x36/0x180 [ 14.792096] ? handle_mm_fault+0xc1/0x210 [ 14.792601] vfs_write+0xad/0x1e0 [ 14.793018] SyS_write+0x52/0xc0 [ 14.793422] do_syscall_64+0x75/0x180 [ 14.793888] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 14.794527] RIP: 0033:0x7f545ad76099 [ 14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099 [ 14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003 [ 14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480 [ 14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760 [ 14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000 [ 14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00 00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00 00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c [ 14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8 [ 14.804838] CR2: 0000000000000000 [ 14.805288] ---[ end trace 3f1da0df5c8b7c37 ]--- Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24IB/srp: Fix completion vector assignment algorithmBart Van Assche
commit 3a148896b24adf8688dc0c59af54531931677a40 upstream. Ensure that cv_end is equal to ibdev->num_comp_vectors for the NUMA node with the highest index. This patch improves spreading of RDMA channels over completion vectors and thereby improves performance, especially on systems with only a single NUMA node. This patch drops support for the comp_vector login parameter by ignoring the value of that parameter since I have not found a good way to combine support for that parameter and automatic spreading of RDMA channels over completion vectors. Fixes: d92c0da71a35 ("IB/srp: Add multichannel support") Reported-by: Alexander Schmid <alex@modula-shop-systems.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Alexander Schmid <alex@modula-shop-systems.de> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24IB/srp: Fix srp_abort()Bart Van Assche
commit e68088e78d82920632eba112b968e49d588d02a2 upstream. Before commit e494f6a72839 ("[SCSI] improved eh timeout handler") it did not really matter whether or not abort handlers like srp_abort() called .scsi_done() when returning another value than SUCCESS. Since that commit however this matters. Hence only call .scsi_done() when returning SUCCESS. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24RDMA/rxe: Fix an out-of-bounds readBart Van Assche
commit a6544a624c3ff92a64e4aca3931fa064607bd3da upstream. This patch avoids that KASAN reports the following when the SRP initiator calls srp_post_send(): ================================================================== BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x5c4/0x980 [rdma_rxe] Read of size 8 at addr ffff880066606e30 by task 02-mq/1074 CPU: 2 PID: 1074 Comm: 02-mq Not tainted 4.16.0-rc3-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x85/0xc7 print_address_description+0x65/0x270 kasan_report+0x231/0x350 rxe_post_send+0x5c4/0x980 [rdma_rxe] srp_post_send.isra.16+0x149/0x190 [ib_srp] srp_queuecommand+0x94d/0x1670 [ib_srp] scsi_dispatch_cmd+0x1c2/0x550 [scsi_mod] scsi_queue_rq+0x843/0xa70 [scsi_mod] blk_mq_dispatch_rq_list+0x143/0xac0 blk_mq_do_dispatch_ctx+0x1c5/0x260 blk_mq_sched_dispatch_requests+0x2bf/0x2f0 __blk_mq_run_hw_queue+0xdb/0x160 __blk_mq_delay_run_hw_queue+0xba/0x100 blk_mq_run_hw_queue+0xf2/0x190 blk_mq_sched_insert_request+0x163/0x2f0 blk_execute_rq+0xb0/0x130 scsi_execute+0x14e/0x260 [scsi_mod] scsi_probe_and_add_lun+0x366/0x13d0 [scsi_mod] __scsi_scan_target+0x18a/0x810 [scsi_mod] scsi_scan_target+0x11e/0x130 [scsi_mod] srp_create_target+0x1522/0x19e0 [ib_srp] kernfs_fop_write+0x180/0x210 __vfs_write+0xb1/0x2e0 vfs_write+0xf6/0x250 SyS_write+0x99/0x110 do_syscall_64+0xee/0x2b0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 The buggy address belongs to the page: page:ffffea0001998180 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880066606d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 ffff880066606d80: f1 00 f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 >ffff880066606e00: f2 00 00 00 00 00 f2 f2 f2 f3 f3 f3 f3 00 00 00 ^ ffff880066606e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880066606f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Moni Shoua <monis@mellanox.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA deviceRoland Dreier
commit 8435168d50e66fa5eae01852769d20a36f9e5e83 upstream. Check to make sure that ctx->cm_id->device is set before we use it. Otherwise userspace can trigger a NULL dereference by doing RDMA_USER_CM_CMD_SET_OPTION on an ID that is not bound to a device. Cc: <stable@vger.kernel.org> Reported-by: <syzbot+a67bc93e14682d92fc2f@syzkaller.appspotmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13RDMA/hfi1: fix array termination by appending NULL to attr arraySteven L. Roberts
[ Upstream commit c4dd4b69f55abcc8dd079f8de55d9d8c2ddbefce ] This fixes a kernel panic when loading the hfi driver as a dynamic module. Signed-off-by: Steven L Roberts <robers97@gmail.com> Reviewed-by: Leon Romanovsky <leon@kernel.org> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlersRaju Rangoju
[ Upstream commit 1dad0ebeea1cd890b8892523f736916e245b0aef ] The patch 761e19a504af (RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016 leads to the following static checker warning: drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure() warn: passing freed memory 'skb' Also fixes skb leak when l2t resolution fails Fixes: 761e19a504afa55 (RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13IB/rdmavt: Allocate CQ memory on the correct nodeMike Marciniszyn
[ Upstream commit db9a2c6f9b6196b889b98e961cb9a37617b11ccf ] CQ allocation does not ensure that completion queue entries and the completion queue structure are allocated on the correct numa node. Fix by allocating the rvt_cq and kernel CQ entries on the device node, leaving the user CQ entries on the default local node. Also ensure CQ resizes use the correct allocator when extending a CQ. Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13i40iw: Correct Q1/XF object count equationShiraz Saleem
[ Upstream commit fe99afd1febd74e0ef1fed7e3283f09effe1f4f0 ] Lower Inbound RDMA Read Queue (Q1) object count by a factor of 2 as it is incorrectly doubled. Also, round up Q1 and Transmit FIFO (XF) object count to power of 2 to satisfy hardware requirement. Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13i40iw: Fix sequence number for the first partial FPDUShiraz Saleem
[ Upstream commit df8b13a1b23356d01dfc4647a5629cdb0f4ce566 ] Partial FPDU processing is broken as the sequence number for the first partial FPDU is wrong due to incorrect Q2 buffer offset. The offset should be 64 rather than 16. Fixes: 786c6adb3a94 ("i40iw: add puda code") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13IB/srpt: Avoid that aborting a command triggers a kernel warningBart Van Assche
[ Upstream commit bd2c52d733f126ff75f99c537a27655b2db07e28 ] Avoid that the following warning is triggered: WARNING: CPU: 10 PID: 166 at ../drivers/infiniband/ulp/srpt/ib_srpt.c:2674 srpt_release_cmd+0x139/0x140 [ib_srpt] CPU: 10 PID: 166 Comm: kworker/u24:8 Not tainted 4.9.4-1-default #1 Workqueue: tmr-fileio target_tmr_work [target_core_mod] Call Trace: [<ffffffffaa3c4f70>] dump_stack+0x63/0x83 [<ffffffffaa0844eb>] __warn+0xcb/0xf0 [<ffffffffaa0845dd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc06ba429>] srpt_release_cmd+0x139/0x140 [ib_srpt] [<ffffffffc06e4377>] target_release_cmd_kref+0xb7/0x120 [target_core_mod] [<ffffffffc06e4d7f>] target_put_sess_cmd+0x2f/0x60 [target_core_mod] [<ffffffffc06e15e0>] core_tmr_lun_reset+0x340/0x790 [target_core_mod] [<ffffffffc06e4816>] target_tmr_work+0xe6/0x140 [target_core_mod] [<ffffffffaa09e4d3>] process_one_work+0x1f3/0x4d0 [<ffffffffaa09e7f8>] worker_thread+0x48/0x4e0 [<ffffffffaa09e7b0>] ? process_one_work+0x4d0/0x4d0 [<ffffffffaa0a46da>] kthread+0xca/0xe0 [<ffffffffaa0a4610>] ? kthread_park+0x60/0x60 [<ffffffffaa71b775>] ret_from_fork+0x25/0x30 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>