<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/infiniband/ulp, branch v4.19</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop</title>
<updated>2018-09-19T21:28:24+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2018-09-18T01:10:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ee92efe41cf358f4b99e73509f2bfd4733609f26'/>
<id>ee92efe41cf358f4b99e73509f2bfd4733609f26</id>
<content type='text'>
Use different loop variables for the inner and outer loop. This avoids
that an infinite loop occurs if there are more RDMA channels than
target-&gt;req_ring_size.

Fixes: d92c0da71a35 ("IB/srp: Add multichannel support")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use different loop variables for the inner and outer loop. This avoids
that an infinite loop occurs if there are more RDMA channels than
target-&gt;req_ring_size.

Fixes: d92c0da71a35 ("IB/srp: Add multichannel support")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler</title>
<updated>2018-09-05T21:32:06+00:00</updated>
<author>
<name>Aaron Knister</name>
<email>aaron.s.knister@nasa.gov</email>
</author>
<published>2018-08-24T12:42:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=816e846c2eb9129a3e0afa5f920c8bbc71efecaa'/>
<id>816e846c2eb9129a3e0afa5f920c8bbc71efecaa</id>
<content type='text'>
Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh-&gt;queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv-&gt;lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&amp;neigh-&gt;queue) &lt; IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr-&gt;hwaddr);
               spin_lock_irqsave(&amp;priv-&gt;lock, flags);
               __skb_queue_tail(&amp;neigh-&gt;queue, skb);
               spin_unlock_irqrestore(&amp;priv-&gt;lock, flags);
       } else {
               ++dev-&gt;stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Knister &lt;aaron.s.knister@nasa.gov&gt;
Tested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh-&gt;queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv-&gt;lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&amp;neigh-&gt;queue) &lt; IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr-&gt;hwaddr);
               spin_lock_irqsave(&amp;priv-&gt;lock, flags);
               __skb_queue_tail(&amp;neigh-&gt;queue, skb);
               spin_unlock_irqrestore(&amp;priv-&gt;lock, flags);
       } else {
               ++dev-&gt;stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Knister &lt;aaron.s.knister@nasa.gov&gt;
Tested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'linus/master' into rdma.git for-next</title>
<updated>2018-08-16T20:21:29+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-08-16T20:13:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0a3173a5f09bc58a3638ecfd0a80bdbae55e123c'/>
<id>0a3173a5f09bc58a3638ecfd0a80bdbae55e123c</id>
<content type='text'>
rdma.git merge resolution for the 4.19 merge window

Conflicts:
 drivers/infiniband/core/rdma_core.c
   - Use the rdma code and revise with the new spelling for
     atomic_fetch_add_unless
 drivers/nvme/host/rdma.c
   - Replace max_sge with max_send_sge in new blk code
 drivers/nvme/target/rdma.c
   - Use the blk code and revise to use NULL for ib_post_recv when
     appropriate
   - Replace max_sge with max_recv_sge in new blk code
 net/rds/ib_send.c
   - Use the net code and revise to use NULL for ib_post_recv when
     appropriate

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
rdma.git merge resolution for the 4.19 merge window

Conflicts:
 drivers/infiniband/core/rdma_core.c
   - Use the rdma code and revise with the new spelling for
     atomic_fetch_add_unless
 drivers/nvme/host/rdma.c
   - Replace max_sge with max_send_sge in new blk code
 drivers/nvme/target/rdma.c
   - Use the blk code and revise to use NULL for ib_post_recv when
     appropriate
   - Replace max_sge with max_recv_sge in new blk code
 net/rds/ib_send.c
   - Use the net code and revise to use NULL for ib_post_recv when
     appropriate

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi</title>
<updated>2018-08-16T05:06:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-16T05:06:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=72f02ba66bd83b54054da20eae550123de84da6f'/>
<id>72f02ba66bd83b54054da20eae550123de84da6f</id>
<content type='text'>
Pull SCSI updates from James Bottomley:
 "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
  hisi_sas, smartpqi, megaraid_sas, arcmsr.

  In addition, with the continuing absence of Nic we have target updates
  for tcmu and target core (all with reviews and acks).

  The biggest observable change is going to be that we're (again) trying
  to switch to mulitqueue as the default (a user can still override the
  setting on the kernel command line).

  Other major core stuff is the removal of the remaining Microchannel
  drivers, an update of the internal timers and some reworks of
  completion and result handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
  scsi: ufs: remove unnecessary query(DM) UPIU trace
  scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
  scsi: aacraid: Spelling fix in comment
  scsi: mpt3sas: Fix calltrace observed while running IO &amp; reset
  scsi: aic94xx: fix an error code in aic94xx_init()
  scsi: st: remove redundant pointer STbuffer
  scsi: qla2xxx: Update driver version to 10.00.00.08-k
  scsi: qla2xxx: Migrate NVME N2N handling into state machine
  scsi: qla2xxx: Save frame payload size from ICB
  scsi: qla2xxx: Fix stalled relogin
  scsi: qla2xxx: Fix race between switch cmd completion and timeout
  scsi: qla2xxx: Fix Management Server NPort handle reservation logic
  scsi: qla2xxx: Flush mailbox commands on chip reset
  scsi: qla2xxx: Fix unintended Logout
  scsi: qla2xxx: Fix session state stuck in Get Port DB
  scsi: qla2xxx: Fix redundant fc_rport registration
  scsi: qla2xxx: Silent erroneous message
  scsi: qla2xxx: Prevent sysfs access when chip is down
  scsi: qla2xxx: Add longer window for chip reset
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull SCSI updates from James Bottomley:
 "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
  hisi_sas, smartpqi, megaraid_sas, arcmsr.

  In addition, with the continuing absence of Nic we have target updates
  for tcmu and target core (all with reviews and acks).

  The biggest observable change is going to be that we're (again) trying
  to switch to mulitqueue as the default (a user can still override the
  setting on the kernel command line).

  Other major core stuff is the removal of the remaining Microchannel
  drivers, an update of the internal timers and some reworks of
  completion and result handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
  scsi: ufs: remove unnecessary query(DM) UPIU trace
  scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
  scsi: aacraid: Spelling fix in comment
  scsi: mpt3sas: Fix calltrace observed while running IO &amp; reset
  scsi: aic94xx: fix an error code in aic94xx_init()
  scsi: st: remove redundant pointer STbuffer
  scsi: qla2xxx: Update driver version to 10.00.00.08-k
  scsi: qla2xxx: Migrate NVME N2N handling into state machine
  scsi: qla2xxx: Save frame payload size from ICB
  scsi: qla2xxx: Fix stalled relogin
  scsi: qla2xxx: Fix race between switch cmd completion and timeout
  scsi: qla2xxx: Fix Management Server NPort handle reservation logic
  scsi: qla2xxx: Flush mailbox commands on chip reset
  scsi: qla2xxx: Fix unintended Logout
  scsi: qla2xxx: Fix session state stuck in Get Port DB
  scsi: qla2xxx: Fix redundant fc_rport registration
  scsi: qla2xxx: Silent erroneous message
  scsi: qla2xxx: Prevent sysfs access when chip is down
  scsi: qla2xxx: Add longer window for chip reset
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2018-08-15T22:04:25+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-15T22:04:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9a76aba02a37718242d7cdc294f0a3901928aa57'/>
<id>9a76aba02a37718242d7cdc294f0a3901928aa57</id>
<content type='text'>
Pull networking updates from David Miller:
 "Highlights:

   - Gustavo A. R. Silva keeps working on the implicit switch fallthru
     changes.

   - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
     Luca Coelho.

   - Re-enable ASPM in r8169, from Kai-Heng Feng.

   - Add virtual XFRM interfaces, which avoids all of the limitations of
     existing IPSEC tunnels. From Steffen Klassert.

   - Convert GRO over to use a hash table, so that when we have many
     flows active we don't traverse a long list during accumluation.

   - Many new self tests for routing, TC, tunnels, etc. Too many
     contributors to mention them all, but I'm really happy to keep
     seeing this stuff.

   - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.

   - Lots of cleanups and fixes in L2TP code from Guillaume Nault.

   - Add IPSEC offload support to netdevsim, from Shannon Nelson.

   - Add support for slotting with non-uniform distribution to netem
     packet scheduler, from Yousuk Seung.

   - Add UDP GSO support to mlx5e, from Boris Pismenny.

   - Support offloading of Team LAG in NFP, from John Hurley.

   - Allow to configure TX queue selection based upon RX queue, from
     Amritha Nambiar.

   - Support ethtool ring size configuration in aquantia, from Anton
     Mikaev.

   - Support DSCP and flowlabel per-transport in SCTP, from Xin Long.

   - Support list based batching and stack traversal of SKBs, this is
     very exciting work. From Edward Cree.

   - Busyloop optimizations in vhost_net, from Toshiaki Makita.

   - Introduce the ETF qdisc, which allows time based transmissions. IGB
     can offload this in hardware. From Vinicius Costa Gomes.

   - Add parameter support to devlink, from Moshe Shemesh.

   - Several multiplication and division optimizations for BPF JIT in
     nfp driver, from Jiong Wang.

   - Lots of prepatory work to make more of the packet scheduler layer
     lockless, when possible, from Vlad Buslov.

   - Add ACK filter and NAT awareness to sch_cake packet scheduler, from
     Toke Høiland-Jørgensen.

   - Support regions and region snapshots in devlink, from Alex Vesker.

   - Allow to attach XDP programs to both HW and SW at the same time on
     a given device, with initial support in nfp. From Jakub Kicinski.

   - Add TLS RX offload and support in mlx5, from Ilya Lesokhin.

   - Use PHYLIB in r8169 driver, from Heiner Kallweit.

   - All sorts of changes to support Spectrum 2 in mlxsw driver, from
     Ido Schimmel.

   - PTP support in mv88e6xxx DSA driver, from Andrew Lunn.

   - Make TCP_USER_TIMEOUT socket option more accurate, from Jon
     Maxwell.

   - Support for templates in packet scheduler classifier, from Jiri
     Pirko.

   - IPV6 support in RDS, from Ka-Cheong Poon.

   - Native tproxy support in nf_tables, from Máté Eckl.

   - Maintain IP fragment queue in an rbtree, but optimize properly for
     in-order frags. From Peter Oskolkov.

   - Improvde handling of ACKs on hole repairs, from Yuchung Cheng"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
  bpf: test: fix spelling mistake "REUSEEPORT" -&gt; "REUSEPORT"
  hv/netvsc: Fix NULL dereference at single queue mode fallback
  net: filter: mark expected switch fall-through
  xen-netfront: fix warn message as irq device name has '/'
  cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
  net: dsa: mv88e6xxx: missing unlock on error path
  rds: fix building with IPV6=m
  inet/connection_sock: prefer _THIS_IP_ to current_text_addr
  net: dsa: mv88e6xxx: bitwise vs logical bug
  net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
  ieee802154: hwsim: using right kind of iteration
  net: hns3: Add vlan filter setting by ethtool command -K
  net: hns3: Set tx ring' tc info when netdev is up
  net: hns3: Remove tx ring BD len register in hns3_enet
  net: hns3: Fix desc num set to default when setting channel
  net: hns3: Fix for phy link issue when using marvell phy driver
  net: hns3: Fix for information of phydev lost problem when down/up
  net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
  net: hns3: Add support for serdes loopback selftest
  bnxt_en: take coredump_record structure off stack
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking updates from David Miller:
 "Highlights:

   - Gustavo A. R. Silva keeps working on the implicit switch fallthru
     changes.

   - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
     Luca Coelho.

   - Re-enable ASPM in r8169, from Kai-Heng Feng.

   - Add virtual XFRM interfaces, which avoids all of the limitations of
     existing IPSEC tunnels. From Steffen Klassert.

   - Convert GRO over to use a hash table, so that when we have many
     flows active we don't traverse a long list during accumluation.

   - Many new self tests for routing, TC, tunnels, etc. Too many
     contributors to mention them all, but I'm really happy to keep
     seeing this stuff.

   - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.

   - Lots of cleanups and fixes in L2TP code from Guillaume Nault.

   - Add IPSEC offload support to netdevsim, from Shannon Nelson.

   - Add support for slotting with non-uniform distribution to netem
     packet scheduler, from Yousuk Seung.

   - Add UDP GSO support to mlx5e, from Boris Pismenny.

   - Support offloading of Team LAG in NFP, from John Hurley.

   - Allow to configure TX queue selection based upon RX queue, from
     Amritha Nambiar.

   - Support ethtool ring size configuration in aquantia, from Anton
     Mikaev.

   - Support DSCP and flowlabel per-transport in SCTP, from Xin Long.

   - Support list based batching and stack traversal of SKBs, this is
     very exciting work. From Edward Cree.

   - Busyloop optimizations in vhost_net, from Toshiaki Makita.

   - Introduce the ETF qdisc, which allows time based transmissions. IGB
     can offload this in hardware. From Vinicius Costa Gomes.

   - Add parameter support to devlink, from Moshe Shemesh.

   - Several multiplication and division optimizations for BPF JIT in
     nfp driver, from Jiong Wang.

   - Lots of prepatory work to make more of the packet scheduler layer
     lockless, when possible, from Vlad Buslov.

   - Add ACK filter and NAT awareness to sch_cake packet scheduler, from
     Toke Høiland-Jørgensen.

   - Support regions and region snapshots in devlink, from Alex Vesker.

   - Allow to attach XDP programs to both HW and SW at the same time on
     a given device, with initial support in nfp. From Jakub Kicinski.

   - Add TLS RX offload and support in mlx5, from Ilya Lesokhin.

   - Use PHYLIB in r8169 driver, from Heiner Kallweit.

   - All sorts of changes to support Spectrum 2 in mlxsw driver, from
     Ido Schimmel.

   - PTP support in mv88e6xxx DSA driver, from Andrew Lunn.

   - Make TCP_USER_TIMEOUT socket option more accurate, from Jon
     Maxwell.

   - Support for templates in packet scheduler classifier, from Jiri
     Pirko.

   - IPV6 support in RDS, from Ka-Cheong Poon.

   - Native tproxy support in nf_tables, from Máté Eckl.

   - Maintain IP fragment queue in an rbtree, but optimize properly for
     in-order frags. From Peter Oskolkov.

   - Improvde handling of ACKs on hole repairs, from Yuchung Cheng"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
  bpf: test: fix spelling mistake "REUSEEPORT" -&gt; "REUSEPORT"
  hv/netvsc: Fix NULL dereference at single queue mode fallback
  net: filter: mark expected switch fall-through
  xen-netfront: fix warn message as irq device name has '/'
  cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
  net: dsa: mv88e6xxx: missing unlock on error path
  rds: fix building with IPV6=m
  inet/connection_sock: prefer _THIS_IP_ to current_text_addr
  net: dsa: mv88e6xxx: bitwise vs logical bug
  net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
  ieee802154: hwsim: using right kind of iteration
  net: hns3: Add vlan filter setting by ethtool command -K
  net: hns3: Set tx ring' tc info when netdev is up
  net: hns3: Remove tx ring BD len register in hns3_enet
  net: hns3: Fix desc num set to default when setting channel
  net: hns3: Fix for phy link issue when using marvell phy driver
  net: hns3: Fix for information of phydev lost problem when down/up
  net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
  net: hns3: Add support for serdes loopback selftest
  bnxt_en: take coredump_record structure off stack
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Consolidate checking of the proposed child interface</title>
<updated>2018-08-03T02:27:44+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-07-29T08:35:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=760109760455a0a35491cb02a3bc3e15f0c180f6'/>
<id>760109760455a0a35491cb02a3bc3e15f0c180f6</id>
<content type='text'>
Move all the checking for pkey and other validity to the __ipoib_vlan_add
function. This removes the last difference from the control flow
of the __ipoib_vlan_add to make the overall design simpler to
understand.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move all the checking for pkey and other validity to the __ipoib_vlan_add
function. This removes the last difference from the control flow
of the __ipoib_vlan_add to make the overall design simpler to
understand.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Erez Shitrit &lt;erezsh@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Maintain the child_intfs list from ndo_init/uninit</title>
<updated>2018-08-03T02:27:44+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-07-29T08:34:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=13476d35bba60b59521ff25d902fdb552b8bf2ac'/>
<id>13476d35bba60b59521ff25d902fdb552b8bf2ac</id>
<content type='text'>
This fixes a bug in the netlink path where the vlan_rwsem was not
held around __ipoib_vlan_add causing the child_intfs to be manipulated
unsafely.

In the process this greatly simplifies the vlan_rwsem write side locking
to only cover a single non-sleeping statement.

This also further increases the safety of the removal ordering by holding
the netdev of the parent while the child is active to ensure most bugs
become either an oops on a NULL priv or a deadlock on the netdev refcount.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes a bug in the netlink path where the vlan_rwsem was not
held around __ipoib_vlan_add causing the child_intfs to be manipulated
unsafely.

In the process this greatly simplifies the vlan_rwsem write side locking
to only cover a single non-sleeping statement.

This also further increases the safety of the removal ordering by holding
the netdev of the parent while the child is active to ensure most bugs
become either an oops on a NULL priv or a deadlock on the netdev refcount.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Do not remove child devices from within the ndo_uninit</title>
<updated>2018-08-03T02:27:43+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-07-29T08:34:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=25405d98a2aa0b9983bb9c36b0b00815d39394f4'/>
<id>25405d98a2aa0b9983bb9c36b0b00815d39394f4</id>
<content type='text'>
Switching to priv_destructor and needs_free_netdev created a subtle
ordering problem in ipoib_remove_one.

Now that unregister_netdev frees the netdev and priv we must ensure that
the children are unregistered before trying to unregister the parent,
or child unregister will use after free.

The solution is to unregister the children, then parent, in the same batch
all while holding the rtnl_lock. This closes all the races where a new
child could have been added and ensures proper ordering.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Switching to priv_destructor and needs_free_netdev created a subtle
ordering problem in ipoib_remove_one.

Now that unregister_netdev frees the netdev and priv we must ensure that
the children are unregistered before trying to unregister the parent,
or child unregister will use after free.

The solution is to unregister the children, then parent, in the same batch
all while holding the rtnl_lock. This closes all the races where a new
child could have been added and ensures proper ordering.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/ipoib: Get rid of the sysfs_mutex</title>
<updated>2018-08-03T02:27:43+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-07-29T08:34:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ee190ab734ba4d3c7887bd193ce8124385738e44'/>
<id>ee190ab734ba4d3c7887bd193ce8124385738e44</id>
<content type='text'>
This mutex was introduced to deal with the deadlock formed by calling
unregister_netdev from within the sysfs callback of a netdev.

Now that we have priv_destructor and needs_free_netdev we can switch
to the more targeted solution of running the unregister from a
work queue. This avoids the deadlock and gets rid of the mutex.

The next patch in the series needs this mutex eliminated to create
atomicity of unregisteration.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This mutex was introduced to deal with the deadlock formed by calling
unregister_netdev from within the sysfs callback of a netdev.

Now that we have priv_destructor and needs_free_netdev we can switch
to the more targeted solution of running the unregister from a
work queue. This avoids the deadlock and gets rid of the mutex.

The next patch in the series needs this mutex eliminated to create
atomicity of unregisteration.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/netdev: Use priv_destructor for netdev cleanup</title>
<updated>2018-08-03T02:27:43+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2018-07-29T08:34:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9f49a5b5c21d58aa84e16cfdc5e99e49faefcb7a'/>
<id>9f49a5b5c21d58aa84e16cfdc5e99e49faefcb7a</id>
<content type='text'>
Now that the unregister_netdev flow for IPoIB no longer relies on external
code we can now introduce the use of priv_destructor and
needs_free_netdev.

The rdma_netdev flow is switched to use the netdev common priv_destructor
instead of the special free_rdma_netdev and the IPOIB ULP adjusted:
 - priv_destructor needs to switch to point to the ULP's destructor
   which will then call the rdma_ndev's in the right order
 - We need to be careful around the error unwind of register_netdev
   as it sometimes calls priv_destructor on failure
 - ULPs need to use ndo_init/uninit to ensure proper ordering
   of failures around register_netdev

Switching to priv_destructor is a necessary pre-requisite to using
the rtnl new_link mechanism.

The VNIC user for rdma_netdev should also be revised, but that is left for
another patch.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Denis Drozdov &lt;denisd@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the unregister_netdev flow for IPoIB no longer relies on external
code we can now introduce the use of priv_destructor and
needs_free_netdev.

The rdma_netdev flow is switched to use the netdev common priv_destructor
instead of the special free_rdma_netdev and the IPOIB ULP adjusted:
 - priv_destructor needs to switch to point to the ULP's destructor
   which will then call the rdma_ndev's in the right order
 - We need to be careful around the error unwind of register_netdev
   as it sometimes calls priv_destructor on failure
 - ULPs need to use ndo_init/uninit to ensure proper ordering
   of failures around register_netdev

Switching to priv_destructor is a necessary pre-requisite to using
the rtnl new_link mechanism.

The VNIC user for rdma_netdev should also be revised, but that is left for
another patch.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Denis Drozdov &lt;denisd@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
