<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/mlx4, branch v3.14.3</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2014-01-25T19:17:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-01-25T19:17:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4ba9920e5e9c0e16b5ed24292d45322907bb9035'/>
<id>4ba9920e5e9c0e16b5ed24292d45322907bb9035</id>
<content type='text'>
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ip-roce' into for-next</title>
<updated>2014-01-23T07:24:21+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2014-01-23T07:24:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fb1b5034e4987b158179a62732fb6dfb8f7ec88e'/>
<id>fb1b5034e4987b158179a62732fb6dfb8f7ec88e</id>
<content type='text'>
Conflicts:
	drivers/infiniband/hw/mlx4/main.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/infiniband/hw/mlx4/main.c
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing</title>
<updated>2014-01-18T22:12:53+00:00</updated>
<author>
<name>Moni Shoua</name>
<email>monis@mellanox.com</email>
</author>
<published>2013-12-12T16:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=297e0dad720664dad44baa2cdd13f871979fb58c'/>
<id>297e0dad720664dad44baa2cdd13f871979fb58c</id>
<content type='text'>
IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Therefore, we need to extract them from the CQE and place them in
struct ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the address
handle attributes.

Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Therefore, we need to extract them from the CQE and place them in
struct ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the address
handle attributes.

Signed-off-by: Moni Shoua &lt;monis@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/core: Ethernet L2 attributes in verbs/cm structures</title>
<updated>2014-01-14T22:20:54+00:00</updated>
<author>
<name>Matan Barak</name>
<email>matanb@mellanox.com</email>
</author>
<published>2013-12-12T16:03:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=dd5f03beb4f76ae65d76d8c22a8815e424fc607c'/>
<id>dd5f03beb4f76ae65d76d8c22a8815e424fc607c</id>
<content type='text'>
This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP.  We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message.  We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB.  Vendor drivers are modified to support the new
function signature.

Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP.  We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message.  We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB.  Vendor drivers are modified to support the new
function signature.

Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mlx4_core: Add support for steerable IB UD QPs</title>
<updated>2014-01-14T22:06:50+00:00</updated>
<author>
<name>Matan Barak</name>
<email>matanb@mellanox.com</email>
</author>
<published>2013-11-07T13:25:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4de6580360867d44adecb2d05febed1c8d186c82'/>
<id>4de6580360867d44adecb2d05febed1c8d186c82</id>
<content type='text'>
This patch adds support for allocating IB UD QPs that we can steer
traffic from.  We introduce a new firmware command FLOW_STEERING_IB_UC_QP_RANGE
and a capability bit.

This command isn't supported for VFs.

Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds support for allocating IB UD QPs that we can steer
traffic from.  We introduce a new firmware command FLOW_STEERING_IB_UC_QP_RANGE
and a capability bit.

This command isn't supported for VFs.

Signed-off-by: Matan Barak &lt;matanb@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx4_core: Add basic support for TCP/IP offloads under tunneling</title>
<updated>2013-12-31T19:31:43+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2013-12-23T14:09:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7ffdf726cfe0d188907bdbb0e7729fb35a69c219'/>
<id>7ffdf726cfe0d188907bdbb0e7729fb35a69c219</id>
<content type='text'>
Add the low-level device commands and definitions used for TCP/IP HW offloads
of tunneled/vxlan traffic which are supported by the ConnectX3-pro NIC.

This is done through the following elements:

 - read tunneling device caps in QUERY_DEV_CAP
 - add helper function to do SET_PORT for tunneling
 - add DMFS VXLAN steering rule definitions
 - add CQE and WQE checksum offload field definitions

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the low-level device commands and definitions used for TCP/IP HW offloads
of tunneled/vxlan traffic which are supported by the ConnectX3-pro NIC.

This is done through the following elements:

 - read tunneling device caps in QUERY_DEV_CAP
 - add helper function to do SET_PORT for tunneling
 - add DMFS VXLAN steering rule definitions
 - add CQE and WQE checksum offload field definitions

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx4_core: Expose physical port id as PF/VF capability</title>
<updated>2013-12-20T00:04:43+00:00</updated>
<author>
<name>Hadar Hen Zion</name>
<email>hadarh@mellanox.com</email>
</author>
<published>2013-12-19T19:20:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e1a28e8e6797449dfdfa4739002d1e5939355a8'/>
<id>8e1a28e8e6797449dfdfa4739002d1e5939355a8</id>
<content type='text'>
Add the infrastructure needed to support ndo_get_phys_port_id which
allows users to identify to which physical port a net-device is connected
to by reading a unique port id.
This will work for VFs and PFs.
The driver uses a new device capability - phys_port_id, The PF driver
reads the port phys_port_id from Firmware and stores it. The VF driver
reads the port phys_port_id from the PF using QUERY_FUNC_CAP command.

Signed-off-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the infrastructure needed to support ndo_get_phys_port_id which
allows users to identify to which physical port a net-device is connected
to by reading a unique port id.
This will work for VFs and PFs.
The driver uses a new device capability - phys_port_id, The PF driver
reads the port phys_port_id from Firmware and stores it. The VF driver
reads the port phys_port_id from the PF using QUERY_FUNC_CAP command.

Signed-off-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx4_en: Datapath structures are allocated per NUMA node</title>
<updated>2013-11-08T00:22:48+00:00</updated>
<author>
<name>Eugenia Emantayev</name>
<email>eugenia@mellanox.com</email>
</author>
<published>2013-11-07T10:19:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=163561a4e2f8af44e96453bc10c7a4f9bcc736e1'/>
<id>163561a4e2f8af44e96453bc10c7a4f9bcc736e1</id>
<content type='text'>
For each RX/TX ring and its CQ, allocation is done on a NUMA node that
corresponds to the core that the data structure should operate on.
The assumption is that the core number is reflected by the ring index.
The affected allocations are the ring/CQ data structures,
the TX/RX info and the shared HW/SW buffer.
For TX rings, each core has rings of all UPs.

Signed-off-by: Yevgeny Petrilin &lt;yevgenyp@mellanox.com&gt;
Signed-off-by: Eugenia Emantayev &lt;eugenia@mellanox.com&gt;
Reviewed-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For each RX/TX ring and its CQ, allocation is done on a NUMA node that
corresponds to the core that the data structure should operate on.
The assumption is that the core number is reflected by the ring index.
The affected allocations are the ring/CQ data structures,
the TX/RX info and the shared HW/SW buffer.
For TX rings, each core has rings of all UPs.

Signed-off-by: Yevgeny Petrilin &lt;yevgenyp@mellanox.com&gt;
Signed-off-by: Eugenia Emantayev &lt;eugenia@mellanox.com&gt;
Reviewed-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx4_core: ICM pages are allocated on device NUMA node</title>
<updated>2013-11-08T00:22:48+00:00</updated>
<author>
<name>Eugenia Emantayev</name>
<email>eugenia@mellanox.com</email>
</author>
<published>2013-11-07T10:19:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6e7136ed7793fa4948b0192dcd6862d12a50d67c'/>
<id>6e7136ed7793fa4948b0192dcd6862d12a50d67c</id>
<content type='text'>
This is done to optimize FW/HW access to host memory.

Signed-off-by: Yevgeny Petrilin &lt;yevgenyp@mellanox.com&gt;
Signed-off-by: Eugenia Emantayev &lt;eugenia@mellanox.com&gt;
Reviewed-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is done to optimize FW/HW access to host memory.

Signed-off-by: Yevgeny Petrilin &lt;yevgenyp@mellanox.com&gt;
Signed-off-by: Eugenia Emantayev &lt;eugenia@mellanox.com&gt;
Reviewed-by: Hadar Hen Zion &lt;hadarh@mellanox.com&gt;
Signed-off-by: Amir Vadai &lt;amirv@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mlx4: Structures and init/teardown for VF resource quotas</title>
<updated>2013-11-04T21:19:07+00:00</updated>
<author>
<name>Jack Morgenstein</name>
<email>jackm@dev.mellanox.co.il</email>
</author>
<published>2013-11-03T08:03:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5a0d0a6161aecbbc76e4c1d2b82e4c7cef88bb29'/>
<id>5a0d0a6161aecbbc76e4c1d2b82e4c7cef88bb29</id>
<content type='text'>
This is step #1 for implementing SRIOV resource quotas for VFs.

Quotas are implemented per resource type for VFs and the PF, to prevent
any entity from simply grabbing all the resources for itself and leaving
the other entities unable to obtain such resources.

Resources which are allocated using quotas:  QPs, CQs, SRQs, MPTs, MTTs, MAC,
                                             VLAN, and Counters.

The quota system works as follows:
Each entity (VF or PF) is given a max number of a given resource (its quota),
and a guaranteed minimum number for each resource (starvation prevention).

For QPs, CQs, SRQs, MPTs and MTTs:
50% of the available quantity for the resource is divided equally among
the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module
parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool.
The other 50% is the "free pool", allocated on a first-come-first-serve basis.
For each VF/PF, resources are first allocated from its "guaranteed-minimum"
pool. When that pool is exhausted, the driver attempts to allocate from
the resource "free-pool".

The quota (i.e., max) for the VFs and the PF is:
  The free-pool amount (50% of the real max) + the guaranteed minimum

For MACs:
  Guarantee 2 MACs per VF/PF per port. As a result, since we have only
  128 MACs per port, reduce the allowable number of VFs from 64 to 63.
  Any remaining MACs are put into a free pool.

For VLANs:
  For the PF, the per-port quota is 128 and guarantee is 64
     (to allow the PF to register at least a VLAN per VF in VST mode).
  For the VFs, the per-port quota is 64 and the guarantee is 0.
      We assume that VGT VFs are trusted not to abuse the VLAN resource.

For Counters:
  For all functions (PF and VFs), the quota is 128 and the guarantee is 0.

In this patch, we define the needed structures, which are added to the
resource-tracker struct.  In addition, we do initialization
for the resource quota, and adjust the query_device response to use quotas
rather than resource maxima.

As part of the implementation, we introduce a new field in
mlx4_dev: quotas.  This field holds the resource quotas used
to report maxima to the upper layers (ib_core, via query_device).

The HCA maxima of these values are passed to the VFs (via
QUERY_HCA) so that they may continue to use these in handling
QPs, CQs, SRQs and MPTs.

Signed-off-by: Jack Morgenstein &lt;jackm@dev.mellanox.co.il&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is step #1 for implementing SRIOV resource quotas for VFs.

Quotas are implemented per resource type for VFs and the PF, to prevent
any entity from simply grabbing all the resources for itself and leaving
the other entities unable to obtain such resources.

Resources which are allocated using quotas:  QPs, CQs, SRQs, MPTs, MTTs, MAC,
                                             VLAN, and Counters.

The quota system works as follows:
Each entity (VF or PF) is given a max number of a given resource (its quota),
and a guaranteed minimum number for each resource (starvation prevention).

For QPs, CQs, SRQs, MPTs and MTTs:
50% of the available quantity for the resource is divided equally among
the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module
parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool.
The other 50% is the "free pool", allocated on a first-come-first-serve basis.
For each VF/PF, resources are first allocated from its "guaranteed-minimum"
pool. When that pool is exhausted, the driver attempts to allocate from
the resource "free-pool".

The quota (i.e., max) for the VFs and the PF is:
  The free-pool amount (50% of the real max) + the guaranteed minimum

For MACs:
  Guarantee 2 MACs per VF/PF per port. As a result, since we have only
  128 MACs per port, reduce the allowable number of VFs from 64 to 63.
  Any remaining MACs are put into a free pool.

For VLANs:
  For the PF, the per-port quota is 128 and guarantee is 64
     (to allow the PF to register at least a VLAN per VF in VST mode).
  For the VFs, the per-port quota is 64 and the guarantee is 0.
      We assume that VGT VFs are trusted not to abuse the VLAN resource.

For Counters:
  For all functions (PF and VFs), the quota is 128 and the guarantee is 0.

In this patch, we define the needed structures, which are added to the
resource-tracker struct.  In addition, we do initialization
for the resource quota, and adjust the query_device response to use quotas
rather than resource maxima.

As part of the implementation, we introduce a new field in
mlx4_dev: quotas.  This field holds the resource quotas used
to report maxima to the upper layers (ib_core, via query_device).

The HCA maxima of these values are passed to the VFs (via
QUERY_HCA) so that they may continue to use these in handling
QPs, CQs, SRQs and MPTs.

Signed-off-by: Jack Morgenstein &lt;jackm@dev.mellanox.co.il&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
