summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-04-19qlcnic: Fix loopback test for SR-IOV PF.Rajesh Borundia
o Do not disable mailbox interrupts while running loopback test through SR-IOV PF. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19qlcnic: Support VLAN id config.Rajesh Borundia
o Add support for VLAN id configuration per VF using iproute2 tool. o VLAN id's 1-4094 are treated as PVID by the PF and Guest VLAN tagging is not allowed by default. o PVID is disabled when the VLAN id is set to 0 o Guest VLAN tagging is allowed when the VLAN id is set to 4095. o Only one Guest VLAN id is supported. o VLAN id can be changed only when the VF driver is not loaded. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19qlcnic: Support MAC address, Tx rate config.Rajesh Borundia
o Add support for MAC address and Tx rate configuration per VF via iproute2 tool. o Tx rate change is allowed while the guest is running and the VF driver is loaded. o MAC address change is allowed only when VF driver is not loaded. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19qlcnic: VF reset recovery implementation.Rajesh Borundia
o Implement recovery mechanism for VF to recover from adapter resets. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19qlcnic: VF FLR implementation.Rajesh Borundia
o FLR from Hypervisor - When hypervisor issues a VF FLR request, adapter notifies the parent PF driver of the FLR request for PF driver to perform any cleanup on behalf of that VF. o FLR from VF Driver - VF driver may initiate a VF FLR request, if VF state needs to be cleaned up before a re-initialization. VF re-initialization during kdump is an example. o PF driver cleans up all resources allocated on behalf of a VF, on VF FLR notifications from the adapter or from the VF driver. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19qlcnic: Change 82xx adapter VLAN id endian type.Rajesh Borundia
o 82xx adapter requires VLAN id in little endian format. Instead of passing vlan id parameter as __le16, pass the parameter as u16 and use cpu_to_le16 at appropriate places. Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19Merge branch 'netlink-mmap'David S. Miller
Patrick McHardy says: ==================== The following patches contain an implementation of memory mapped I/O for netlink. The implementation is modelled after AF_PACKET memory mapped I/O with a few differences: - In order to perform memory mapped I/O to userspace, the kernel allocates skbs with the data area pointing to the data area of the mapped frames. All netlink subsystems assume a linear data area, so for the sake of simplicity, the mapped data area is not attached to the paged area but to skb->data. This requires introduction of a special skb alloction function that just allocates an skb head without the data area. Since this is a quite rare use case, I introduced a new function based on __alloc_skb instead of splitting it up into head and data alloction. The alternative would be to introduce an __alloc_skb_head and __alloc_skb_data function, which would actually be useful for a specific error case in memory mapped netlink, but would require a couple of extra instructions for the common skb allocation case, so it doesn't really seem worth it. In order to get the destination memory area for skb->data before message construction, memory mapped netlink I/O needs to look up the destination socket during allocation instead of during transmission because the ring is owned by the receiveing socket/process. A special skb allocation function (netlink_alloc_skb) taking the destination pid as an argument is used for this, all subsystems that want to support memory mapped I/O need to use this function, automatic fallback to the receive queue happens for unconverted subsystems. Dumps automatically use memory mapped I/O if the receiving socket has enabled it. The visible effect of looking up the destination socket during allocation instead of transmission is that message ordering in userspace might change in case allocation and transmission aren't performed atomically. This usually doesn't matter since most subsystems have a BKL-like lock like the rtnl mutex, to my knowledge the currently only existing case where it might matter is nfnetlink_queue combined with the recently introduced batched verdicts, but a) that subsystem already includes sequence numbers which allow userspace to reorder messages in case it cares to, also the reodering window is quite small and b) with memory mapped transmission batching can be performed in a subsystem indepandant manner. - AF_NETLINK contains flow control for database dumps, with regular I/O dump continuation are triggered based on the sockets receive queue space and by recvmsg() calls. Since with memory mapped I/O there are no recvmsg() calls under normal operation, this is done in netlink_poll(), under the assumption that userspace has processed all pending frames before invoking poll(), thus the ring is expected to have room for new messages. Dumps currently don't benefit as much as they could from memory mapped I/O because each single continuation requires a poll() call. A more agressive approach seems like a good idea to me, especially in case the socket is not subscribed to any multicast groups (IOW only receiving explicitly requested data). Besides that, the memory mapped netlink implementation extends the states defined by AF_PACKET between userspace and the kernel by a SKIP status, this is intended for the case that userspace wants to queue frames (specifically when using nfnetlink_queue, an IDS and stream reassembly, requested by Eric Leblond) for a longer period of time. The kernel skips over all frames marked with SKIP when looking or unused frames and only fails when not finding a free frame or when having skipped the entire ring. Also noteworthy is memory mapped sendmsg: the kernel performs validation of messages before accepting and processing them, in order to prevent userspace from changing the messages contents after validation, the kernel checks that the ring is only mapped once and the file descriptor is not shared (in order to avoid having userspace set up another mapping after the first mentioned check). If either of both is not true, the message copied to an allocated skb and processed as with regular I/O. I'd especially appreciate review of this part since I'm not really versed in memory, file and process management, The remaining interesting details are included in the changelogs of the individual patches and the documentation, so I won't repeat them here. As an example, nfnetlink_queue is convererted to support memory mapped I/O. Other subsystems that would probably benefit are nfnetlink_log, audit and maybe ISCSI, not sure. Following are some numbers collected by Florian Westphal based on a slightly older version, which included an experimental patch for the nfnetlink_queue ordering issue. === Test hardware is a 12-core machine Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz ixgbe interfaces are used (i.e., multiqueue nics). irqs are distributed across the cpus. I've made several tests. The simple one consists of 3GBit UDP traffic, packets are 1500 bytes in size (i.e., no fragmentation), with a single nfqueue and the test client programs in libmnl examples directory. Packets are sent from one /24 net to another /24 net, i.e. there are a few hundred flows active at any given time. I've also tested with snort, but I disabled all rules. 6Gbit UDP traffic is generated in the snort case, and 6 nfqueues are used (i.e., 6 snorts run in parallel). I've tested with 3 different kernels, all based on 3.7.1. - 3.7.1, without the mmap patches - 3.7.1, with Patricks mmap patches - 3.7.1, with mmap patches and extended spinlock to ensure packet ids are monotonically increasing and cannot be re-ordered. This is what we currently ship in our product. [ the spinlock that is extended is the per nfqueue spinlock, it will be held from the time the netlink skb is allocated until the netlink skb is sent to userspace: http://1984.lsi.us.es/git/nf-next/commit/?h=mmap-netlink3&id=b8eb19c46650fef4e9e4fe53f367f99bbf72afc9 ] snort is normally used in "batch mode", i.e., after processing 25 packets a single "batch verdict" is sent to accept the packets seen so far. "mmap snort" means RX_RING + sendmsg(), i.e. TX_RING is not used at this time (except where noted below). One reason is that snort has a reload thread, so kernel needs to copy; also in the snort case no payload rewrite takes place, so compared to the rx path the tx path is cheap. Results: 3.7.1, without mmap patches, i.e. recv()+sendmsg() for everyone nfq-queue: 1.7 gbit out snort-recv-batch-25 5.1 gbit out snort-recv-no-batch 3.1 gbit out 3.7.1 + mmap + without extended spinlocked section nfq-queue: 1.7 gbit out (recv/sendmsg) nfq-queue-mmap: 2.4 gbit out snort-mmap-batch-25 5.6 gbit out (warning: since ids can be re-ordered, this version is "broken"). snort-recv-batch-25 5.1 gbit out snort-mmap-no-batch 4.6 gbit out (i.e., one verdict per packet) Kernel 3.7.1 + mmap + extended spinlock section: nfq-queue: 1.4 gbit out nfq-queue-mmap: 2.3 gbit out snort: 5.6 gbit out Conclusions: - The "extended spinlocked section" hurts performance in the single queue case; with 6 snorts there is no measureable slowdown. - I tried to re-write the mmap-snort to work without batch verdicts, but results were not very encouraging: kernel 3.7.1 + mmap (without extended spinlocked section): snort-mmap-batch-25 5.6 gbit out (what we currenlty ship) snort-recv-batch-25 5.1 gbit out (without using mmap) snort-mmap-batch-1 4.6 gbit out (with mmap but without batch verdicts) snort-mmap-txring-25 5.2 gbit out (with mmap but without batch verdicts) snort-mmap-txring-1 4.6 gbit out (with mmap but without batch verdicts) The difference between the last two is that in the txring-25 case, we put a verdict into the tx ring after every packet, but will only invoke sendmsg(, NULL, 0) after processing 25 packets. So the only difference is the number of sendmsg calls/context switches. So, i.o.w, kernel 3.7.1 + mmap + the extra locking crap is faster than 3.7.1 + mmap-without-extra-locking and single-verdict-per packet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19nfnetlink: add support for memory mapped netlinkPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netfilter: rename netlink related "pid" variables to "portid"Patrick McHardy
Get rid of the confusing mix of pid and portid and use portid consistently for all netlink related socket identities. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add documentation for memory mapped I/OPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add RX/TX-ring support to netlink diagPatrick McHardy
Based on AF_PACKET. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add flow control for memory mapped I/OPatrick McHardy
Add flow control for memory mapped RX. Since user-space usually doesn't invoke recvmsg() when using memory mapped I/O, flow control is performed in netlink_poll(). Dumps are allowed to continue if at least half of the ring frames are unused. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: implement memory mapped recvmsg()Patrick McHardy
Add support for mmap'ed recvmsg(). To allow the kernel to construct messages into the mapped area, a dataless skb is allocated and the data pointer is set to point into the ring frame. This means frames will be delivered to userspace in order of allocation instead of order of transmission. This usually doesn't matter since the order is either not determinable by userspace or message creation/transmission is serialized. The only case where this can have a visible difference is nfnetlink_queue. Userspace can't assume mmap'ed messages have ordered IDs anymore and needs to check this if using batched verdicts. For non-mapped sockets, nothing changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: implement memory mapped sendmsg()Patrick McHardy
Add support for mmap'ed sendmsg() to netlink. Since the kernel validates received messages before processing them, the code makes sure userspace can't modify the message contents after invoking sendmsg(). To do that only a single mapping of the TX ring is allowed to exist and the socket must not be shared. If either of these two conditions does not hold, it falls back to copying. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add mmap'ed netlink helper functionsPatrick McHardy
Add helper functions for looking up mmap'ed frame headers, reading and writing their status, allocating skbs with mmap'ed data areas and a poll function. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: mmaped netlink: ring setupPatrick McHardy
Add support for mmap'ed RX and TX ring setup and teardown based on the af_packet.c code. The following patches will use this to add the real mmap'ed receive and transmit functionality. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add netlink_skb_set_owner_r()Patrick McHardy
For mmap'ed I/O a netlink specific skb destructor needs to be invoked after the final kfree_skb() to clean up state. This doesn't work currently since the skb's ownership is transfered to the receiving socket using skb_set_owner_r(), which orphans the skb, thereby invoking the destructor prematurely. Since netlink doesn't account skbs to the originating socket, there's no need to orphan the skb. Add a netlink specific skb_set_owner_r() variant that does not orphan the skb and use a netlink specific destructor to call sock_rfree(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: don't orphan skb in netlink_trim()Patrick McHardy
Netlink doesn't account skbs to the sending socket, so the there's no need to orphan the skb before trimming it. Removing the skb_orphan() call is required for mmap'ed netlink, which uses a netlink specific skb destructor that must not be invoked before the final freeing of the skb. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: add function to allocate sk_buff head without data areaPatrick McHardy
Add a function to allocate a sk_buff head without any data. This will be used by memory mapped netlink to attach data from the mmaped area to the skb. Additionally change skb_release_all() to check whether the skb has a data area to allow the skb destructor to clear the data pointer in case only a head has been allocated. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: rename ssk to sk in struct netlink_skb_paramsPatrick McHardy
Memory mapped netlink needs to store the receiving userspace socket when sending from the kernel to userspace. Rename 'ssk' to 'sk' to avoid confusion. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19netlink: add symbolic value for congested statePatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19Merge branch '8021ad'David S. Miller
Patrick McHardy says: ==================== The following patches add support for 802.1ad (provider tagging) to the VLAN driver. The patchset consists of the following parts: - renaming of the NET_F_HW_VLAN feature flags to indicate that they only operate on CTAGs - preparation for 802.1ad VLAN filtering offload by adding a proto argument to the rx_{add,kill}_vid net_device_ops callbacks - preparation of the VLAN code to support multiple protocols by making the protocol used for tagging a property of the VLAN device and converting the device lookup functions accordingly - second step of preparation of the VLAN code by making the packet tagging functions take a protocol argument - introducation of 802.1ad support in the VLAN code, consisting mainly of checking for ETH_P_8021AD in a couple of places and testing the netdevice offload feature checks to take the protocol into account - announcement of STAG offloading capabilities in a couple of drivers for virtual network devices The patchset is based on net-next.git and has been tested with single and double tagging with and without HW acceleration (for CTAGs). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: announce STAG offload capability in some driversPatrick McHardy
- macvlan: propagate STAG filtering capabilities from underlying device - ifb: announce STAG tagging support in addition to CTAG tagging support - veth: announce STAG tagging/stripping support in addition to CTAG support Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: add 802.1ad supportPatrick McHardy
Add support for 802.1ad VLAN devices. This mainly consists of checking for ETH_P_8021AD in addition to ETH_P_8021Q in a couple of places and check offloading capabilities based on the used protocol. Configuration is done using "ip link": # ip link add link eth0 eth0.1000 \ type vlan proto 802.1ad id 1000 # ip link add link eth0.1000 eth0.1000.1000 \ type vlan proto 802.1q id 1000 52:54:00:12:34:56 > 92:b1:54:28:e4:8c, ethertype 802.1Q (0x8100), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84) 20.1.0.2 > 20.1.0.1: ICMP echo request, id 3003, seq 8, length 64 92:b1:54:28:e4:8c > 52:54:00:12:34:56, ethertype 802.1Q-QinQ (0x88a8), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 47944, offset 0, flags [none], proto ICMP (1), length 84) 20.1.0.1 > 20.1.0.2: ICMP echo reply, id 3003, seq 8, length 64 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: add protocol argument to packet tagging functionsPatrick McHardy
Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: prepare for 802.1ad supportPatrick McHardy
Make the encapsulation protocol value a property of VLAN devices and change the device lookup functions to take the protocol value into account. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: prepare for 802.1ad VLAN filtering offloadPatrick McHardy
Change the rx_{add,kill}_vid callbacks to take a protocol argument in preparation of 802.1ad support. The protocol argument used so far is always htons(ETH_P_8021Q). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*Patrick McHardy
Rename the hardware VLAN acceleration features to include "CTAG" to indicate that they only support CTAGs. Follow up patches will introduce 802.1ad server provider tagging (STAGs) and require the distinction for hardware not supporting acclerating both. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19Merge branch 'intel'David S. Miller
Jeff Kirsher says: ==================== This series contains updates to ixgbe and igb. The ixgbe changes contains 2 patches from the community, one which is a fix from akepner to fix a issue where netif_running() in shutdown was not done under rtnl_lock. The other community fix from Joe Perches cleans up #ifdef CONFIG_DEBUG_FS which is no longer necessary. The last ixgbe patch, from Jacob Keller, adds support for WoL on 82559 SFP+ LOM. The remaining patches are against igb, 10 of which were previously submitted in a pull request where changes were requested. The following igb patches: igb: Support for 100base-fx SFP igb: Support to read and export SFF-8472/8079 data are v2 based on feedback from Dan Carpenter and Ben Hutchings in the previous pull request. The largest set of changes are in my patch to cleanup code comments and whitespace to align the igb driver with the networking style of code comments. While cleaning up the code comments, fixed several other whitespace/checkpatch.pl code formatting issues. Other notable igb patches are EEE capable devices query the PHY to determine what the link partner is advertising, added support for i354 devices and added support for spoofchk config. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18igb: Add support for i354 devicesCarolyn Wyborny
This patch adds base support for new i354 devices. Loopback test is unsupported for this release. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: add support for spoofchk configLior Levy
Add support for spoofchk configuration per VF via iproute2 tool. Signed-off-by: Lior Levy <lior.levy@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Enable EEE LP advertisementMatthew Vick
On EEE-capable devices, query the PHY to determine what the link partner is advertising. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Fix code comments and whitespaceJeff Kirsher
Aligns the multi-line code comments with the desired style for the networking tree. Also cleaned up whitespace issues found during the cleanup of code comments (i.e. remove unnecessary blank lines, use tabs where possible, properly wrap lines and keep strings on a single line) Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2013-04-18igb: Fix sparse warnings on function pointersAkeem G. Abodunrin
This patch fixes sparse warnings on function pointers that are not defined as static. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Use rx/tx_itr_setting when setting up initial value of itrAlexander Duyck
It turns out that the InterruptThrottleRate module parameter was only having the effect of locking the ITR at the starting ITR value. This was because the values stored in rx_itr_setting and tx_itr_setting were being ignored when configuring the initial itr_val of the q_vector. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Pull adapter out of main path in igb_xmit_frame_ringAlexander Duyck
We only need the adapter pointer in the case of ptp. As such we can pull the adapter out of the main path and place it inside the if statement to avoid the temptation of accessing the adapter pointer in the fast path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Mask off check of frag_off as we only want fragment offsetAlexander Duyck
We were incorrectly checking the entire frag_off field when we only wanted the fragment offset. As a result we were not pulling in TCP headers when the DNF flag was set. To correct that we will now check for frag off using the IP_OFFSET mask. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: random code and comments fixAkeem G. Abodunrin
This patch fixes code and comments as identified in the driver. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Implement support to power sfp cage and turn on I2CAkeem G. Abodunrin
Based on original patch from Aurélien Guillaume <footplus@gmail.com> This patch adds support to turn on I2C, with sfp cage powered. CC: Aurélien Guillaume <footplus@gmail.com> Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Support to read and export SFF-8472/8079 dataAkeem G. Abodunrin
This patch adds support to read and export SFF-8472/8079 (SFP data) over i2c, through Ethtool. v2: Changed implementation to accommodate any offset within SFF module length boundary. Reported-by: Aurélien Guillaume <footplus@gmail.com> CC: Aurélien Guillaume <footplus@gmail.com> CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18igb: Support for 100base-fx SFPAkeem G. Abodunrin
This patch adds support for 100base-fx SFP and report proper link speed/duplex via Ethtool. v2: fix smatch warnings CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18ixgbe: Remove unnecessary #ifdef CONFIG_DEBUG_FS testsJoe Perches
Add some empty static inlines instead to make the code more readable. Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18ixgbe: Add support for WoL on 82599 SFP+ LOMJacob Keller
This patch adds software support for WoL for the 82599 SFP+ LOM device, (ID 0x8976) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18ixgbe: in shutdown, do netif_running() under rtnl_lockakepner
During shutdown it's possible for __dev_close() (which holds rtnl_lock) to clear the __LINK_STATE_START bit, and for ixgbe to then read that bit (without holding rtnl_lock), and then not fail to free irqs, etc. The result is a crash like this: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:313! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map CPU 1 Pid: 5910, comm: reboot Tainted: P ---------------- 2.6.32 #1 empty RIP: 0010:[<ffffffff81305c2b>] [<ffffffff81305c2b>] free_msi_irqs+0x11b/0x130 RSP: 0018:ffff880185c9bc88 EFLAGS: 00010282 RAX: ffff880219f58bc0 RBX: ffff88021ac53b00 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000246 RDI: 000000000000004a RBP: ffff880185c9bcc8 R08: 0000000000000002 R09: 0000000000000106 R10: 0000000000000000 R11: 0000000000000006 R12: ffff88021e524778 R13: 0000000000000001 R14: ffff88021e524000 R15: 0000000000000000 FS: 00007f90821b7700(0000) GS:ffff880028220000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f90818bd010 CR3: 0000000132c64000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process reboot (pid: 5910, threadinfo ffff880185c9a000, task ffff88021bf04a80) Stack: ffff880185c9bc98 000000018130529d ffff880185c9bcc8 ffff88021e524000 <0> 0000000000000004 ffff88021948c700 0000000000000000 ffff880185c9bda7 <0> ffff880185c9bce8 ffffffff81305cbd ffff880185c9bce8 ffff88021948c700 Call Trace: [<ffffffff81305cbd>] pci_disable_msix+0x3d/0x50 [<ffffffffa00501d5>] ixgbe_reset_interrupt_capability+0x65/0x90 [ixgbe] [<ffffffffa00512f6>] ixgbe_clear_interrupt_scheme+0xb6/0xd0 [ixgbe] [<ffffffffa005330b>] __ixgbe_shutdown+0x5b/0x200 [ixgbe] [<ffffffffa00534ca>] ixgbe_shutdown+0x1a/0x60 [ixgbe] [<ffffffff812f6c7c>] pci_device_shutdown+0x2c/0x50 [<ffffffff813727fb>] device_shutdown+0x4b/0x160 [<ffffffff8107d98c>] kernel_restart_prepare+0x2c/0x40 ehci timer_action, mod_timer io_watchdog [<ffffffff8107d9e6>] kernel_restart+0x16/0x60 [<ffffffff8107dbfd>] sys_reboot+0x1ad/0x200 [<ffffffff811676cf>] ? __d_free+0x3f/0x60 [<ffffffff81167748>] ? d_free+0x58/0x60 [<ffffffff8116f7c0>] ? mntput_no_expire+0x30/0x100 [<ffffffff81152b11>] ? __fput+0x191/0x200 [<ffffffff816565fe>] ? do_page_fault+0x3e/0xa0 [<ffffffff8100b132>] system_call_fastpath+0x16/0x1b Code: 4c 89 ef e8 98 8c e3 ff 4d 39 f4 48 8b 43 10 75 cf 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 49 8b 7d 20 e8 07 5a d3 ff eb c9 <0f> 0b 0f 1f 00 eb fb 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 ehci timer_action, mod_timer io_watchdog RIP [<ffffffff81305c2b>] free_msi_irqs+0x11b/0x130 RSP <ffff880185c9bc88> ---[ end trace 27de882a0fe75593 ]--- (This was seen on a pretty old kernel/driver, but looks like the same bug is still possible.) Signed-off-by: <akepner@riverbed.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-18Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to ixgbe only. v2- Dropped the following 2 patches from the series: ixgbe: Support using build_skb in the case that jumbo frames are disabled ixgbe: walk pci-e bus to find minimum width Ben Hutchings found a bug with Alex's patch, so that patch was dropped permanently. Jacob's "walk PCIe bus" patch is being re-worked for a more generic solution so that other drivers can benefit. In the remaining patches... Alex provides a fix where we were incorrectly checking the entire frag_off field when we only wanted the fragment offset. Alex also cleans up the check for PAGE_SIZE, since the default configuration allocates 32K for all buffers. Emil provides a change to the calculation of eerd so that it is consistent between the read and write functions by using | instead of +. Jacob adds support for displaying PCIe Gen3 link speed, which was previously missing from the ixgbe driver. He also provides a patch to clean up ixgbe_get_bus_info_generic to call some conversion functions, which are used also in another patch provided by Jacob. Jacob modifies the driver to enable certain devices (which have an internal switch) to read from the physical slot rather than reading data from the internal switch. Don provides a couple of fixes (which are more appropriate for net-next), one of which resolves an issue where ixgbe was only turning on the laser when the adapter was up which caused issues for those who wanted to access the MNG firmware while the port was in a down state. The other fix is for WoL when currently linked at 1G. Lastly Don bumps the driver version keep the in-kernel driver up to date with the current functionality. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18tcp: introduce TCPSpuriousRtxHostQueues SNMP counterEric Dumazet
Host queues (Qdisc + NIC) can hold packets so long that TCP can eventually retransmit a packet before the first transmit even left the host. Its not clear right now if we could avoid this in the first place : - We could arm RTO timer not at the time we enqueue packets, but at the time we TX complete them (tcp_wfree()) - Cancel the sending of the new copy of the packet if prior one is still in queue. This patch adds instrumentation so that we can at least see how often this problem happens. TCPSpuriousRtxHostQueues SNMP counter is incremented every time we detect the fast clone is not yet freed in tcp_transmit_skb() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18fec: Remove unneeded asm header filesFabio Estevam
There is nothing in the driver that requires <asm/coldfire.h> and <asm/mcfsim.h>. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-17ixgbe: bump version numberDon Skidmore
Bump the version number reflect the corresponding functionality in the out of tree driver. Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-17ixgbe: Fix 1G link WoLDon Skidmore
We reset during the shutdown path which will reset AUTOC register. This would change LMS to 10G. If we were currently linked at 1G we will lose link, which is a bad thing if we wanted WoL to work. For the fix I needed to know if WoL is supported so I created a new bool in the ixgbe_hw struct. If this is set we will not allow the reset to change the current LMS value in AUTOC. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-17ixgbe: fix MNG FW support when adapter not upDon Skidmore
We were only turning the laser on when the adapter was up. This causes issues for those who wanted to access the MNG FW while the port was in a down state. This patch makes sure the laser is turned on in probe and remain up even after the port is brought down. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>