Age | Commit message (Collapse) | Author |
|
commit a4f2dacbf2a5045e34b98a35d9a3857800f25a7b upstream.
For VXLAN/NVGRE encapsulation, the current HW doesn't support offloading
both the outer UDP TX checksum and the inner TCP/UDP TX checksum.
The driver doesn't advertize SKB_GSO_UDP_TUNNEL_CSUM, however we are wrongly
telling the HW to offload the outer UDP checksum for encapsulated packets,
fix that.
Fixes: 837052d0ccc5 ('net/mlx4_en: Add netdev support for TCP/IP
offloads of vxlan tunneling')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 08e8331654d1d7b2c58045e549005bc356aa7810 upstream.
There is a race condition between e1000_change_mtu's cleanups and
netpoll, when we change the MTU across jumbo size:
Changing MTU frees all the rx buffers:
e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
e1000_clean_rx_ring
Then, close to the end of e1000_change_mtu:
pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag
And when we come back to do the rest of the MTU change:
e1000_up -> e1000_configure -> e1000_configure_rx ->
e1000_alloc_jumbo_rx_buffers
alloc_jumbo finds the buffers already != NULL, since data (shared with
page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
or at least not what is expected when in jumbo state.
This results in an unusable adapter (packets don't get through), and a
NULL pointer dereference on the next call to e1000_clean_rx_ring
(other mtu change, link down, shutdown):
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330
[...]
Call Trace:
[<ffffffff81195445>] put_page+0x55/0x60
[<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
[<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
[<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
[<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
[<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
[<ffffffff81647050>] dev_set_mtu+0xa0/0x140
[<ffffffff81664218>] do_setlink+0x218/0xac0
[<ffffffff814459e9>] ? nla_parse+0xb9/0x120
[<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
[<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
[<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
[<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260
By setting the allocator to a dummy version, netpoll can't mess up our
rx buffers. The allocator is set back to a sane value in
e1000_configure_rx.
Fixes: edbbb3ca1077 ("e1000: implement jumbo receive with partial descriptors")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
netpoll can call functions in hard irq context that are ordinarily
called in lesser contexts. For those functions use dev_kfree_skb_any
and dev_consume_skb_any so skbs are freed safely from hard irq
context.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as
which can be called in hard irq by netpoll, softirq context
by normal napi polling, and in normal sleepable context
by the network device close method.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace kfree_skb with dev_kfree_skb_any in cp_start_xmit
as it can be called in both hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e5eda89d97ec256ba14e7e861387cc0468259c18 ]
Netdevice registration should be performed a the end of the driver
initialization flow. If we don't do that, after calling register_netdevice,
device callbacks may be issued by higher layers of the stack before
final configuration of the device is done.
For example (VXLAN configuration race), mlx4_SET_PORT_VXLAN was issued
after the register_netdev command. System network scripts may configure
the interface (UP) right after the registration, which also attach
unicast VXLAN steering rule, before mlx4_SET_PORT_VXLAN was called,
causing the firmware to fail the rule attachment.
Fixes: 837052d0ccc5 ("net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling")
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 87f966d97b89774162df04d2106c6350c8fe4cb3 upstream.
On a MIPS Malta board, tons of fifo underflow errors have been observed
when using u-boot as bootloader instead of YAMON. The reason for that
is that YAMON used to set the pcnet device to SRAM mode but u-boot does
not. As a result, the default Tx threshold (64 bytes) is now too small to
keep the fifo relatively used and it can result to Tx fifo underflow errors.
As a result of which, it's best to setup the SRAM on supported controllers
so we can always use the NOUFLO bit.
Cc: <netdev@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: Don Fry <pcnet32@frontier.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit da293700568ed3d96fcf062ac15d7d7c41377f11 upstream.
EEH recovery for bnx2x based adapters is not reliable on all Power
systems using the default hot reset, which can result in an
unrecoverable EEH error. Forcing the use of fundamental reset
during EEH recovery fixes this.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 24e579c8898aa641ede3149234906982290934e5 ]
With the commit d75b1ade567ffab ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. When in busy_poll is we return 0
in napi_poll. We should return budget.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6088beef3f7517717bd21d90b379714dd0837079 ]
NAPI poll logic now enforces that a poller returns exactly the budget
when it wants to be called again.
If a driver limits TX completion, it has to return budget as well when
the limit is hit, not the number of received packets.
Reported-and-tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: d75b1ade567f ("net: less interrupt masking in NAPI")
Cc: Manish Chopra <manish.chopra@qlogic.com>
Acked-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 02a54164c52ed6eca3089a0d402170fbf34d6cf5 upstream.
In Dual EMAC, the default VLANs are used to segregate Rx packets between
the ports, so adding the same default VLAN to the switch will affect the
normal packet transfers. So returning error on addition of dual EMAC
default VLANs.
Even if EMAC 0 default port VLAN is added to EMAC 1, it will lead to
break dual EMAC port separations.
Fixes: d9ba8f9e6298 (driver: net: ethernet: cpsw: dual emac interface implementation)
Reported-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 25906052d953d3fbdb7e19480b9de5e6bb949f3f upstream.
Since ALE table is a common resource for both the interfaces in Dual EMAC
mode and while bringing up the second interface in cpsw_ndo_set_rx_mode()
all the multicast entries added by the first interface is flushed out and
only second interface multicast addresses are added. Fixing this by
flushing multicast addresses based on dual EMAC port vlans which will not
affect the other emac port multicast addresses.
Fixes: d9ba8f9 (driver: net: ethernet: cpsw: dual emac interface implementation)
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7ce67a38f799d1fb332f672b117efbadedaa5352 upstream.
The CPSW IP implements pulse-signaled interrupts. Due to
that we must write a correct, pre-defined value to the
CPDMA_MACEOIVECTOR register so the controller generates
a pulse on the correct IRQ line to signal the End Of
Interrupt.
The way the driver is written today, all four IRQ lines
are requested using the same IRQ handler and, because of
that, we could fall into situations where a TX IRQ fires
but we tell the controller that we ended an RX IRQ (or
vice-versa). This situation triggers an IRQ storm on the
reserved IRQ 127 of INTC which will in turn call ack_bad_irq()
which will, then, print a ton of:
unexpected IRQ trap at vector 00
In order to fix the problem, we are moving all calls to
cpdma_ctlr_eoi() inside the IRQ handler and making sure
we *always* write the correct value to the CPDMA_MACEOIVECTOR
register. Note that the algorithm assumes that IRQ numbers and
value-to-be-written-to-EOI are proportional, meaning that a
write of value 0 would trigger an EOI pulse for the RX_THRESHOLD
Interrupt and that's the IRQ number sitting in the 0-th index
of our irqs_table array.
This, however, is safe at least for current implementations of
CPSW so we will refrain from making the check smarter (and, as
a side-effect, slower) until we actually have a platform where
IRQ lines are swapped.
This patch has been tested for several days with AM335x- and
AM437x-based platforms. AM57x was left out because there are
still pending patches to enable ethernet in mainline for that
platform. A read of the TRM confirms the statement on previous
paragraph.
Reported-by: Yegor Yefremov <yegorslists@googlemail.com>
Fixes: 510a1e7 (drivers: net: davinci_cpdma: acknowledge interrupt properly)
Signed-off-by: Felipe Balbi <balbi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 17e96834fd35997ca7cdfbf15413bcd5a36ad448 ]
Hardware always provides compliment of IP pseudo checksum. Stack expects
whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
This causes checksum error in nf & ovs.
kernel: qg-19546f09-f2: hw csum failure
kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF O-------------- 3.10.0-123.8.1.el7.x86_64 #1
kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
kernel: Call Trace:
kernel: <IRQ> [<ffffffff815e237b>] dump_stack+0x19/0x1b
kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
Hardware verifies IP & tcp/udp header checksum but does not provide payload
checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Stefan Assmann <sassmann@redhat.com>
Reported-by: Sunil Choudhary <schoudha@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7a05dc64e2e4c611d89007b125b20c0d2a4d31a5 ]
Commit d75b1ade567f ("net: less interrupt masking in NAPI") uncovered
wrong alx_poll() behavior.
A NAPI poll() handler is supposed to return exactly the budget when/if
napi_complete() has not been called.
It is also supposed to return number of frames that were received, so
that netdev_budget can have a meaning.
Also, in case of TX pressure, we still have to dequeue received
packets : alx_clean_rx_irq() has to be called even if
alx_clean_tx_irq(alx) returns false, otherwise device is half duplex.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: d75b1ade567f ("net: less interrupt masking in NAPI")
Reported-by: Oded Gabbay <oded.gabbay@amd.com>
Bisected-by: Oded Gabbay <oded.gabbay@amd.com>
Tested-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 05b0aa579397b734f127af58e401a30784a1e315 ]
During driver load in tg3_init_one, if the driver detects DMA activity before
intializing the chip tg3_halt is called. As part of tg3_halt interrupts are
disabled using routine tg3_disable_ints. This routine was using mailbox value
which was not initialized (default value is 0). As a result driver was writing
0x00000001 to pci config space register 0, which is the vendor id / device id.
This driver bug was exposed because of the commit a7877b17a667 (PCI: Check only
the Vendor ID to identify Configuration Request Retry). Also this issue is only
seen in older generation chipsets like 5722 because config space write to offset
0 from driver is possible. The newer generation chips ignore writes to offset 0.
Also without commit a7877b17a667, for these older chips when a GRC reset is
issued the Bootcode would reprogram the vendor id/device id, which is the reason
this bug was masked earlier.
Fixed by initializing the interrupt mailbox registers before calling tg3_halt.
Please queue for -stable.
Reported-by: Nils Holland <nholland@tisys.org>
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit aec653c43b0c55667355e26d7de1236bda9fb4e3 upstream.
Call igb_setup_link() when the PHY is powered up.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Jeff Westfahl <jeff.westfahl@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Vincent Donnefort <vdonnefort@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5f478b41033606d325e420df693162e2524c2b94 ]
mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.
Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit aebea2ba0f7495e1a1c9ea5e753d146cb2f6b845 ]
The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.
The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.
In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.
The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.
Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.
No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.
This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2d5c57d7fbfaa642fb7f0673df24f32b83d9066c ]
Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.
Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.
As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.
Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests"
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a620a6bc1c94c22d6c312892be1e0ae171523125 ]
If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.
This fix only initializes the RX ring if it has been allocated.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d0aedcd4f14a22e23b313f42b7e6e6ebfc0fbc31 ]
vio_dring_avail() will allow use of every dring entry, but when the last
entry is allocated then dr->prod == dr->cons which is indistinguishable from
the ring empty condition. This causes the next allocation to reuse an entry.
When this happens in sunvdc, the server side vds driver begins nack'ing the
messages and ends up resetting the ldc channel. This problem does not effect
sunvnet since it checks for < 2.
The fix here is to just never allocate the very last dring slot so that full
and empty are not the same condition. The request start path was changed to
check for the ring being full a bit earlier, and to stop the blk_queue if
there is no space left. The blk_queue will be restarted once the ring is
only half full again. The number of ring entries was increased to 512 which
matches the sunvnet and Solaris vdc drivers, and greatly reduces the
frequency of hitting the ring full condition and the associated blk_queue
stop/starting. The checks in sunvent were adjusted to account for
vio_dring_avail() returning 1 less.
Orabug: 19441666
OraBZ: 14983
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ccf899a27c08038db91765ff12bb0380dcd85887 ]
With commit be9dad1f9f26604fb ("net: phy: suspend phydev when going
to HALTED"), the PHY device will be put in a low-power mode using
BMCR_PDOWN if the the interface is set down. The smsc911x driver does
a software_reset opening the device driver (ndo_open). In such case,
the PHY must be powered-up before access to any register and before
calling the software_reset function. Otherwise, as the PHY is powered
down the software reset fails and the interface can not be enabled
again.
This patch fixes this scenario that is easy to reproduce setting down
the network interface and setting up again.
$ ifconfig eth0 down
$ ifconfig eth0 up
ifconfig: SIOCSIFFLAGS: Input/output error
Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9026968abe7ad102f4ac5c6d96d733643f75399c ]
This reverts commit 8ef29f8aae524bd51298fb10ac6a5ce6c4c5a3d8.
The driver core already calls pinctrl_get() and claims the default
state. There is no need to replicate this in the driver.
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7d3083ee36b51e425b6abd76778a2046906b0fd3 ]
When receiving a vlan-tagged frame that still contains
a vlan header, the length of the packet will be greater
then MTU+ETH_HLEN since it will account of the extra
vlan header. TG3 checks this for the case for 802.1Q,
but not for 802.1ad. As a result, full sized 802.1ad
frames get dropped by the card.
Add a check for 802.1ad protocol when receving full
sized frames.
Suggested-by: Prashant Sreedharan <prashant@broadcom.com>
CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 476c18850c6cbaa3f2bb661ae9710645081563b9 ]
TG3 appears to have an issue performing TSO and checksum offloading
correclty when the frame has been vlan encapsulated (non-accelrated).
In these cases, tcp checksum is not correctly updated.
This patch attempts to work around this issue. After the patch,
802.1ad vlans start working correctly over tg3 devices.
CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit id: 014269ff376f552363ecdab78d3d947fbe2237d9 in Linus's tree
should be queued up for stable 3.14 & 3.15 since the i40e driver will
not load when DCB is enabled, unless this patch is applied.
In case of any AQ command to query port's DCB configuration fails
during driver's probe time; the probe fails and returns an error.
This patch prevents this issue by continuing the driver probe even
when an error is returned.
Also, added an error message to dump the AQ error status to show what
error caused the failure to get the DCB configuration from firmware.
Change-ID: Ifd5663512588bca684069bb7d4fb586dd72221af
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 10545937e866ccdbb7ab583031dbdcc6b14e4eb4 ]
On IOMMU systems DMA mapping can fail, we need to check for
that possibility.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cbd5228199d8be45d895d9d0cc2b8ce53835fc21 upstream.
Hidden away in the last 8 bytes of the buffer_list page is a solitary
statistic. It needs to be byte swapped or else ethtool -S will
produce numbers that terrify the user.
Since we do this in multiple places, create a helper function with a
comment explaining what is going on.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c36c9d50cc6af5c5bfcc195f21b73f55520c15f9 ]
The recent commit "e29aa33 bna: Enable Multi Buffer RX" is causing
a performance regression. It does not properly update 'cmpl' pointer
at the end of the loop in NAPI handler bnad_cq_process(). The result is
only one packet / per NAPI-schedule is processed.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit fe26566d8a05151ba1dce75081f6270f73ec4ae1 ]
When TSO packet is transmitted additional BD w/o mapping is used
to describe the packed. The BD needs special handling in tx
completion.
kernel: Call Trace:
kernel: <IRQ> [<ffffffff815e19ba>] dump_stack+0x19/0x1b
kernel: [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80
kernel: [<ffffffff8105df5c>] warn_slowpath_fmt+0x5c/0x80
kernel: [<ffffffff814a8c0d>] ? find_iova+0x4d/0x90
kernel: [<ffffffff814ab0e2>] intel_unmap_page.part.36+0x142/0x160
kernel: [<ffffffff814ad0e6>] intel_unmap_page+0x26/0x30
kernel: [<ffffffffa01f55d7>] bnx2x_free_tx_pkt+0x157/0x2b0 [bnx2x]
kernel: [<ffffffffa01f8dac>] bnx2x_tx_int+0xac/0x220 [bnx2x]
kernel: [<ffffffff8101a0d9>] ? read_tsc+0x9/0x20
kernel: [<ffffffffa01f8fdb>] bnx2x_poll+0xbb/0x3c0 [bnx2x]
kernel: [<ffffffff814d041a>] net_rx_action+0x15a/0x250
kernel: [<ffffffff81067047>] __do_softirq+0xf7/0x290
kernel: [<ffffffff815f3a5c>] call_softirq+0x1c/0x30
kernel: [<ffffffff81014d25>] do_softirq+0x55/0x90
kernel: [<ffffffff810673e5>] irq_exit+0x115/0x120
kernel: [<ffffffff815f4358>] do_IRQ+0x58/0xf0
kernel: [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
kernel: <EOI> [<ffffffff810bbff7>] ? clockevents_notify+0x127/0x140
kernel: [<ffffffff814834df>] ? cpuidle_enter_state+0x4f/0xc0
kernel: [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
kernel: [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
kernel: [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
kernel: [<ffffffff815cfee1>] start_secondary+0x265/0x27b
kernel: ---[ end trace 11aa7726f18d7e80 ]---
Fixes: a848ade408b ("bnx2x: add CSUM and TSO support for encapsulation protocols")
Reported-by: Yulong Pei <ypei@redhat.com>
Cc: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a4b70a07ed12a71131cab7adce2ce91c71b37060 ]
Nothing cleans up the objects created by
vnet_new(), they are completely leaked.
vnet_exit(), after doing the vio_unregister_driver() to clean
up ports, should call a helper function that iterates over vnet_list
and cleans up those objects. This includes unregister_netdevice()
as well as free_netdev().
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4cad9f3b61c7268fa89ab8096e23202300399b5d ]
On BE3, if the clear-interrupt bit of the EQ doorbell is not set the first
time it is armed, ocassionally we have observed that the EQ doesn't raise
anymore interrupts even if it is in armed state.
This patch fixes this by setting the clear-interrupt bit when EQs are
armed for the first time in be_open().
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0a1985879437d14bda8c90d0dae3455c467d7642 ]
This commit fixes the command value generated for CSUM calculation
when running in big endian mode. The Ethernet protocol ID for IP was
being unconditionally byte-swapped in the layer 3 protocol check (with
swab16), which caused the mvneta driver to not function correctly in
big endian mode. This patch byte-swaps the ID conditionally with
htons.
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Thomas Fitzsimmons <fitzsim@fitzsim.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4d12bc63ab5e48c1d78fa13883cf6fefcea3afb1 ]
As reported by Maggie Mae Roxas, the mvneta driver doesn't behave
properly in 10 Mbit/s mode. This is due to a misconfiguration of the
MVNETA_GMAC_AUTONEG_CONFIG register: bit MVNETA_GMAC_CONFIG_MII_SPEED
must be set for a 100 Mbit/s speed, but cleared for a 10 Mbit/s speed,
which the driver was not properly doing. This commit adjusts that by
setting the MVNETA_GMAC_CONFIG_MII_SPEED bit only in 100 Mbit/s mode,
and relying on the fact that all the speed related bits of this
register are cleared at the beginning of the mvneta_adjust_link()
function.
This problem exists since c5aff18204da0 ("net: mvneta: driver for
Marvell Armada 370/XP network unit") which is the commit that
introduced the mvneta driver in the kernel.
Cc: <stable@vger.kernel.org> # v3.8+
Fixes: c5aff18204da0 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Reported-by: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Cc: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 07b0f00964def8af9321cfd6c4a7e84f6362f728 ]
While it is legal to kfree(NULL), it is not wise to use :
put_page(virt_to_head_page(NULL))
BUG: unable to handle kernel paging request at ffffeba400000000
IP: [<ffffffffc01f5928>] virt_to_head_page+0x36/0x44 [bnx2x]
Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Fixes: d46d132cc021 ("bnx2x: use netdev_alloc_frag()")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 96dee024ca4799d6d21588951240035c21ba1c67 upstream.
Previous commit c3a0dce35af0 fixed an overrun for the RAR on i218 devices.
This commit also attempted to homogenize the RAR/SHRA access for all parts
accessed by the e1000e driver. This change introduced an error for
assigning MAC addresses to guest OS's for 82579 devices.
Only RAR[0] is accessible to the driver for 82579 parts, and additional
addresses must be placed into the SHRA[L|H] registers. The rar_entry_count
was changed in the previous commit to an inaccurate value that accounted
for all RAR and SHRA registers, not just the ones usable by the driver.
This patch fixes the count to the correct value and adjusts the
e1000_rar_set_pch2lan() function to user the correct index.
Cc: John Greene <jogreene@redhat.com>
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "Alexander Y. Fomichev" <aleksandr.fomichev@x5.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 76252723e88681628a3dbb9c09c963e095476f73 upstream.
To properly re-initialize SR-IOV it is necessary to reset the device
even if it is already down. Not doing this may result in Tx unit hangs.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 948264879b6894dc389a44b99fae4f0b72932619 upstream.
On some devices, the internal PLL circuit occasionally provides the
wrong clock frequency after power up. The probability of failure is less
than one failure per 1000 power cycles. When the failure occurs, the
internal clock frequency is around 1/20 of the correct frequency.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bc82878baa10c2a6a4a6affaf52c152935112142 upstream.
Commit eb17711bc1d6 ("net/mlx4_core: Introduce nic_info new flag in
QUERY_FUNC_CAP") did:
if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_OFFSET) {
which should be:
if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_FORCE_VLAN) {
Fix that.
Fixes: eb17711bc1d6 ("net/mlx4_core: Introduce nic_info new flag in QUERY_FUNC_CAP")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b91113282bf44df46aba374a0b8f88a75bfd4b3f upstream.
If the mdio probe function fails in emac_open, the interrupt we just requested
isn't freed. If emac_open is called again, for example because we try to set up
the interface again, the kernel will oops because the interrupt wasn't properly
released.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6954cc1f238199e971ec905c5cc87120806ac981 upstream.
Fix null-pointer dereference at probe when the mdio platform device is
missing (e.g. when it has been disabled in DT).
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 59993f48b38fd46863b23bb1bb1dc3291e7278fb upstream.
This reverts commit f8d56d8f892be43a2404356073e16401eb5a42e6 ("net:
eth: cpsw: Correctly attach to GPIO bitbang MDIO driver").
Fix potential null-pointer dereference at probe if the mdio-gpio device
has not been successfully probed yet.
The offending commit is plain wrong for a number of reasons. First of
all it accesses internal driver data of an unrelated device. Neither
does it check that the data is non-null (which it is in case the device
has not been probed yet).
Furthermore, the decision on whether to treat any driver data according
to the mdio-gpio driver's internals is made based on the node name. But
the name is not compared against "mdio" which is the normal name for the
node, but rather against "gpio" which the node does not have to be named
(and shouldn't be according to the binding documentation). [ If this
hack is to be kept out-of-tree it should at least be matching against
the compatible property. ]
Cc: Stefan Roese <sr@denx.de>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit da1de8dfff09d33d4a5345762c21b487028e25f5 ]
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.
This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.
During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.
Fix that by keeping only one driver entry which releases mlx4_priv.
Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|