Age | Commit message (Collapse) | Author |
|
|
|
commit 1de63d60cd5b0d33a812efa455d5933bf1564a51 upstream.
There was a serious problem in samsung-laptop that its platform driver is
designed to run under BIOS and running under EFI can cause the machine to
become bricked or can cause Machine Check Exceptions.
Discussion about this problem:
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
https://bugzilla.kernel.org/show_bug.cgi?id=47121
The patches to fix this problem:
efi: Make 'efi_enabled' a function to query EFI facilities
83e68189745ad931c2afd45d8ee3303929233e7f
samsung-laptop: Disable on EFI hardware
e0094244e41c4d0c7ad69920681972fc45d8ce34
Unfortunately this problem comes back again if users specify "noefi" option.
This parameter clears EFI_BOOT and that driver continues to run even if running
under EFI. Refer to the document, this parameter should clear
EFI_RUNTIME_SERVICES instead.
Documentation/kernel-parameters.txt:
===============================================================================
...
noefi [X86] Disable EFI runtime services support.
...
===============================================================================
Documentation/x86/x86_64/uefi.txt:
===============================================================================
...
- If some or all EFI runtime services don't work, you can try following
kernel command line parameters to turn off some or all EFI runtime
services.
noefi turn off all EFI runtime services
...
===============================================================================
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 249bfb83cf8ba658955f0245ac3981d941f746ee upstream.
Devices are added to pci_pme_list when drivers use pci_enable_wake()
or pci_wake_from_d3(), but they aren't removed from the list unless
the driver explicitly disables wakeup. Many drivers never disable
wakeup, so their devices remain on the list even after they are
removed, e.g., via hotplug. A subsequent PME poll will oops when
it tries to touch the device.
This patch disables PME# on a device before removing it, which removes
the device from pci_pme_list. This is safe even if the device never
had PME# enabled.
This oops can be triggered by unplugging a Thunderbolt ethernet adapter
on a Macbook Pro, as reported by Daniel below.
[bhelgaas: changelog]
Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.com
Reported-and-tested-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream.
This fixes CVE-2013-0228 / XSA-42
Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the > guest with the panic like this:
-------------
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]
Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
EIP is at xen_iret+0x12/0x2b
EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
Stack:
00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
Call Trace:
Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
general protection fault: 0000 [#2]
---[ end trace ab0d29a492dcd330 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1250, comm: r Tainted: G D ---------------
2.6.32-356.el6.i686 #1
Call Trace:
[<c08476df>] ? panic+0x6e/0x122
[<c084b63c>] ? oops_end+0xbc/0xd0
[<c084b260>] ? do_general_protection+0x0/0x210
[<c084a9b7>] ? error_code+0x73/
-------------
Petr says: "
I've analysed the bug and I think that xen_iret() cannot cope with
mangled DS, in this case zeroed out (null selector/descriptor) by either
xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
entry was invalidated by the reproducer. "
Jan took a look at the preliminary patch and came up a fix that solves
this problem:
"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."
The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.
Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment. In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.
Reviewed-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream.
A user reported the following oops when a backup process reads
/proc/kcore:
BUG: unable to handle kernel paging request at ffffbb00ff33b000
IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
[...]
Call Trace:
[<ffffffff811b8aaa>] read_kcore+0x17a/0x370
[<ffffffff811ad847>] proc_reg_read+0x77/0xc0
[<ffffffff81151687>] vfs_read+0xc7/0x130
[<ffffffff811517f3>] sys_read+0x53/0xa0
[<ffffffff81449692>] system_call_fastpath+0x16/0x1b
Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt->phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.
The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD. If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.
This patch adds the necessary pud_large() check.
Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.coM>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cb214ede7657db458fd0b2a25ea0b28dbf900ebc upstream.
When a HP ProLiant DL980 G7 Server boots a regular kernel,
there will be intermittent lost interrupts which could
result in a hang or (in extreme cases) data loss.
The reason is that this system only supports x2apic physical
mode, while the kernel boots with a logical-cluster default
setting.
This bug can be worked around by specifying the "x2apic_phys" or
"nox2apic" boot option, but we want to handle this system
without requiring manual workarounds.
The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table.
As all apicids are smaller than 255, BIOS need to pass the
control to the OS with xapic mode, according to x2apic-spec,
chapter 2.9.
Current code handle x2apic when BIOS pass with xapic mode
enabled:
When user specifies x2apic_phys, or FADT indicates PHYSICAL:
1. During madt oem check, apic driver is set with xapic logical
or xapic phys driver at first.
2. enable_IR_x2apic() will enable x2apic_mode.
3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe()
will install the correct x2apic phys driver and use x2apic phys mode.
Otherwise it will skip the driver will let x2apic_cluster_probe to
take over to install x2apic cluster driver (wrong one) even though FADT
indicates PHYSICAL, because x2apic_phys_probe does not check
FADT PHYSICAL.
Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the
problem.
Signed-off-by: Stoney Wang <song-bo.wang@hp.com>
[ updated the changelog and simplified the code ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e575a86fdc50d013bf3ad3aa81d9100e8e6cc60d upstream.
Without this patch, it is trivial to determine kernel page
mappings by examining the error code reported to dmesg[1].
Instead, declare the entire kernel memory space as a violation
of a present page.
Additionally, since show_unhandled_signals is enabled by
default, switch branch hinting to the more realistic
expectation, and unobfuscate the setting of the PF_PROT bit to
improve readability.
[1] http://vulnfactory.org/blog/2013/02/06/a-linux-memory-trick/
Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130207174413.GA12485@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d911e03d097bdc01363df5d81c43f69432eb785c upstream.
Since ed4f209 "s390/time: fix sched_clock() overflow" a new helper function
is used to avoid overflows when converting TOD format values to nanosecond
values.
The kvm interrupt code formerly however only worked by accident because of
an overflow. It tried to program a timer that would expire in more than ~29
years. Because of the old TOD-to-nanoseconds overflow bug the real expiry
value however was much smaller, but now it isn't anymore.
This however triggers yet another bug in the function that programs the clock
comparator s390_next_ktime(): if the absolute "expires" value is after 2042
this will result in an overflow and the programmed value is lower than the
current TOD value which immediatly triggers a clock comparator (= timer)
interrupt.
Since the timer isn't expired it will be programmed immediately again and so
on... the result is a dead system.
To fix this simply program the maximum possible value if an overflow is
detected.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9977f0f164d46613288e0b5778eae500dfe06f31 upstream.
With commit 8e72033f2a48 ("thp: make MADV_HUGEPAGE check for
mm->def_flags") the VM_NOHUGEPAGE flag may be set on s390 in
mm->def_flags for certain processes, to prevent future thp mappings.
This would be overwritten by do_mlockall(), which sets it back to 0 with
an optional VM_LOCKED flag set.
To fix this, instead of overwriting mm->def_flags in do_mlockall(), only
the VM_LOCKED flag should be set or cleared.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3399cfb5df9594495b876d1843a7165f77366b2b upstream.
Commit e7e034e18a0a ("drivers/rtc/rtc-pl031.c: fix the missing operation
on enable") accidentally broke the ST variants of PL031.
The bit that is being poked as "clockwatch" enable bit for the ST
variants does the work of bit 0 on this variant. Bit 0 is used for a
clock divider on the ST variants, and setting it to 1 will affect
timekeeping in a very bad way.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Mian Yousaf KAUKAB <mian.yousaf.kaukab@stericsson.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit a56040731e5b00081c6d6c26b99e6e257a5d63d7 which was
commit eb178619f930fa2ba2348de332a1ff1c66a31424 upstream.
It has been reported to cause problems:
http://bugzilla.redhat.com/show_bug.cgi?id=909602
Acked-by: Ben Myers <bpm@sgi.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: CAI Qian <caiqian@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
commit 5f97ab913cf0fbc378ea8ffc3ee66f4890d11c55 upstream.
1) Lockdep thinks all nouveau subdevs belong to the same class and can be
locked in arbitrary order, which is not true (at least in general case).
Tell it to distinguish subdevs by (o)class type.
2) DRM client can be locked under user client lock - tell lockdep to put
DRM client lock in a separate class.
Reported-by: Arend van Spriel <arend@broadcom.com>
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bc9540c637c3d8712ccbf9dcf28621f380ed5e64 ]
commit 9ca1b22d6d2 (net: splice: avoid high order page splitting)
forgot that skb->head could need a copy into several page frags.
This could be the case for loopback traffic mostly.
Also remove now useless skb argument from linear_to_page()
and __splice_segment() prototypes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 82bda6195615891181115f579a480aa5001ce7e9 ]
splice() can handle pages of any order, but network code tries hard to
split them in PAGE_SIZE units. Not quite successfully anyway, as
__splice_segment() assumed poff < PAGE_SIZE. This is true for
the skb->data part, not necessarily for the fragments.
This patch removes this logic to give the pages as they are in the skb.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f26845b43c75d3f32f98d194c1327b5b1e6b3fb0 ]
Under unusual circumstances, TCP collapse can split a big GRO TCP packet
while its being used in a splice(socket->pipe) operation.
skb_splice_bits() releases the socket lock before calling
splice_to_pipe().
[ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc()
[ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]-
[ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13
To fix this problem, we must eat skbs in tcp_recv_skb().
Remove the inline keyword from tcp_recv_skb() definition since
it has three call sites.
Reported-by: Christian Becker <c.becker@traviangames.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ff905b1e4aad8ccbbb0d42f7137f19482742ff07 ]
commit 02275a2ee7c0 (tcp: don't abort splice() after small transfers)
added a regression.
[ 83.843570] INFO: rcu_sched self-detected stall on CPU
[ 83.844575] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 0, t=21002 jiffies, g=4457, c=4456, q=13132)
[ 83.844582] Task dump for CPU 6:
[ 83.844584] netperf R running task 0 8966 8952 0x0000000c
[ 83.844587] 0000000000000000 0000000000000006 0000000000006c6c 0000000000000000
[ 83.844589] 000000000000006c 0000000000000096 ffffffff819ce2bc ffffffffffffff10
[ 83.844592] ffffffff81088679 0000000000000010 0000000000000246 ffff880c4b9ddcd8
[ 83.844594] Call Trace:
[ 83.844596] [<ffffffff81088679>] ? vprintk_emit+0x1c9/0x4c0
[ 83.844601] [<ffffffff815ad449>] ? schedule+0x29/0x70
[ 83.844606] [<ffffffff81537bd2>] ? tcp_splice_data_recv+0x42/0x50
[ 83.844610] [<ffffffff8153beaa>] ? tcp_read_sock+0xda/0x260
[ 83.844613] [<ffffffff81537b90>] ? tcp_prequeue_process+0xb0/0xb0
[ 83.844615] [<ffffffff8153c0f0>] ? tcp_splice_read+0xc0/0x250
[ 83.844618] [<ffffffff814dc0c2>] ? sock_splice_read+0x22/0x30
[ 83.844622] [<ffffffff811b820b>] ? do_splice_to+0x7b/0xa0
[ 83.844627] [<ffffffff811ba4bc>] ? sys_splice+0x59c/0x5d0
[ 83.844630] [<ffffffff8119745b>] ? putname+0x2b/0x40
[ 83.844633] [<ffffffff8118bcb4>] ? do_sys_open+0x174/0x1e0
[ 83.844636] [<ffffffff815b6202>] ? system_call_fastpath+0x16/0x1b
if recv_actor() returns 0, we should stop immediately,
because looping wont give a chance to drain the pipe.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 02275a2ee7c0ea475b6f4a6428f5df592bc9d30b ]
TCP coalescing added a regression in splice(socket->pipe) performance,
for some workloads because of the way tcp_read_sock() is implemented.
The reason for this is the break when (offset + 1 != skb->len).
As we released the socket lock, this condition is possible if TCP stack
added a fragment to the skb, which can happen with TCP coalescing.
So let's go back to the beginning of the loop when this happens,
to give a chance to splice more frags per system call.
Doing so fixes the issue and makes GRO 10% faster than LRO
on CPU-bound splice() workloads instead of the opposite.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6731d2095bd4aef18027c72ef845ab1087c3ba63 ]
There are transients during normal FRTO procedure during which
the packets_in_flight can go to zero between write_queue state
updates and firing the resulting segments out. As FRTO processing
occurs during that window the check must be more precise to
not match "spuriously" :-). More specificly, e.g., when
packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic
branch that set cwnd into zero would not be taken and new segments
might be sent out later.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2e5f421211ff76c17130b4597bc06df4eeead24f ]
Commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start())
uncovered a bug in FRTO code :
tcp_process_frto() is setting snd_cwnd to 0 if the number
of in flight packets is 0.
As Neal pointed out, if no packet is in flight we lost our
chance to disambiguate whether a loss timeout was spurious.
We should assume it was a proper loss.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 973ec449bb4f2b8c514bacbcb4d9506fc31c8ce3 ]
Since commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()),
a nul snd_cwnd triggers an infinite loop in tcp_slow_start()
Avoid this infinite loop and log a one time error for further
analysis. FRTO code is suspected to cause this bug.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 66555e92fb7a619188c02cceae4bbc414f15f96d ]
On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN
retransmission to detect SYN/data drops. But if F-RTO is disabled,
icsk_retransmit is reset at step D of tcp_fastretrans_alert() (
under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use
total_retrans instead which accounts for SYN retransmission regardless
the use of F-RTO.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b5c37fe6e24eec194bb29d22fdd55d73bcc709bf ]
On sctp_endpoint_destroy, previously used sensitive keying material
should be zeroed out before the memory is returned, as we already do
with e.g. auth keys when released.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6ba542a291a5e558603ac51cda9bded347ce7627 ]
In sctp_setsockopt_auth_key, we create a temporary copy of the user
passed shared auth key for the endpoint or association and after
internal setup, we free it right away. Since it's sensitive data, we
should zero out the key before returning the memory back to the
allocator. Thus, use kzfree instead of kfree, just as we do in
sctp_auth_key_put().
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86 ]
Jamie Parsons reported a problem recently, in which the re-initalization of an
association (The duplicate init case), resulted in a loss of receive window
space. He tracked down the root cause to sctp_outq_teardown, which discarded
all the data on an outq during a re-initalization of the corresponding
association, but never reset the outq->outstanding_data field to zero. I wrote,
and he tested this fix, which does a proper full re-initalization of the outq,
fixing this problem, and hopefully future proofing us from simmilar issues down
the road.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b44108dbdbaa07c609bb5755e8dd6c2035236251 ]
git commit 9cb3a50c (ipv4: Invalidate the socket cached route on
pmtu events if possible) introduced a refcount problem. We don't
get a refcount on the route if we get it from__sk_dst_get(), but
we need one if we want to reuse this route because __sk_dst_set()
releases the refcount of the old route. This patch adds proper
refcount handling for that case. We introduce a 'new' flag to
indicate that we are going to use a new route and we release the
old route only if we replace it by a new one.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8141ed9fcedb278f4a3a78680591bef1e55f75fb ]
This implements a socket release callback function to check
if the socket cached route got invalid during the time
we owned the socket. The function is used from udp, raw
and ping sockets.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9cb3a50c5f63ed745702972f66eaee8767659acd ]
The route lookup in ipv4_sk_update_pmtu() might return a route
different from the route we cached at the socket. This is because
standart routes are per cpu, so each cpu has it's own struct rtable.
This means that we do not invalidate the socket cached route if the
NET_RX_SOFTIRQ is not served by the same cpu that the sending socket
uses. As a result, the cached route reused until we disconnect.
With this patch we invalidate the socket cached route if possible.
If the socket is owened by the user, we can't update the cached
route directly. A followup patch will implement socket release
callback functions for datagram sockets to handle this case.
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6f809da27c94425e07be4a64d5093e1df95188e9 ]
pmtu and redirect events are now handled in the protocols error handler,
so add an error handler for icmp6 to do this. It is needed in the case
when we have no socket context. Based on a patch by Duan Jiong.
Reported-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit fa1e492aa3cbafba9f8fc6d05e5b08a3091daf4a ]
Routes with locked mtu should not use learned pmtu informations,
so do not update the pmtu on these routes.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 38d523e2948162776903349c89d65f7b9370dadb ]
The output route check was introduced with git commit 261663b0
(ipv4: Don't use the cached pmtu informations for input routes)
during times when we cached the pmtu informations on the
inetpeer. Now the pmtu informations are back in the routes,
so this check is obsolete. It also had some unwanted side effects,
as reported by Timo Teras and Lukas Tribus.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b9149729ebdcfce63f853aa54a404c6a8f6ebbf3 ]
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4cc7c1cb7b11b6f3515bd9075527576a1eecc4aa ]
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7d5145d8eb2b9791533ffe4dc003b129b9696c48 ]
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 48856286b64e4b66ec62b94e504d0b29c1ade664 ]
A buggy or malicious frontend should not be able to confuse netback.
If we spot anything which is not as it should be then shutdown the
device and don't try to continue with the ring in a potentially
hostile state. Well behaved and non-hostile frontends will not be
penalised.
As well as making the existing checks for such errors fatal also add a
new check that ensures that there isn't an insane number of requests
on the ring (i.e. more than would fit in the ring). If the ring
contains garbage then previously is was possible to loop over this
insane number, getting an error each time and therefore not generating
any more pending requests and therefore not exiting the loop in
xen_netbk_tx_build_gops for an externded period.
Also turn various netdev_dbg calls which no precipitate a fatal error
into netdev_err, they are rate limited because the device is shutdown
afterwards.
This fixes at least one known DoS/softlockup of the backend domain.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ab54ee80aa7585f9666ff4dd665441d7ce41f1e8 ]
We have conflicting type qualifiers for "freg_t" in s390's ptrace.h and the
iphase atm device driver, which causes the compile error below.
Unfortunately the s390 typedef can't be renamed, since it's a user visible api,
nor can I change the include order in s390 code to avoid the conflict.
So simply rename the iphase typedef to a new name. Fixes this compile error:
In file included from drivers/atm/iphase.c:66:0:
drivers/atm/iphase.h:639:25: error: conflicting type qualifiers for 'freg_t'
In file included from next/arch/s390/include/asm/ptrace.h:9:0,
from next/arch/s390/include/asm/lowcore.h:12,
from next/arch/s390/include/asm/thread_info.h:30,
from include/linux/thread_info.h:54,
from include/linux/preempt.h:9,
from include/linux/spinlock.h:50,
from include/linux/seqlock.h:29,
from include/linux/time.h:5,
from include/linux/stat.h:18,
from include/linux/module.h:10,
from drivers/atm/iphase.c:43:
next/arch/s390/include/uapi/asm/ptrace.h:197:3: note: previous declaration of 'freg_t' was here
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 41ab3e31bd50b42c85ac0aa0469642866aee2a9a ]
ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch,
and the -1 return value from this function is bogus. Use the error
handling we already have in place in ip6gre_tunnel_xmit() for this error
case to fix this.
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9665d5d62487e8e7b1f546c00e11107155384b9a ]
When releasing a packet socket, the routine packet_set_ring() is reused
to free rings instead of allocating them. But when calling it for the
first time, it fills req->tp_block_nr with the value of rb->pg_vec_len
which in the second invocation makes it bail out since req->tp_block_nr
is greater zero but req->tp_block_size is zero.
This patch solves the problem by passing a zeroed auto-variable to
packet_set_ring() upon each invocation from packet_release().
As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING
and packet mmap), i.e. the original inclusion of TX ring support into
af_packet, but applies only to sockets with both RX and TX ring
allocated, which is probably why this was unnoticed all the time.
Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Cc: Johann Baudy <johann.baudy@gnu-log.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 559bcac35facfed49ab4f408e162971612dcfdf3 ]
1) rhine_tx() should use dev_kfree_skb() not dev_kfree_skb_irq()
2) rhine_slow_event_task's NAPI triggering logic is racey, it
should just hit the interrupt mask register. This is the
same as commit 7dbb491878a2c51d372a8890fa45a8ff80358af1
("r8169: avoid NAPI scheduling delay.") made to fix the same
problem in the r8169 driver. From Francois Romieu.
Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bd30e947207e2ea0ff2c08f5b4a03025ddce48d3 ]
They will be created at output, if ever needed. This avoids creating
empty neighbor entries when TPROXYing/Forwarding packets for addresses
that are not even directly reachable.
Note that IPv4 already handles it this way. No neighbor entries are
created for local input.
Tested by myself and customer.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 604dfd6efc9b79bce432f2394791708d8e8f6efc ]
The return value of pktgen_add_device() is not checked, so
even if we fail to add some device, for example, non-exist one,
we still see "OK:...". This patch fixes it.
After this patch, I got:
# echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0
-bash: echo: write error: No such device
# cat /proc/net/pktgen/kpktgend_0
Running:
Stopped:
Result: ERROR: can not add device non-exist
# echo "add_device eth0" > /proc/net/pktgen/kpktgend_0
# cat /proc/net/pktgen/kpktgend_0
Running:
Stopped: eth0
Result: OK: add_device=eth0
(Candidate for -stable)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5465740ace36f179de5bb0ccb5d46ddeb945e309 ]
Due to IP_GRE GSO support, GRE can recieve non linear skb which
results in panic in case of GRE_CSUM. Following patch fixes it by
using correct csum API.
Bug introduced in commit 6b78f16e4bdde3936b (gre: add GSO support)
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 794ed393b707f01858f5ebe2ae5eabaf89d00022 ]
Ben Greear reported crashes in ip_rcv_finish() on a stress
test involving many macvlans.
We tracked the bug to a dst use after free. ip_rcv_finish()
was calling dst->input() and got garbage for dst->input value.
It appears the bug is in loopback driver, lacking
a skb_dst_force() before calling netif_rx().
As a result, a non refcounted dst, normally protected by a
RCU read_lock section, was escaping this section and could
be freed before the packet being processed.
[<ffffffff813a3c4d>] loopback_xmit+0x64/0x83
[<ffffffff81477364>] dev_hard_start_xmit+0x26c/0x35e
[<ffffffff8147771a>] dev_queue_xmit+0x2c4/0x37c
[<ffffffff81477456>] ? dev_hard_start_xmit+0x35e/0x35e
[<ffffffff8148cfa6>] ? eth_header+0x28/0xb6
[<ffffffff81480f09>] neigh_resolve_output+0x176/0x1a7
[<ffffffff814ad835>] ip_finish_output2+0x297/0x30d
[<ffffffff814ad6d5>] ? ip_finish_output2+0x137/0x30d
[<ffffffff814ad90e>] ip_finish_output+0x63/0x68
[<ffffffff814ae412>] ip_output+0x61/0x67
[<ffffffff814ab904>] dst_output+0x17/0x1b
[<ffffffff814adb6d>] ip_local_out+0x1e/0x23
[<ffffffff814ae1c4>] ip_queue_xmit+0x315/0x353
[<ffffffff814adeaf>] ? ip_send_unicast_reply+0x2cc/0x2cc
[<ffffffff814c018f>] tcp_transmit_skb+0x7ca/0x80b
[<ffffffff814c3571>] tcp_connect+0x53c/0x587
[<ffffffff810c2f0c>] ? getnstimeofday+0x44/0x7d
[<ffffffff810c2f56>] ? ktime_get_real+0x11/0x3e
[<ffffffff814c6f9b>] tcp_v4_connect+0x3c2/0x431
[<ffffffff814d6913>] __inet_stream_connect+0x84/0x287
[<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49
[<ffffffff8108d695>] ? _local_bh_enable_ip+0x84/0x9f
[<ffffffff8108d6c8>] ? local_bh_enable+0xd/0x11
[<ffffffff8146763c>] ? lock_sock_nested+0x6e/0x79
[<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49
[<ffffffff814d6b49>] inet_stream_connect+0x33/0x49
[<ffffffff814632c6>] sys_connect+0x75/0x98
This bug was introduced in linux-2.6.35, in commit
7fee226ad2397b (net: add a noref bit on skb dst)
skb_dst_force() is enforced in dev_queue_xmit() for devices having a
qdisc.
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5d0feaff230c0abfe4a112e6f09f096ed99e0b2d ]
This was introduced in commit 6dccd16 "r8169: merge with version
6.001.00 of Realtek's r8169 driver". I did not find the version
6.001.00 online, but in 6.002.00 or any later r8169 from Realtek
this hunk is no longer present.
Also commit 05af214 "r8169: fix Ethernet Hangup for RTL8110SC
rev d" claims to have fixed this issue otherwise.
The magic compare mask of 0xfffe000 is dubious as it masks
parts of the Reserved part, and parts of the VLAN tag. But this
does not make much sense as the VLAN tag parts are perfectly
valid there. In matter of fact this seems to be triggered with
any VLAN tagged packet as RxVlanTag bit is matched. I would
suspect 0xfffe0000 was intended to test reserved part only.
Finally, this hunk is evil as it can cause more packets to be
handled than what was NAPI quota causing net/core/dev.c:
net_rx_action(): WARN_ON_ONCE(work > weight) to trigger, and
mess up the NAPI state causing device to hang.
As result, any system using VLANs and having high receive
traffic (so that NAPI poll budget limits rtl_rx) would result
in device hang.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a05948f296ce103989b28a2606e47d2e287c3c89 ]
Christoph Paasch found netxen could trigger a BUG in its dismantle
phase, in netxen_release_tx_buffer(), using full size TSO packets.
cmd_buf->frag_count includes the skb->data part, so the loop must
start at index 1 instead of 0, or else we can make an out
of bound access to cmd_buff->frag_array[MAX_SKB_FRAGS + 2]
Christoph provided the fixes in netxen_map_tx_skb() function.
In case of a dma mapping error, its better to clear the dma fields
so that we don't try to unmap them again in netxen_release_tx_buffer()
Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d721a1752ba544df8d7d36959038b26bc92bdf80 ]
If subtracting 12 from l leaves zero we'd do a zero size allocation,
leading to an oops later when we try to set the NUL terminator.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b74aa930ef49a3c0d8e4c1987f89decac768fb2c ]
commit 563d34d057 (tcp: dont drop MTU reduction indications)
added an error leading to incorrect accounting of
LINUX_MIB_LOCKDROPPEDICMPS
If socket is owned by the user, we want to increment
this SNMP counter, unless the message is a
(ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one.
Reported-by: Maciej ¯enczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej ¯enczykowski <maze@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ca4c7b35f75492de7fbf5ee95be07481c348caee ]
The lines
if (mlx4_is_mfunc(dev)) {
nreq = 2;
} else {
which hard code the number of requested msi-x vectors under multi-function
mode to two can be removed completely, since the firmware sets num_eqs and
reserved_eqs appropriately Thus, the code line:
nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq);
is by itself sufficient and correct for all cases. Currently, for mfunc
mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled.
This triples (one vector is used for the async events and commands EQ) the
horse power provided for processing of incoming packets on netdev RSS scheme,
IO initiators/targets commands processing flows, etc.
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 213815a1e6ae70b9648483b110bc5081795f99e8 ]
Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on
same host" introduced a regression under which a bridge acting as vSwitch
whose uplink is an mlx4 Ethernet device become non-operative in native
(non sriov) mode. This happens since broadcast ARP requests sent by VMs
were loopback-ed by the HW and hence the bridge learned VM source MACs
on both the VM and the uplink ports.
The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch
configuration or when the device is in selftest.
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d6fb3be544b46a7611a3373fcaa62b5b0be01888 ]
The xgmac driver assumes 1 frame per descriptor. If a frame larger than
the descriptor's buffer size is received, the frame will spill over into
the next descriptor. So check for received frames that span more than one
descriptor and discard them. This prevents a crash if we receive erroneous
large packets.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|