Age | Commit message (Collapse) | Author |
|
CONFIG_NO_HZ_FULL requires possibility of smp_send_reschedule()
for the calling CPU. Currently, it is used in inc_nr_running()
scheduler primitive only.
Nobody calls smp_send_reschedule() from preemptible context
(furthermore, it looks like it will be save if anybody use it
another way in the future). But anyway I add WARN_ON() here
just to return here if anything changes.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Simba-bridges
The SIMBA APB Bridges lacks the 'ranges' of-property describing the
PCI I/O and memory areas located beneath the bridge. Faking this
information has been performed by reading range registers in the
APB bridge, and calculating the corresponding areas.
In commit 01f94c4a6ced476ce69b895426fc29bfc48c69bd
("Fix sabre pci controllers with new probing scheme.") a bug was
introduced into this calculation, causing the PCI memory areas
to be calculated incorrectly: The shift size was set to be
identical for I/O and MEM ranges, which is incorrect.
This patch set the shift size of the MEM range back to the
value used before 01f94c4a6ced476ce69b895426fc29bfc48c69bd.
Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that we have 64-bits for PMDs we can stop using special encodings
for the huge PMD values, and just put real PTEs in there.
We allocate a _PAGE_PMD_HUGE bit to distinguish between plain PMDs and
huge ones. It is the same for both 4U and 4V PTE layouts.
We also use _PAGE_SPECIAL to indicate the splitting state, since a
huge PMD cannot also be special.
All of the PMD --> PTE translation code disappears, and most of the
huge PMD bit modifications and tests just degenerate into the PTE
operations. In particular USER_PGTABLE_CHECK_PMD_HUGE becomes
trivial.
As a side effect, normal PMDs don't shift the physical address around.
This also speeds up the page table walks in the TLB miss paths since
they don't have to do the shifts any more.
Another non-trivial aspect is that pte_modify() has to be changed
to preserve the _PAGE_PMD_HUGE bits as well as the page size field
of the pte.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To make the page tables compact, we were using 32-bit PGDs and PMDs.
We only had to support <= 43 bits of physical addresses so this was
quite feasible.
In order to support larger physical addresses we have to move to
64-bit PGDs and PMDs.
Most of the changes are straight-forward:
1) {pgd,pmd}_t --> unsigned long
2) Anything that tries to use plain "unsigned int" types with pgd/pmd
values needs to be adjusted. In particular things like "0U" become
"0UL".
3) {PGDIR,PMD}_BITS decrease by one.
4) In the assembler page table walkers, use "ldxa" instead of "lduwa"
and adjust the low bit masks to clear out the low 3 bits instead of
just the low 2 bits during pgd/pmd address formation.
Also, use PTRS_PER_PGD and PTRS_PER_PMD in the sizing of the
swapper_{pg_dir,low_pmd_dir} arrays.
This patch does not try to take advantage of having 64-bits in the
PMDs to simplify the hugepage code, that will come in a subsequent
change.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The impetus for this is that we would like to move to 64-bit PMDs and
PGDs, but that would result in only supporting a 42-bit address space
with the current page table layout. It'd be nice to support at least
43-bits.
The reason we'd end up with only 42-bits after making PMDs and PGDs
64-bit is that we only use half-page sized PTE tables in order to make
PMDs line up to 4MB, the hardware huge page size we use.
So what we do here is we make huge pages 8MB, and fabricate them using
4MB hw TLB entries.
Facilitate this by providing a "REAL_HPAGE_SHIFT" which is used in
places that really need to operate on hardware 4MB pages.
Use full pages (512 entries) for PTE tables, and adjust PMD_SHIFT,
PGD_SHIFT, and the build time CPP test as needed. Use a CPP test to
make sure REAL_HPAGE_SHIFT and the _PAGE_SZHUGE_* we use match up.
This makes the pgtable cache completely unused, so remove the code
managing it and the state used in mm_context_t. Now we have less
spinlocks taken in the page table allocation path.
The technique we use to fabricate the 8MB pages is to transfer bit 22
from the missing virtual address into the PTEs physical address field.
That takes care of the transparent huge pages case.
For hugetlb, we fill things in at the PTE level and that code already
puts the sub huge page physical bits into the PTEs, based upon the
offset, so there is nothing special we need to do. It all just works
out.
So, a small amount of complexity in the THP case, but this code is
about to get much simpler when we move the 64-bit PMDs as we can move
away from the fancy 32-bit huge PMD encoding and just put a real PTE
value in there.
With bug fixes and help from Bob Picco.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Choose PAGE_OFFSET dynamically based upon cpu type.
Original UltraSPARC-I (spitfire) chips only supported a 44-bit
virtual address space.
Newer chips (T4 and later) support 52-bit virtual addresses
and up to 47-bits of physical memory space.
Therefore we have to adjust PAGE_SIZE dynamically based upon
the capabilities of the chip.
Note that this change alone does not allow us to support > 43-bit
physical memory, to do that we need to re-arrange our page table
support. The current encodings of the pmd_t and pgd_t pointers
restricts us to "32 + 11" == 43 bits.
This change can waste quite a bit of memory for the various tables.
In particular, a future change should work to size and allocate
kern_linear_bitmap[] and sparc64_valid_addr_bitmap[] dynamically.
This isn't easy as we really cannot take a TLB miss when accessing
kern_linear_bitmap[]. We'd have to lock it into the TLB or similar.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
Some parts of the code use '41' others use '42', make them
all use the same value.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
This way we can see exactly what they are derived from, and in particular
how they would change if we were to use a different PAGE_OFFSET value.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
This makes clearer the implications for a given choosen
value.
Based upon patches by Bob Picco.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
This pertains to all of the computations of the kernel fast
TLB miss xor values.
Based upon a patch by Bob Picco.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
Older UltraSPARC chips had an address space hole due to the MMU only
supporting 44-bit virtual addresses.
The top end of this hole also has the same value as the current
definition of PAGE_OFFSET, so this can be confusing.
Consolidate the defines for the userspace mmap exclusion range into
page_64.h and use them in sys_sparc_64.c and hugetlbpage.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Bob Picco <bob.picco@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DeviceTree updates for 3.13. This is a bit larger pull request than
usual for this cycle with lots of clean-up.
- Cross arch clean-up and consolidation of early DT scanning code.
- Clean-up and removal of arch prom.h headers. Makes arch specific
prom.h optional on all but Sparc.
- Addition of interrupts-extended property for devices connected to
multiple interrupt controllers.
- Refactoring of DT interrupt parsing code in preparation for
deferred probe of interrupts.
- ARM cpu and cpu topology bindings documentation.
- Various DT vendor binding documentation updates"
* tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
powerpc: add missing explicit OF includes for ppc
dt/irq: add empty of_irq_count for !OF_IRQ
dt: disable self-tests for !OF_IRQ
of: irq: Fix interrupt-map entry matching
MIPS: Netlogic: replace early_init_devtree() call
of: Add Panasonic Corporation vendor prefix
of: Add Chunghwa Picture Tubes Ltd. vendor prefix
of: Add AU Optronics Corporation vendor prefix
of/irq: Fix potential buffer overflow
of/irq: Fix bug in interrupt parsing refactor.
of: set dma_mask to point to coherent_dma_mask
of: add vendor prefix for PHYTEC Messtechnik GmbH
DT: sort vendor-prefixes.txt
of: Add vendor prefix for Cadence
of: Add empty for_each_available_child_of_node() macro definition
arm/versatile: Fix versatile irq specifications.
of/irq: create interrupts-extended property
microblaze/pci: Drop PowerPC-ism from irq parsing
of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
of/irq: Use irq_of_parse_and_map()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
"The main changes in this cycle are:
- (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
van Riel, Peter Zijlstra et al. Yay!
- optimize preemption counter handling: merge the NEED_RESCHED flag
into the preempt_count variable, by Peter Zijlstra.
- wait.h fixes and code reorganization from Peter Zijlstra
- cfs_bandwidth fixes from Ben Segall
- SMP load-balancer cleanups from Peter Zijstra
- idle balancer improvements from Jason Low
- other fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
stop_machine: Fix race between stop_two_cpus() and stop_cpus()
sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
sched: Fix asymmetric scheduling for POWER7
sched: Move completion code from core.c to completion.c
sched: Move wait code from core.c to wait.c
sched: Move wait.c into kernel/sched/
sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
sched: Avoid throttle_cfs_rq() racing with period_timer stopping
sched: Guarantee new group-entities always have weight
sched: Fix hrtimer_cancel()/rq->lock deadlock
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
sched: Fix race on toggling cfs_bandwidth_used
sched: Remove extra put_online_cpus() inside sched_setaffinity()
sched/rt: Fix task_tick_rt() comment
sched/wait: Fix build breakage
sched/wait: Introduce prepare_to_wait_event()
sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
sched: Remove get_online_cpus() usage
sched: Fix race in migrate_swap_stop()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ changes from Ingo Molnar:
"The biggest change this cycle are the softirq/hardirq stack
interaction and nesting fixes, cleanups and reorganizations from
Frederic. This is the longer followup story to the softirq nesting
fix that is already upstream (commit ded797547548: "irq: Force hardirq
exit's softirq processing on its own stack")"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: bcm2835: Convert to use IRQCHIP_DECLARE macro
powerpc: Tell about irq stack coverage
x86: Tell about irq stack coverage
irq: Optimize softirq stack selection in irq exit
irq: Justify the various softirq stack choices
irq: Improve a bit softirq debugging
irq: Optimize call to softirq on hardirq exit
irq: Consolidate do_softirq() arch overriden implementations
x86/irq: Correct comment about i8259 initialization
|
|
|
|
Resolve cherry-picking conflicts:
Conflicts:
mm/huge_memory.c
mm/memory.c
mm/mprotect.c
See this upstream merge commit for more details:
52469b4fcd4f Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull networking fixes from David Miller:
"Sorry I let so much accumulate, I was in Buffalo and wanted a few
things to cook in my tree for a while before sending to you. Anyways,
it's a lot of little things as usual at this stage in the game"
1) Make bonding MAINTAINERS entry reflect reality, from Andy
Gospodarek.
2) Fix accidental sock_put() on timewait mini sockets, from Eric
Dumazet.
3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6
addresses, from François CACHEREUL.
4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
Carpenter.
5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
Dumazet.
6) SFC driver bug fixes from Ben Hutchings.
7) Fix TX packet scheduling wedge after channel change in ath9k driver,
from Felix Fietkau.
8) Fix user after free in BPF JIT code, from Alexei Starovoitov.
9) Source address selection test is reversed in
__ip_route_output_key(), fix from Jiri Benc.
10) VLAN and CAN layer mis-size netlink attributes, from Marc
Kleine-Budde.
11) Fix permission checks in sysctls to use current_euid() instead of
current_uid(). From Eric W Biederman.
12) IPSEC policies can go away while a timer is still pending for them,
add appropriate ref-counting to fix, from Steffen Klassert.
13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
chips, from Nguyen Hong Ky and Simon Horman.
14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.
15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
Ghorbel.
16) Fix typo in fq_change(), we were assigning "initial quantum" to
"quantum". From Eric Dumazet.
17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
FQ packet scheduler does not pace those flows properly. Also from
Eric Dumazet.
18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.
19) l2tp_xmit_skb() can be called from process context, not just softirq
context, so we must always make sure to BH disable around it. From
Eric Dumazet.
20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
packet scheduler. From Stephen Hemminger.
21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
Carpenter and Salva Peiró.
22) Fix PHY reset and other issues in dm9000 driver, from Nikita
Kiryanov and Michael Abbott.
23) When hardware can do SCTP crc32 checksums, we accidently don't
disable the csum offload when IPSEC transformations have been
applied. From Fan Du and Vlad Yasevich.
24) Tail loss probing in TCP leaves the socket in the wrong congestion
avoidance state. From Yuchung Cheng.
25) In CPSW driver, enable NAPI before interrupts are turned on, from
Markus Pargmann.
26) Integer underflow and dual-assignment in YAM hamradio driver, from
Dan Carpenter.
27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
unclone it. This fixes various hard to track down crashes in
drivers where the SKBs ->gso_segs was changing right from underneath
the driver during TX queueing. From Eric Dumazet.
28) Fix the handling of VLAN IDs, and in particular the special IDs 0
and 4095, in the bridging layer. From Toshiaki Makita.
29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.
30) Fix race in socket credential passing, from Daniel Borkmann.
31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
from Seif Mazareeb.
32) Fix identification of fragmented frames in ipv4/ipv6 UDP
Fragmentation Offload output paths, from Jiri Pirko.
33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
Elior.
34) When we removed the explicit neighbour pointer from ipv6 routes a
slight regression was introduced for users such as IPVS, xt_TEE, and
raw sockets. We mix up the users requested destination address with
the routes assigned nexthop/gateway. From Julian Anastasov and
Simon Horman.
35) Fix stack overruns in rt6_probe(), the issue is that can end up
doing two full packet xmit paths at the same time when emitting
neighbour discovery messages. From Hannes Frederic Sowa.
36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
Mariusz Ceier.
37) Make sure to set TCP sk_pacing_rate after the first legitimate RTT
sample, from Neal Cardwell.
38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
Steffen Klassert.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits)
ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
ax88179_178a: Correct the RX error definition in RX header
Revert "bridge: only expire the mdb entry when query is received"
tcp: initialize passive-side sk_pacing_rate after 3WHS
davinci_emac.c: Fix IFF_ALLMULTI setup
mac802154: correct a typo in ieee802154_alloc_device() prototype
ipv6: probe routes asynchronous in rt6_probe
netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
ipv6: fill rt6i_gateway with nexthop address
ipv6: always prefer rt6i_gateway if present
bnx2x: Set NETIF_F_HIGHDMA unconditionally
bnx2x: Don't pretend during register dump
bnx2x: Lock DMAE when used by statistic flow
bnx2x: Prevent null pointer dereference on error flow
bnx2x: Fix config when SR-IOV and iSCSI are enabled
bnx2x: Fix Coalescing configuration
bnx2x: Unlock VF-PF channel on MAC/VLAN config error
bnx2x: Prevent an illegal pointer dereference during panic
bnx2x: Fix Maximum CoS estimation for VFs
drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing
...
|
|
Use for_each_node_by_type() to iterate all cpu nodes in the
system.
Provide and overridable function arch_find_n_match_cpu_physical_id,
which sees if the given device node matches 'cpu' and if so sets
'*thread' when non-NULL to the cpu thread number within the core.
The default implementation behaves the same as the existing code.
Add a sparc64 implementation.
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
|
|
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Move of_address_to_resource and of_iomap declarations to common code. These
only differ on sparc, but the declarations are the same and don't need to
be in arch header.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
|
|
Implement of_node_to_nid as weak function to remove the dependency on
asm/prom.h. This is in preparation to make prom.h optional.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
|
|
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.
Conflicts:
arch/avr32/include/asm/Kbuild
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
on x86 system with net.core.bpf_jit_enable = 1
sudo tcpdump -i eth1 'tcp port 22'
causes the warning:
[ 56.766097] Possible unsafe locking scenario:
[ 56.766097]
[ 56.780146] CPU0
[ 56.786807] ----
[ 56.793188] lock(&(&vb->lock)->rlock);
[ 56.799593] <Interrupt>
[ 56.805889] lock(&(&vb->lock)->rlock);
[ 56.812266]
[ 56.812266] *** DEADLOCK ***
[ 56.812266]
[ 56.830670] 1 lock held by ksoftirqd/1/13:
[ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
[ 56.849757]
[ 56.849757] stack backtrace:
[ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
[ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
[ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
[ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
[ 56.923006] Call Trace:
[ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76
[ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
[ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
[ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
[ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
[ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
[ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
[ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
[ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
[ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
[ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
[ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
[ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
[ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
[ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
[ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
[ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
[ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
[ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
[ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
[ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
[ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70
cannot reuse jited filter memory, since it's readonly,
so use original bpf insns memory to hold work_struct
defer kfree of sk_filter until jit completed freeing
tested on x86_64 and i386
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit ebd97be635 ('PCI: remove ARCH_SUPPORTS_MSI kconfig option')
removes the ARCH_SUPPORTS_MSI Kconfig option that allowed
architectures to indicate whether they support PCI MSI or not. Now,
PCI MSI support can be compiled in on any architecture thanks to the
use of weak functions thanks to 4287d824f265 ('PCI: use weak functions
for MSI arch-specific functions').
So, architecture specific code is now responsible to ensure that its
PCI MSI code builds in all cases, or be appropriately conditionally
compiled.
On Sparc, the MSI support is only provided for Sparc64, so the
ARCH_SUPPORTS_MSI kconfig option was only selected for SPARC64, and
not for the Sparc architecture as a whole. Therefore, removing
ARCH_SUPPORTS_MSI broke Sparc32 configurations with CONFIG_PCI_MSI=y,
because the Sparc-specific MSI code is not designed to be built on
Sparc32.
To solve this, this commit ensures that the Sparc MSI code is only
built on Sparc64. This is done thanks to a new Kconfig Makefile helper
option SPARC64_PCI_MSI, modeled after the existing SPARC64_PCI. The
SPARC64_PCI_MSI option is an hidden option that is true when both
Sparc64 PCI support is enabled and MSI is enabled. The
arch/sparc/kernel/pci_msi.c file is now only built when
SPARC64_PCI_MSI is true.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch proposes to remove the IRQF_DISABLED flag from sparc architecture
code. It's a NOOP since 2.6.35 and it will be removed one day.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The length argument to strlcpy was still wrong. It could overflow the end of
full_boot_str by 5 bytes. Instead of strcat and strlcpy, just use snprint.
Reported-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All arch overriden implementations of do_softirq() share the following
common code: disable irqs (to avoid races with the pending check),
check if there are softirqs pending, then execute __do_softirq() on
a specific stack.
Consolidate the common parts such that archs only worry about the
stack switch.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 117a0c5fc9c2d06045bd217385b2b39ea426b5a6 ("sparc: kernel: using
strlcpy() instead of strcpy()") added a bug to ldom_reboot in
arch/sparc/kernel/ds.c
- strcpy(full_boot_str + strlen("boot "), boot_command);
+ strlcpy(full_boot_str + strlen("boot "), boot_command,
+ sizeof(full_boot_str + strlen("boot ")));
That last sizeof() expression evaluates to sizeof(size_t) which is
not what was intended.
Also even the corrected:
sizeof(full_boot_str) + strlen("boot ")
is not right as the destination buffer length is just plain
"sizeof(full_boot_str)" and that's what the final argument
should be.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I found the following pattern that leads in to interesting findings:
grep -r "ret.*|=.*__put_user" *
grep -r "ret.*|=.*__get_user" *
grep -r "ret.*|=.*__copy" *
The __put_user() calls in compat_ioctl.c, ptrace compat, signal compat,
since those appear in compat code, we could probably expect the kernel
addresses not to be reachable in the lower 32-bit range, so I think they
might not be exploitable.
For the "__get_user" cases, I don't think those are exploitable: the worse
that can happen is that the kernel will copy kernel memory into in-kernel
buffers, and will fail immediately afterward.
The alpha csum_partial_copy_from_user() seems to be missing the
access_ok() check entirely. The fix is inspired from x86. This could
lead to information leak on alpha. I also noticed that many architectures
map csum_partial_copy_from_user() to csum_partial_copy_generic(), but I
wonder if the latter is performing the access checks on every
architectures.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently hugepage migration works well only for pmd-based hugepages
(mainly due to lack of testing,) so we had better not enable migration of
other levels of hugepages until we are ready for it.
Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
page table walk and check pud/pmd_huge() there, so they are safe. But the
other users (softoffline and memory hotremove) don't do this, so without
this patch they can try to migrate unexpected types of hugepages.
To prevent this, we introduce hugepage_migration_support() as an
architecture dependent check of whether hugepage are implemented on a pmd
basis or not. And on some architecture multiple sizes of hugepages are
available, so hugepage_migration_support() also checks hugepage size.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform changes from Olof Johansson:
"This branch contains mostly additions and changes to platform
enablement and SoC-level drivers. Since there's sometimes a
dependency on device-tree changes, there's also a fair amount of
those in this branch.
Pieces worth mentioning are:
- Mbus driver for Marvell platforms, allowing kernel configuration
and resource allocation of on-chip peripherals.
- Enablement of the mbus infrastructure from Marvell PCI-e drivers.
- Preparation of MSI support for Marvell platforms.
- Addition of new PCI-e host controller driver for Tegra platforms
- Some churn caused by sharing of macro names between i.MX 6Q and 6DL
platforms in the device tree sources and header files.
- Various suspend/PM updates for Tegra, including LP1 support.
- Versatile Express support for MCPM, part of big little support.
- Allwinner platform support for A20 and A31 SoCs (dual and quad
Cortex-A7)
- OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
The code that touches other architectures are patches moving MSI
arch-specific functions over to weak symbols and removal of
ARCH_SUPPORTS_MSI, acked by PCI maintainers"
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (266 commits)
tegra-cpuidle: provide stub when !CONFIG_CPU_IDLE
PCI: tegra: replace devm_request_and_ioremap by devm_ioremap_resource
ARM: tegra: Drop ARCH_SUPPORTS_MSI and sort list
ARM: dts: vf610-twr: enable i2c0 device
ARM: dts: i.MX51: Add one more I2C2 pinmux entry
ARM: dts: i.MX51: Move pins configuration under "iomuxc" label
ARM: dtsi: imx6qdl-sabresd: Add USB OTG vbus pin to pinctrl_hog
ARM: dtsi: imx6qdl-sabresd: Add USB host 1 VBUS regulator
ARM: dts: imx27-phytec-phycore-som: Enable AUDMUX
ARM: dts: i.MX27: Disable AUDMUX in the template
ARM: dts: wandboard: Add support for SDIO bcm4329
ARM: i.MX5 clocks: Remove optional clock setup (CKIH1) from i.MX51 template
ARM: dts: imx53-qsb: Make USBH1 functional
ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module
ARM i.MX6Q: dts: Enable SPI NOR flash on Phytec phyFLEX-i.MX6 Ouad module
ARM: dts: imx6qdl-sabresd: Add touchscreen support
ARM: imx: add ocram clock for imx53
ARM: dts: imx: ocram size is different between imx6q and imx6dl
ARM: dts: imx27-phytec-phycore-som: Fix regulator settings
ARM: dts: i.MX27: Remove clock name from CPU node
...
|
|
ERROR: "flush_ptrace_access" [drivers/staging/lustre/lustre/libcfs/libcfs.ko]
undefined!
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The functions
__down_read
__down_read_trylock
__down_write
__down_write_trylock
__up_read
__up_write
__downgrade_write
are implemented inline, so remove corresponding EXPORT_SYMBOLs
(They lead to compile errors on RT kernel).
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra into next/soc
From: Stephen Warren:
ARM: tegra: core SoC enhancements for 3.12
This branch includes a number of enhancements to core SoC support for
Tegra devices. The major new features are:
* Adds a new CPU-power-gated cpuidle state for Tegra114.
* Adds initial system suspend support for Tegra114, initially supporting
just CPU-power-gating during suspend.
* Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
both gates CPU power, and places the DRAM into self-refresh mode.
* A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
from arch/arm/mach-tegra/ to drivers/pci/host/.
The PCIe driver work depends on the following tag from Thomas Petazzoni:
git://git.infradead.org/linux-mvebu.git mis-3.12.2
... which is merged into the middle of this pull request.
* tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra: (33 commits)
ARM: tegra: disable LP2 cpuidle state if PCIe is enabled
MAINTAINERS: Add myself as Tegra PCIe maintainer
PCI: tegra: set up PADS_REFCLK_CFG1
PCI: tegra: Add Tegra 30 PCIe support
PCI: tegra: Move PCIe driver to drivers/pci/host
PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platforms
ARM: tegra: add LP1 suspend support for Tegra114
ARM: tegra: add LP1 suspend support for Tegra20
ARM: tegra: add LP1 suspend support for Tegra30
ARM: tegra: add common LP1 suspend support
clk: tegra114: add LP1 suspend/resume support
ARM: tegra: config the polarity of the request of sys clock
ARM: tegra: add common resume handling code for LP1 resuming
ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pci
of: pci: add registry of MSI chips
PCI: Introduce new MSI chip infrastructure
PCI: remove ARCH_SUPPORTS_MSI kconfig option
PCI: use weak functions for MSI arch-specific functions
ARM: tegra: unify Tegra's Kconfig a bit more
ARM: tegra: remove the limitation that Tegra114 can't support suspend
...
Signed-off-by: Kevin Hilman <khilman@linaro.org>
|
|
Now that we have weak versions for each of the PCI MSI architecture
functions, we can actually build the MSI support for all platforms,
regardless of whether they provide or not architecture-specific
versions of those functions. For this reason, the ARCH_SUPPORTS_MSI
hidden kconfig boolean becomes useless, and this patch gets rid of it.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
1)Use kvmap_itlb_longpath instead of kvmap_dtlb_longpath.
2)Handle page #0 only, don't handle page #1: bleu -> blu
(KERNBASE is 0x400000, so #1 does not exist too. But everything
is possible in the future. Fix to not to have problems later.)
3)Remove unused kvmap_itlb_nonlinear.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
(From v1 to v2: changed comment)
On the way linux_sparc_syscall32->linux_syscall_trace32->goto 2f,
register %o5 doesn't clear its second 32-bit.
Fix that.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename to make the function name better conform to its goal.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass 1 in %o1 to indicate that syscall_trace accounts exit.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Syscall number is passed instead of return value. Fix that.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add SPARC64X chip type in cpumap.c to correctly
build CPU distribution map that spans all CPUs.
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/sparc uses of the __cpuinit macros from
C files and removes __CPUINIT from assembly files. Note that even
though arch/sparc/kernel/trampoline_64.S has instances of ".previous"
in it, they are all paired off against explicit ".section" directives,
and not implicitly paired with __CPUINIT (unlike mips and arm were).
[1] https://lkml.org/lkml/2013/5/20/589
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs stuff from Al Viro:
"O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
making simple_lookup() usable for filesystems with non-NULL s_d_op,
which allows us to get rid of quite a bit of ugliness"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sunrpc: now we can just set ->s_d_op
cgroup: we can use simple_lookup() now
efivarfs: we can use simple_lookup() now
make simple_lookup() usable for filesystems that set ->s_d_op
configfs: don't open-code d_alloc_name()
__rpc_lookup_create_exclusive: pass string instead of qstr
rpc_create_*_dir: don't bother with qstr
llist: llist_add() can use llist_add_batch()
llist: fix/simplify llist_add() and llist_add_batch()
fput: turn "list_head delayed_fput_list" into llist_head
fs/file_table.c:fput(): add comment
Safer ABI for O_TMPFILE
|
|
Pull networking fixes from David Miller:
"Just a bunch of small fixes and tidy ups:
1) Finish the "busy_poll" renames, from Eliezer Tamir.
2) Fix RCU stalls in IFB driver, from Ding Tianhong.
3) Linearize buffers properly in tun/macvtap zerocopy code.
4) Don't crash on rmmod in vxlan, from Pravin B Shelar.
5) Spinlock used before init in alx driver, from Maarten Lankhorst.
6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry
Kravkov.
7) Dummy and ifb driver load failure paths can oops, fixes from Tan
Xiaojun and Ding Tianhong.
8) Correct MTU calculations in IP tunnels, from Alexander Duyck.
9) Account all TCP retransmits in SNMP stats properly, from Yuchung
Cheng.
10) atl1e and via-rhine do not handle DMA mapping failures properly,
from Neil Horman.
11) Various equal-cost multipath route fixes in ipv6 from Hannes
Frederic Sowa"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
ipv6: only static routes qualify for equal cost multipathing
via-rhine: fix dma mapping errors
atl1e: fix dma mapping warnings
tcp: account all retransmit failures
usb/net/r815x: fix cast to restricted __le32
usb/net/r8152: fix integer overflow in expression
net: access page->private by using page_private
net: strict_strtoul is obsolete, use kstrtoul instead
drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe
drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe
drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe
net/usb: add relative mii functions for r815x
net/tipc: use %*phC to dump small buffers in hex form
qlcnic: Adding Maintainers.
gre: Fix MTU sizing check for gretap tunnels
pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
pkt_sched: sch_qfq: improve efficiency of make_eligible
gso: Update tunnel segmentation to support Tx checksum offload
inet: fix spacing in assignment
ifb: fix oops when loading the ifb failed
...
|