summaryrefslogtreecommitdiff
path: root/arch/sparc/mm/init_64.c
AgeCommit message (Collapse)Author
2013-06-19sparc64 address-congruence propertybob picco
The Machine Description (MD) property "address-congruence-offset" is optional. According to the MD specification the value is assumed 0UL when not present. This caused early boot failure on T5. Signed-off-by: Bob Picco <bob.picco@oracle.com> CC: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-07mm/SPARC: use common help functions to free reserved pagesJiang Liu
Use common help functions to free reserved pages. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcDavid S. Miller
Merge sparc bug fixes that didn't make it into v3.9 into sparc-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-29sparse-vmemmap: specify vmemmap population range in bytesJohannes Weiner
The sparse code, when asking the architecture to populate the vmemmap, specifies the section range as a starting page and a number of pages. This is an awkward interface, because none of the arch-specific code actually thinks of the range in terms of 'struct page' units and always translates it to bytes first. In addition, later patches mix huge page and regular page backing for the vmemmap. For this, they need to call vmemmap_populate_basepages() on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these are not necessarily multiples of the 'struct page' size and so this unit is too coarse. Just translate the section range into bytes once in the generic sparse code, then pass byte ranges down the stack. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: David S. Miller <davem@davemloft.net> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-08sparc64: Do not save/restore interrupts in get_new_mmu_context()Kirill Tkhai
get_new_mmu_context() is always called with interrupts disabled. So it's possible to do this micro optimization. (Also fix the comment to switch_mm, which is called in both cases) Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20sparc64: Do not change num_physpages during initmem freeingTkhai Kirill
Common hibernation code looks at num_physpages during suspend and restore. Restore is able to be called from initcall, which is before initmem freeing. This case leads to restore fail. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: David Miller <davem@davemloft.net> CC: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-23memory-hotplug: remove memmap of sparse-vmemmapTang Chen
Introduce a new API vmemmap_free() to free and remove vmemmap pagetables. Since pagetable implements are different, each architecture has to provide its own version of vmemmap_free(), just like vmemmap_populate(). Note: vmemmap_free() is not implemented for ia64, ppc, s390, and sparc. [mhocko@suse.cz: fix implicit declaration of remove_pagetable] Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23memory-hotplug: implement register_page_bootmem_info_section of sparse-vmemmapYasuaki Ishimatsu
For removing memmap region of sparse-vmemmap which is allocated bootmem, memmap region of sparse-vmemmap needs to be registered by get_page_bootmem(). So the patch searches pages of virtual mapping and registers the pages by get_page_bootmem(). NOTE: register_page_bootmem_memmap() is not implemented for ia64, ppc, s390, and sparc. So introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform doesn't support it. It's implemented by adding a new Kconfig option named CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected by memory-hotplug feature fully supported archs(currently only on x86_64). Since we have 2 config options called MEMORY_HOTPLUG and MEMORY_HOTREMOVE used for memory hot-add and hot-remove separately, and codes in function register_page_bootmem_info_node() are only used for collecting infomation for hot-remove, so reside it under MEMORY_HOTREMOVE. Besides page_isolation.c selected by MEMORY_ISOLATION under MEMORY_HOTPLUG is also such case, move it too. [mhocko@suse.cz: put register_page_bootmem_memmap inside CONFIG_MEMORY_HOTPLUG_SPARSE] [linfeng@cn.fujitsu.com: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node()] [mhocko@suse.cz: remove the arch specific functions without any implementation] [linfeng@cn.fujitsu.com: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed] [rientjes@google.com: fix defined but not used warning] Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Wu Jianguo <wujianguo@huawei.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Peter Anvin: "This is a huge set of several partly interrelated (and concurrently developed) changes, which is why the branch history is messier than one would like. The *really* big items are two humonguous patchsets mostly developed by Yinghai Lu at my request, which completely revamps the way we create initial page tables. In particular, rather than estimating how much memory we will need for page tables and then build them into that memory -- a calculation that has shown to be incredibly fragile -- we now build them (on 64 bits) with the aid of a "pseudo-linear mode" -- a #PF handler which creates temporary page tables on demand. This has several advantages: 1. It makes it much easier to support things that need access to data very early (a followon patchset uses this to load microcode way early in the kernel startup). 2. It allows the kernel and all the kernel data objects to be invoked from above the 4 GB limit. This allows kdump to work on very large systems. 3. It greatly reduces the difference between Xen and native (Xen's equivalent of the #PF handler are the temporary page tables created by the domain builder), eliminating a bunch of fragile hooks. The patch series also gets us a bit closer to W^X. Additional work in this pull is the 64-bit get_user() work which you were also involved with, and a bunch of cleanups/speedups to __phys_addr()/__pa()." * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits) x86, mm: Move reserving low memory later in initialization x86, doc: Clarify the use of asm("%edx") in uaccess.h x86, mm: Redesign get_user with a __builtin_choose_expr hack x86: Be consistent with data size in getuser.S x86, mm: Use a bitfield to mask nuisance get_user() warnings x86/kvm: Fix compile warning in kvm_register_steal_time() x86-32: Add support for 64bit get_user() x86-32, mm: Remove reference to alloc_remap() x86-32, mm: Remove reference to resume_map_numa_kva() x86-32, mm: Rip out x86_32 NUMA remapping code x86/numa: Use __pa_nodebug() instead x86: Don't panic if can not alloc buffer for swiotlb mm: Add alloc_bootmem_low_pages_nopanic() x86, 64bit, mm: hibernate use generic mapping_init x86, 64bit, mm: Mark data/bss/brk to nx x86: Merge early kernel reserve for 32bit and 64bit x86: Add Crash kernel low reservation x86, kdump: Remove crashkernel range find limit for 64bit memblock: Add memblock_mem_size() x86, boot: Not need to check setup_header version for setup_data ...
2013-02-20sparc64: Fix tsb_grow() in atomic context.David S. Miller
If our first THP installation for an MM is via the set_pmd_at() done during khugepaged's collapsing we'll end up in tsb_grow() trying to do a GFP_KERNEL allocation with several locks held. Simply using GFP_ATOMIC in this situation is not the best option because we really can't have this fail, so we'd really like to keep this an order 0 GFP_KERNEL allocation if possible. Also, doing the TSB allocation from khugepaged is a really bad idea because we'll allocate it potentially from the wrong NUMA node in that context. So what we do is defer the hugepage TSB allocation until the first TLB miss we take on a hugepage. This is slightly tricky because we have to handle two unusual cases: 1) Taking the first hugepage TLB miss in the window trap handler. We'll call the winfix_trampoline when that is detected. 2) An initial TSB allocation via TLB miss races with a hugetlb fault on another cpu running the same MM. We handle this by unconditionally loading the TSB we see into the current cpu even if it's non-NULL at hugetlb_setup time. Reported-by: Meelis Roos <mroos@ut.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-20sparc64: Handle hugepage TSB being NULL.David S. Miller
Accomodate the possibility that the TSB might be NULL at the point that update_mmu_cache() is invoked. This is necessary because we will sometimes need to defer the TSB allocation to the first fault that happens in the 'mm'. Seperate out the hugepage PTE test into a seperate function so that the logic is clearer. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge remote-tracking branch 'origin/x86/boot' into x86/mm2H. Peter Anvin
Coming patches to x86/mm2 require the changes and advanced baseline in x86/boot. Resolved Conflicts: arch/x86/kernel/setup.c mm/nobootmem.c Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-03SPARC: drivers: remove __dev* attributes.Greg Kroah-Hartman
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17sparc, mm: Remove calling of free_all_bootmem_node()Yinghai Lu
Now NO_BOOTMEM version free_all_bootmem_node() does not really do free_bootmem at all, and it only call register_page_bootmem_info_node instead. That is confusing, try to kill that free_all_bootmem_node(). Before that, this patch will remove calling of free_all_bootmem_node() We add register_page_bootmem_info() to call register_page_bootmem_info_node directly. Also could use free_all_bootmem() for numa case, and it is just the same as free_low_memory_core_early(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1353123563-3103-45-git-send-email-yinghai@kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: sparclinux@vger.kernel.org Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-10-14sparc64: clear syscall_noerror on the entry to syscall, not on the exitAl Viro
Move that sucker to just before TI_FPDEPTH and replace stb with sth in etrap_save(). Take current_ds to its old place, so that we don't push wsaved into TI_... flags. That allows to lose clearing syscall_noerror on return from syscall. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-09sparc64: Support transparent huge pages.David Miller
This is relatively easy since PMD's now cover exactly 4MB of memory. Our PMD entries are 32-bits each, so we use a special encoding. The lowest bit, PMD_ISHUGE, determines the interpretation. This is possible because sparc64's page tables are purely software entities so we can use whatever encoding scheme we want. We just have to make the TLB miss assembler page table walkers aware of the layout. set_pmd_at() works much like set_pte_at() but it has to operate in two page from a table of non-huge PTEs, so we have to queue up TLB flushes based upon what mappings are valid in the PTE table. In the second regime we are going from huge-page to non-huge-page, and in that case we need only queue up a single TLB flush to push out the huge page mapping. We still have 5 bits remaining in the huge PMD encoding so we can very likely support any new pieces of THP state tracking that might get added in the future. With lots of help from Johannes Weiner. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09sparc64: Eliminate PTE table memory wastage.David Miller
We've split up the PTE tables so that they take up half a page instead of a full page. This is in order to facilitate transparent huge page support, which works much better if our PMDs cover 4MB instead of 8MB. What we do is have a one-behind cache for PTE table allocations in the mm struct. This logic triggers only on allocations. For example, we don't try to keep track of free'd up page table blocks in the style that the s390 port does. There were only two slightly annoying aspects to this change: 1) Changing pgtable_t to be a "pte_t *". There's all of this special logic in the TLB free paths that needed adjustments, as did the PMD populate interfaces. 2) init_new_context() needs to zap the pointer, since the mm struct just gets copied from the parent on fork. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09sparc64: Only support 4MB huge pages and 8KB base pages.David Miller
Narrowing the scope of the page size configurations will make the transparent hugepage changes much simpler. In the end what we really want to do is have the kernel support multiple huge page sizes and use whatever is appropriate as the context dictactes. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-02sparc: fix format string argument for prom_printf()Akinobu Mita
prom_printf() takes printf style arguments. Specifing GCC's format attribute reveals that there are several wrong usages of prom_printf(). This fixes those wrong format strings and arguments, and also leaves format attributes in order to detect similar mistakes at compile time. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-06sparc64: Use cpu_pgsz_mask for linear kernel mapping config.David S. Miller
This required a little bit of reordering of how we set up the memory management early on. We now only know the final values of kern_linear_pte_xor[] after we take over the trap table and start processing TLB misses ourselves. So once we fill those values in we re-clear the kernel's 4M TSB and flush the TLBs. That way if we find we support larger than 4M pages we won't have any stale smaller page size entries in the TSB. SUN4U Panther support for larger page sizes should now be extremely trivial but I have no hardware on which to test it and I believe that some of the sun4u TLB miss assembler needs to be audited first to make sure it really can handle larger than 4M PTEs properly. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-06sparc64: Probe cpu page size support more portably.David S. Miller
On sun4v, interrogate the machine description. This code is extremely defensive in nature, and a lot of the checks can probably be removed. On sun4u things are a lot simpler. There are the page sizes all chips support, and then Panther adds 32MB and 256MB pages. Report the probed value in /proc/cpuinfo Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-06sparc64: Support 2GB and 16GB page sizes for kernel linear mappings.David S. Miller
SPARC-T4 supports 2GB pages. So convert kpte_linear_bitmap into an array of 2-bit values which index into kern_linear_pte_xor. Now kern_linear_pte_xor is used for 4 page size aligned regions, 4MB, 256MB, 2GB, and 16GB respectively. Enabling 2GB pages is currently hardcoded using a check against sun4v_chip_type. In the future this will be done more cleanly by interrogating the machine description which is the correct way to determine this kind of thing. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-15sparc64: Be less verbose during vmemmap population.David S. Miller
On a 2-node machine with 256GB of ram we get 512 lines of console output, which is just too much. This mimicks Yinghai Lu's x86 commit c2b91e2eec9678dbda274e906cc32ea8f711da3b (x86_64/mm: check and print vmemmap allocation continuous) except that we aren't ever going to get contiguous block pointers in between calls so just print when the virtual address or node changes. This decreases the output by an order of 16. Also demote this to KERN_DEBUG. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-09sparc: fix build fail in mm/init_64.c when NEED_MULTIPLE_NODES is offPaul Gortmaker
Commit 625d693e9784f988371e69c2b41a2172c0be6c11 (linux-next) "sparc64: Convert over to NO_BOOTMEM." causes the following compile failure for sparc64 allnoconfig: arch/sparc/mm/init_64.c:822:16: error: unused variable 'paddr' arch/sparc/mm/init_64.c:1759:7: error: unused variable 'node' arch/sparc/mm/init_64.c:809:12: error: 'memblock_nid_range' defined but not used The paddr decl can easily be shuffled within the ifdef. The memblock_nid_range is just a stub function for when NEED_MULTIPLE_NODES is off, but the only caller is within a NEED_MULTIPLE_NODES enabled section, so we can simply delete it. The unused "node" is slightly more interesting. In the case of "# CONFIG_NEED_MULTIPLE_NODES is not set" we no longer get the definition of: #define NODE_DATA(nid) (node_data[nid]) from arch/sparc/include/asm/mmzone.h - but instead we get: #define NODE_DATA(nid) (&contig_page_data) from include/linux/mmzone.h -- and since the arg is ignored, the thing really is unused. Rather than put in a confusing looking __maybe_unused, simply splitting the declaration from the assignment seemed to me to be the least offensive. Cc: Sam Ravnborg <sam@ravnborg.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-27sparc64: Do not set max_mapnr.David S. Miller
There is no need, since nothing relevant to sparc64 makes use of this value. Noticed by Sam Ravnborg. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-26sparc64: Use node local allocations for IRQ stacks.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-26sparc64: Convert over to NO_BOOTMEM.David S. Miller
With help from Sam Ravnborg. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-28Disintegrate asm/system.h for SparcDavid Howells
Disintegrate asm/system.h for Sparc. Signed-off-by: David Howells <dhowells@redhat.com> cc: sparclinux@vger.kernel.org
2011-12-08sparc: Use HAVE_MEMBLOCK_NODE_MAPTejun Heo
sparc doesn't access early_node_map[] directly and enabling HAVE_MEMBLOCK_NODE_MAP is trivial - replacing add_active_range() calls with memblock_set_node() and selecting HAVE_MEMBLOCK_NODE_MAP is enough. -v2: Use select in Kconfig instead as suggested by Sam Ravnborg. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: sparclinux@vger.kernel.org
2011-12-08memblock: s/memblock_analyze()/memblock_allow_resize()/ and update usersTejun Heo
The only function of memblock_analyze() is now allowing resize of memblock region arrays. Rename it to memblock_allow_resize() and update its users. * The following users remain the same other than renaming. arm/mm/init.c::arm_memblock_init() microblaze/kernel/prom.c::early_init_devtree() powerpc/kernel/prom.c::early_init_devtree() openrisc/kernel/prom.c::early_init_devtree() sh/mm/init.c::paging_init() sparc/mm/init_64.c::paging_init() unicore32/mm/init.c::uc32_memblock_init() * In the following users, analyze was used to update total size which is no longer necessary. powerpc/kernel/machine_kexec.c::reserve_crashkernel() powerpc/kernel/prom.c::early_init_devtree() powerpc/mm/init_32.c::MMU_init() powerpc/mm/tlb_nohash.c::__early_init_mmu() powerpc/platforms/ps3/mm.c::ps3_mm_add_memory() powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups() sh/kernel/machine_kexec.c::reserve_crashkernel() * x86/kernel/e820.c::memblock_x86_fill() was directly setting memblock_can_resize before populating memblock and calling analyze afterwards. Call memblock_allow_resize() before start populating. memblock_can_resize is now static inside memblock.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-12-08memblock: Kill memblock_init()Tejun Heo
memblock_init() initializes arrays for regions and memblock itself; however, all these can be done with struct initializers and memblock_init() can be removed. This patch kills memblock_init() and initializes memblock with struct initializer. The only difference is that the first dummy entries don't have .nid set to MAX_NUMNODES initially. This doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-11-28Merge branch 'master' into x86/memblockTejun Heo
Conflicts & resolutions: * arch/x86/xen/setup.c dc91c728fd "xen: allow extra memory to be in multiple regions" 24aa07882b "memblock, x86: Replace memblock_x86_reserve/free..." conflicted on xen_add_extra_mem() updates. The resolution is trivial as the latter just want to replace memblock_x86_reserve_range() with memblock_reserve(). * drivers/pci/intel-iommu.c 166e9278a3f "x86/ia64: intel-iommu: move to drivers/iommu/" 5dfe8660a3d "bootmem: Replace work_with_active_regions() with..." conflicted as the former moved the file under drivers/iommu/. Resolved by applying the chnages from the latter on the moved file. * mm/Kconfig 6661672053a "memblock: add NO_BOOTMEM config symbol" c378ddd53f9 "memblock, x86: Make ARCH_DISCARD_MEMBLOCK a config option" conflicted trivially. Both added config options. Just letting both add their own options resolves the conflict. * mm/memblock.c d1f0ece6cdc "mm/memblock.c: small function definition fixes" ed7b56a799c "memblock: Remove memblock_memory_can_coalesce()" confliected. The former updates function removed by the latter. Resolution is trivial. Signed-off-by: Tejun Heo <tj@kernel.org>
2011-09-29sparc64: Force the execute bit in OpenFirmware's translation entries.David S. Miller
In the OF 'translations' property, the template TTEs in the mappings never specify the executable bit. This is the case even though some of these mappings are for OF's code segment. Therefore, we need to force the execute bit on in every mapping. This problem can only really trigger on Niagara/sun4v machines and the history behind this is a little complicated. Previous to sun4v, the sun4u TTE entries lacked a hardware execute permission bit. So OF didn't have to ever worry about setting anything to handle executable pages. Any valid TTE loaded into the I-TLB would be respected by the chip. But sun4v Niagara chips have a real hardware enforced executable bit in their TTEs. So it has to be set or else the I-TLB throws an instruction access exception with type code 6 (protection violation). We've been extremely fortunate to not get bitten by this in the past. The best I can tell is that the OF's mappings for it's executable code were mapped using permanent locked mappings on sun4v in the past. Therefore, the fact that we didn't have the exec bit set in the OF translations we would use did not matter in practice. Thanks to Greg Onufer for helping me track this down. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-06sparc: Fix build with DEBUG_PAGEALLOC enabled.David S. Miller
arch/sparc/mm/init_64.c:1622:22: error: unused variable '__swapper_4m_tsb_phys_patch_end' [-Werror=unused-variable] arch/sparc/mm/init_64.c:1621:22: error: unused variable '__swapper_4m_tsb_phys_patch' [-Werror=unused-variable] Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-05sparc: Access kernel TSB using physical addressing when possible.David S. Miller
On sun4v this is basically required since we point the hypervisor and the TSB walking hardware at these tables using physical addressing too. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14memblock: Don't allow archs to override memblock_nid_range()Tejun Heo
memblock_nid_range() is used to implement memblock_[try_]alloc_nid(). The generic version determines the range by walking early_node_map with for_each_mem_pfn_range(). The generic version is defined __weak to allow arch override. Currently, only sparc overrides it; however, with the previous update to the generic implementation, there isn't much to be gained with arch override. Sparc would behave exactly the same with the generic implementation. This patch disallows arch override for memblock_nid_range() and make both generic and sparc versions static. sparc is only compile tested. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310460395-30913-6-git-send-email-tj@kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-07sparc: Remove unnecessary semicolonsJoe Perches
Semicolons are not necessary after switch/while/for/if braces so remove them. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16sparc: convert old cpumask API into new oneKOSAKI Motohiro
Adapt new API. Almost change is trivial, most important change are to remove following like =operator. cpumask_t cpu_mask = *mm_cpumask(mm); cpus_allowed = current->cpus_allowed; Because cpumask_var_t is =operator unsafe. These usage might prevent kernel core improvement. No functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: mtd/m25p80: add support to parse the partitions by OF node of/irq: of_irq.c needs to include linux/irq.h of/mips: Cleanup some include directives/files. of/mips: Add device tree support to MIPS of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch of/device: Rework to use common platform_device_alloc() for allocating devices of/xsysace: Fix OF probing on little-endian systems of: use __be32 types for big-endian device tree data of/irq: remove references to NO_IRQ in drivers/of/platform.c of/promtree: add package-to-path support to pdt of/promtree: add of_pdt namespace to pdt code of/promtree: no longer call prom_ functions directly; use an ops structure of/promtree: make drivers/of/pdt.c no longer sparc-only sparc: break out some PROM device-tree building code out into drivers/of of/sparc: convert various prom_* functions to use phandle sparc: stop exporting openprom.h header powerpc, of_serial: Endianness issues setting up the serial ports of: MTD: Fix OF probing on little-endian systems of: GPIO: Fix OF probing on little-endian systems
2010-10-12memblock, bootmem: Round pfn properly for memory and reserved regionsYinghai Lu
We need to round memory regions correctly -- specifically, we need to round reserved region in the more expansive direction (lower limit down, upper limit up) whereas usable memory regions need to be rounded in the more restrictive direction (lower limit up, upper limit down). This introduces two set of inlines: memblock_region_memory_base_pfn() memblock_region_memory_end_pfn() memblock_region_reserved_base_pfn() memblock_region_reserved_end_pfn() Although they are antisymmetric (and therefore are technically duplicates) the use of the different inlines explicitly documents the programmer's intention. The lack of proper rounding caused a bug on ARM, which was then found to also affect other architectures. Reported-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CB4CDFD.4020105@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-09of/sparc: convert various prom_* functions to use phandleAndres Salomon
Rather than passing around ints everywhere, use the phandle type where appropriate for the various functions that talk to the PROM. Signed-off-by: Andres Salomon <dilinger@queued.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-05memblock: Separate memblock_alloc_nid() and memblock_alloc_try_nid()Benjamin Herrenschmidt
The former is now strict, it will fail if it cannot honor the allocation within the node, while the later implements the previous semantic which falls back to allocating anywhere. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05memblock: Remove nid_range argument, arch provides memblock_nid_range() insteadBenjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-04memblock/sparc: Use new accessorsBenjamin Herrenschmidt
CC: David S. Miller <davem@davemloft.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-04memblock: Rename memblock_region to memblock_type and memblock_property to ↵Benjamin Herrenschmidt
memblock_region Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14lmb: rename to memblockYinghai Lu
via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by: Ingo Molnar <mingo@elte.hu> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-05Merge branch 'master' into export-slabhTejun Heo
2010-04-03sparc64: Fix array size reported by vmemmap_populate()Ben Hutchings
vmemmap_populate() attempts to report the used index and total size of vmemmap_table, but it wrongly shifts the total size so that it is always shown as 0. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-02-20MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itselfRussell King
On VIVT ARM, when we have multiple shared mappings of the same file in the same MM, we need to ensure that we have coherency across all copies. We do this via make_coherent() by making the pages uncacheable. This used to work fine, until we allowed highmem with highpte - we now have a page table which is mapped as required, and is not available for modification via update_mmu_cache(). Ralf Beache suggested getting rid of the PTE value passed to update_mmu_cache(): On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables to construct a pointer to the pte again. Passing a pte_t * is much more elegant. Maybe we might even replace the pte argument with the pte_t? Ben Herrenschmidt would also like the pte pointer for PowerPC: Passing the ptep in there is exactly what I want. I want that -instead- of the PTE value, because I have issue on some ppc cases, for I$/D$ coherency, where set_pte_at() may decide to mask out the _PAGE_EXEC. So, pass in the mapped page table pointer into update_mmu_cache(), and remove the PTE value, updating all implementations and call sites to suit. Includes a fix from Stephen Rothwell: sparc: fix fallout from update_mmu_cache API change Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>