linux-toradex.git/arch, branch v3.2.45

x86/mm: account for PGDIR_SIZE alignment

2013-05-13T14:02:44+00:00

Patch for 3.0-stable.  Function find_early_table_space removed upstream.

Fixes panic in alloc_low_page due to pgt_buf overflow during
init_memory_mapping.

find_early_table_space sizes pgt_buf based upon the size of the
memory being mapped, but it does not take into account the alignment
of the memory.  When the region being mapped spans a 512GB (PGDIR_SIZE)
alignment, a panic from alloc_low_pages occurs.

kernel_physical_mapping_init takes into account PGDIR_SIZE alignment.
This causes an extra call to alloc_low_page to be made.  This extra call
isn't accounted for by find_early_table_space and causes a kernel panic.

Change is to take into account PGDIR_SIZE alignment in find_early_table_space.

Signed-off-by: Jerry Hoemann 
Signed-off-by: Ben Hutchings

powerpc: fix numa distance for form0 device tree

2013-05-13T14:02:44+00:00

commit 7122beeee7bc1757682049780179d7c216dd1c83 upstream.

The following commit breaks numa distance setup for old powerpc
systems that use form0 encoding in device tree.

commit 41eab6f88f24124df89e38067b3766b7bef06ddb
powerpc/numa: Use form 1 affinity to setup node distance

Device tree node /rtas/ibm,associativity-reference-points would
index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or
form1 encoding detected by ibm,architecture-vec-5 property.

All modern systems use form1 and current kernel code is correct.
However, on older systems with form0 encoding, the numa distance
will get hard coded as LOCAL_DISTANCE for all nodes.  This causes
task scheduling anomaly since scheduler will skip building numa
level domain (topmost domain with all cpus) if all numa distances
are same.  (value of 'level' in sched_init_numa() will remain 0)

Prior to the above commit:
((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)

Restoring compatible behavior with this patch for old powerpc systems
with device tree where numa distance are encoded as form0.

Signed-off-by: Vaidyanathan Srinivasan 
Signed-off-by: Michael Ellerman 
Signed-off-by: Ben Hutchings

sparc64: Fix race in TLB batch processing.

2013-05-13T14:02:42+00:00

[ Commits f36391d2790d04993f48da6a45810033a2cdf847 and
  f0af97070acbad5d6a361f485828223a4faaa0ee upstream. ]

As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.

So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.

Fix this by using generic infrastructure to synchonize on the
completion of the cross call.

This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().

We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls.  If we're not in such a
region, we flush TLBs synchronously.

1) Get rid of xcall_flush_tlb_pending and per-cpu type
   implementations.

2) Do TLB batch cross calls instead via:

	smp_call_function_many()
		tlb_pending_func()
			__flush_tlb_pending()

3) Batch only in lazy mmu sequences:

	a) Add 'active' member to struct tlb_batch
	b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
	c) Set 'active' in arch_enter_lazy_mmu_mode()
	d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
	e) Check 'active' in tlb_batch_add_one() and do a synchronous
           flush if it's clear.

4) Add infrastructure for synchronous TLB page flushes.

	a) Implement __flush_tlb_page and per-cpu variants, patch
	   as needed.
	b) Likewise for xcall_flush_tlb_page.
	c) Implement smp_flush_tlb_page() to invoke the cross-call.
	d) Wire up global_flush_tlb_page() to the right routine based
           upon CONFIG_SMP

5) It turns out that singleton batches are very common, 2 out of every
   3 batch flushes have only a single entry in them.

   The batch flush waiting is very expensive, both because of the poll
   on sibling cpu completeion, as well as because passing the tlb batch
   pointer to the sibling cpus invokes a shared memory dereference.

   Therefore, in flush_tlb_pending(), if there is only one entry in
   the batch perform a completely asynchronous global_flush_tlb_page()
   instead.

Reported-by: Dave Kleikamp 
Signed-off-by: David S. Miller 
Acked-by: Dave Kleikamp 
Signed-off-by: Ben Hutchings

s390: move dummy io_remap_pfn_range() to asm/pgtable.h

2013-05-13T14:02:33+00:00

commit 4f2e29031e6c67802e7370292dd050fd62f337ee upstream.

Commit b4cbb197c7e7 ("vm: add vm_iomap_memory() helper function") added
a helper function wrapper around io_remap_pfn_range(), and every other
architecture defined it in .

The s390 choice of  may make sense, but is not very convenient
for this case, and gratuitous differences like that cause unexpected errors like this:

   mm/memory.c: In function 'vm_iomap_memory':
   mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]

Glory be the kbuild test robot who noticed this, bisected it, and
reported it to the guilty parties (ie me).

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Signed-off-by: Linus Torvalds 
[bwh: Backported to 3.2: the macro was not defined, so this is an addition
 and not a move]
Signed-off-by: Ben Hutchings

perf/x86: Fix offcore_rsp valid mask for SNB/IVB

2013-05-13T14:02:33+00:00

commit f1923820c447e986a9da0fc6bf60c1dccdf0408e upstream.

The valid mask for both offcore_response_0 and
offcore_response_1 was wrong for SNB/SNB-EP,
IVB/IVB-EP. It was possible to write to
reserved bit and cause a GP fault crashing
the kernel.

This patch fixes the problem by correctly marking the
reserved bits in the valid mask for all the processors
mentioned above.

A distinction between desktop and server parts is introduced
because bits 24-30 are only available on the server parts.

This version of the  patch is just a rebase to perf/urgent tree
and should apply to older kernels as well.

Signed-off-by: Stephane Eranian 
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: gregkh@linuxfoundation.org
Cc: security@kernel.org
Cc: ak@linux.intel.com
Signed-off-by: Ingo Molnar 
[bwh: Backported to 3.2: adjust context; drop the IVB case]
Signed-off-by: Ben Hutchings

ARM: u300: fix ages old copy/paste bug

2013-05-13T14:02:28+00:00

commit 0259d9eb30d003af305626db2d8332805696e60d upstream.

The UART1 is on the fast AHB bridge, not on the slow bus.

Acked-by: Arnd Bergmann 
Signed-off-by: Linus Walleij 
Signed-off-by: Olof Johansson 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings

powerpc: Add isync to copy_and_flush

2013-05-13T14:02:27+00:00

commit 29ce3c5073057991217916abc25628e906911757 upstream.

In __after_prom_start we copy the kernel down to zero in two calls to
copy_and_flush.  After the first call (copy from 0 to copy_to_here:)
we jump to the newly copied code soon after.

Unfortunately there's no isync between the copy of this code and the
jump to it.  Hence it's possible that stale instructions could still be
in the icache or pipeline before we branch to it.

We've seen this on real machines and it's results in no console output
after:
  calling quiesce...
  returning from prom_init

The below adds an isync to ensure that the copy and flushing has
completed before any branching to the new instructions occurs.

Signed-off-by: Michael Neuling 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Ben Hutchings

powerpc/spufs: Initialise inode->i_ino in spufs_new_inode()

2013-05-13T14:02:25+00:00

commit 6747e83235caecd30b186d1282e4eba7679f81b7 upstream.

In commit 85fe402 (fs: do not assign default i_ino in new_inode), the
initialisation of i_ino was removed from new_inode() and pushed down
into the callers. However spufs_new_inode() was not updated.

This exhibits as no files appearing in /spu, because all our dirents
have a zero inode, which readdir() seems to dislike.

Signed-off-by: Michael Ellerman 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Ben Hutchings

xen/time: Fix kasprintf splat when allocating timer%d IRQ line.

2013-05-13T14:02:20+00:00

commit 7918c92ae9638eb8a6ec18e2b4a0de84557cccc8 upstream.

When we online the CPU, we get this splat:

smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
 [] __might_sleep+0xda/0x100
 [] __kmalloc_track_caller+0x1e7/0x2c0
 [] ? kasprintf+0x38/0x40
 [] kvasprintf+0x5b/0x90
 [] kasprintf+0x38/0x40
 [] xen_setup_timer+0x30/0xb0
 [] xen_hvm_setup_cpu_clockevents+0x1f/0x30
 [] start_secondary+0x19c/0x1a8

The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.

Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.

Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Ben Hutchings

xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline

2013-05-13T14:02:20+00:00

commit 66ff0fe9e7bda8aec99985b24daad03652f7304e upstream.

While we don't use the spinlock interrupt line (see for details
commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.

This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:

genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
 [] __setup_irq+0x23e/0x4a0
 [] ? kmem_cache_alloc_trace+0x221/0x250
 [] request_threaded_irq+0xfb/0x160
 [] ? xen_spin_trylock+0x20/0x20
 [] bind_ipi_to_irqhandler+0xa3/0x160
 [] ? kasprintf+0x38/0x40
 [] ? xen_spin_trylock+0x20/0x20
 [] ? update_max_interval+0x15/0x40
 [] xen_init_lock_cpu+0x3c/0x78
 [] xen_hvm_cpu_notify+0x29/0x33
 [] notifier_call_chain+0x4d/0x70
 [] __raw_notifier_call_chain+0x9/0x10
 [] __cpu_notify+0x1b/0x30
 [] _cpu_up+0xa0/0x14b
 [] cpu_up+0xd9/0xec
 [] store_online+0x94/0xd0
 [] dev_attr_store+0x1b/0x20
 [] sysfs_write_file+0xf4/0x170
 [] vfs_write+0xb4/0x130
 [] sys_write+0x5a/0xa0
 [] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2

And if one looks at the /proc/interrupts right after
offlining (CPU1):

  70:          0          0  xen-percpu-ipi       spinlock0
  71:          0          0  xen-percpu-ipi       spinlock1
  77:          0          0  xen-percpu-ipi       spinlock2

There is the oddity of the 'spinlock1' still being present.

Signed-off-by: Konrad Rzeszutek Wilk 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings