linux-toradex.git/arch, branch v3.2.9

x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors

2012-03-01T00:31:14+00:00

commit 32c3233885eb10ac9cb9410f2f8cd64b8df2b2a1 upstream.

For L1 instruction cache and L2 cache the shared CPU information
is wrong. On current AMD family 15h CPUs those caches are shared
between both cores of a compute unit.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=42607

Signed-off-by: Andreas Herrmann 
Cc: Petkov Borislav 
Cc: Dave Jones 
Link: http://lkml.kernel.org/r/20120208195229.GA17523@alberich.amd.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman

ARM: omap: fix oops in arch/arm/mach-omap2/vp.c when pmic is not found

2012-03-01T00:31:13+00:00

commit d980e0f8d858c6963d676013e976ff00ab7acb2b upstream.

When the PMIC is not found, voltdm->pmic will be NULL.  vp.c's
initialization function tries to dereferences this, which causes an
oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (3.3.0-rc2+ #204)
PC is at omap_vp_init+0x5c/0x15c
LR is at omap_vp_init+0x58/0x15c
pc : []    lr : []    psr: 60000013
sp : c181ff30  ip : c181ff68  fp : c181ff64
r10: c0407808  r9 : c040786c  r8 : c0407814
r7 : c0026868  r6 : c00264fc  r5 : c040ad6c  r4 : 00000000
r3 : 00000040  r2 : 000032c8  r1 : 0000fa00  r0 : 000032c8
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 80004019  DAC: 00000015
Process swapper (pid: 1, stack limit = 0xc181e2e8)
Stack: (0xc181ff30 to 0xc1820000)
ff20:                                     c0381d00 c02e9c6d c0383582 c040786c
ff40: c040ad6c c00264fc c0026868 c0407814 00000000 c03d9de4 c181ff8c c181ff68
ff60: c03db448 c03db830 c02e982c c03fdfb8 c03fe004 c0039988 00000013 00000000
ff80: c181ff9c c181ff90 c03d9df8 c03db390 c181ffdc c181ffa0 c0008798 c03d9df0
ffa0: c181ffc4 c181ffb0 c0055a44 c0187050 c0039988 c03fdfb8 c03fe004 c0039988
ffc0: 00000013 00000000 00000000 00000000 c181fff4 c181ffe0 c03d1284 c0008708
ffe0: 00000000 c03d1208 00000000 c181fff8 c0039988 c03d1214 1077ce40 01f7ee08
Backtrace:
[] (omap_vp_init+0x0/0x15c) from [] (omap_voltage_late_init+0xc4/0xfc)
[] (omap_voltage_late_init+0x0/0xfc) from [] (omap2_common_pm_late_init+0x14/0x54)
 r8:00000000 r7:00000013 r6:c0039988 r5:c03fe004 r4:c03fdfb8
[] (omap2_common_pm_late_init+0x0/0x54) from [] (do_one_initcall+0x9c/0x164)
[] (do_one_initcall+0x0/0x164) from [] (kernel_init+0x7c/0x120)
[] (kernel_init+0x0/0x120) from [] (do_exit+0x0/0x2cc)
 r5:c03d1208 r4:00000000
Code: e5ca300b e5900034 ebf69027 e5994024 (e5941000)
---[ end trace aed617dddaf32c3d ]---
Kernel panic - not syncing: Attempted to kill init!

Signed-off-by: Russell King 
Cc: Igor Grinberg 
Signed-off-by: Greg Kroah-Hartman

ARM: 7325/1: fix v7 boot with lockdep enabled

2012-03-01T00:30:57+00:00

commit 8e43a905dd574f54c5715d978318290ceafbe275 upstream.

Bootup with lockdep enabled has been broken on v7 since b46c0f74657d
("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").

This is because v7_setup (which is called very early during boot) calls
v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
ends up attempting to call into lockdep C code (trace_hardirqs_off())
when we are in no position to execute it (no stack, MMU off).

Fix this by using a notrace variant of save_and_disable_irqs.  The code
already uses the notrace variant of restore_irqs.

Reviewed-by: Nicolas Pitre 
Acked-by: Stephen Boyd 
Cc: Catalin Marinas 
Signed-off-by: Rabin Vincent 
Signed-off-by: Russell King 
Signed-off-by: Greg Kroah-Hartman

ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR

2012-03-01T00:30:56+00:00

commit b46c0f74657d1fe1c1b0c1452631cc38a9e6987f upstream.

armv7's flush_cache_all() flushes caches via set/way. To
determine the cache attributes (line size, number of sets,
etc.) the assembly first writes the CSSELR register to select a
cache level and then reads the CCSIDR register. The CSSELR register
is banked per-cpu and is used to determine which cache level CCSIDR
reads. If the task is migrated between when the CSSELR is written and
the CCSIDR is read the CCSIDR value may be for an unexpected cache
level (for example L1 instead of L2) and incorrect cache flushing
could occur.

Disable interrupts across the write and read so that the correct
cache attributes are read and used for the cache flushing
routine. We disable interrupts instead of disabling preemption
because the critical section is only 3 instructions and we want
to call v7_dcache_flush_all from __v7_setup which doesn't have a
full kernel stack with a struct thread_info.

This fixes a problem we see in scm_call() when flush_cache_all()
is called from preemptible context and sometimes the L2 cache is
not properly flushed out.

Signed-off-by: Stephen Boyd 
Acked-by: Catalin Marinas 
Reviewed-by: Nicolas Pitre 
Signed-off-by: Russell King 
Signed-off-by: Greg Kroah-Hartman

ARM: 7326/2: PL330: fix null pointer dereference in pl330_chan_ctrl()

2012-03-01T00:30:53+00:00

commit 46e33c606af8e0caeeca374103189663d877c0d6 upstream.

This fixes the thrd->req_running field being accessed before thrd
is checked for null. The error was introduced in

   abb959f: ARM: 7237/1: PL330: Fix driver freeze

Reference: <1326458191-23492-1-git-send-email-mans.rullgard@linaro.org>

Signed-off-by: Mans Rullgard 
Acked-by: Javi Merino 
Signed-off-by: Russell King 
Signed-off-by: Greg Kroah-Hartman

S390: correct ktime to tod clock comparator conversion

2012-03-01T00:30:53+00:00

commit cf1eb40f8f5ea12c9e569e7282161fc7f194fd62 upstream.

The conversion of the ktime to a value suitable for the clock comparator
does not take changes to wall_to_monotonic into account. In fact the
conversion just needs the boot clock (sched_clock_base_cc) and the
total_sleep_time.

This is applicable to 3.2+ kernels.

Signed-off-by: Martin Schwidefsky 
Signed-off-by: Greg Kroah-Hartman

ARM: at91: USB AT91 gadget registration for module

2012-03-01T00:30:49+00:00

commit e8c9dc93e27d891636defbc269f182a83e6abba8 upstream.

Registration of at91_udc as a module will enable SoC
related code.

Fix following an idea from Karel Znamenacek.

Signed-off-by: Nicolas Ferre 
Acked-by: Karel Znamenacek 
Acked-by: Jean-Christophe PLAGNIOL-VILLARD 
Signed-off-by: Greg Kroah-Hartman

powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events

2012-03-01T00:30:49+00:00

commit 9a45a9407c69d068500923480884661e2b9cc421 upstream.

perf on POWER stopped working after commit e050e3f0a71b (perf: Fix
broken interrupt rate throttling). That patch exposed a bug in
the POWER perf_events code.

Since the PMCs count upwards and take an exception when the top bit
is set, we want to write 0x80000000 - left in power_pmu_start. We were
instead programming in left which effectively disables the counter
until we eventually hit 0x80000000. This could take seconds or longer.

With the patch applied I get the expected number of samples:

          SAMPLE events:       9948

Signed-off-by: Anton Blanchard 
Acked-by: Paul Mackerras 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Greg Kroah-Hartman

i387: re-introduce FPU state preloading at context switch time

2012-02-27T18:25:55+00:00

commit 34ddc81a230b15c0e345b6b253049db731499f7e upstream.

After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870ef3ff ("i387:
do not preload FPU state at task switch time").

However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably

 - properly abstracted as a true FPU state switch, rather than as
   open-coded save and restore with various hacks.

   In particular, implementing it as a proper FPU state switch allows us
   to optimize the CR0.TS flag accesses: there is no reason to set the
   TS bit only to then almost immediately clear it again.  CR0 accesses
   are quite slow and expensive, don't flip the bit back and forth for
   no good reason.

 - Make sure that the same model works for both x86-32 and x86-64, so
   that there are no gratuitous differences between the two due to the
   way they save and restore segment state differently due to
   architectural differences that really don't matter to the FPU state.

 - Avoid exposing the "preload" state to the context switch routines,
   and in particular allow the concept of lazy state restore: if nothing
   else has used the FPU in the meantime, and the process is still on
   the same CPU, we can avoid restoring state from memory entirely, just
   re-expose the state that is still in the FPU unit.

   That optimized lazy restore isn't actually implemented here, but the
   infrastructure is set up for it.  Of course, older CPU's that use
   'fnsave' to save the state cannot take advantage of this, since the
   state saving also trashes the state.

In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage.  Hopefully it's easier to
follow as a result.

Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

i387: move TS_USEDFPU flag from thread_info to task_struct

2012-02-27T18:25:54+00:00

commit f94edacf998516ac9d849f7bc6949a703977a7f3 upstream.

This moves the bit that indicates whether a thread has ownership of the
FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
(called 'has_fpu') in task_struct->thread.has_fpu.

This fixes two independent bugs at the same time:

 - changing 'thread_info->status' from the scheduler causes nasty
   problems for the other users of that variable, since it is defined to
   be thread-synchronous (that's what the "TS_" part of the naming was
   supposed to indicate).

   So perfectly valid code could (and did) do

	ti->status |= TS_RESTORE_SIGMASK;

   and the compiler was free to do that as separate load, or and store
   instructions.  Which can cause problems with preemption, since a task
   switch could happen in between, and change the TS_USEDFPU bit. The
   change to TS_USEDFPU would be overwritten by the final store.

   In practice, this seldom happened, though, because the 'status' field
   was seldom used more than once, so gcc would generally tend to
   generate code that used a read-modify-write instruction and thus
   happened to avoid this problem - RMW instructions are naturally low
   fat and preemption-safe.

 - On x86-32, the current_thread_info() pointer would, during interrupts
   and softirqs, point to a *copy* of the real thread_info, because
   x86-32 uses %esp to calculate the thread_info address, and thus the
   separate irq (and softirq) stacks would cause these kinds of odd
   thread_info copy aliases.

   This is normally not a problem, since interrupts aren't supposed to
   look at thread information anyway (what thread is running at
   interrupt time really isn't very well-defined), but it confused the
   heck out of irq_fpu_usable() and the code that tried to squirrel
   away the FPU state.

   (It also caused untold confusion for us poor kernel developers).

It also turns out that using 'task_struct' is actually much more natural
for most of the call sites that care about the FPU state, since they
tend to work with the task struct for other reasons anyway (ie
scheduling).  And the FPU data that we are going to save/restore is
found there too.

Thanks to Arjan Van De Ven  for pointing us to
the %esp issue.

Cc: Arjan van de Ven 
Reported-and-tested-by: Raphael Prevost 
Acked-and-tested-by: Suresh Siddha 
Tested-by: Peter Anvin 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman