linux-toradex.git/arch/powerpc, branch v3.12.45

powerpc/perf: Fix book3s kernel to userspace backtraces

2015-07-30T12:10:42+00:00

commit 72e349f1124a114435e599479c9b8d14bfd1ebcd upstream.

When we take a PMU exception or a software event we call
perf_read_regs(). This overloads regs->result with a boolean that
describes if we should use the sampled instruction address register
(SIAR) or the regs.

If the exception is in kernel, we start with the kernel regs and
backtrace through the kernel stack. At this point we switch to the
userspace regs and backtrace the user stack with perf_callchain_user().

Unfortunately these regs have not got the perf_read_regs() treatment,
so regs->result could be anything. If it is non zero,
perf_instruction_pointer() decides to use the SIAR, and we get issues
like this:

0.11%  qemu-system-ppc  [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
       |
       ---_raw_spin_lock_irqsave
          |
          |--52.35%-- 0
          |          |
          |          |--46.39%-- __hrtimer_start_range_ns
          |          |          kvmppc_run_core
          |          |          kvmppc_vcpu_run_hv
          |          |          kvmppc_vcpu_run
          |          |          kvm_arch_vcpu_ioctl_run
          |          |          kvm_vcpu_ioctl
          |          |          do_vfs_ioctl
          |          |          sys_ioctl
          |          |          system_call
          |          |          |
          |          |          |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
          |          |          |          |
          |          |          |           --100.00%-- 0x7e714
          |          |          |                     0x7e714

Notice the bogus _raw_spin_irqsave when we transition from kernel
(system_call) to userspace (0x7e714). We inserted what was in the SIAR.

Add a check in regs_use_siar() to check that the regs in question
are from a PMU exception. With this fix the backtrace makes sense:

     0.47%  qemu-system-ppc  [kernel.vmlinux]         [k] _raw_spin_lock_irqsave
            |
            ---_raw_spin_lock_irqsave
               |
               |--53.83%-- 0
               |          |
               |          |--44.73%-- hrtimer_try_to_cancel
               |          |          kvmppc_start_thread
               |          |          kvmppc_run_core
               |          |          kvmppc_vcpu_run_hv
               |          |          kvmppc_vcpu_run
               |          |          kvm_arch_vcpu_ioctl_run
               |          |          kvm_vcpu_ioctl
               |          |          do_vfs_ioctl
               |          |          sys_ioctl
               |          |          system_call
               |          |          __ioctl
               |          |          0x7e714
               |          |          0x7e714

Signed-off-by: Anton Blanchard 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc: Align TOC to 256 bytes

2015-06-03T09:33:11+00:00

commit 5e95235ccd5442d4a4fe11ec4eb99ba1b7959368 upstream.

Recent toolchains force the TOC to be 256 byte aligned. We need
to enforce this alignment in our linker script, otherwise pointers
to our TOC variables (__toc_start, __prom_init_toc_start) could
be incorrect.

If they are bad, we die a few hundred instructions into boot.

Signed-off-by: Anton Blanchard 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the allowed address space

2015-06-02T09:54:48+00:00

commit 19751c07b3728748c1253627ce94e6906fa5e273 upstream.

According to Posix, if MAP_FIXED is specified mmap shall set ENOMEM if
the requested mapping exceeds the allowed range for address space of
the process. The generic code set it right, but the specific powerpc
slice_get_unmapped_area() function currently returns -EINVAL in that
case.
This patch corrects it.

Signed-off-by: Jerome Marchand 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Jiri Slaby

powerpc: Fix missing L2 cache size in /sys/devices/system/cpu

2015-05-04T09:50:11+00:00

commit f7e9e358362557c3aa2c1ec47490f29fe880a09e upstream.

This problem appears to have been introduced in 2.6.29 by commit
93197a36a9c1 "Rewrite sysfs processor cache info code".

This caused lscpu to error out on at least e500v2 devices, eg:

  error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory

Some embedded powerpc systems use cache-size in DTS for the unified L2
cache size, not d-cache-size, so we need to allow for both DTS names.
Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle
this.

Fixes: 93197a36a9c1 ("powerpc: Rewrite sysfs processor cache info code")
Signed-off-by: Dave Olson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH

2015-05-04T09:50:05+00:00

commit 9a5cbce421a283e6aea3c4007f141735bf9da8c3 upstream.

We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH
(currently 127), but we forgot to do the same for 64bit backtraces.

Signed-off-by: Anton Blanchard 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

nosave: consolidate __nosave_{begin,end} in

2015-05-04T09:49:04+00:00

commit 7f8998c7aef3ac9c5f3f2943e083dfa6302e90d0 upstream.

The different architectures used their own (and different) declarations:

    extern __visible const void __nosave_begin, __nosave_end;
    extern const void __nosave_begin, __nosave_end;
    extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in .

Signed-off-by: Geert Uytterhoeven 
Cc: Russell King 
Cc: Ralf Baechle 
Cc: Benjamin Herrenschmidt 
Cc: Martin Schwidefsky 
Cc: "David S. Miller" 
Cc: Guan Xuetao 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Jiri Slaby 
[js -- port to 3.12: arm does not have hibernation yet]

powerpc/pseries: Little endian fixes for post mobility device tree update

2015-04-21T14:30:02+00:00

commit f6ff04149637723261aa4738958b0098b929ee9e upstream.

We currently use the device tree update code in the kernel after resuming
from a suspend operation to re-sync the kernels view of the device tree with
that of the hypervisor. The code as it stands is not endian safe as it relies
on parsing buffers returned by RTAS calls that thusly contains data in big
endian format.

This patch annotates variables and structure members with __be types as well
as performing necessary byte swaps to cpu endian for data that needs to be
parsed.

Signed-off-by: Tyrel Datwyler 
Cc: Nathan Fontenot 
Cc: Cyril Bur 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc: Fix sys_call_table declaration to enable syscall tracing

2015-04-09T12:13:46+00:00

commit 1028ccf560b97adbf272381a61a67e17d44d1054 upstream.

Declaring sys_call_table as a pointer causes the compiler to generate
the wrong lookup code in arch_syscall_addr().

     :
        lis     r9,-16384
        rlwinm  r3,r3,2,0,29
  -     lwz     r11,30640(r9)
  -     lwzx    r3,r11,r3
  +     addi    r9,r9,30640
  +     lwzx    r3,r9,r3
        blr

The actual sys_call_table symbol, declared in assembler, is an
array. If we lie about that to the compiler we get the wrong code
generated, as above.

This definition seems only to be used by the syscall tracing code in
kernel/trace/trace_syscalls.c. With this patch I can successfully use
the syscall tracepoints:

  bash-3815  [002] ....   333.239082: sys_write -> 0x2
  bash-3815  [002] ....   333.239087: sys_dup2(oldfd: a, newfd: 1)
  bash-3815  [002] ....   333.239088: sys_dup2 -> 0x1
  bash-3815  [002] ....   333.239092: sys_fcntl(fd: a, cmd: 1, arg: 0)
  bash-3815  [002] ....   333.239093: sys_fcntl -> 0x1
  bash-3815  [002] ....   333.239094: sys_close(fd: a)
  bash-3815  [002] ....   333.239094: sys_close -> 0x0

Signed-off-by: Romeo Cane 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc/smp: Wait until secondaries are active & online

2015-04-09T12:13:43+00:00

commit 875ebe940d77a41682c367ad799b4f39f128d3fa upstream.

Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:

  BUG_ON(td->cpu != smp_processor_id());

Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:

  CPU: 0
  Comm: watchdog/130

The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.

This seems to have been introduced by 6acbfb96976f "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.

The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.

Fixes: 6acbfb96976f ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: Michael Ellerman 
Signed-off-by: Anton Blanchard 
Signed-off-by: Michael Ellerman 
Signed-off-by: Jiri Slaby

powerpc/perf: Fix ABIv2 kernel backtraces

2015-04-09T12:13:35+00:00

commit 85101af13bb854a6572fa540df7c7201958624b9 upstream.

ABIv2 kernels are failing to backtrace through the kernel. An example:

39.30%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               __GI___libc_read

The problem is in valid_next_sp() where we check that the new stack
pointer is at least STACK_FRAME_OVERHEAD below the previous one.

ABIv1 has a minimum stack frame size of 112 bytes consisting of 48 bytes
and 64 bytes of parameter save area. ABIv2 changes that to 32 bytes
with no paramter save area.

STACK_FRAME_OVERHEAD is in theory the minimum stack frame size,
but we over 240 uses of it, some of which assume that it includes
space for the parameter area.

We need to work through all our stack defines and rationalise them
but let's fix perf now by creating STACK_FRAME_MIN_SIZE and using
in valid_next_sp(). This fixes the issue:

30.64%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               pagecache_get_page
               generic_file_read_iter
               new_sync_read
               vfs_read
               sys_read
               syscall_exit
               __GI___libc_read

Cc: stable@vger.kernel.org # 3.16+
Reported-by: Aneesh Kumar K.V 
Signed-off-by: Anton Blanchard 
Signed-off-by: Jiri Slaby