Age | Commit message (Collapse) | Author |
|
commit a4a81d8eebdc1d209d034f62a082a5131e4242b5 upstream.
In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF
notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified
as note name. Otherwise the notes are ignored.
For /proc/vmcore we currently use "CORE" for these notes.
Up to now this has not been a real problem because the dump analysis tool
"crash" does not check the note name. But it will break all programs that
use libbfd for processing ELF notes.
So fix this and use "LINUX" for all s390 specific notes to comply with
libbfd.
Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4920e3cf77347d7d7373552d4839e8d832321313 upstream.
The current implementation of setup_randomness uses the stack address
and therefore the pointer to the SYSIB 3.2.2 block as input data
address. Furthermore the length of the input data is the number of
virtual-machine description blocks which is typically one.
This means that typically a single zero byte is fed to
add_device_randomness.
Fix both of these and use the address of the first virtual machine
description block as input data address and also use the correct
length.
Fixes: bcfcbb6bae64 ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit da8fd820f389a0e29080b14c61bf5cf1d8ef5ca1 upstream.
Commit bcfcbb6bae64 ("s390: add system information as device
randomness") intended to add some virtual machine specific information
to the randomness pool.
Unfortunately it uses the page allocator before it is ready to use. In
result the page allocator always returns NULL and the setup_randomness
function never adds anything to the randomness pool.
To fix this use memblock_alloc and memblock_free instead.
Fixes: bcfcbb6bae64 ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9dce990d2cf57b5ed4e71a9cdbd7eae4335111ff upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
convert_vx_to_fp() is adapted to handle only a specified number of
registers rather than unconditionally handling all of them: other
callers of this function are adapted appropriately.
Based on an initial patch by Dave Martin.
Reported-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5419447e2142d6ed68c9f5c1a28630b3a290a845 upstream.
This reverts commit 852ffd0f4e23248b47531058e531066a988434b5.
There are use cases where an intermediate boot kernel (1) uses kexec
to boot the final production kernel (2). For this scenario we should
provide the original boot information to the production kernel (2).
Therefore clearing the boot information during kexec() should not
be done.
Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8f100bb1ff27873dd71f636da670e503b9ade3c6 upstream.
Add the missing lpp magic initialization for cpu 0. Without this all
samples on cpu 0 do not have the most significant bit set in the
program parameter field, which we use to distinguish between guest and
host samples if the pid is also 0.
We did initialize the lpp magic in the absolute zero lowcore but
forgot that when switching to the allocated lowcore on cpu 0 only.
Reported-by: Shu Juan Zhang <zhshuj@cn.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e370e4769463a65dcf8806fa26d2874e0542ac41 upstream.
There is a tricky interaction between the machine check handler
and the critical sections of load_fpu_regs and save_fpu_regs
functions. If the machine check interrupts one of the two
functions the critical section cleanup will complete the function
before the machine check handler s390_do_machine_check is called.
Trouble is that the machine check handler needs to validate the
floating point registers *before* and not *after* the completion
of load_fpu_regs/save_fpu_regs.
The simplest solution is to rewind the PSW to the start of the
load_fpu_regs/save_fpu_regs and retry the function after the
return from the machine check handler.
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7a76aa95f6f6682db5629449d763251d1c9f8c4e upstream.
we have to check bit 40 of the facility list before issuing LPP
and not bit 48. Otherwise a guest running on a system with
"The decimal-floating-point zoned-conversion facility" and without
the "The set-program-parameters facility" might crash on an lpp
instruction.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 342300cc9cd3428bc6bfe5809bfcc1b9a0f06702 upstream.
git commit 8070361799ae1e3f4ef347bd10f0a508ac10acfb
"s390: add support for vector extension"
broke 31-bit compat processes in regard to signal handling.
The restore_sigregs_ext32() function is used to restore the additional
elements from the user space signal frame. Among the additional elements
are the upper registers halves for 64-bit register support for 31-bit
processes. The copy_from_user that is used to retrieve the high-gprs
array from the user stack uses an incorrect length, 8 bytes instead of
64 bytes. This causes incorrect upper register halves to get loaded.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d9a3a09af54d01ab8b0c320580f4f95328d4a7ac upstream.
Replace the offsets based on the struct area_area with the offset
constants from asm-offsets.c based on the struct _lowcore.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The print_insn() function returns strings like "lghi %r1,0". To escape the
'%' character in sprintf() a second '%' is used. For example "lghi %%r1,0"
is converted into "lghi %r1,0".
After print_insn() the output string is passed to printk(). Because format
specifiers like "%r" or "%f" are ignored by printk() this works by chance
most of the time. But for instructions with control registers like
"lctl %c6,%c6,780" this fails because printk() interprets "%c" as
character format specifier.
Fix this problem and escape the '%' characters twice.
For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf()
into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780".
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
There is no known user, therefore remove the code.
Acked-by: Rob Van Der Heij <robvdheij@nl.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Passes mlock2-tests test case in 64 bit and compat mode.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Remove dead code, since this could only happen on a 31 bit machine
where the kernel wouldn't IPL.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
commit 1f6b83e5e4d3 ("s390: avoid z13 cache aliasing") checks for the
machine type to optimize address space randomization and zero page
allocation to avoid cache aliases.
This check might fail under a hypervisor with migration support.
z/VMs "Single System Image and Live Guest Relocation" facility will
"fake" the machine type of the oldest system in the group. For example
in a group of zEC12 and Z13 the guest appears to run on a zEC12
(architecture fencing within the relocation domain)
Remove the machine type detection and always use cache aliasing
rules that are known to work for all machines. These are the z13
aliasing rules.
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
There's no reason to clear all PSW mask bits other than the addressing
mode bits. Just use the previous PSW mask as-is.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Allow to ipl from CCW based devices residing in any subchannel set.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The input buffer in reipl_fcp_scpdata_write is accessed out of bounds
when an offset is specified. The problem is that the offset refers to
the data we should write to and not to the buffer we read from.
So instead of
memcpy(scp_data, buf + off, count);
we could just do
memcpy(scp_data + off, buf, count);
However we not only modify the data but also store its length. For this to
work we'd need to remember a state per open FH. Since that's not possible
with sysfs callbacks let's just fail when an offset is specified.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Documentation/trace/tracepoints.txt states that the naming scheme
for tracepoints is "subsys_event" to avoid collisions. Rename
the 'diagnose' tracepoint to 's390_diagnose'.
Reported-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
startup calls the C function _sclp_print_early() if the machine we're
running on is not supported by the kernel. sclp.c is getting built
with -m64, so _sclp_print_early() expects the zSeries ELF ABI to be
used.
We previously called _sclp_print_early() using the S/390 ELF ABI, with
a stack frame size of 96 bytes and while being in 31-bit address
mode. This caused _sclp_wait_int() (called indirectly from
_sclp_print_early()) to jump to an undefined address. While
_sclp_wait_int() contained some code to deal with being called in
31-bit addressing mode, it didn't quite work. While fixing this is
possible, the code would still only work by chance and could break any
time.
Ensure compliance with the zSeries ELF ABI by switching to 64-bit
addressing mode early and using a minimum stack frame size of 160
bytes.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
"There is only one new feature in this pull for the 4.4 merge window,
most of it is small enhancements, cleanup and bug fixes:
- Add the s390 backend for the software dirty bit tracking. This
adds two new pgtable functions pte_clear_soft_dirty and
pmd_clear_soft_dirty which is why there is a hit to
arch/x86/include/asm/pgtable.h in this pull request.
- A series of cleanup patches for the AP bus, this includes the
removal of the support for two outdated crypto cards (PCICC and
PCICA).
- The irq handling / signaling on buffer full in the runtime
instrumentation code is dropped.
- Some micro optimizations: remove unnecessary memory barriers for a
couple of functions: [smb_]rmb, [smb_]wmb, atomics, bitops, and for
spin_unlock. Use the builtin bswap if available and make
test_and_set_bit_lock more cache friendly.
- Statistics and a tracepoint for the diagnose calls to the
hypervisor.
- The CPU measurement facility support to sample KVM guests is
improved.
- The vector instructions are now always enabled for user space
processes if the hardware has the vector facility. This simplifies
the FPU handling code. The fpu-internal.h header is split into fpu
internals, api and types just like x86.
- Cleanup and improvements for the common I/O layer.
- Rework udelay to solve a problem with kprobe. udelay has busy loop
semantics but still uses an idle processor state for the wait"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (66 commits)
s390: remove runtime instrumentation interrupts
s390/cio: de-duplicate subchannel validation
s390/css: unneeded initialization in for_each_subchannel
s390/Kconfig: use builtin bswap
s390/dasd: fix disconnected device with valid path mask
s390/dasd: fix invalid PAV assignment after suspend/resume
s390/dasd: fix double free in dasd_eckd_read_conf
s390/kernel: fix ptrace peek/poke for floating point registers
s390/cio: move ccw_device_stlck functions
s390/cio: move ccw_device_call_handler
s390/topology: reduce per_cpu() invocations
s390/nmi: reduce size of percpu variable
s390/nmi: fix terminology
s390/nmi: remove casts
s390/nmi: remove pointless error strings
s390: don't store registers on disabled wait anymore
s390: get rid of __set_psw_mask()
s390/fpu: split fpu-internal.h into fpu internals, api, and type headers
s390/dasd: fix list_del corruption after lcu changes
s390/spinlock: remove unneeded serializations at unlock
...
|
|
The external interrupts for runtime instrumentation buffer-full
and runtime instrumentation halted are unused and have no current
user. Remove the support and ignore the second parameter of the
s390_runtime_instr system call from now on.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
git commit 155e839a814834a3b4b31e729f4716e59d3d2dd4
"s390/kernel: dynamically allocate FP register save area"
introduced a regression in regard to ptrace.
If the vector register extension is not present or unused the
ptrace peek of a floating pointer register return incorrect data
and the ptrace poke to a floating pointer register overwrites the
task structure starting at task->thread.fpu.fprs.
Cc: stable@kernel.org # v4.3
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Each per_cpu() invocation generates extra code. Since there are a lot
of similiar calls in the topology code we can avoid a lot of them.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Change the flag fields within struct mcck_struct to simple bit fields
to reduce the size of the structure which is used as percpu variable.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
According to the architecture registers are validated and not
revalidated. So change comments and functions names to match.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Remove all the casts to and from the machine check interruption code.
This patch changes struct mci to a union, which contains an anonymous
structure with the already known bits and in addition an unsigned
long field, which contains the raw machine check interruption code.
This allows to simply assign and decoce the interruption code value
without the need for all those casts we had all the time.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
s390_handle_damage() has character string parameter which was used as
a pointer to verbose error message. The hope was (a lot of years ago)
when analyzing dumps that register R2 would still contain the pointer
and therefore it would be rather easy to tell what went wrong.
However gcc optimizes the strings away since a long time. And even if
it wouldn't it is necessary to have a close look at the machine check
interruption code to tell what's wrong.
So remove the pointless error strings.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Split the API and FPU type definitions into separate header files
similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b).
The new header files and their meaning are:
asm/fpu/types.h:
FPU related data types, needed for 'struct thread_struct' and
'struct task_struct'.
asm/fpu/api.h:
FPU related 'public' functions for other subsystems and device
drivers.
asm/fpu/internal.h:
FPU internal functions mainly used to convert
FPU register contents in signal handling.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The first level machine check handler for etr and stp machine checks may
call queue_work() while in nmi context. This may deadlock e.g. if the
machine check happened when the interrupted context did hold a lock, that
also will be acquired by queue_work().
Therefore split etr and stp machine check handling into first and second
level handling. The second level handling will then issue the queue_work()
call in process context which avoids the potential deadlock.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
When using systemtap it was observed that our udelay implementation is
rather suboptimal if being called from a kprobe handler installed by
systemtap.
The problem observed when a kprobe was installed on lock_acquired().
When the probe was hit the kprobe handler did call udelay, which set
up an (internal) timer and reenabled interrupts (only the clock comparator
interrupt) and waited for the interrupt.
This is an optimization to avoid that the cpu is busy looping while waiting
that enough time passes. The problem is that the interrupt handler still
does call irq_enter()/irq_exit() which then again can lead to a deadlock,
since some accounting functions may take locks as well.
If one of these locks is the same, which caused lock_acquired() to be
called, we have a nice deadlock.
This patch reworks the udelay code for the interrupts disabled case to
immediately leave the low level interrupt handler when the clock
comparator interrupt happens. That way no C code is being called and the
deadlock cannot happen anymore.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The program parameter can be used to mark hardware samples with
some token. Previously, it was used to mark guest samples only.
Improve the program parameter doubleword by combining two parts,
the leftmost LPP part and the rightmost PID part. Set the PID
part for processes by using the task PID.
To distinguish host and guest samples for the kernel (PID part
is zero), the guest must always set the program paramater to a
non-zero value. Use the leftmost bit in the LPP part of the
program parameter to be able to detect guest kernel samples.
[brueckner@linux.vnet.ibm.com]: Split __LC_CURRENT and introduced
__LC_LPP. Corrected __LC_CURRENT users and adjusted assembler parts.
And updated the commit message accordingly.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The use of OFFSET instead of DEFINE makes the definitions in asm-offsets.c
more readable. While we are at it sort the defines for struct _lowcore
according to the field order and remove some unneeded defines.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Various functions in entry.S perform test-under-mask instructions
to test for particular bits in memory. Because test-under-mask uses
a mask value of one byte, the mask value and the offset into the
memory must be calculated manually. This easily introduces errors
and is hard to review and read.
Introduce the TSTMSK assembler macro to specify a mask constant and
let the macro calculate the offset and the byte mask to generate a
test-under-mask instruction. The benefit is that existing symbolic
constants can now be used for tests. Also the macro checks for
zero mask values and mask values that consist of multiple bytes.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Previously, the init task did not have an allocated FPU save area and
saving an FPU state was not possible. Now if the vector extension is
always enabled, provide a static FPU save area to save FPU states of
vector instructions that can be executed quite early.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
If the kernel detects that the s390 hardware supports the vector
facility, it is enabled by default at an early stage. To force
it off, use the novx kernel parameter. Note that there is a small
time window, where the vector facility is enabled before it is
forced to be off.
With enabling the vector facility by default, the FPU save and
restore functions can be improved. They do not longer require
to manage expensive control register updates to enable or disable
the vector enablement control for particular processes.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
To be able to analyse problems in regard to hypervisor overhead
add a tracepoing for diagnose calls. It reports the number of
the diagnose issued, e.g.
sshd-1385 [002] .... 42.701431: diagnose: nr=0x9c
<idle>-0 [001] ..s. 43.587528: diagnose: nr=0x9c
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Introduce /sys/debug/kernel/diag_stat with a statistic how many diagnose
calls have been done by each CPU in the system.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The <linux/memblock.h> already provides for_each_mem_range() macro that
iterates through memblock areas from type_a and not included in type_b.
We can remove custom for_each_dump_mem_range() macro and use the
for_each_mem_range() instead.
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
By definition smp_wmb only orders writes against writes. (Finish all
previous writes, and do not start any future write). To protect the
vdso init code against early reads on other CPUs, let's use a full
smp_mb at the end of vdso init. As right now smp_wmb is implemented
as full serialization, this needs no stable backport, but this change
will be necessary if we reimplement smp_wmb.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The calculation for the SMT scaling factor for a hardware thread
which has been partially idle needs to disregard the cycles spent
by the other threads of the core while the thread is idle.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
As discussed on linux-arch all architectures should wire up the separate
system calls that are hidden behind the socketcall multiplexer system call.
It's just a couple more system calls and gives us a very small performance
improvement.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
A couple of compat wrapper functions are simply trampolines to the real
system call. This happened because the compat wrapper defines will only
sign and zero extend system call parameters which are of different size
on s390/s390x (longs and pointers).
All other parameters will be correctly sign and zero extended by the
normal system call wrappers.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Add notrace to the compat wrapper define to disable tracing of compat
wrapper functions. These are supposed to be very small and more or less
just a trampoline to the real system call.
Also fix indentation.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: linux-api@vger.kernel.org
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: linux-s390@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The scaled cputime is supposed to be derived from the normal per-thread
cputime by dividing it with the average thread density in the last interval.
The calculation of the scaling values for the average thread density is
incorrect. The current, incorrect calculation:
Ci = cycle count with i active threads
T = unscaled cputime, sT = scaled cputime
sT = T * (C1 + C2 + ... + Cn) / (1*C1 + 2*C2 + ... + n*Cn)
The calculation happens to yield the correct numbers for the simple cases
with only one Ci value not zero. But for cases with multiple Ci values not
zero it fails. E.g. on a SMT-2 system with one thread active half the time
and two threads active for the other half of the time it fails, the scaling
factor should be 3/4 but the formula gives 2/3.
The correct formula is
sT = T * (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn)
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Previously, the cpum_cf PMU returned -EPERM if a counter is requested and
the counter set to which the counter belongs is not authorized. According
to the perf_event_open() system call manual, an error code of EPERM indicates
an unsupported exclude setting or CAP_SYS_ADMIN is missing.
Use ENOENT to indicate that particular counters are not available when the
counter set which contains the counter is not authorized. For generic events,
this might trigger a fall back, for example, to a software event.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|