summaryrefslogtreecommitdiff
path: root/arch/s390/kernel
AgeCommit message (Collapse)Author
2018-11-27s390/vdso: add missing FORCE to build targetsVasily Gorbik
[ Upstream commit b44b136a3773d8a9c7853f8df716bd1483613cbb ] According to Documentation/kbuild/makefiles.txt all build targets using if_changed should use FORCE as well. Add missing FORCE to make sure vdso targets are rebuild properly when not just immediate prerequisites have changed but also when build command differs. Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-09-15s390/kdump: Fix memleak in nt_vmcoreinfoPhilipp Rudo
[ Upstream commit 2d2e7075b87181ed0c675e4936e20bdadba02e1f ] The vmcoreinfo of a crashed system is potentially fragmented. Thus the crash kernel has an intermediate step where the vmcoreinfo is copied into a temporary, continuous buffer in the crash kernel memory. This temporary buffer is never freed. Free it now to prevent the memleak. While at it replace all occurrences of "VMCOREINFO" by its corresponding macro to prevent potential renaming issues. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11s390: Correct register corruption in critical section cleanupChristian Borntraeger
commit 891f6a726cacbb87e5b06076693ffab53bd378d7 upstream. In the critical section cleanup we must not mess with r1. For march=z9 or older, larl + ex (instead of exrl) are used with r1 as a temporary register. This can clobber r1 in several interrupt handlers. Fix this by using r11 as a temp register. r11 is being saved by all callers of cleanup_critical. Fixes: 6dd85fbb87 ("s390: move expoline assembler macros to a header") Cc: stable@vger.kernel.org #v4.16 Reported-by: Oliver Kurz <okurz@suse.com> Reported-by: Petr Tesařík <ptesarik@suse.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-25s390: extend expoline to BC instructionsMartin Schwidefsky
[ Upstream commit 6deaa3bbca804b2a3627fd685f75de64da7be535 ] The BPF JIT uses a 'b <disp>(%r<x>)' instruction in the definition of the sk_load_word and sk_load_half functions. Add support for branch-on-condition instructions contained in the thunk code of an expoline. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-25s390: move spectre sysfs attribute codeMartin Schwidefsky
[ Upstream commit 4253b0e0627ee3461e64c2495c616f1c8f6b127b ] The nospec-branch.c file is compiled without the gcc options to generate expoline thunks. The return branch of the sysfs show functions cpu_show_spectre_v1 and cpu_show_spectre_v2 is an indirect branch as well. These need to be compiled with expolines. Move the sysfs functions for spectre reporting to a separate file and loose an '.' for one of the messages. Cc: stable@vger.kernel.org # 4.16 Fixes: d424986f1d ("s390: add sysfs attributes for spectre") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-25s390/kernel: use expoline for indirect branchesMartin Schwidefsky
[ Upstream commit c50c84c3ac4d5db683904bdb3257798b6ef980ae ] The assember code in arch/s390/kernel uses a few more indirect branches which need to be done with execute trampolines for CONFIG_EXPOLINE=y. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-25s390/ftrace: use expoline for indirect branchesMartin Schwidefsky
[ Upstream commit 23a4d7fd34856da8218c4cfc23dba7a6ec0a423a ] The return from the ftrace_stub, _mcount, ftrace_caller and return_to_handler functions is done with "br %r14" and "br %r1". These are indirect branches as well and need to use execute trampolines for CONFIG_EXPOLINE=y. The ftrace_caller function is a special case as it returns to the start of a function and may only use %r0 and %r1. For a pre z10 machine the standard execute trampoline uses a LARL + EX to do this, but this requires *two* registers in the range %r1..%r15. To get around this the 'br %r1' located in the lowcore is used, then the EX instruction does not need an address register. But the lowcore trick may only be used for pre z14 machines, with noexec=on the mapping for the first page may not contain instructions. The solution for that is an ALTERNATIVE in the expoline THUNK generated by 'GEN_BR_THUNK %r1' to switch to EXRL, this relies on the fact that a machine that supports noexec=on has EXRL as well. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-25s390: move expoline assembler macros to a headerMartin Schwidefsky
[ Upstream commit 6dd85fbb87d1d6b87a3b1f02ca28d7b2abd2e7ba ] To be able to use the expoline branches in different assembler files move the associated macros from entry.S to a new header nospec-insn.h. While we are at it make the macros a bit nicer to use. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22s390: remove indirect branch from do_softirq_own_stackMartin Schwidefsky
commit 9f18fff63cfd6f559daa1eaae60640372c65f84b upstream. The inline assembly to call __do_softirq on the irq stack uses an indirect branch. This can be replaced with a normal relative branch. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22s390/cpum_sf: ensure sample frequency of perf event attributes is non-zeroHendrik Brueckner
commit 4bbaf2584b86b0772413edeac22ff448f36351b1 upstream. Correct a trinity finding for the perf_event_open() system call with a perf event attribute structure that uses a frequency but has the sampling frequency set to zero. This causes a FP divide exception during the sample rate initialization for the hardware sampling facility. Fixes: 8c069ff4bd606 ("s390/perf: add support for the CPU-Measurement Sampling Facility") Cc: stable@vger.kernel.org # 3.14+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390/uprobes: implement arch_uretprobe_is_alive()Heiko Carstens
commit 783c3b53b9506db3e05daacfe34e0287eebb09d8 upstream. Implement s390 specific arch_uretprobe_is_alive() to avoid SIGSEGVs observed with uretprobes in combination with setjmp/longjmp. See commit 2dea1d9c38e4 ("powerpc/uprobes: Implement arch_uretprobe_is_alive()") for more details. With this implemented all test cases referenced in the above commit pass. Reported-by: Ziqian SUN <zsun@redhat.com> Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: correct module section names for expoline code revertMartin Schwidefsky
[ Upstream commit 6cf09958f32b9667bb3ebadf74367c791112771b ] The main linker script vmlinux.lds.S for the kernel image merges the expoline code patch tables into two section ".nospec_call_table" and ".nospec_return_table". This is *not* done for the modules, there the sections retain their original names as generated by gcc: ".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg". The module_finalize code has to check for the compiler generated section names, otherwise no code patching is done. This slows down the module code in case of "spectre_v2=off". Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: correct nospec auto detection init orderMartin Schwidefsky
[ Upstream commit 6a3d1e81a434fc311f224b8be77258bafc18ccc6 ] With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via early_initcall is done *after* the early_param functions. This overwrites any settings done with the nobp/no_spectre_v2/spectre_v2 parameters. The code patching for the kernel is done after the evaluation of the early parameters but before the early_initcall is done. The end result is a kernel image that is patched correctly but the kernel modules are not. Make sure that the nospec auto detection function is called before the early parameters are evaluated and before the code patching is done. Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add sysfs attributes for spectreMartin Schwidefsky
[ Upstream commit d424986f1d6b16079b3231db0314923f4f8deed1 ] Set CONFIG_GENERIC_CPU_VULNERABILITIES and provide the two functions cpu_show_spectre_v1 and cpu_show_spectre_v2 to report the spectre mitigations. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: report spectre mitigation via syslogMartin Schwidefsky
[ Upstream commit bc035599718412cfba9249aa713f90ef13f13ee9 ] Add a boot message if either of the spectre defenses is active. The message is "Spectre V2 mitigation: execute trampolines." or "Spectre V2 mitigation: limited branch prediction." Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add automatic detection of the spectre defenseMartin Schwidefsky
[ Upstream commit 6e179d64126b909f0b288fa63cdbf07c531e9b1d ] Automatically decide between nobp vs. expolines if the spectre_v2=auto kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set. The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set can be overruled with the nobp, nospec and spectre_v2 kernel parameters. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: move nobp parameter functions to nospec-branch.cMartin Schwidefsky
[ Upstream commit b2e2f43a01bace1a25bdbae04c9f9846882b727a ] Keep the code for the nobp parameter handling with the code for expolines. Both are related to the spectre v2 mitigation. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390/entry.S: fix spurious zeroing of r0Christian Borntraeger
[ Upstream commit d3f468963cd6fd6d2aa5e26aed8b24232096d0e1 ] when a system call is interrupted we might call the critical section cleanup handler that re-does some of the operations. When we are between .Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the problem state registers r0-r7: .Lcleanup_system_call: [...] 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer # fill pt_regs mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC ---> stmg %r0,%r7,__PT_R0(%r9) The problem is now, that we might have already zeroed out r0. The fix is to move the zeroing of r0 after sysc_do_svc. Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com> Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: do not bypass BPENTER for interrupt system callsMartin Schwidefsky
[ Upstream commit d5feec04fe578c8dbd9e2e1439afc2f0af761ed4 ] The system call path can be interrupted before the switch back to the standard branch prediction with BPENTER has been done. The critical section cleanup code skips forward to .Lsysc_do_svc and bypasses the BPENTER. In this case the kernel and all subsequent code will run with the limited branch prediction. Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)Eugeniu Rosca
[ Upstream commit 2cb370d615e9fbed9e95ed222c2c8f337181aa90 ] I've accidentally stumbled upon the IS_ENABLED(EXPOLINE_*) lines, which obviously always evaluate to false. Fix this. Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: introduce execute-trampolines for branchesMartin Schwidefsky
[ Upstream commit f19fbd5ed642dc31c809596412dab1ed56f2f156 ] Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and -mfunction_return= compiler options to create a kernel fortified against the specte v2 attack. With CONFIG_EXPOLINE=y all indirect branches will be issued with an execute type instruction. For z10 or newer the EXRL instruction will be used, for older machines the EX instruction. The typical indirect call basr %r14,%r1 is replaced with a PC relative call to a new thunk brasl %r14,__s390x_indirect_jump_r1 The thunk contains the EXRL/EX instruction to the indirect branch __s390x_indirect_jump_r1: exrl 0,0f j . 0: br %r1 The detour via the execute type instruction has a performance impact. To get rid of the detour the new kernel parameter "nospectre_v2" and "spectre_v2=[on,off,auto]" can be used. If the parameter is specified the kernel and module code will be patched at runtime. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: run user space and KVM guests with modified branch predictionMartin Schwidefsky
[ Upstream commit 6b73044b2b0081ee3dd1cd6eaab7dee552601efb ] Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary plumbing in entry.S to be able to run user space and KVM guests with limited branch prediction. To switch a user space process to limited branch prediction the s390_isolate_bp() function has to be call, and to run a vCPU of a KVM guest associated with the current task with limited branch prediction call s390_isolate_bp_guest(). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add options to change branch prediction behaviour for the kernelMartin Schwidefsky
[ Upstream commit d768bd892fc8f066cd3aa000eb1867bcf32db0ee ] Add the PPA instruction to the system entry and exit path to switch the kernel to a different branch prediction behaviour. The instructions are added via CPU alternatives and can be disabled with the "nospec" or the "nobp=0" kernel parameter. If the default behaviour selected with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be used to enable the changed kernel branch prediction. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390/alternative: use a copy of the facility bit maskMartin Schwidefsky
[ Upstream commit cf1489984641369611556bf00c48f945c77bcf02 ] To be able to switch off specific CPU alternatives with kernel parameters make a copy of the facility bit mask provided by STFLE and use the copy for the decision to apply an alternative. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: scrub registers on kernel entry and KVM exitMartin Schwidefsky
[ Upstream commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 ] Clear all user space registers on entry to the kernel and all KVM guest registers on KVM guest exit if the register does not contain either a parameter or a result value. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: enable CPU alternatives unconditionallyHeiko Carstens
[ Upstream commit 049a2c2d486e8cc82c5cd79fa479c5b105b109e9 ] Remove the CPU_ALTERNATIVES config option and enable the code unconditionally. The config option was only added to avoid a conflict with the named saved segment support. Since that code is gone there is no reason to keep the CPU_ALTERNATIVES config option. Just enable it unconditionally to also reduce the number of config options and make it less likely that something breaks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: introduce CPU alternativesVasily Gorbik
[ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ] Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability. A new kernel boot parameter "noaltinstr" disables patching. Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length. alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2); Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes. .altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization. Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-20s390/ipl: ensure loadparm valid flag is setVasily Gorbik
commit 15deb080a6087b73089139569558965750e69d67 upstream. When loadparm is set in reipl parm block, the kernel should also set DIAG308_FLAGS_LP_VALID flag. This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct boot -> ccw reipl. Cc: <stable@vger.kernel.org> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13s390: move _text symbol to address higher than zeroHeiko Carstens
[ Upstream commit d04a4c76f71dd5335f8e499b59617382d84e2b8d ] The perf tool assumes that kernel symbols are never present at address zero. In fact it assumes if functions that map symbols to addresses return zero, that the symbol was not found. Given that s390's _text symbol historically is located at address zero this yields at least a couple of false errors and warnings in one of perf's test cases about not present symbols ("perf test 1"). To fix this simply move the _text symbol to address 0x200, just behind the initial psw and channel program located at the beginning of the kernel image. This is now hard coded within the linker script. I tried a nicer solution which moves the initial psw and channel program into an own section. However that would move the symbols within the "real" head.text section to different addresses, since the ".org" statements within head.S are relative to the head.text section. If there is a new section in front, everything else will be moved. Alternatively I could have adjusted all ".org" statements. But this current solution seems to be the easiest one, since nobody really cares where the _text symbol is actually located. Reported-by: Zvonko Kosic <zkosic@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22s390/topology: fix typo in early topology codeHeiko Carstens
[ Upstream commit 4fd4dd8bffb112d1e6549e0ff09e9fa3c8cc2b96 ] Use MACHINE_FLAG_TOPOLOGY instead of MACHINE_HAS_TOPOLOGY when clearing the bit that indicates if the machine provides topology information (and if it should be used). Currently works anyway. Fixes: 68cc795d1933 ("s390/topology: make "topology=off" parameter work") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22s390: fix handling of -1 in set{,fs}[gu]id16 syscallsEugene Syromiatnikov
commit 6dd0d2d22aa363fec075cb2577ba273ac8462e94 upstream. For some reason, the implementation of some 16-bit ID system calls (namely, setuid16/setgid16 and setfsuid16/setfsgid16) used type cast instead of low2highgid/low2highuid macros for converting [GU]IDs, which led to incorrect handling of value of -1 (which ought to be considered invalid). Discovered by strace test suite. Cc: stable@vger.kernel.org Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-10kernel: make groups_sort calling a responsibility group_info allocatorsThiago Rafael Becker
commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream. In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14s390: fix compat system call tableHeiko Carstens
commit e779498df587dd2189b30fe5b9245aefab870eb8 upstream. When wiring up the socket system calls the compat entries were incorrectly set. Not all of them point to the corresponding compat wrapper functions, which clear the upper 33 bits of user space pointers, like it is required. Fixes: 977108f89c989 ("s390: wire up separate socketcalls system calls") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09s390/runtime instrumentation: simplify task exit handlingHeiko Carstens
commit 8d9047f8b967ce6181fd824ae922978e1b055cc0 upstream. Free data structures required for runtime instrumentation from arch_release_task_struct(). This allows to simplify the code a bit, and also makes the semantics a bit easier: arch_release_task_struct() is never called from the task that is being removed. In addition this allows to get rid of exit_thread() in a later patch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390/disassembler: increase show_code buffer sizeVasily Gorbik
commit b192571d1ae375e0bbe0aa3ccfa1a3c3704454b9 upstream. Current buffer size of 64 is too small. objdump shows that there are instructions which would require up to 75 bytes buffer (with current formating). 128 bytes "ought to be enough for anybody". Also replaces 8 spaces with a single tab to reduce the memory footprint. Fixes the following KASAN finding: BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538 Write of size 1 at addr 000000005a4a75a0 by task bash/1282 CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215 Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) Call Trace: ([<000000000011eeb6>] show_stack+0x56/0x88) [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0 [<00000000004e2994>] print_address_description+0xf4/0x288 [<00000000004e2cf2>] kasan_report+0x13a/0x230 [<0000000000e38ae6>] number+0x3fe/0x538 [<0000000000e3dfe4>] vsnprintf+0x194/0x948 [<0000000000e3ea42>] sprintf+0xa2/0xb8 [<00000000001198dc>] print_insn+0x374/0x500 [<0000000000119346>] show_code+0x4ee/0x538 [<000000000011f234>] show_registers+0x34c/0x388 [<000000000011f2ae>] show_regs+0x3e/0xa8 [<000000000011f502>] die+0x1ea/0x2e8 [<0000000000138f0e>] do_no_context+0x106/0x168 [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0 [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0 [<000000000090639e>] sysrq_handle_crash+0x46/0x58 ([<0000000000000007>] 0x7) [<00000000009073fa>] __handle_sysrq+0x102/0x218 [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100 [<000000000061d67a>] proc_reg_write+0xb2/0x128 [<0000000000520be6>] __vfs_write+0xee/0x368 [<0000000000521222>] vfs_write+0x21a/0x278 [<000000000052156a>] SyS_write+0xda/0x178 [<0000000000e555cc>] system_call+0xc4/0x270 The buggy address belongs to the page: page:000003d1016929c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000 raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: 000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00 >000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 ^ 000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8 000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00 ================================================================== Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390/disassembler: add missing end marker for e7 tableHeiko Carstens
commit 5c50538752af7968f53924b22dede8ed4ce4cb3b upstream. The e7 opcode table does not have an end marker. Hence when trying to find an unknown e7 instruction the code will access memory behind the table until it finds something that matches the opcode, or the kernel crashes, whatever comes first. This affects not only the in-kernel disassembler but also uprobes and kprobes which refuse to set a probe on unknown instructions, and therefore search the opcode tables to figure out if instructions are known or not. Fixes: 3585cb0280654 ("s390/disassembler: add vector instructions") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390/runtime instrumention: fix possible memory corruptionHeiko Carstens
commit d6e646ad7cfa7034d280459b2b2546288f247144 upstream. For PREEMPT enabled kernels the runtime instrumentation (RI) code contains a possible use-after-free bug. If a task that makes use of RI exits, it will execute do_exit() while still enabled for preemption. That function will call exit_thread_runtime_instr() via exit_thread(). If exit_thread_runtime_instr() gets preempted after the RI control block of the task has been freed but before the pointer to it is set to NULL, then save_ri_cb(), called from switch_to(), will write to already freed memory. Avoid this and simply disable preemption while freeing the control block and setting the pointer to NULL. Fixes: e4b8b3f33fca ("s390: add support for runtime instrumentation") Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390: fix transactional execution control register handlingHeiko Carstens
commit a1c5befc1c24eb9c1ee83f711e0f21ee79cbb556 upstream. Dan Horák reported the following crash related to transactional execution: User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000] CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1 Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) task: 00000000fafc8000 task.stack: 00000000fafc4000 User PSW : 0705200180000000 000003ff93c14e70 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3 User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e 0000000000000000 0000000000000002 0000000000000000 000003ff00000000 0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770 000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0 User Code: 000003ff93c14e62: 60e0b030 std %f14,48(%r11) 000003ff93c14e66: 60f0b038 std %f15,56(%r11) #000003ff93c14e6a: e5600000ff0e tbegin 0,65294 >000003ff93c14e70: a7740006 brc 7,3ff93c14e7c 000003ff93c14e74: a7080000 lhi %r0,0 000003ff93c14e78: a7f40023 brc 15,3ff93c14ebe 000003ff93c14e7c: b2220000 ipm %r0 000003ff93c14e80: 8800001c srl %r0,28 There are several bugs with control register handling with respect to transactional execution: - on task switch update_per_regs() is only called if the next task has an mm (is not a kernel thread). This however is incorrect. This breaks e.g. for user mode helper handling, where the kernel creates a kernel thread and then execve's a user space program. Control register contents related to transactional execution won't be updated on execve. If the previous task ran with transactional execution disabled then the new task will also run with transactional execution disabled, which is incorrect. Therefore call update_per_regs() unconditionally within switch_to(). - on startup the transactional execution facility is not enabled for the idle thread. This is not really a bug, but an inconsistency to other facilities. Therefore enable the facility if it is available. - on fork the new thread's per_flags field is not cleared. This means that a child process inherits the PER_FLAG_NO_TE flag. This flag can be set with a ptrace request to disable transactional execution for the current process. It should not be inherited by new child processes in order to be consistent with the handling of all other PER related debugging options. Therefore clear the per_flags field in copy_thread_tls(). Reported-and-tested-by: Dan Horák <dan@danny.cz> Fixes: d35339a42dd1 ("s390: add support for transactional memory") Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-15s390/topology: make "topology=off" parameter workHeiko Carstens
[ Upstream commit 68cc795d1933285705ced6d841ef66c00ce98cbe ] The "topology=off" kernel parameter is supposed to prevent the kernel to use hardware topology information to generate scheduling domains etc. For an unknown reason I implemented this in a very odd way back then: instead of simply clearing the MACHINE_HAS_TOPOLOGY flag within the lowcore I added a second variable which indicated that topology information should not be used. This is more than suboptimal since it partially doesn't work. For the fake NUMA case topology information is still considered and scheduling domains will be created based on this. To fix this and to simplify the code get rid of the extra variable and implement the "topology=off" case like it is done for other features. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17s390/kvm: do not rely on the ILC on kvm host protection faulsChristian Borntraeger
commit c0e7bb38c07cbd8269549ee0a0566021a3c729de upstream. For most cases a protection exception in the host (e.g. copy on write or dirty tracking) on the sie instruction will indicate an instruction length of 4. Turns out that there are some corner cases (e.g. runtime instrumentation) where this is not necessarily true and the ILC is unpredictable. Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for all possible ILCs. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/cputime: fix incorrect system timeMartin Schwidefsky
commit 07a63cbe8bcb6ba72fb989dcab1ec55ec6c36c7e upstream. git commit c5328901aa1db134 "[S390] entry[64].S improvements" removed the update of the exit_timer lowcore field from the critical section cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done blocks. If the PSW is updated by the critical section cleanup to point to user space again, the interrupt entry code will do a vtime calculation after the cleanup completed with an exit_timer value which has *not* been updated. Due to this incorrect system time deltas are calculated. If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry time of the interrupt. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25s390/kdump: Add final noteMichael Holzheu
commit dcc00b79fc3d076832f7240de8870f492629b171 upstream. Since linux v3.14 with commit 38dfac843cb6d7be1 ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: use correct input data address for setup_randomnessHeiko Carstens
commit 4920e3cf77347d7d7373552d4839e8d832321313 upstream. The current implementation of setup_randomness uses the stack address and therefore the pointer to the SYSIB 3.2.2 block as input data address. Furthermore the length of the input data is the number of virtual-machine description blocks which is typically one. This means that typically a single zero byte is fed to add_device_randomness. Fix both of these and use the address of the first virtual machine description block as input data address and also use the correct length. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: make setup_randomness workHeiko Carstens
commit da8fd820f389a0e29080b14c61bf5cf1d8ef5ca1 upstream. Commit bcfcbb6bae64 ("s390: add system information as device randomness") intended to add some virtual machine specific information to the randomness pool. Unfortunately it uses the page allocator before it is ready to use. In result the page allocator always returns NULL and the setup_randomness function never adds anything to the randomness pool. To fix this use memblock_alloc and memblock_free instead. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390/kdump: Use "LINUX" ELF note name instead of "CORE"Michael Holzheu
commit a4a81d8eebdc1d209d034f62a082a5131e4242b5 upstream. In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified as note name. Otherwise the notes are ignored. For /proc/vmcore we currently use "CORE" for these notes. Up to now this has not been a real problem because the dump analysis tool "crash" does not check the note name. But it will break all programs that use libbfd for processing ELF notes. So fix this and use "LINUX" for all s390 specific notes to comply with libbfd. Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01s390/ptrace: Preserve previous registers for short regset writeMartin Schwidefsky
commit 9dce990d2cf57b5ed4e71a9cdbd7eae4335111ff upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. convert_vx_to_fp() is adapted to handle only a specified number of registers rather than unconditionally handling all of them: other callers of this function are adapted appropriately. Based on an initial patch by Dave Martin. Reported-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12s390/topology: always use s390 specific sched_domain_topology_levelHeiko Carstens
commit ebb299a51059017ec253bd30781a83d1f6e11b24 upstream. The s390 specific sched_domain_topology_level should always be used, not only if the machine provides topology information. Luckily this odd behaviour, that was by accident introduced with git commit d05d15da18f5 ("s390/topology: delay initialization of topology cpu masks") has currently no side effect. Fixes: d05d15da18f5 ("s390/topology: delay initialization of topology cpumasks") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09s390/kexec: use node 0 when re-adding crash kernel memoryHeiko Carstens
commit 9f88eb4df728aebcd2ddd154d99f1d75b428b897 upstream. When re-adding crash kernel memory within setup_resources() the function memblock_add() is used. That function will add memory by default to node "MAX_NUMNODES" instead of node 0, like the memory detection code does. In case of !NUMA this will trigger this warning when the kernel generates the vmemmap: Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220 CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16 Call Trace: [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8 [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50 [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0 [<0000000000840136>] sparse_mem_map_populate+0x46/0x68 [<0000000000d0c59c>] sparse_init+0x184/0x238 [<0000000000cf45f6>] paging_init+0xbe/0xf8 [<0000000000cf1d4a>] setup_arch+0xa02/0xae0 [<0000000000ced75a>] start_kernel+0x72/0x450 [<0000000000100020>] _stext+0x20/0x80 If NUMA is selected numa_setup_memory() will fix the node assignments before the vmemmap will be populated; so this warning will only appear if NUMA is not selected. To fix this simply use memblock_add_node() and re-add crash kernel memory explicitly to node 0. Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: 4e042af463f8 ("s390/kexec: fix crash on resize of reserved memory") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-11mm: kmemleak: scan .data.ro_after_initJakub Kicinski
Limit the number of kmemleak false positives by including .data.ro_after_init in memory scanning. To achieve this we need to add symbols for start and end of the section to the linker scripts. The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark families as __ro_after_init"). Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-24s390/dumpstack: use pr_cont within show_stack and dieHeiko Carstens
Use pr_cont instead of printk calls also within show_stack and die in order to avoid extra line breaks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>